RELATED APPLICATIONSThis patent application is a continuing patent application of U.S. patent application Ser. No. 09/932,806, filed Aug. 17, 2001.[0001]
TECHNICAL FIELDThe present disclosure is directed to video presentation transmission and, more particularly, to methods and apparatus for generating navigation information on the fly.[0002]
BACKGROUNDMany digitized moving picture systems use the well-known protocols and formats developed by the Moving Pictures Experts Group, generically referred to as MPEG. Various versions of these protocols have been developed, and are referred to as MPEG-1, MPEG-2, etc. In an MPEG system, compressed video and audio data is packetized into elementary streams wrapped inside packet headers that contain information necessary to decompress the individual streams during playback. These individual audio and video elementary streams can be further assembled, or multiplexed, into a single stream with timing information in the packet headers that identify the time at which the contents of each packet should be presented. In this way, video packets can be synchronized with audio packets during playback. MPEG systems use two basic types of multiplexed streams, namely, Program Streams (PS) and Target Streams (TS). Program Streams are targeted primarily for storage media. Transport Streams are targeted primarily for transmission. Transport Streams have a potentially higher error rate associated with data transmission.[0003]
In encoders of MPEG systems, audio and video are individually compressed and packetized. A multiplexer then combines the individual packets into a PS or TS. On the decoder side, the packets are retrieved from the stream by a demultiplexer, individual packets are depacketized and decompressed, and synchronization between audio and video is achieved by using the appropriate fields in the PS or TS headers. Decoding is typically performed on the fly (i.e., dynamically) as the audio/video is played back. The packets are time-stamped, allowing the playback can be manipulated to perform such functions as: moving directly to specified portions of the audio and/or video presentation, pausing, playing only audio or only video, playing audio in different languages, etc., while maintaining proper synchronization. These and similar functions are collectively referred to as navigation. Generating navigation data for an MPEG stream is conventionally performed during the encoding operation, and is placed into the MPEG stream in the form of navigation packets.[0004]
Previously, particularly-relevant points in the MPEG data stream were selected to be navigation points or links that were presented to a user on a display screen, such as a television that was also displaying decoded program content. The viewer could then select one of the navigation points from the display and have the audio and video associated with the selected navigation point presented to the viewer. However, conventionally the navigation points were generated prior to broadcast of the MPEG stream and were transmitted at substantially the same time as the MPEG stream. Accordingly, in prior systems, program content had always been stored and previewed before it was broadcast so that the navigation points could be generated and broadcast with the MPEG stream including the program content.[0005]
BRIEF DESCRIPTION OF THE DRAWINGSFIGS. 1 and 2 are block diagrams of example communication systems.[0006]
FIG. 3 is a block diagram of the example encoders of FIGS. 1 and 2.[0007]
FIG. 4 is a block diagram of the example navigation generator of FIG. 1.[0008]
FIG. 5 is a block diagram of the example decoder/players of FIGS. 1 and 2.[0009]
FIG. 6 is a block diagram of the example playback stack of FIG. 5.[0010]
FIG. 7 is a block diagram of the example seek point controller of FIG. 6.[0011]
FIG. 8 is an example display view including navigation seek points presented to a user of the decoder/player of FIGS. 1 and 2.[0012]
FIG. 9 is flow diagram of an example compose navigation database process.[0013]
FIG. 10 is a flow diagram of an example playback process.[0014]
DETAILED DESCRIPTIONFIG. 1 shows a block diagram of an example end-to-[0015]end system1 including anencoder10 coupled to anavigation generator20 that is further coupled to a decoder/player30. Thesystem1 is used to create a navigation database and an audio/video datastream, such as, for example, an MPEG stream including audio and video information. In the disclosed example, theencoder10 receives audio and video data, compresses the received data to reduce the storage space and bandwidth required by that data, packetizes the compressed data into audio and video packets, and multiplexes the packets together into an MPEG stream that is coupled to thenavigation generator20.
The[0016]navigation generator20, as described below in conjunction with FIG. 4, includes information from a composer (e.g., a user who manually enters navigation data while viewing a video presentation and who is typically located at a broadcast operation center (BOC)), examines the MPEG stream received from theencoder10, and produces an associated navigation database containing navigation metadata for performing navigation functions on the MPEG stream. The particular navigation functions that are performed are specified by the composer inputs.
Alternatively, as shown in the example alternative end-to-[0017]end system2 of FIG. 2, theencoder10 provides information to anautomatic navigation generator40. Theautomatic navigation generator40 automatically generates the navigation database by analyzing the content of the MPEG stream and identifying desired points in the MPEG stream based on predefined criteria. In this alternative example, thenavigation generator40 produces a navigational database using an automated process rather than relying on direct user inputs to specify which parts of the MPEG stream are identified in the navigational database.
The output of either the[0018]generator20 or thegenerator40 is coupled to the decoder/player30 to allow a user to select various portions of the MPEG stream for playback based on the data in the navigation database. The selected portions are demultiplexed, depacketized, and decompressed by the decoder/player30 to produce the desired video and/or audio outputs (e.g., for playing, for viewing, for listening and/or recording). The navigation data in the navigation database can be used to produce these outputs and/or to create special effects such as pause/resume, freeze-frame, fast playback, and slow playback, as well as to provide a user the ability to jump to a particular location in the programming.
In the examples of FIGS. 1 and 2, the functions of decoding and playing are integrated into the decoder/[0019]player30. In another example, the decoder and player are separate units, and the decoded data developed by the decoder is stored and/or transmitted to the player by a transmitter (not shown) for presentation. In an example, the functions of theencoder10, thegenerator20, and the decoder/player30 are performed at different times and on different platforms, and one or more storage media are used to hold the data stream and/or navigation database until the next stage is ready to receive that data. For example, theencoder10 may store the encoded data stream before passing encoded data stream to thegenerator20. Additionally or alternatively, thegenerator20 may hold, or store, navigation database information until such information is passed to the decoder/player30.
FIG. 3 illustrates an[0020]example encoder10. Theencoder10 could be located at a BOC for a Video-on-Demand network, for example. In the example of FIG. 3, a raw video signal is provided to avideo encoder11 and a raw audio signal is provided to anaudio encoder12. According to the disclosed example, the video and audio signals are digitized before presentation to theencoders11,12. However, in another example, theencoders11,12 could include analog-to-digital converters and, therefore, could accommodate the input of analog signals. The video data is compressed through any known video compression algorithm by thevideo encoder11. Similarly, the audio data is compressed through any known audio compression algorithm by theaudio encoder12. For example, the encoders could perform MPEG compression or any other suitable type of compression.
The compressed video data from the[0021]video encoder11 is segmented into packets with predefined sizes, formats, and protocols by avideo packetizer13 while the compressed audio data from the audio theencoder12 is segmented into packets with predefined sizes, formats, and protocols by theaudio packetizer14. Each packet developed by thevideo packetizer13 and/or theaudio packetizer14 may be referred to as an audio/video packetized elementary stream (A/V PES) and contains timing information, such as, for example, a presentation time stamp (PTS) that identifies where in the playback presentation the data in the packet should be placed. The playback operation later synchronizes the video and audio packets in the proper timing relationships by matching up timing information from various packets.
In the foregoing examples, the audio data may contain multiple audio tracks, such as voice tracks in different languages for a movie. Each audio track uses the same relative timing data and, therefore, each packet is identified by a sequence number or other identifier. As described below in detail, the navigation data uses these sequence numbers to identify particular packets for playback thereby permitting selection of the audio packets for the desired audio track. Person of ordinary skill in the art will readily appreciate that the navigation data can also use the packet identifiers to permit mixing specified video and specified audio in other predetermined ways during playback.[0022]
In the example of FIG. 3, the packets produced by the[0023]packetizers13,14 are combined into a single data stream by amultiplexer15. Themultiplexer15, which receives a system time clock input, may be, for example, a program stream multiplexer or a transport stream multiplexer. The multiplexed data may contain additional information related to timing and/or contents, and may follow an MPEG-2 transport protocol.
In the illustrated example, the MPEG stream is stored in an[0024]optional storage device16 before being provided to thenavigation generator20. Thestorage device16 may be internal or external to theencoder10, and may utilize a portable medium such as, but not limited to, a compact disk, read only memory (CD-ROM) or a digital versatile disk (DVD). The data stream may be written directly intostorage device16, or may be transmitted through a transmission medium before being stored. Regardless of the particular configuration of thestorage device16, the MPEG stream may be read out of thestorage device16 for transmission to thenavigation generator20. Alternatively, theoptional storage device16 may be completely eliminated and, therefore, the MPEG stream may not be stored at all.
An[0025]example navigation generator20 is illustrated in FIG. 4. In the illustrated example, thenavigation generator20 includes anauthoring tool21. Theauthoring tool21 is used to examine the MPEG stream and to generate navigation information identifying content of interest in the MPEG data stream. Theauthoring tool21 may be implemented in hardware, software and/or firmware or in any other suitable combination thereof. Theauthoring tool21 is responsive tonavigation configuration information24, which includes a file of desired seek points in the MPEG stream as defined bycomposer inputs23. The composer is typically a person who views the MPEG stream content and manually records the seek points in the file according to a predefined format, which is discussed below. In an example, the composer specifies the composer inputs through a keyboard, by pointing to icons on a screen with a mouse or other pointing device, or by any other data input device. Additionally or alternatively, theauthoring tool21 may be programmed to search through the MPEG stream in various ways to locate and define desired seek points. In an example, the authoring tool examines the timing information of the packets.
In some examples, the[0026]authoring tool21 may be interactive, thereby allowing personnel at the BOC to view the MPEG encoded programming as it is broadcast to viewers. For example, theauthoring tool21 may be embodied in a workstation or any other suitable computer installation that allows personnel at the BOC to identify portions of the broadcast programming that are of interest as the MPEG stream is broadcast. Accordingly, personnel at the BOC view the programming at substantially the same time as the viewers and the personnel at the BOC surmise the events in the programming that the viewers will find interesting and will want to review at a later time. For example, considering a baseball game, personnel at the BOC watch the MPEG stream delivering the baseball game programming content at approximately the same time the viewers view the content. As the BOC personnel and the viewers watch the game, a player from the home team hits a homerun. At that point, the BOC personnel controls theauthoring tool21 to make an indication that an event of interest has occurred. The indication of the interesting event is translated into a navigation file that is broadcast to the viewer's location to identify the content of interest. At some later time, (e.g., between innings of the baseball game), the viewers may select a link that causes their receiver/decoder to present the audio and video of the home run. To permit this on the fly insertion of navigation seek points, there may be a delay (e.g., 3 minutes) between the time the BOC personnel are provided with the MPEG stream and the broadcast time of that MPEG stream.
While the foregoing example is pertinent to a baseball game in particular and, in general, sporting events, those having ordinary skill in the art will readily recognize that other programming events could be analyzed at the BOC as they are watched by viewers and, subsequently, navigation information could be provided to each user. Additionally, it will be readily appreciated by those having ordinary skill in the art that the application of the disclosed system is not limited only to live events such as sporting events. To the contrary, the disclosed system could be used for previously-recorded programs stored at the BOC that have not had navigation information generated by the BOC until the program is broadcast. However, the application to live broadcasts is particularly advantageous.[0027]
In another example, the MPEG stream may include a video presentation such as a digitized movie or other video sequence. In such an example, the composer-inputted criteria may be seek points that are located specified amounts of time after the movie starts, or may be points that divide the movie into a specified number of equal time segments.[0028]
After the content of interest has been identified, the[0029]authoring tool21 locates a video intraframe (I-frame) that is closest to each specified part of the MPEG presentation of which a seek point is desired, and identifies that I-frame, or the packet containing that I-frame, as the requested point in the MPEG sequence. The identified points and corresponding I-frames divide the data stream into labeled segments. In this example, I-frames are used as reference points because, unlike predicted frames (P-frames) or bi-directional frames (B-frames), they are self-contained video images that do not depend on previous or subsequent frames for their reconstruction.
In the example of FIG. 4, the navigation data generated by the[0030]authoring tool21 is placed into one or more navigation files22. Unlike conventional systems that encode navigation data into the MPEG stream, the navigation file(s)22 of FIG. 4 are separate files from the files holding the MPEG stream and are broadcast after the MPEG stream is broadcast. Both navigation file(s)22 and the associated MPEG stream may be stored at the BOC until needed for playback although they may remain as separate files. In the example illustrated in FIG. 4, both the navigation file(s)22 and the MPEG stream are stored in the storage device26 (e.g., a hard drive). Thestorage device26 may be internal or external to thenavigation generator20, and may include a fixed or portable medium. Alternatively, the navigation files22 and the associated MPEG stream may be broadcast “on-the-fly” to client devices by a transmitter (not shown) over a transmission medium.
In an example, the navigation file comprises two files. The first file is an Extensible Markup Language (XML) file or some other pre-defined file format. The first file contains chapter times, positions and labels, and audio/video stream packet ID's and labels. It may also include timing information, titles for scenes in the media presentation, bitmap files for the scenes, and I-frame location in the stream. The second file is a binary file referred to as an I-frame index file. The second file contains the presentation time and file offset of the packet corresponding to each video I-frame. The I-frame index file is used for video trick modes, such as fast forward and fast reverse. Additionally, the I-frame index may be used to provide a user the ability to jump to particular locations in the presentation and may also be used as a quick-scan source for locating specific time points in the presentation. The navigation files (including seek points), and the MPEG stream, are read out and broadcast (or simply broadcast in the case of “on-the-fly” broadcast during play of the media presentation) to client devices.[0031]
FIG. 5 shows an[0032]example client device30 for playing all or a portion of the MPEG stream. Aplayback stack31 reads the navigation files22 and presents navigation options to thedisplay34. Such options may include chapters available, chapter labels, etc. Aplayback control33 is adapted to provide input data to theplayback stack31 to identify the portions or segments of the MPEG stream that are to be selected by a viewer for presentation. Theplayback stack31 responds to input data from theplayback control33 by reading the navigation files to identify where in the MPEG stream requested segments of the presentation are located. The selected MPEG segments are then read from theplayback stack31 and presented to thedecoder32, where they are decoded and played on one or more media players. The illustrated example shows media players such as adisplay34 for presenting video data and aspeaker35 for presenting audio data.
As shown in FIG. 6, the[0033]example playback stack31 includes anintelligent reception system54, an internalmass storage device56 for storing the MPEG data stream and the navigation files and a seekpoint controller58. Theintelligent reception system54, which may be implemented as software, hardware and/or firmware, controls whether to store navigation file(s) and whether to associate the file with the MPEG stream content. Theintelligent reception system54 may be used, for example, only to allow subscribers of the system wishing seek point functionality to have access to navigation information. Additionally or alternatively, theintelligent reception system54 may be used to allow only viewers of a particular program or viewers who have expressed interest in a particular program to receive and view the navigation files associated with that program. Themass storage device56 may be a hard drive, an optical disk and its associated optical drive, semiconductor memory or any other memory device. In any of the foregoing examples, the navigation file(s) may be encrypted by theencoder10 and decrypted by theintelligent reception system54 of the paying subscriber. Alternatively, the navigation files may be unencrypted.
The seek[0034]point controller58 enables a user of theclient device30 to use the navigation seek points to navigate within the MPEG media presentation. The seekpoint controller58 may be located within theplayback stack31, as shown, or may be located external to theplayback stack31, and may be implemented as software, firmware and/or hardware or any suitable combination thereof. When implemented as software or firmware, for example, the seekpoint controller58 could be an additional application stack added to an existing playback device.
An example seek[0035]point controller58 is illustrated in detail in FIG. 7. In this example, the seekpoint controller58 includes a controller/selector62 that receives an input of I-frame data and navigation file information from theintelligent reception system54. The navigation file information includes the navigation seek points. The controller/selector62 reads or parses the file(s)22, which includes stored seek points or locations, and outputs the navigation points to a user's media playing device (e.g., eithervisual display34,audio speaker35 or both) via a seekpoint output device68.
The seek[0036]point output device68 formats the navigation seek points for visual display on a user'svisual display34. Additionally, audio data concerning the seek points may also be formatted and output for audio indication of the navigation seek points to theaudio speaker35, for example. Thecontrol output block64 of the seekpoint controller58 also controls themass storage device56 to control the flow of information from themass storage device56 to thedecoder32. For example, seek point information and MPEG stream portions may be output from themass storage device56 to thedecoder32 for presentation on the display34 (FIG. 5).
The[0037]receiver66 of the seekpoint controller58 receives information from theplayback control33, such as, for example, a remote control, and passes the information on to the controller/selector62. The information received by thereceiver66 may include indications of links that a viewer desires to select. Accordingly, the information passed to thereceiver66 is used to control, via thecontrol output64, the MPEG stream information passed from themass storage device56 to thedecoder32.
An example of a navigation display presented to a user is illustrated in FIG. 8. As shown, the[0038]display34 includes aprogram viewing area70 in which programming information such as television shows may be presented. Thedisplay34 also includes a number of sections of navigation seek point information, as shown at reference numerals72-78. While the navigation seek point information and the programming information are shown as occupying two separate areas of thedisplay34, those having ordinary skill in the art will readily recognize that this arrangement is just one example arrangement because the information may be presented together, such as by overlaying one type of information over the other. The navigation seek point information displayed at reference numerals72-78 may be textual, but may also include, for example, image and audio information. Of course, as will be readily appreciated by those having ordinary skill in the art, the navigation seek point information is not limited to display in the locations denoted with reference numeral72-78. To the contrary, the locations denoted with reference numerals72-78 are merely examples of locations at which the navigation seek point information may be displayed.
After the navigation seek point data is displayed to the user, the user may input a selection of a seek point. Returning to FIGS. 6 and 7, such a selection is sent from the[0039]playback control33 to thereceiver66 within the seekpoint controller58 of theplayback stack31. Thereceiver66 may also be configured to accept other data from theplayback control33, such as commands to move graphical indications, such as cursors, on the display of which navigation seek point is desired. In this way, the user is presented with a graphical user interface to graphically move between different seek points to facilitate selection of a particular seek point.
Once a navigation seek point selection is received by the[0040]receiver66 from theplayback control33, the selection is communicated to the controller/selector62 to affect presentation of the desired portion of the MPEG media presentation. The controller/selector62 correlates the selected seek point with the particular I-frame carrying the corresponding portion of the media selection by addressing the navigation database. The controller/selector62 then commands thecontrol output64 to relay the selection information, including the starting I-frame, to theplayback stack31. In response, theplayback stack31 accesses the MPEG data from themass storage device56 corresponding to the selected portion of the media presentation and serves up the MPEG data to thedisplay34 and/orspeaker35 through thedecoder32. Thus, the disclosed playback system permits a client/user to jump within an MPEG media presentation to portions of the presentation that the user wants to playback through visually or audibly presented navigation seek points.
FIG. 9 is a flow chart of an[0041]example process80 for composing a navigation database. Theprocess80 of FIG. 9 is, for example, implemented in the authoring tool21 (FIG. 4), at which BOC personnel (e.g., an operator) can view the MPEG stream to display the programming content of the MPEG stream (block82). As the operator views the programming, the operator waits for content of interest to be displayed (block84). If no content of interest is displayed (block84), the operator continues to view the programming (block82). If, however, content of interest is presented to the operator (block84), the operator makes an identification of the content of interest (block86). The identification could include the operator selecting a link provided on theauthoring tool21, depressing a button provided on theauthoring tool21 or making a time notation by hand and later keying the time information into theauthoring tool21. Both the start and end time of the content of interest may be indicated.
After the operator has identified the time of the content of interest, the authoring tool selects an MPEG I-frame having a timestamp close to the identified time (block[0042]88). The I-frame is selected because I-frames include all visual information necessary to generate an image (i.e., I-frames are not relative frames dependent on other frames for visual content information). After the I-frame is identified (block88), the I-frame is stored in a navigation file (block90), as is the I-frame file offset and the presentation time stamp (PTS). The navigation file is then transmitted to viewers watching the programming (block92). An indication of the end of the identified content of interest may also be stored in the navigation information. In the disclosed example, the navigation file is broadcast after viewers have had the opportunity to watch the programming content in the MPEG stream. Alternatively or additionally, the navigation file may be transmitted to all receiver/decoders so that the receiver decoders can store the navigation file in the event that a viewer later tunes to programming related to the navigation file.
The navigation file in the disclosed example is separate from any file containing all or part of the MPEG stream itself, and may even be stored in a separate medium from the MPEG stream. Additionally, the metadata in the navigation file may also contain one or more of timing information, titles for scenes of the MPEG media presentation, bitmap or MPEG data for displaying images corresponding to particular scenes of the MPEG presentation and I-frame location.[0043]
If there is more content to view (block[0044]94), theauthoring tool21 continues to display programming (block82). In the alternative, if there is no more programming content to view (block94), theprocess80 may end execution or may return control to any routine that called theprocess80.
In the foregoing example, an operator at the BOC was described as selecting content of interest. However, persons of ordinary skill in the art will recognize that content of interest could be identified at locations other than at the BOC and that the content of interest could be identified in an automated manner without operator interaction. For example, software or hardware could be adapted to identify content of interest by monitoring video information in the MPEG stream. For example, the[0045]authoring tool21 could be automated to detect instant replays contained in sporting event broadcasts and could, therefore, identify the instant replays as content of interest.
Having described examples of how navigation information is identified and broadcast, attention is now turned to a process[0046]100 (FIG. 10) for playing back portions of the MPEG stream associated with the navigation information. Theprocess100 may be implemented by the playback stack31 (FIG. 5). During theplayback process100, the navigation information is displayed to the viewer watching the programming to which the navigation information corresponds (block102). For example, as shown in FIG. 8, the navigation information may be displayed in areas denoted with the reference numerals72-78. Additionally or alternatively, navigation information related to other programming content may be stored for later use.
After the navigation information is displayed (block[0047]102), theprocess100 determines if a viewer has selected any of the navigation information presented on the display (block104). If the viewer has not selected any of the navigation information (block104), real time programming continues to be displayed (block106). If, however, a viewer has selected the navigation information (block104), real time programming is interrupted (block108) and the frames associated with the selected navigation information are recalled (block110). The program content beginning with the information represented by the I-frame associated with the selected navigation information are then presented to the viewer (block112) until the end of the programming content associated with the selected navigation information is reached (block114). When the end of the programming associated with the selected navigation information is reached (block114), real time programming is again displayed (block106).
Although the foregoing describes real time programming as being displayed (block[0048]106), it will be readily appreciated by those having ordinary skill in the art that other pre-recorded programming could be displayed. For example, if a user were watching an hour long program and wanted to re-watch the first 30 minutes of the program, which, for example, took 10 minutes, the user could either resume watching the real time programming that is atminute 40 or could resume watching the program fromminute 30 from a recording thereof. Accordingly, resuming to real time programming is merely one example of content that may be displayed after a user navigates to view stored information.
The disclosed example may be implemented in circuitry or as a method. The disclosed example may also be implemented as information or instructions stored on media, which may be read and executed by at least one processor to perform the functions described herein. A machine-readable medium may include any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer). For example, a machine-readable medium may include read only memory (ROM); random access memory (RAM); magnetic disk storage media; optical storage media; flash memory devices; electrical, optical, acoustical or other forms of propagated signals (e.g., carrier waves, infrared signals, digital signals, etc.), and others.[0049]
From the foregoing, persons of ordinary skill in the art will appreciate that the disclosed examples permit navigational data to be generated from an encoded data stream, thus allowing creation of a navigation database after the MPEG data has been compressed, packetized, and multiplexed. The data stream contains video and/or audio data that has been compressed and packetized according to any of various known formats and protocols such as any of the MPEG formats. The navigation data permits selective retrieval of portions of the data stream for playback by identifying packets or other portions of the data stream that are associated with navigation points (i.e., points in a presentation carried in the data stream that the user may wish to access quickly and begin playing). Navigation data may also include data that enables special effects or trick modes, such as fast forward or fast reverse. In addition, the navigation information provides the viewer the ability to jump to particular point in the data stream.[0050]
The configuration information in the navigation database includes information on the particular points identified in the navigation database. These points and/or the associated information may be specified by a user. Alternately, the configuration data in the navigation database may be generated automatically. The data in the navigation database, which contains data about other data, may be referred to as navigation metadata. The navigation database may be kept separately from the MPEG stream.[0051]
Although certain apparatus and methods have been described herein, the scope of coverage of this patent is not limited thereto. On the contrary, this patent covers all embodiments of the teachings of the invention fairly falling within the scope of the appended claims, either literally or under the doctrine of equivalents.[0052]