Movatterモバイル変換


[0]ホーム

URL:


GB2342802A - Indexing conference content onto a timeline - Google Patents

Indexing conference content onto a timeline
Download PDF

Info

Publication number
GB2342802A
GB2342802AGB9916394AGB9916394AGB2342802AGB 2342802 AGB2342802 AGB 2342802AGB 9916394 AGB9916394 AGB 9916394AGB 9916394 AGB9916394 AGB 9916394AGB 2342802 AGB2342802 AGB 2342802A
Authority
GB
United Kingdom
Prior art keywords
conference
participant
sound
audio
timeline
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB9916394A
Other versions
GB2342802B (en
GB9916394D0 (en
Inventor
Steven L Potts
Peter L Chu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Polycom LLC
Original Assignee
Picturetel Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Picturetel CorpfiledCriticalPicturetel Corp
Publication of GB9916394D0publicationCriticalpatent/GB9916394D0/en
Publication of GB2342802ApublicationCriticalpatent/GB2342802A/en
Application grantedgrantedCritical
Publication of GB2342802BpublicationCriticalpatent/GB2342802B/en
Anticipated expirationlegal-statusCritical
Expired - Fee Relatedlegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

A method and system to index the content of conferences. It includes identifying each such conference participant producing a sound, capturing an image of each such conference participant, and with correlating the images of the conference participants with audio segments of an audio recording, that is the segments corresponding to the audio produced by the conference participant. The indexing system includes a sound recording mechanism, at least one identifier of locations of conference participants, a camera, an image storage device, a processor for associating the still images captured by the camera to the sound recorded by the sound recording mechanism thereby correlating still images of the conference participants to audio segments produced by the conference participants, and a graphical user interface which allows easy access to stored sound, images, and correlated data. It may also include an aiming device for pointing the camera at the person speaking.

Description

METHOD AND APPARATUS FOR INDEXING CONFERENCE CONTENTThis invention relates to the field of multimedia.
With the advent of economical digital storage media and sophisticated video/audio decompression technology capable of running on personal computers, thousands of hours of digitized video/audio data can be stored with virtually instantaneous random access. In order for this stored data to be utilized, it must be indexed efficiently in a manner allowing a user to find desired portions of the digitized video/audio data quickly.
For recorded conferences having a number of participants, indexing is generally performed on the basis of"who"said"what"and"when" (at what time). Currently used methods of indexing do not reliably give this information, primarily because video pattern recognition, speech recognition, and speaker identification techniques are unreliable technologies in the noisy, reverberant, uncontrolled environments in which conferences occur.
Also, a need exists for a substitut for tedious trial-and-error techniques for finding when a conference participant first starts speaking in a recording.
The invention features a method and a system for indexing the content of a conference by matching images captured during the conference to the recording of sounds produced by conference participants.
Using reliable sound source localization technology implemented with microphone arrays, the invention produces reliable information concerning"who"and"when" (which persons spoke at what time) for a conference. While information concerning"whatn (subject matter) is missing, the"who-when"information greatly facilitates manual annotation for the missing"what"information. In many search-retrieval situations, the"who-when"information alone will be sufficient for indexing.
In one aspect of the invention, the method includes identifying a conference participant producing a sound, capturing a still image of the conference participant, correlating the still image of the conference participant to the audio segments of the audio recording corresponding to the sound produced by the conference participant, and generating a timeline by creating a speech-present segment representing the correlated still image and associated audio segment. Thus, the timeline includes speech-present segments representing a still image and associated audio segments. The still image is a visual representation of the sound source producing the associated audio segments.
The audio recording can be segmente into audio segment portions and associated with conference participants, whose images are captured, for example, with a video camera.
Embodiments of this aspect of the invention may include one or more of the following features.
The still image of each conference participant producing a sound is captured as a segment of a continuous video recording of the conference, thereby establishing a complete visual indicator of all speakers participating in a conference.
The timeline is presented visually so that a user can quickly and easily access individual segments of the continuous recording of the conference.
The timeline can include a colored line or bar representing the duration of each speech segment with a correlated image to index the recorded conference. The timeline can be presented as a graphical user interface (GUI), so that the user can use an input device (for example, a mouse) to select or highlight the appropriate part of the timeline corresponding to the start of the desired recording, access that part, and start playing the recording. Portions of the audio and video recordings can be played on a playback monitor.
Various approaches can be used to identify a conference participant. In one embodiment, a microphone array is used to locate the conference participant by sound.
The microphone arrays together with reliable sound source localization technology reliably and accurately estimate the position and presence of sound sources in space.
The time elapsed from a start of the conference is stored with each audio segment and each still image. An indexing engine can be provided to generate the timeline by matching the elapsed time associated with an audio segment and a still image.
The system can be used to index a conference with only one participant. The timeline then includes an indication of the times in which sound was produced, as well as an image of the lone participant.
In applications in which more than one conference participant is present and identified, the system stores the times elapsed from the start of the conference and identifications of when a speaker begins speaking with each still image, a participant being associated with each image.
The elapsed time is also stored with the audio recording each time a change in sound location is identified. The indexing engine creates an index, that is, a list of associated images and sound segments. Based on this index, a timeline is then generated for each still image (that is, each conference participant) designating the times from the start of the conference when the participant speaks. The timeline also indicates any other conference participant who might also appear in the still image (for example, a neighbor sitting in close proximity to the speaker), but is silent at the particular elapsed time, thus giving a comprehensive overview of the sounds produced by all conference participants, as well as helping identify all persons present in the still images. The timeline may be generated either in real time or after the conference is finished.
In embodiments in which a video camera is used to capture still images of the conference participants, it can also be used to record a continuous video recording of the conference.
The system can be used for a conference with all participants in one room (near-end participants) as well as for a conference with participants (far-end participants) at another site.
Assuming that a speaker has limited movement during a conference, the same person is assumed to be talking every time sound is detected from a particular locality. Thus, if the speech source is determined to be the same as the locality of a previously detected conference participant, a speech-present segment is added to the timeline for the previously detected conference participant. If the location of a conference participant is different from a previously detected location of a near-end conference participant, a still image of the new near-end conference participant is stored and a new timeline is started for the new near-end conference participant.
In a video conference involving a far-end participant, the audio source is a loudspeaker at the near end transmitting a sound from a far-end speech source. The timeline is then associated with the far-end, and generating a timeline includes creating a speech-present segment for . the far-end if a far-end speech source is present. Thus, a user of the invention can identify and access far-end speech segments. Further, if a far-end speech source is involved in the conference, echo can be suppressed by subtracting a block of accumulated far-end loudspeaker data from a block of accumulated near-end microphone array data.
Advantageously, therefore, a video image of a display presented at the conference is captured, and a timeline is generated for the captured video image of the display. This enables the indexing of presentation material as well as sounds produced by conference participants.
The present invention is illustrated in the following figures.
Fig. 1 is a schematic representation of a videoconferencing embodiment using two microphone arrays;Fig. 2 is a block diagram of the computer which performs some of the functions illustrated in Fig. 1 ;Fig. 3 is an exemplary display showing timelines generated during a videoconference; andFig. 4 is a flow diagram illustrating operation of the microphone array conference indexing method.
While the description which follows is associated with a videoconference between the local or near-end site and a distant or far-end site, the invention can be used with a single site conference as well.
Referring then to Fig. 1, a videoconference indexing system 10 (shown enclosed by dashed lines) is used to record and index a videoconference having, in this particular embodiment, four conference participants 62,64,66,68 sitting around a table 60 and engaged in a videoconference.
One or more far-end conference participants (not shown) also participate in the conference through the use of a local videoconferencing system 20 connected over a communication channel 16 to a far-end video conferencing system 18. The communication channel 16 connects the far-end video conferencing system to the near-end videoconferencing system 20 and far-end decompressed audio is available to a source locator 22.
Videoconference indexing system 10 includes videoconferencing system 20, a computer 30, and a playback system 50. Videoconferencing system 20 includes a display monitor 21 and a loudspeaker 23 for allowing the far-end conference participant to be seen and heard by conference participants 62,64,66, and 68. In an alternative embodiment, the embodiment shown in Fig. 1 is used to record a meeting not in a conference-call mode, so the need for the display monitor 21 and loudspeaker 23 of videoconferencing system 20 is eliminated. System 20 also includes microphone arrays 12,14 for acquiring sound (for example, participants'speech), the source locator 22 for determining the location of a sound-producing conference participant, and a video camera 24 for capturing video images of the setting and participants as part of a continuous video 'recording. In one embodiment, source locator 22 is a standalone hardware, called"LIMELIGHT", manufactured and sold by PictureTel Corporation, and which is a videoconferencing unit having an integrated motorized camera and microphone array. The"LIMELIGHT"locator 22 has a digital signal processing (DSP) integrated circuit which efficiently implements the source locator function, receiving electrical signals representing sound picked up in the room and outputting source location parameters. Further details of the structure and implementation of the"Limelightwn system is described in U. S. 5, 778,082, the contents of which are incorporated herein by reference. (In other embodiments of the invention, multiple cameras and microphone configurations can be used.)Alternative methods can be used to fulfill the function of source locator 22. For example, a camera video pattern recognition algorithm can be used to identify the location of an audio source, based on mouth movements. In another embodiment of the invention, an infrared motion detector can be used to identify an audio source location, for example to detect a speaker approaching a podium.
Computer 30 includes an audio storage 32 and a video storage 34 for storing audio and video data provided from microphone arrays 12,14 and video camera 24, respectively.
Computer 30 also includes an indexing engine software module 40 whose operations will be discussed in greater detail below.
Referring to Fig. 2, the hardware for computer 30 used to store and process data and computer instructions is shown. In particular, computer 30 includes a processor 31, a memory storage 33, and a working memory 35, all of which are connected by an interface bus 37. Memory storage 33, typically a disk drive, is used for storing the audio and video data provided from microphone arrays 12,14 and camera 24, respectively, and thus includes audio storage 32 and video storage 34. In operation, indexing engine software 40 is loaded into working memory 35, typically RAM, from memory storage 33 so that the computer instructions from the indexing engine can be processed by processor 31. Computer 30 serves as an intermediate storage facility which records, compresses, and combines the audio, video, and indexing information data as the actual conference occurs.
Referring again to Fig. 1, playback system 50 is connected to computer 30 and includes a playback display 52 and a playback server 54, which together allow the recording of the videoconference to be reviewed quickly and accessed at a later time.
Although a more detailed description of the operation is provided below, in general, microphone arrays 12,14 generate signals, in response to sound generated in the videoconference, which are sent to source locator 22.
Source locator 22, in turn, transmits signals representative of the location of a sound source both to a pointing mechanism 26 connected to video camera 24 and to computer 30. These signals are transmitted along lines 27 and 28, respectively. Pointing mechanism 26 includes motors which, in the most general case, control panning, tilting, zooming, and auto-focus functions of the video camera (subsets of these functions can also be used). Further details of pointing mechanism 26 are described in U. S. 5,633,681, incorporated herein by reference. Video camera 24, in response to the signals from source locator 22, is then pointed,. by pointing mechanism 26, in the direction of the conference participant who is the current sound source.
Images of the conference participant captured by video camera 24 are stored in video storage 34 as video data, along with an indication of the time which has elapsed from the start of the conference.
Simultaneously, the sound picked up by microphone arrays 12,14 is transmitted to and stored in audio storage 32, also along with the time which has elapsed from the start of the conference until the beginning of each new sound segment. Thus, the elapsed time is stored with each sound segment in audio storage 32. A new sound segment corresponds to each change, determined by the source locator 22, in the detected location of sound source.
In order to minimize storage requirements, both the audio and video data are stored, in this illustrated embodiment, in a compressed format. If further storage minimization is necessary, only those portions of the videoconference during which speech is detected will be stored, and further, if necessary, the video data, other than the conference participant still images, need not be stored.
Although the embodiment illustrated in Fig. 1 uses one camera, more than one camera can be used to capture the video images of conference participants. This approach is especially useful for cases where one participant may block a camera's view of another participant. Alternatively, a separate camera can be dedicated to recording, for example viewgraphs or whiteboard drawings, shown during the course of a conference.
As noted above, audio storage 32 and video storage 34 are both part of computer 30 and the stored audio and video images are available to both the indexing engine 40 and playback system 50. The latter includes the playback display 52 and the playback server 54 as noted above.
Indexing engine 40 associates the stored video images to the stored sound (segments) based on elapsed time from the start of the conference, and generates a file with indexing information; it indexes compressed audio and video data using a protocol such as, for example, AVI format. Foi long term storage, audio, video, and indexing information i : transmitted from computer 30 to the playback server 54 for access by users of the system. Playback server 54 can retrieve from its''own. memory the audio and video data when requested by a user. Playback server 54 stores data from the conference in such a way as to make it quickly available for many users on a computer network. In one embodiment, playback server 54 includes many computers, with a library of multimedia files distributed across the computers. A user can access playback server 54 as well as the information generated by the indexing engine 40 by using GUI 45 with a GUI display 47. Then, the playback display terminal 52 is used to display video data stored in video storage 34 and to play audio data stored in audio storage 32; playback display 52 is also used to display video data and to play audio data stored in playback server 54.
Alternatively, instead of using video images for indexing, an icon is generated based on a still image selected from the continuous video recording. Then, the icon of the conference participant is associated with the audio segment generated by the conference participant. Thus the system builds a database index associating with each identified sound source and its representative icon or image, a sequence of elapsed times and time durations for each instance when the participant was a"sound source".
The elapsed times and the durations can be used to access the stored audio and video as described in detail below.
One feature of the invention is to index conference content using the identification of various sound sources and their locations. In the embodiment shown in Fig. 1, the identification and location of sound sources is achieved by the source locator 22 and the two microphone arrays 12, 14. Each microphone array is a PictureTel"LimeLight" array having four microphones, one microphone positioned at each vertex of an inverted T and at the intersection of the two linear portions of the"T". In this illustrated embodiment, the inverted T array has a height of 12 inches and a width of 18 inches. Arrays of this type are described in U. S. Patent 5,778,082 by Chu et al., the contents of which are incorporated herein by reference.
In other embodiments, other microphone array position estimation procedures and microphone array configurations, with different structures and techniques of estimating spatial location, can be used to locate a sound source. For example, a microphone can be situated close to each conference participant, and any microphone with a sufficiently loud signal indicates that the particular person associated with that microphone is speaking.
Accurate time-of-arrival difference times of emitted sound in the room are obtained between selected combinations of microphone pairs in each microphone array 12,14 by the use of a highly modified cross-correlation technique (modified for robustness to room echo and background noise degradation) as described in U. S. 5, 778,082. Assuming plane sound waves (the far-field assumption), these pairs of timedifferences can be translated by source locator 22 correspondingly into bearing angles from the respective array. The angles provide an estimate of the location of the sound source in three-dimensional space.
In the embodiment shown in Fig. 1, the sound is picked up by a microphone array integrated with the sound localization array, so that the microphone arrays serve double duty as both sound localization and sound pick-up apparatus. However, in other embodiments, one microphone or microphone array can be used for recording while another microphone or microphone array can be used for sound localization.
Although two microphone arrays 12,14 are shown in use with videoconferencing indexing system 10, only one array is required. In other embodiments, the number and configurations of microphone arrays may vary, for example, from one microphone to many. Using more than one array provides advantages. In particular, while the azimuth and elevation angles provided by each of arrays 12,14 are highly accurate and are estimated to within a fraction of a degree, range estimates are not nearly as accurate. Even though the range error is higher, however, the information is sufficient for use with pointing mechanism 26.
However, the larger range estimation error of the microphone arrays gives rise to sound source ambiguity problems for a single microphone array. Thus, with reference to Fig. 1, microphone array 12 might view persons 66,68 as the same person, since their difference in range to microphone array 12 might be less than the range error of array 12. To address this problem, source localization estimates from microphone array 14 could be used by source locator 22 as a second source of information to separate persons 66 and 68, since persons 66 and 68 are separated substantially in azimuth angle from the viewpoint of microphone array 14.
An alternative approach to indexing by sound source location is to use manual camera position commands such as pan/tilt commands and presets to index the meeting. These commands in general may indicate a change in content whereby a change in camera position is indicative of a change in sound source location.
Fig. 3 shows an example of a display 80, viewed onGUI display 47 (Fig. 1), resulting from a videoconference.
The following features, included in the display 80, indicate to a user of system 10 exactly who was speaking and when that person spoke, Horizontal axis 99 is a time scale, representing the actual time during the recorded conference.
Pictures of conference participants appear along the vertical axis of display 80. Indexing engine 40 (Fig. 1) selects and extracts from video storage 34 pictures 81,83, 85 of conference participants 62,64,66, on the basis of elapsed time from the start of the conference and the beginning of new sound segments. These pictures represent the conference participant (s) producing the sound segment (s). Pictures 81,83,85 are single still frames from a continuous video recording captured by video camera 24 and stored in video storage 34. A key criteria for selection of images for the pictures is the elapsed time from the start of the conference to the beginning of each respective sound segment: the pictures selected for the timeline are the ones which are captured at the same elapsed time as the beginning of each respective sound segment.
Display 80 includes a picture 87, denoting a far-end conference participant. This image, too, is selected by the indexing engine 40. It can be an image of the far-end conference participant, if images from a far-end camera are available. Alternatively, it can be an image of a logo, a photograph, etc., captured by a near-end camera.
Display 80 also includes a block 89 representing, for example, data presented by one of the conference participants at the conference. Data content can be recorded by use of an electronic viewgraph display system (not shown) which provides signals to videoconferencing system 20. Alternatively, a second camera can be used to record slides presented with a conventional viewgraph. The slides, greatly reduced in size, would then form part of display 80.
Associated with each picture 81,83,85,87 and block 89 are line'segments representing when sound corresponding to each respective picture occurred. For example, segments 90,92,92', and 94 represent the duration of sound produced by three conference participants, e. g. 62, 64, and 66 of Fig. 1. Segment 96 represents sounds produced by a far-end conference participant (not shown in Fig. 1).
Segments 97 and 98, on the other hand, show when data content was displayed during the presentation and show a representation of the data content. The segments may be different colors, with different meaning assigned to each color. For example, a blue line could represent a near-end sound source, and a red line could represent a far-end sound source. In essence, the pictures and blocks, together with the segments, provide a series of timelines for each conference participant and presented data block.
In display 80, the content of what each person 62, 64,66 said is not presented, but this information can, if desired, be filled in after-the-fact by manual annotation, such as a note on the display 80 through the GUI 45 at each speech segment 90,92,92', and 94.
A user can view display 80 using GUI 45, GUI display 47, and playback display 52. In particular, the user can click a mouse or other input device (for example, a trackball or cursor control keys on a keyboard) on any point in segments 90,92,92', 94,96,97, and 98 in the display 80 to access and playback or display that portion of the stored conference file.
A flow diagram of a method 100, according to the invention, is presented in Fig. 4. Method 100 of Fig. 4 is generic to system operation, and could be applied to a wide variety of different microphone array configurations. With reference also to. Figs. 1-3, the operation of system will be described.
In operation, audio is simultaneously acquired from both the far end and the near end of a videoconference.
From the far end, audio is continuously acquired for successive preselected durations of time as it is received by videoconferencing system 20 (step 101). Audio received from the far-end videoconferencing system 18 is thus directed to the source locator 22 (step 102). The source locator analyzes the frequency components of far end audio signals. The onset of a new segment is characterized by i) the magnitude of a particular frequency component being greater than the background noise for that frequency and~ii) the magnitude of a particular frequency component being greater than the magnitude of the same component acquired during a predetermined number of preceding time frames. If speech is present, an audio segment (e. g., segment 96 inFig. 3) is begun (step 103) for the timeline corresponding to audio produced by the far-end conference participant (s).
An audio segment is continued for the timeline, corresponding to a far-end conference participant, if speech continues to be present at the far-end and there has been no temporal interruption since the beginning of the previously started audio segment.
While the preselected durations of far-end audio are being acquired (step 101) and analyzed, the system simultaneously acquires successive N second durations of audio from microphone arrays 12,14 (step 104). Because the audio from the far-end site can interfere with near-end detection of audio in the room, the far-end signal received through the microphone arrays is suppressed by the subtraction of a block of N second durations of far-end audio from the acquired near-end audio (step 105). In this way, false sound localization of the loudspeaker as a "person" (audio source) will not occur. Echo suppression will not affect a signal resulting from two near-end participants speaking simultaneously. In this case, the sound locator locates both participants, locates the stronger of the two, or does nothing.
Echo suppression can be implemented with adaptive filters, or by use of a bandpass filter bank (not shown) with band-by-band gating (setting to zero those bands with significant far-end energy, allowing processing to occur only on bands with far-end energy near the far-end background noise level), as is well-known to those skilled in the art. Methods for achieving both adaptive filtering and echo suppression are described in U. S. 5, 305, 307 by Chu, the contents of which are incorporated herein by reference.
The detection and location of speech of a near-end source is determined (step 106) using source locator 22 and microphone arrays 12,14. If speech is detected, then source locator 22 estimates the spatial location of the speech source (step 107). Further details for the manner in which source location is accomplished is described in U. S.
5,778,082. This method involves estimating the time delay between signals arriving at a pair of microphones from a common source. As described in connection with the far-end audio analysis,-a near-end speech source is detected if the magnitude of a frequency component is significantly greater than the background noise for that frequency, and if the magnitude of the frequency component is greater than that acquired for that frequency in a predetermined number of preceding time frames. The fulfillment of both conditions signifies the start of a speech segment from a particular speech source. A speech source location is calculated by comparing the time delay of the signals received at the microphone arrays 12, 14, as determined by source locator 22.
Indexing engine 40 compares the newly derived source location parameters (step 107) to the parameters of previously detected sources (step 108). Due to errors in estimation and small movements of the person speaking, the new source location parameters may differ slightly from previously estimated parameters of the same person. If the difference between location parameters for the new source and old source is small enough, it is assumed that a previously detected source (person) is audible (speaking) again, and the speech segment in his/her timeline is simmly extended or reinstated (step 111).
The difference thresholds for the location parameters according to one particular embodiment of the invention are: 1. If the range of both of two sources (previouslydetected and current) is less than 2 meters, then itis determined that a new source is audible if:the pan angle difference is greater than 12 degrees,or the tilt angle difference is greater than 4degrees, or the range difference is greater than. 5meters.
2. If the range of either of two sources is greaterthan 2 meters but less than 3.5 meters, then it isdetermined that a new source is audible if:the pan angle difference is greater than 9 degrees,or the tilt angle difference is greater than 3degrees, or the range difference is greater than. 75meters.
3. If the range of either of two sources is greaterthan 3.5 meters, then it is determined that a newsource is audible if:the pan angle difference is greater than 6 degrees,or the tilt angle difference is greater than 2degrees, or the range difference is greater than 1 meter.
Video camera 24, according to this embodiment of the invention, is automatically pointed in the response to the determined location, at the current or most recent sound source. Thus, during a meeting, a continuous video recording can be made of each successive speaker. Indexing engine 40, based on correlating the elapsed times for the video images and sound segments, extracts still images from the video for purposes of providing images to be shown onGUI display 47 to allow the user to visually identify the person associated with a timeline (step 109). A new segment of data storage is begun for each new speaker (step 110).
Alternatively, a continuous video recording of the meeting can be sampled after the meeting is over, and still video images, such as pictures 81, 83, and 85 of the participants, can be extracted by the indexing engine 40 from the continuous stored video recording.
Occasionally, a person may change his position during a conference. The method of Fig. 4 treats the new position of the person as a new speaker. By using video pattern recognition and/or speaker audio identification techniques, however, the new speaker can be identified as being one of the old speakers who has moved. When such a positive identification occurs, the new speaker timeline (including, for example, images and sound segments, 85 and 94 in Fig. 3) can be merged with the original timeline for the speaker. Techniques of video-based tracking are discussed in a co'-pending patent application (Serial No.
09/79840, filed May 15, 1998) assigned to the assignee of the present invention, and the contents of which are hereby incorporated by reference. The co-pending application describes the combination of video with audio techniques for autopositioning the camera.
In some cases, more than one conference participant may appear in a still image. The timeline can also indicate any other conference participant who might also appear in the still image (for example, a neighbor sitting in close proximity to the speaker), but is silent at the particular elapsed time, thus giving a comprehensive overview of the sounds produced by all conference participants, as well as helping identify all persons present in the still images.
Conference data can also be indexed for a multipoint conference in which more than two sites engage in a conference together. In this multipoint configuration, microphone arrays at each site can send indexing information for the stream of video/audio/data content from that site to a central computer for storage and display.
Additions, deletions, and other modifications of the described embodiments will be apparent to those practiced in this field and are within the scope of the following claims.

Claims (26)

Claims
1. A method for indexing the content of a conference with at. least one participant, said method comprising :recording an audio recording of the conference;identifying a conference participant producing a sound;capturing a still image of the identified conference participant;correlating the still image of the conference participant to at least one audio segment portion of the audio recording, said at least one segment corresponding to the sound produced by the identified conference participant; andgenerating a timeline by creating at least one speech-present segment representing the correlated still image and associated at least one audio segment.
2. The method claimed in claim 1, further comprising:displaying the timeline on a display monitor; andaccessing the timeline displayed on the monitor using a graphical user interface (GUI).
3. The method claimed in claim 2, wherein capturing the still image includes making a video recording of the conference and capturing a video image of the conference participant producing the sound from a segment of the associated video recording of the conference, and further comprising :using the GUI to select a portion of a specific audio segment for replaying portions of the audio and video recordings on a playback monitor.
4. The method of claim 1, wherein capturing the still image comprises capturing a video image of the conference participant producing the sound from a segment of an associated continuous video recording of the conference.
5. The method of claim 1, further comprising using a video camera to capture the still video image.
6. The method of claim 1 wherein identifying the conference participant is based on identifying the location of the participant.
7. The method of claim 6, wherein identifying the conference participant includes using a microphone array.
8. The method of claim 1, further comprising:storing time elapsed from a start of the conference with the audio segment and the still image, wherein the timeline is generated by an indexing engine matching the elapsed time associated with the audio segment and the still image.
9. The method of claim 1, further comprising:identifying a plurality of conference participants;capturing a still image of each one of the plurality of conference participants;storing a time elapsed from a start of the conference indicating the time of the capturing of each still image ; andstoring a time elapsed from a start of the conference in association with the audio recording each time a change in audio source location is identified, wherein generating a timeline includes indicating for each identified conference participant the particular elapsed times from the start of the conference during which the particular participant was speaking, and wherein generating the timeline includes indicating any other conference participant who also appears in the video image and is silent at the particular elapsed time.
10. The method of claim 9, wherein a conference participant has been previously identified and wherein a speech-present segment is added to the timeline for the previously detected conference participant when the participant speaks.
11. The method of claim 10, wherein each identified conference participant is a near-end conference participant.
12. The method of claim 11, wherein identifying each near-end conference participant is based on location.
13. The method of claim 12, wherein a still image of a new near-end conference participant is identified and a new timeline is started for the new near-end conference participant, if the location of the new near-end conference participant is different from previously detected locations of the other identified near-end conference participant.
14. The method of claim 1, wherein the audio source is a far-end loudspeaker transmitting a sound from a far-end speech source, wherein the timeline is a far-end timeline, and wherein generating the far-end timeline includes creating a speech-present segment on the far-end timeline if a far-end speech source is present.
15. The method of claim 14, further comprising:accumulating a block of far-end loudspeaker microphone array data;accumulating a block of near-end microphone array data ; andsuppressing echo by subtracting accumulated far-end loudspeaker data from accumulated near-end microphone array data.
16. The method of claim 1, further comprising:capturing a video image of a display presented at the conference; andgenerating a timeline for the captured video image of the display.
17. The method of claim 1, wherein the generated timeline is color-coded.
18. A system for indexing the content of a conference with at least one participant, said system comprising :a sound recording mechanism which records sound created by a conference participant;at least one source locator for identifying the location. of a conference participant, wherein the source locator generates signals corresponding to the location of the conference participant;a camera assembly including a camera and a camera movement device, which, in response to the signals generated by said source locator, moves the camera to point at the conference participant;an image capture unit for capturing an image of the conference participant; an image storage device for storing images captured by said image capture unit ;a processor for associating the image captured by the camera to the sound recorded by the sound recording mechanism and to create a timeline comprising images and indicators of presence of associated sound; anda graphical user interface which allows access to the stored sound, images, and timeline.
19. The system of claim 18, wherein the sound locator uses at least one microphone array.
20. The system of claim 18, wherein the sound locator uses a plurality of microphones.
21. The system of claim 18, wherein the sound locator comprises a plurality of microphone arrays.
22. A system for indexing the content of a conference with at least one participant, said system comprising:means for recording an audio recording of the conference;means for identifying each conference participant producing a sound;means for capturing a still image of each identified conference participant; andmeans for associating the still image of each identified conference participant to at least one audio segment portion of the audio recording corresponding to the sound produced by such conference participant.
23. A method for presenting an audio index database representation of'a conference comprising:generating a plurality of participant timelines, each timeline having at least one speech-present segment representing a correlated still image and at least one associated audio segment ;enabling a user to identify any of the segments representing audio desired ; andplaying back the identified segment.
24. A method for indexing the content of a conference with at least oneparticipant substantially as herein described with reference toFigures 1 to 4.
25. A system for indexing the content of a conference with at least oneparticipant ged substantially as herein describedand shown with reference to Figures 1 to 4.
26. A method for presenting an audio index database representation of aconference substantially as herein described with reference toFigures 1 to 4.
GB9916394A1998-10-141999-07-13Method and apparatus for indexing conference contentExpired - Fee RelatedGB2342802B (en)

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
US17346298A1998-10-141998-10-14

Publications (3)

Publication NumberPublication Date
GB9916394D0 GB9916394D0 (en)1999-09-15
GB2342802Atrue GB2342802A (en)2000-04-19
GB2342802B GB2342802B (en)2003-04-16

Family

ID=22632148

Family Applications (1)

Application NumberTitlePriority DateFiling Date
GB9916394AExpired - Fee RelatedGB2342802B (en)1998-10-141999-07-13Method and apparatus for indexing conference content

Country Status (2)

CountryLink
JP (1)JP2000125274A (en)
GB (1)GB2342802B (en)

Cited By (157)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
GB2351627A (en)*1999-03-262001-01-03Canon KkImage processing apparatus
GB2351628A (en)*1999-04-142001-01-03Canon KkImage and sound processing apparatus
WO2002013522A3 (en)*2000-08-102003-10-30QuindiAudio and video notetaker
FR2849564A1 (en)*2002-12-312004-07-02Droit In Situ METHOD AND SYSTEM FOR PRODUCING A MULTIMEDIA EDITION BASED ON ORAL SERVICES
US7113201B1 (en)1999-04-142006-09-26Canon Kabushiki KaishaImage processing apparatus
US7117157B1 (en)1999-03-262006-10-03Canon Kabushiki KaishaProcessing apparatus for determining which person in a group is speaking
EP1427205A4 (en)*2001-09-142006-10-04Sony CorpNetwork information processing system and information processing method
GB2429133A (en)*2004-08-312007-02-14Sony CorpMethod and device for indexing image data to associated audio data
EP1906707A4 (en)*2005-07-082010-01-20Yamaha CorpAudio transmission system and communication conference device
RU2398277C2 (en)*2004-10-302010-08-27Майкрософт КорпорейшнAutomatic extraction of faces for use in time scale of recorded conferences
GB2486793A (en)*2010-12-232012-06-27Samsung Electronics Co LtdIdentifying a speaker via mouth movement and generating a still image
US8452037B2 (en)2010-05-052013-05-28Apple Inc.Speaker clip
US8560309B2 (en)2009-12-292013-10-15Apple Inc.Remote conferencing center
WO2013169621A1 (en)*2012-05-112013-11-14Qualcomm IncorporatedAudio user interaction recognition and context refinement
EP2557778A4 (en)*2010-09-152014-01-15Zte CorpMethod and apparatus for video recording in video calls
US8644519B2 (en)2010-09-302014-02-04Apple Inc.Electronic devices with improved audio
US8811648B2 (en)2011-03-312014-08-19Apple Inc.Moving magnet audio transducer
US8858271B2 (en)2012-10-182014-10-14Apple Inc.Speaker interconnect
EP2709357A4 (en)*2012-01-162014-11-12Huawei Tech Co LtdConference recording method and conference system
US8892446B2 (en)2010-01-182014-11-18Apple Inc.Service orchestration for intelligent automated assistant
US8903108B2 (en)2011-12-062014-12-02Apple Inc.Near-field null and beamforming
US8942410B2 (en)2012-12-312015-01-27Apple Inc.Magnetically biased electromagnet for audio applications
US8977584B2 (en)2010-01-252015-03-10Newvaluexchange Global Ai LlpApparatuses, methods and systems for a digital conversation management platform
US8989428B2 (en)2011-08-312015-03-24Apple Inc.Acoustic systems in electronic devices
US9007871B2 (en)2011-04-182015-04-14Apple Inc.Passive proximity detection
US9020163B2 (en)2011-12-062015-04-28Apple Inc.Near-field null and beamforming
US9225701B2 (en)2011-04-182015-12-29Intelmate LlcSecure communication systems and methods
US9262612B2 (en)2011-03-212016-02-16Apple Inc.Device access using voice authentication
US9300784B2 (en)2013-06-132016-03-29Apple Inc.System and method for emergency calls initiated by voice command
US9330720B2 (en)2008-01-032016-05-03Apple Inc.Methods and apparatus for altering audio output signals
US9338493B2 (en)2014-06-302016-05-10Apple Inc.Intelligent automated assistant for TV user interactions
US9357299B2 (en)2012-11-162016-05-31Apple Inc.Active protection for acoustic device
US9368114B2 (en)2013-03-142016-06-14Apple Inc.Context-sensitive handling of interruptions
US9430463B2 (en)2014-05-302016-08-30Apple Inc.Exemplar-based natural language processing
US9483461B2 (en)2012-03-062016-11-01Apple Inc.Handling speech synthesis of content for multiple languages
US9495129B2 (en)2012-06-292016-11-15Apple Inc.Device, method, and user interface for voice-activated navigation and browsing of a document
US9502031B2 (en)2014-05-272016-11-22Apple Inc.Method for supporting dynamic grammars in WFST-based ASR
US9525943B2 (en)2014-11-242016-12-20Apple Inc.Mechanically actuated panel acoustic system
US9535906B2 (en)2008-07-312017-01-03Apple Inc.Mobile device having human language translation capability with positional feedback
US9576574B2 (en)2012-09-102017-02-21Apple Inc.Context-sensitive handling of interruptions by intelligent digital assistant
US9582608B2 (en)2013-06-072017-02-28Apple Inc.Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
US9620104B2 (en)2013-06-072017-04-11Apple Inc.System and method for user-specified pronunciation of words for speech synthesis and recognition
US9620105B2 (en)2014-05-152017-04-11Apple Inc.Analyzing audio input for efficient speech and music recognition
US9626955B2 (en)2008-04-052017-04-18Apple Inc.Intelligent text-to-speech conversion
US9633660B2 (en)2010-02-252017-04-25Apple Inc.User profiling for voice input processing
US9633004B2 (en)2014-05-302017-04-25Apple Inc.Better resolution when referencing to concepts
US9633674B2 (en)2013-06-072017-04-25Apple Inc.System and method for detecting errors in interactions with a voice-based digital assistant
US9646614B2 (en)2000-03-162017-05-09Apple Inc.Fast, language-independent method for user authentication by voice
US9646609B2 (en)2014-09-302017-05-09Apple Inc.Caching apparatus for serving phonetic pronunciations
US9668121B2 (en)2014-09-302017-05-30Apple Inc.Social reminders
US9697822B1 (en)2013-03-152017-07-04Apple Inc.System and method for updating an adaptive speech recognition model
US9697820B2 (en)2015-09-242017-07-04Apple Inc.Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks
US9711141B2 (en)2014-12-092017-07-18Apple Inc.Disambiguating heteronyms in speech synthesis
US9715875B2 (en)2014-05-302017-07-25Apple Inc.Reducing the need for manual start/end-pointing and trigger phrases
US9721566B2 (en)2015-03-082017-08-01Apple Inc.Competing devices responding to voice triggers
US9734193B2 (en)2014-05-302017-08-15Apple Inc.Determining domain salience ranking from ambiguous words in natural speech
US9746916B2 (en)2012-05-112017-08-29Qualcomm IncorporatedAudio user interaction recognition and application interface
US9760559B2 (en)2014-05-302017-09-12Apple Inc.Predictive text input
US9785630B2 (en)2014-05-302017-10-10Apple Inc.Text prediction using combined word N-gram and unigram language models
US9798393B2 (en)2011-08-292017-10-24Apple Inc.Text correction processing
US9820033B2 (en)2012-09-282017-11-14Apple Inc.Speaker assembly
US9818400B2 (en)2014-09-112017-11-14Apple Inc.Method and apparatus for discovering trending terms in speech requests
NO20160989A1 (en)*2016-06-082017-12-11Pexip ASVideo Conference timeline
US9842101B2 (en)2014-05-302017-12-12Apple Inc.Predictive conversion of language input
US9842105B2 (en)2015-04-162017-12-12Apple Inc.Parsimonious continuous-space phrase representations for natural language processing
US9858925B2 (en)2009-06-052018-01-02Apple Inc.Using context information to facilitate processing of commands in a virtual assistant
US9858948B2 (en)2015-09-292018-01-02Apple Inc.Electronic equipment with ambient noise sensing input circuitry
US9865280B2 (en)2015-03-062018-01-09Apple Inc.Structured dictation using intelligent automated assistants
US9886953B2 (en)2015-03-082018-02-06Apple Inc.Virtual assistant activation
US9886432B2 (en)2014-09-302018-02-06Apple Inc.Parsimonious handling of word inflection via categorical stem + suffix N-gram language models
US9899019B2 (en)2015-03-182018-02-20Apple Inc.Systems and methods for structured stem and suffix language models
US9900698B2 (en)2015-06-302018-02-20Apple Inc.Graphene composite acoustic diaphragm
US9922642B2 (en)2013-03-152018-03-20Apple Inc.Training an at least partial voice command system
US9934775B2 (en)2016-05-262018-04-03Apple Inc.Unit-selection text-to-speech synthesis based on predicted concatenation parameters
US9953088B2 (en)2012-05-142018-04-24Apple Inc.Crowd sourcing information to fulfill user requests
US9959870B2 (en)2008-12-112018-05-01Apple Inc.Speech recognition involving a mobile device
US9966065B2 (en)2014-05-302018-05-08Apple Inc.Multi-command single utterance input method
US9966068B2 (en)2013-06-082018-05-08Apple Inc.Interpreting and acting upon commands that involve sharing information with remote devices
US9971774B2 (en)2012-09-192018-05-15Apple Inc.Voice-based media searching
US9972304B2 (en)2016-06-032018-05-15Apple Inc.Privacy preserving distributed evaluation framework for embedded personalized systems
US10043516B2 (en)2016-09-232018-08-07Apple Inc.Intelligent automated assistant
US10049668B2 (en)2015-12-022018-08-14Apple Inc.Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10049663B2 (en)2016-06-082018-08-14Apple, Inc.Intelligent automated assistant for media exploration
US10057736B2 (en)2011-06-032018-08-21Apple Inc.Active transport based notifications
US10063977B2 (en)2014-05-122018-08-28Apple Inc.Liquid expulsion from an orifice
US10067938B2 (en)2016-06-102018-09-04Apple Inc.Multilingual word prediction
US10074360B2 (en)2014-09-302018-09-11Apple Inc.Providing an indication of the suitability of speech recognition
US10078631B2 (en)2014-05-302018-09-18Apple Inc.Entropy-guided text prediction using combined word and character n-gram language models
US10079014B2 (en)2012-06-082018-09-18Apple Inc.Name recognition system
US10083688B2 (en)2015-05-272018-09-25Apple Inc.Device voice control for selecting a displayed affordance
US10089072B2 (en)2016-06-112018-10-02Apple Inc.Intelligent device arbitration and control
US10101822B2 (en)2015-06-052018-10-16Apple Inc.Language input correction
US10127911B2 (en)2014-09-302018-11-13Apple Inc.Speaker identification and unsupervised speaker adaptation techniques
US10127220B2 (en)2015-06-042018-11-13Apple Inc.Language identification from short strings
US10134385B2 (en)2012-03-022018-11-20Apple Inc.Systems and methods for name pronunciation
US10170123B2 (en)2014-05-302019-01-01Apple Inc.Intelligent assistant for home automation
US10176167B2 (en)2013-06-092019-01-08Apple Inc.System and method for inferring user intent from speech inputs
US10185542B2 (en)2013-06-092019-01-22Apple Inc.Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US10186254B2 (en)2015-06-072019-01-22Apple Inc.Context-based endpoint detection
US10192552B2 (en)2016-06-102019-01-29Apple Inc.Digital assistant providing whispered speech
US10199051B2 (en)2013-02-072019-02-05Apple Inc.Voice trigger for a digital assistant
US10223066B2 (en)2015-12-232019-03-05Apple Inc.Proactive assistance based on dialog communication between devices
US10241644B2 (en)2011-06-032019-03-26Apple Inc.Actionable reminder entries
US10241752B2 (en)2011-09-302019-03-26Apple Inc.Interface for a virtual digital assistant
US10249300B2 (en)2016-06-062019-04-02Apple Inc.Intelligent list reading
US10255907B2 (en)2015-06-072019-04-09Apple Inc.Automatic accent detection using acoustic models
US10269345B2 (en)2016-06-112019-04-23Apple Inc.Intelligent task discovery
US10276170B2 (en)2010-01-182019-04-30Apple Inc.Intelligent automated assistant
US10283110B2 (en)2009-07-022019-05-07Apple Inc.Methods and apparatuses for automatic speech recognition
US10284951B2 (en)2011-11-222019-05-07Apple Inc.Orientation-based audio
US10289433B2 (en)2014-05-302019-05-14Apple Inc.Domain specific language for encoding assistant dialog
US10297253B2 (en)2016-06-112019-05-21Apple Inc.Application integration with a digital assistant
US10318871B2 (en)2005-09-082019-06-11Apple Inc.Method and apparatus for building an intelligent automated assistant
US10354011B2 (en)2016-06-092019-07-16Apple Inc.Intelligent automated assistant in a home environment
US10356243B2 (en)2015-06-052019-07-16Apple Inc.Virtual assistant aided communication with 3rd party service in a communication session
US10366158B2 (en)2015-09-292019-07-30Apple Inc.Efficient word encoding for recurrent neural network language models
US10402151B2 (en)2011-07-282019-09-03Apple Inc.Devices with enhanced audio
US10410637B2 (en)2017-05-122019-09-10Apple Inc.User-specific acoustic models
US10446143B2 (en)2016-03-142019-10-15Apple Inc.Identification of voice inputs providing credentials
US10446141B2 (en)2014-08-282019-10-15Apple Inc.Automatic speech recognition based on user feedback
US10482874B2 (en)2017-05-152019-11-19Apple Inc.Hierarchical belief states for digital assistants
US10490187B2 (en)2016-06-102019-11-26Apple Inc.Digital assistant providing automated status report
US10496753B2 (en)2010-01-182019-12-03Apple Inc.Automatically adapting user interfaces for hands-free interaction
US10509862B2 (en)2016-06-102019-12-17Apple Inc.Dynamic phrase expansion of language input
US10521466B2 (en)2016-06-112019-12-31Apple Inc.Data driven natural language event detection and classification
US10553209B2 (en)2010-01-182020-02-04Apple Inc.Systems and methods for hands-free notification summaries
US10552013B2 (en)2014-12-022020-02-04Apple Inc.Data detection
US10568032B2 (en)2007-04-032020-02-18Apple Inc.Method and system for operating a multi-function portable electronic device using voice-activation
US10567477B2 (en)2015-03-082020-02-18Apple Inc.Virtual assistant continuity
US10593346B2 (en)2016-12-222020-03-17Apple Inc.Rank-reduced token representation for automatic speech recognition
US10592095B2 (en)2014-05-232020-03-17Apple Inc.Instantaneous speaking of content on touch devices
US10659851B2 (en)2014-06-302020-05-19Apple Inc.Real-time digital assistant knowledge updates
US10671428B2 (en)2015-09-082020-06-02Apple Inc.Distributed personal assistant
US10679605B2 (en)2010-01-182020-06-09Apple Inc.Hands-free list-reading by intelligent automated assistant
US10691473B2 (en)2015-11-062020-06-23Apple Inc.Intelligent automated assistant in a messaging environment
US10705794B2 (en)2010-01-182020-07-07Apple Inc.Automatically adapting user interfaces for hands-free interaction
US10706373B2 (en)2011-06-032020-07-07Apple Inc.Performing actions associated with task items that represent tasks to perform
US10733993B2 (en)2016-06-102020-08-04Apple Inc.Intelligent digital assistant in a multi-tasking environment
US10747498B2 (en)2015-09-082020-08-18Apple Inc.Zero latency digital assistant
US10755703B2 (en)2017-05-112020-08-25Apple Inc.Offline personal assistant
US10757491B1 (en)2018-06-112020-08-25Apple Inc.Wearable interactive audio device
US10762293B2 (en)2010-12-222020-09-01Apple Inc.Using parts-of-speech tagging and named entity recognition for spelling correction
US10791176B2 (en)2017-05-122020-09-29Apple Inc.Synchronization and task delegation of a digital assistant
US10789041B2 (en)2014-09-122020-09-29Apple Inc.Dynamic thresholds for always listening speech trigger
US10791216B2 (en)2013-08-062020-09-29Apple Inc.Auto-activating smart responses based on activities from remote devices
US10810274B2 (en)2017-05-152020-10-20Apple Inc.Optimizing dialogue policy decisions for digital assistants using implicit feedback
US10873798B1 (en)2018-06-112020-12-22Apple Inc.Detecting through-body inputs at a wearable audio device
US11010550B2 (en)2015-09-292021-05-18Apple Inc.Unified language modeling framework for word prediction, auto-completion and auto-correction
US11025565B2 (en)2015-06-072021-06-01Apple Inc.Personalized prediction of responses for instant messaging
US11217255B2 (en)2017-05-162022-01-04Apple Inc.Far-field extension for digital assistant services
US11307661B2 (en)2017-09-252022-04-19Apple Inc.Electronic device with actuators for producing haptic and audio output along a device housing
US11334032B2 (en)2018-08-302022-05-17Apple Inc.Electronic watch with barometric vent
US11499255B2 (en)2013-03-132022-11-15Apple Inc.Textile product having reduced density
US11561144B1 (en)2018-09-272023-01-24Apple Inc.Wearable electronic device with fluid-based pressure sensing
US11587559B2 (en)2015-09-302023-02-21Apple Inc.Intelligent device identification
US11857063B2 (en)2019-04-172024-01-02Apple Inc.Audio output system for a wirelessly locatable tag
US12256032B2 (en)2021-03-022025-03-18Apple Inc.Handheld electronic device

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
JP4212274B2 (en)*2001-12-202009-01-21シャープ株式会社 Speaker identification device and video conference system including the speaker identification device
JP2005277445A (en)2004-03-222005-10-06Fuji Xerox Co LtdConference video image processing apparatus, and conference video image processing method and program
JP2005354541A (en)*2004-06-112005-12-22Fuji Xerox Co LtdDisplay apparatus, system, and display method
JP2005352933A (en)*2004-06-142005-12-22Fuji Xerox Co LtdDisplay arrangement, system, and display method
JP4656395B2 (en)*2005-03-302011-03-23カシオ計算機株式会社 Recording apparatus, recording method, and recording program
JP2007052565A (en)2005-08-162007-03-01Fuji Xerox Co LtdInformation processing system and information processing method
JP5573402B2 (en)*2010-06-212014-08-20株式会社リコー CONFERENCE SUPPORT DEVICE, CONFERENCE SUPPORT METHOD, CONFERENCE SUPPORT PROGRAM, AND RECORDING MEDIUM

Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
JPS60205151A (en)*1984-03-291985-10-16Toshiba Electric Equip Corp solar tracking device
EP0660249A1 (en)*1993-12-271995-06-28AT&T Corp.Table of contents indexing system
WO1997001932A1 (en)*1995-06-271997-01-16At & T Corp.Method and apparatus for recording and indexing an audio and multimedia conference
US5717869A (en)*1995-11-031998-02-10Xerox CorporationComputer controlled display system using a timeline to control playback of temporal data representing collaborative activities
US5729741A (en)*1995-04-101998-03-17Golden Enterprises, Inc.System for storage and retrieval of diverse types of information obtained from different media sources which includes video, audio, and text transcriptions
US5786814A (en)*1995-11-031998-07-28Xerox CorporationComputer controlled display system activities using correlated graphical and timeline interfaces for controlling replay of temporal data representing collaborative activities

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
JPH03162187A (en)*1989-11-211991-07-12Mitsubishi Electric Corp video conferencing equipment
JP3266959B2 (en)*1993-01-072002-03-18富士ゼロックス株式会社 Electronic conference system
JPH06266632A (en)*1993-03-121994-09-22Toshiba CorpMethod and device for processing information of electronic conference system
US5778082A (en)*1996-06-141998-07-07Picturetel CorporationMethod and apparatus for localization of an acoustic source
JPH10145763A (en)*1996-11-151998-05-29Mitsubishi Electric Corp Conference system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
JPS60205151A (en)*1984-03-291985-10-16Toshiba Electric Equip Corp solar tracking device
EP0660249A1 (en)*1993-12-271995-06-28AT&T Corp.Table of contents indexing system
US5729741A (en)*1995-04-101998-03-17Golden Enterprises, Inc.System for storage and retrieval of diverse types of information obtained from different media sources which includes video, audio, and text transcriptions
WO1997001932A1 (en)*1995-06-271997-01-16At & T Corp.Method and apparatus for recording and indexing an audio and multimedia conference
US5717869A (en)*1995-11-031998-02-10Xerox CorporationComputer controlled display system using a timeline to control playback of temporal data representing collaborative activities
US5786814A (en)*1995-11-031998-07-28Xerox CorporationComputer controlled display system activities using correlated graphical and timeline interfaces for controlling replay of temporal data representing collaborative activities

Cited By (229)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US7117157B1 (en)1999-03-262006-10-03Canon Kabushiki KaishaProcessing apparatus for determining which person in a group is speaking
GB2351627B (en)*1999-03-262003-01-15Canon KkImage processing apparatus
GB2351627A (en)*1999-03-262001-01-03Canon KkImage processing apparatus
GB2351628A (en)*1999-04-142001-01-03Canon KkImage and sound processing apparatus
GB2351628B (en)*1999-04-142003-10-01Canon KkImage and sound processing apparatus
US7113201B1 (en)1999-04-142006-09-26Canon Kabushiki KaishaImage processing apparatus
US9646614B2 (en)2000-03-162017-05-09Apple Inc.Fast, language-independent method for user authentication by voice
WO2002013522A3 (en)*2000-08-102003-10-30QuindiAudio and video notetaker
EP1427205A4 (en)*2001-09-142006-10-04Sony CorpNetwork information processing system and information processing method
WO2004062285A1 (en)*2002-12-312004-07-22Dahan Templier JenniferMethod and system for producing a multimedia publication on the basis of oral material
FR2849564A1 (en)*2002-12-312004-07-02Droit In Situ METHOD AND SYSTEM FOR PRODUCING A MULTIMEDIA EDITION BASED ON ORAL SERVICES
GB2429133A (en)*2004-08-312007-02-14Sony CorpMethod and device for indexing image data to associated audio data
GB2429133B (en)*2004-08-312007-08-29Sony CorpRecording and reproduction device
US7636121B2 (en)2004-08-312009-12-22Sony CorporationRecording and reproducing device
RU2398277C2 (en)*2004-10-302010-08-27Майкрософт КорпорейшнAutomatic extraction of faces for use in time scale of recorded conferences
EP1906707A4 (en)*2005-07-082010-01-20Yamaha CorpAudio transmission system and communication conference device
US8208664B2 (en)2005-07-082012-06-26Yamaha CorporationAudio transmission system and communication conference device
US10318871B2 (en)2005-09-082019-06-11Apple Inc.Method and apparatus for building an intelligent automated assistant
US9117447B2 (en)2006-09-082015-08-25Apple Inc.Using event alert text as input to an automated assistant
US8930191B2 (en)2006-09-082015-01-06Apple Inc.Paraphrasing of user requests and results by automated digital assistant
US8942986B2 (en)2006-09-082015-01-27Apple Inc.Determining user intent based on ontologies of domains
US10568032B2 (en)2007-04-032020-02-18Apple Inc.Method and system for operating a multi-function portable electronic device using voice-activation
US9330720B2 (en)2008-01-032016-05-03Apple Inc.Methods and apparatus for altering audio output signals
US10381016B2 (en)2008-01-032019-08-13Apple Inc.Methods and apparatus for altering audio output signals
US9865248B2 (en)2008-04-052018-01-09Apple Inc.Intelligent text-to-speech conversion
US9626955B2 (en)2008-04-052017-04-18Apple Inc.Intelligent text-to-speech conversion
US9535906B2 (en)2008-07-312017-01-03Apple Inc.Mobile device having human language translation capability with positional feedback
US10108612B2 (en)2008-07-312018-10-23Apple Inc.Mobile device having human language translation capability with positional feedback
US9959870B2 (en)2008-12-112018-05-01Apple Inc.Speech recognition involving a mobile device
US10795541B2 (en)2009-06-052020-10-06Apple Inc.Intelligent organization of tasks items
US11080012B2 (en)2009-06-052021-08-03Apple Inc.Interface for a virtual digital assistant
US9858925B2 (en)2009-06-052018-01-02Apple Inc.Using context information to facilitate processing of commands in a virtual assistant
US10475446B2 (en)2009-06-052019-11-12Apple Inc.Using context information to facilitate processing of commands in a virtual assistant
US10283110B2 (en)2009-07-022019-05-07Apple Inc.Methods and apparatuses for automatic speech recognition
US8560309B2 (en)2009-12-292013-10-15Apple Inc.Remote conferencing center
US10679605B2 (en)2010-01-182020-06-09Apple Inc.Hands-free list-reading by intelligent automated assistant
US10705794B2 (en)2010-01-182020-07-07Apple Inc.Automatically adapting user interfaces for hands-free interaction
US8903716B2 (en)2010-01-182014-12-02Apple Inc.Personalized vocabulary for digital assistant
US10706841B2 (en)2010-01-182020-07-07Apple Inc.Task flow identification based on user intent
US8892446B2 (en)2010-01-182014-11-18Apple Inc.Service orchestration for intelligent automated assistant
US12087308B2 (en)2010-01-182024-09-10Apple Inc.Intelligent automated assistant
US10496753B2 (en)2010-01-182019-12-03Apple Inc.Automatically adapting user interfaces for hands-free interaction
US9318108B2 (en)2010-01-182016-04-19Apple Inc.Intelligent automated assistant
US10553209B2 (en)2010-01-182020-02-04Apple Inc.Systems and methods for hands-free notification summaries
US9548050B2 (en)2010-01-182017-01-17Apple Inc.Intelligent automated assistant
US10276170B2 (en)2010-01-182019-04-30Apple Inc.Intelligent automated assistant
US11423886B2 (en)2010-01-182022-08-23Apple Inc.Task flow identification based on user intent
US10607141B2 (en)2010-01-252020-03-31Newvaluexchange Ltd.Apparatuses, methods and systems for a digital conversation management platform
US9424862B2 (en)2010-01-252016-08-23Newvaluexchange LtdApparatuses, methods and systems for a digital conversation management platform
US9431028B2 (en)2010-01-252016-08-30Newvaluexchange LtdApparatuses, methods and systems for a digital conversation management platform
US8977584B2 (en)2010-01-252015-03-10Newvaluexchange Global Ai LlpApparatuses, methods and systems for a digital conversation management platform
US10984327B2 (en)2010-01-252021-04-20New Valuexchange Ltd.Apparatuses, methods and systems for a digital conversation management platform
US10984326B2 (en)2010-01-252021-04-20Newvaluexchange Ltd.Apparatuses, methods and systems for a digital conversation management platform
US9424861B2 (en)2010-01-252016-08-23Newvaluexchange LtdApparatuses, methods and systems for a digital conversation management platform
US10607140B2 (en)2010-01-252020-03-31Newvaluexchange Ltd.Apparatuses, methods and systems for a digital conversation management platform
US12307383B2 (en)2010-01-252025-05-20Newvaluexchange Global Ai LlpApparatuses, methods and systems for a digital conversation management platform
US11410053B2 (en)2010-01-252022-08-09Newvaluexchange Ltd.Apparatuses, methods and systems for a digital conversation management platform
US9633660B2 (en)2010-02-252017-04-25Apple Inc.User profiling for voice input processing
US10049675B2 (en)2010-02-252018-08-14Apple Inc.User profiling for voice input processing
US8452037B2 (en)2010-05-052013-05-28Apple Inc.Speaker clip
US9386362B2 (en)2010-05-052016-07-05Apple Inc.Speaker clip
US10063951B2 (en)2010-05-052018-08-28Apple Inc.Speaker clip
US8866867B2 (en)2010-09-152014-10-21Zte CorporationMethod and apparatus for video recording in video calls
EP2557778A4 (en)*2010-09-152014-01-15Zte CorpMethod and apparatus for video recording in video calls
US8644519B2 (en)2010-09-302014-02-04Apple Inc.Electronic devices with improved audio
US10762293B2 (en)2010-12-222020-09-01Apple Inc.Using parts-of-speech tagging and named entity recognition for spelling correction
GB2486793A (en)*2010-12-232012-06-27Samsung Electronics Co LtdIdentifying a speaker via mouth movement and generating a still image
US8687076B2 (en)2010-12-232014-04-01Samsung Electronics Co., Ltd.Moving image photographing method and moving image photographing apparatus
GB2486793B (en)*2010-12-232017-12-20Samsung Electronics Co LtdMoving image photographing method and moving image photographing apparatus
US9262612B2 (en)2011-03-212016-02-16Apple Inc.Device access using voice authentication
US10102359B2 (en)2011-03-212018-10-16Apple Inc.Device access using voice authentication
US8811648B2 (en)2011-03-312014-08-19Apple Inc.Moving magnet audio transducer
US9225701B2 (en)2011-04-182015-12-29Intelmate LlcSecure communication systems and methods
US10032066B2 (en)2011-04-182018-07-24Intelmate LlcSecure communication systems and methods
US9674625B2 (en)2011-04-182017-06-06Apple Inc.Passive proximity detection
US9007871B2 (en)2011-04-182015-04-14Apple Inc.Passive proximity detection
US11120372B2 (en)2011-06-032021-09-14Apple Inc.Performing actions associated with task items that represent tasks to perform
US10241644B2 (en)2011-06-032019-03-26Apple Inc.Actionable reminder entries
US10706373B2 (en)2011-06-032020-07-07Apple Inc.Performing actions associated with task items that represent tasks to perform
US10057736B2 (en)2011-06-032018-08-21Apple Inc.Active transport based notifications
US10402151B2 (en)2011-07-282019-09-03Apple Inc.Devices with enhanced audio
US10771742B1 (en)2011-07-282020-09-08Apple Inc.Devices with enhanced audio
US9798393B2 (en)2011-08-292017-10-24Apple Inc.Text correction processing
US8989428B2 (en)2011-08-312015-03-24Apple Inc.Acoustic systems in electronic devices
US10241752B2 (en)2011-09-302019-03-26Apple Inc.Interface for a virtual digital assistant
US10284951B2 (en)2011-11-222019-05-07Apple Inc.Orientation-based audio
US9020163B2 (en)2011-12-062015-04-28Apple Inc.Near-field null and beamforming
US8903108B2 (en)2011-12-062014-12-02Apple Inc.Near-field null and beamforming
EP2709357A4 (en)*2012-01-162014-11-12Huawei Tech Co LtdConference recording method and conference system
US10134385B2 (en)2012-03-022018-11-20Apple Inc.Systems and methods for name pronunciation
US9483461B2 (en)2012-03-062016-11-01Apple Inc.Handling speech synthesis of content for multiple languages
WO2013169621A1 (en)*2012-05-112013-11-14Qualcomm IncorporatedAudio user interaction recognition and context refinement
US9736604B2 (en)2012-05-112017-08-15Qualcomm IncorporatedAudio user interaction recognition and context refinement
US10073521B2 (en)2012-05-112018-09-11Qualcomm IncorporatedAudio user interaction recognition and application interface
WO2013169618A1 (en)*2012-05-112013-11-14Qualcomm IncorporatedAudio user interaction recognition and context refinement
US9746916B2 (en)2012-05-112017-08-29Qualcomm IncorporatedAudio user interaction recognition and application interface
US9953088B2 (en)2012-05-142018-04-24Apple Inc.Crowd sourcing information to fulfill user requests
US10079014B2 (en)2012-06-082018-09-18Apple Inc.Name recognition system
US9495129B2 (en)2012-06-292016-11-15Apple Inc.Device, method, and user interface for voice-activated navigation and browsing of a document
US9576574B2 (en)2012-09-102017-02-21Apple Inc.Context-sensitive handling of interruptions by intelligent digital assistant
US9971774B2 (en)2012-09-192018-05-15Apple Inc.Voice-based media searching
US9820033B2 (en)2012-09-282017-11-14Apple Inc.Speaker assembly
US8858271B2 (en)2012-10-182014-10-14Apple Inc.Speaker interconnect
US9357299B2 (en)2012-11-162016-05-31Apple Inc.Active protection for acoustic device
US8942410B2 (en)2012-12-312015-01-27Apple Inc.Magnetically biased electromagnet for audio applications
US10199051B2 (en)2013-02-072019-02-05Apple Inc.Voice trigger for a digital assistant
US10978090B2 (en)2013-02-072021-04-13Apple Inc.Voice trigger for a digital assistant
US11499255B2 (en)2013-03-132022-11-15Apple Inc.Textile product having reduced density
US9368114B2 (en)2013-03-142016-06-14Apple Inc.Context-sensitive handling of interruptions
US9922642B2 (en)2013-03-152018-03-20Apple Inc.Training an at least partial voice command system
US9697822B1 (en)2013-03-152017-07-04Apple Inc.System and method for updating an adaptive speech recognition model
US9582608B2 (en)2013-06-072017-02-28Apple Inc.Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
US9620104B2 (en)2013-06-072017-04-11Apple Inc.System and method for user-specified pronunciation of words for speech synthesis and recognition
US9633674B2 (en)2013-06-072017-04-25Apple Inc.System and method for detecting errors in interactions with a voice-based digital assistant
US9966060B2 (en)2013-06-072018-05-08Apple Inc.System and method for user-specified pronunciation of words for speech synthesis and recognition
US9966068B2 (en)2013-06-082018-05-08Apple Inc.Interpreting and acting upon commands that involve sharing information with remote devices
US10657961B2 (en)2013-06-082020-05-19Apple Inc.Interpreting and acting upon commands that involve sharing information with remote devices
US10185542B2 (en)2013-06-092019-01-22Apple Inc.Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US10176167B2 (en)2013-06-092019-01-08Apple Inc.System and method for inferring user intent from speech inputs
US9300784B2 (en)2013-06-132016-03-29Apple Inc.System and method for emergency calls initiated by voice command
US10791216B2 (en)2013-08-062020-09-29Apple Inc.Auto-activating smart responses based on activities from remote devices
US10063977B2 (en)2014-05-122018-08-28Apple Inc.Liquid expulsion from an orifice
US9620105B2 (en)2014-05-152017-04-11Apple Inc.Analyzing audio input for efficient speech and music recognition
US10592095B2 (en)2014-05-232020-03-17Apple Inc.Instantaneous speaking of content on touch devices
US9502031B2 (en)2014-05-272016-11-22Apple Inc.Method for supporting dynamic grammars in WFST-based ASR
US10289433B2 (en)2014-05-302019-05-14Apple Inc.Domain specific language for encoding assistant dialog
US9842101B2 (en)2014-05-302017-12-12Apple Inc.Predictive conversion of language input
US10497365B2 (en)2014-05-302019-12-03Apple Inc.Multi-command single utterance input method
US10169329B2 (en)2014-05-302019-01-01Apple Inc.Exemplar-based natural language processing
US10170123B2 (en)2014-05-302019-01-01Apple Inc.Intelligent assistant for home automation
US10083690B2 (en)2014-05-302018-09-25Apple Inc.Better resolution when referencing to concepts
US9715875B2 (en)2014-05-302017-07-25Apple Inc.Reducing the need for manual start/end-pointing and trigger phrases
US9760559B2 (en)2014-05-302017-09-12Apple Inc.Predictive text input
US9430463B2 (en)2014-05-302016-08-30Apple Inc.Exemplar-based natural language processing
US10078631B2 (en)2014-05-302018-09-18Apple Inc.Entropy-guided text prediction using combined word and character n-gram language models
US9633004B2 (en)2014-05-302017-04-25Apple Inc.Better resolution when referencing to concepts
US9734193B2 (en)2014-05-302017-08-15Apple Inc.Determining domain salience ranking from ambiguous words in natural speech
US11257504B2 (en)2014-05-302022-02-22Apple Inc.Intelligent assistant for home automation
US11133008B2 (en)2014-05-302021-09-28Apple Inc.Reducing the need for manual start/end-pointing and trigger phrases
US9966065B2 (en)2014-05-302018-05-08Apple Inc.Multi-command single utterance input method
US9785630B2 (en)2014-05-302017-10-10Apple Inc.Text prediction using combined word N-gram and unigram language models
US10904611B2 (en)2014-06-302021-01-26Apple Inc.Intelligent automated assistant for TV user interactions
US9668024B2 (en)2014-06-302017-05-30Apple Inc.Intelligent automated assistant for TV user interactions
US9338493B2 (en)2014-06-302016-05-10Apple Inc.Intelligent automated assistant for TV user interactions
US10659851B2 (en)2014-06-302020-05-19Apple Inc.Real-time digital assistant knowledge updates
US10446141B2 (en)2014-08-282019-10-15Apple Inc.Automatic speech recognition based on user feedback
US10431204B2 (en)2014-09-112019-10-01Apple Inc.Method and apparatus for discovering trending terms in speech requests
US9818400B2 (en)2014-09-112017-11-14Apple Inc.Method and apparatus for discovering trending terms in speech requests
US10789041B2 (en)2014-09-122020-09-29Apple Inc.Dynamic thresholds for always listening speech trigger
US10074360B2 (en)2014-09-302018-09-11Apple Inc.Providing an indication of the suitability of speech recognition
US10127911B2 (en)2014-09-302018-11-13Apple Inc.Speaker identification and unsupervised speaker adaptation techniques
US9986419B2 (en)2014-09-302018-05-29Apple Inc.Social reminders
US9886432B2 (en)2014-09-302018-02-06Apple Inc.Parsimonious handling of word inflection via categorical stem + suffix N-gram language models
US9668121B2 (en)2014-09-302017-05-30Apple Inc.Social reminders
US9646609B2 (en)2014-09-302017-05-09Apple Inc.Caching apparatus for serving phonetic pronunciations
US10362403B2 (en)2014-11-242019-07-23Apple Inc.Mechanically actuated panel acoustic system
US9525943B2 (en)2014-11-242016-12-20Apple Inc.Mechanically actuated panel acoustic system
US10552013B2 (en)2014-12-022020-02-04Apple Inc.Data detection
US11556230B2 (en)2014-12-022023-01-17Apple Inc.Data detection
US9711141B2 (en)2014-12-092017-07-18Apple Inc.Disambiguating heteronyms in speech synthesis
US9865280B2 (en)2015-03-062018-01-09Apple Inc.Structured dictation using intelligent automated assistants
US11087759B2 (en)2015-03-082021-08-10Apple Inc.Virtual assistant activation
US10311871B2 (en)2015-03-082019-06-04Apple Inc.Competing devices responding to voice triggers
US9886953B2 (en)2015-03-082018-02-06Apple Inc.Virtual assistant activation
US10567477B2 (en)2015-03-082020-02-18Apple Inc.Virtual assistant continuity
US9721566B2 (en)2015-03-082017-08-01Apple Inc.Competing devices responding to voice triggers
US9899019B2 (en)2015-03-182018-02-20Apple Inc.Systems and methods for structured stem and suffix language models
US9842105B2 (en)2015-04-162017-12-12Apple Inc.Parsimonious continuous-space phrase representations for natural language processing
US10083688B2 (en)2015-05-272018-09-25Apple Inc.Device voice control for selecting a displayed affordance
US10127220B2 (en)2015-06-042018-11-13Apple Inc.Language identification from short strings
US10101822B2 (en)2015-06-052018-10-16Apple Inc.Language input correction
US10356243B2 (en)2015-06-052019-07-16Apple Inc.Virtual assistant aided communication with 3rd party service in a communication session
US11025565B2 (en)2015-06-072021-06-01Apple Inc.Personalized prediction of responses for instant messaging
US10255907B2 (en)2015-06-072019-04-09Apple Inc.Automatic accent detection using acoustic models
US10186254B2 (en)2015-06-072019-01-22Apple Inc.Context-based endpoint detection
US9900698B2 (en)2015-06-302018-02-20Apple Inc.Graphene composite acoustic diaphragm
US10671428B2 (en)2015-09-082020-06-02Apple Inc.Distributed personal assistant
US11500672B2 (en)2015-09-082022-11-15Apple Inc.Distributed personal assistant
US10747498B2 (en)2015-09-082020-08-18Apple Inc.Zero latency digital assistant
US9697820B2 (en)2015-09-242017-07-04Apple Inc.Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks
US9858948B2 (en)2015-09-292018-01-02Apple Inc.Electronic equipment with ambient noise sensing input circuitry
US10366158B2 (en)2015-09-292019-07-30Apple Inc.Efficient word encoding for recurrent neural network language models
US11010550B2 (en)2015-09-292021-05-18Apple Inc.Unified language modeling framework for word prediction, auto-completion and auto-correction
US11587559B2 (en)2015-09-302023-02-21Apple Inc.Intelligent device identification
US11526368B2 (en)2015-11-062022-12-13Apple Inc.Intelligent automated assistant in a messaging environment
US10691473B2 (en)2015-11-062020-06-23Apple Inc.Intelligent automated assistant in a messaging environment
US10049668B2 (en)2015-12-022018-08-14Apple Inc.Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10223066B2 (en)2015-12-232019-03-05Apple Inc.Proactive assistance based on dialog communication between devices
US10446143B2 (en)2016-03-142019-10-15Apple Inc.Identification of voice inputs providing credentials
US9934775B2 (en)2016-05-262018-04-03Apple Inc.Unit-selection text-to-speech synthesis based on predicted concatenation parameters
US9972304B2 (en)2016-06-032018-05-15Apple Inc.Privacy preserving distributed evaluation framework for embedded personalized systems
US10249300B2 (en)2016-06-062019-04-02Apple Inc.Intelligent list reading
NO20160989A1 (en)*2016-06-082017-12-11Pexip ASVideo Conference timeline
US10049663B2 (en)2016-06-082018-08-14Apple, Inc.Intelligent automated assistant for media exploration
US11069347B2 (en)2016-06-082021-07-20Apple Inc.Intelligent automated assistant for media exploration
US10354011B2 (en)2016-06-092019-07-16Apple Inc.Intelligent automated assistant in a home environment
US10067938B2 (en)2016-06-102018-09-04Apple Inc.Multilingual word prediction
US10509862B2 (en)2016-06-102019-12-17Apple Inc.Dynamic phrase expansion of language input
US10733993B2 (en)2016-06-102020-08-04Apple Inc.Intelligent digital assistant in a multi-tasking environment
US11037565B2 (en)2016-06-102021-06-15Apple Inc.Intelligent digital assistant in a multi-tasking environment
US10192552B2 (en)2016-06-102019-01-29Apple Inc.Digital assistant providing whispered speech
US10490187B2 (en)2016-06-102019-11-26Apple Inc.Digital assistant providing automated status report
US10521466B2 (en)2016-06-112019-12-31Apple Inc.Data driven natural language event detection and classification
US10297253B2 (en)2016-06-112019-05-21Apple Inc.Application integration with a digital assistant
US10269345B2 (en)2016-06-112019-04-23Apple Inc.Intelligent task discovery
US10089072B2 (en)2016-06-112018-10-02Apple Inc.Intelligent device arbitration and control
US11152002B2 (en)2016-06-112021-10-19Apple Inc.Application integration with a digital assistant
US10043516B2 (en)2016-09-232018-08-07Apple Inc.Intelligent automated assistant
US10553215B2 (en)2016-09-232020-02-04Apple Inc.Intelligent automated assistant
US10593346B2 (en)2016-12-222020-03-17Apple Inc.Rank-reduced token representation for automatic speech recognition
US10755703B2 (en)2017-05-112020-08-25Apple Inc.Offline personal assistant
US10791176B2 (en)2017-05-122020-09-29Apple Inc.Synchronization and task delegation of a digital assistant
US11405466B2 (en)2017-05-122022-08-02Apple Inc.Synchronization and task delegation of a digital assistant
US10410637B2 (en)2017-05-122019-09-10Apple Inc.User-specific acoustic models
US10810274B2 (en)2017-05-152020-10-20Apple Inc.Optimizing dialogue policy decisions for digital assistants using implicit feedback
US10482874B2 (en)2017-05-152019-11-19Apple Inc.Hierarchical belief states for digital assistants
US11217255B2 (en)2017-05-162022-01-04Apple Inc.Far-field extension for digital assistant services
US11307661B2 (en)2017-09-252022-04-19Apple Inc.Electronic device with actuators for producing haptic and audio output along a device housing
US11907426B2 (en)2017-09-252024-02-20Apple Inc.Electronic device with actuators for producing haptic and audio output along a device housing
US10873798B1 (en)2018-06-112020-12-22Apple Inc.Detecting through-body inputs at a wearable audio device
US11743623B2 (en)2018-06-112023-08-29Apple Inc.Wearable interactive audio device
US10757491B1 (en)2018-06-112020-08-25Apple Inc.Wearable interactive audio device
US12413880B2 (en)2018-06-112025-09-09Apple Inc.Wearable interactive audio device
US11740591B2 (en)2018-08-302023-08-29Apple Inc.Electronic watch with barometric vent
US11334032B2 (en)2018-08-302022-05-17Apple Inc.Electronic watch with barometric vent
US12099331B2 (en)2018-08-302024-09-24Apple Inc.Electronic watch with barometric vent
US11561144B1 (en)2018-09-272023-01-24Apple Inc.Wearable electronic device with fluid-based pressure sensing
US11857063B2 (en)2019-04-172024-01-02Apple Inc.Audio output system for a wirelessly locatable tag
US12256032B2 (en)2021-03-022025-03-18Apple Inc.Handheld electronic device

Also Published As

Publication numberPublication date
JP2000125274A (en)2000-04-28
GB2342802B (en)2003-04-16
GB9916394D0 (en)1999-09-15

Similar Documents

PublicationPublication DateTitle
GB2342802A (en)Indexing conference content onto a timeline
KR101238586B1 (en)Automatic face extraction for use in recorded meetings timelines
Lee et al.Portable meeting recorder
Cutler et al.Distributed meetings: A meeting capture and broadcasting system
US7428000B2 (en)System and method for distributed meetings
US5548346A (en)Apparatus for integrally controlling audio and video signals in real time and multi-site communication control method
CN107820037B (en)Audio signal, image processing method, device and system
US7355623B2 (en)System and process for adding high frame-rate current speaker data to a low frame-rate video using audio watermarking techniques
JP3620855B2 (en) Method and apparatus for recording and indexing audio and multimedia conferences
US7362350B2 (en)System and process for adding high frame-rate current speaker data to a low frame-rate video
US20060251384A1 (en)Automatic video editing for real-time multi-point video conferencing
US7355622B2 (en)System and process for adding high frame-rate current speaker data to a low frame-rate video using delta frames
CN111193890B (en)Conference record analyzing device and method and conference record playing system
JP2006085440A (en)Information processing system, information processing method and computer program
JP4414708B2 (en) Movie display personal computer, data display system, movie display method, movie display program, and recording medium
WO2002013522A2 (en)Audio and video notetaker
Arnaud et al.The CAVA corpus: synchronised stereoscopic and binaural datasets with head movements
TWI799048B (en)Panoramic video conference system and method
KR20010079719A (en)Real-time tracking of an object of interest using a hybrid optical and virtual zooming mechanism
SumecMulti camera automatic video editing
JP6860178B1 (en) Video processing equipment and video processing method
Rui et al.PING: A Group-to-individual distributed meeting system
JP2000333125A (en)Editing device and recording device

Legal Events

DateCodeTitleDescription
732EAmendments to the register in respect of changes of name or changes affecting rights (sect. 32/1977)
PCNPPatent ceased through non-payment of renewal fee

Effective date:20150713


[8]ページ先頭

©2009-2025 Movatter.jp