CROSS-REFERENCE TO RELATED APPLICATIONSThis application is a continuation of, and claims priority to, co-pending U.S. Patent Application entitled “DETECTION OF CAST MEMBERS IN VIDEO CONTENT,” filed on Apr. 10, 2013, and assigned application Ser. No. 13/860,347, which is incorporated herein by reference in its entirety.
BACKGROUNDPeople often want more information about the movies and other video content they are watching. To this end, people may search the Internet to find out more information about the video content. This information may include, for example, biographies of actors, production information, trivia, goofs, and so on.
BRIEF DESCRIPTION OF THE DRAWINGSMany aspects of the present disclosure can be better understood with reference to the following drawings. The components in the drawings are not necessarily to scale, with emphasis instead being placed upon clearly illustrating the principles of the disclosure. Moreover, in the drawings, like reference numerals designate corresponding parts throughout the several views.
FIG. 1 is a drawing of an exemplary face detection and recognition scenario according to various embodiments of the present disclosure.
FIG. 2 is a drawing of a networked environment according to various embodiments of the present disclosure.
FIGS. 3A-3C are drawings of examples of user interfaces rendered by a client in the networked environment ofFIG. 2 according to various embodiments of the present disclosure.
FIGS. 4A and 4B are flowcharts illustrating examples of functionality implemented as portions of a cast member detection application executed in a computing environment in the networked environment ofFIG. 2 according to various embodiments of the present disclosure.
FIG. 5 is a schematic block diagram that provides one example illustration of a computing environment employed in the networked environment ofFIG. 2 according to various embodiments of the present disclosure.
DETAILED DESCRIPTIONThe present disclosure relates to detection of cast members in video content. Systems may wish to present an identification of the cast member(s) who are present in a current scene of a movie, television show, and so on. Manually creating associations between scenes and cast members may be labor intensive, and exceptionally so when performed on a large scale for a multitude of video programs. Various embodiments of the present disclosure facilitate automated cast member detection in video content using face detection and recognition. Existing data associating cast members with video content may be employed, and facial data models may be updated as part of the face recognition process. Manual disambiguation and confirmation may be used to a limited extent to verify and improve facial data models.
Turning now toFIG. 1, shown is an exemplary face detection andrecognition scenario100 according to various embodiments. Face detection is performed on avideo frame103 from a video program. In this example, twofaces106aand106bare detected. Face recognition is then performed on each of thefaces106aand106busingreference images109aand109bthat correspond to cast members that are known to appear in the video program. As a result of the face recognition, the detectedface106ais recognized as being the cast member corresponding to thereference image109a,and the detectedface106bis recognized as being the cast member corresponding to thereference image109b.Accordingly, the corresponding cast members may be associated with thevideo frame103 andother video frames103 from the same scene. In the following discussion, a general description of the system and its components is provided, followed by a discussion of the operation of the same.
With reference toFIG. 2, shown is anetworked environment200 according to various embodiments. Thenetworked environment200 includes acomputing environment203 and one ormore clients206 in data communication via anetwork209. Thenetwork209 includes, for example, the Internet, intranets, extranets, wide area networks (WANs), local area networks (LANs), wired networks, wireless networks, or other suitable networks, etc., or any combination of two or more such networks.
Thecomputing environment203 may comprise, for example, a server computer or any other system providing computing capability. Alternatively, thecomputing environment203 may employ a plurality of computing devices that are arranged, for example, in one or more server banks or computer banks or other arrangements. Such computing devices may be located in a single installation or may be distributed among many different geographical locations. For example, thecomputing environment203 may include a plurality of computing devices that together may comprise a cloud computing resource, a grid computing resource, and/or any other distributed computing arrangement. In some cases, thecomputing environment203 may correspond to an elastic computing resource where the allotted capacity of processing, network, storage, or other computing-related resources may vary over time.
Various applications and/or other functionality may be executed in thecomputing environment203 according to various embodiments. Also, various data is stored in adata store212 that is accessible to thecomputing environment203. Thedata store212 may be representative of a plurality ofdata stores212 as can be appreciated. The data stored in thedata store212, for example, is associated with the operation of the various applications and/or functional entities described below.
The components executed on thecomputing environment203, for example, include a castmember detection application215, a scenebreak detection application218, amanual review system221, and other applications, services, processes, systems, engines, or functionality not discussed in detail herein. The castmember detection application215 is executed to processvideo frames103 to determine which cast members appear in whichvideo frames103. To this end, the castmember detection application215 may includeface detection logic224,face tracker logic225,face recognition logic227, andtemporal smoothing logic228.
Theface detection logic224 is executed to detect whether a face is present in a givenvideo frame103. Theface tracker logic225 may assist theface detection logic224 by tracking a sequence of faces occurring acrossmultiple video frames103, where the faces in the sequence are similar in appearance and geometric proximity or position. Theface recognition logic227 is executed to recognize a detected face within avideo frame103 as corresponding to a particular person or cast member. Thetemporal smoothing logic228 may employ a temporal smoothing factor to smooth the face recognition results acrossvideo frames103 in which a previously or subsequently recognized face is unrecognized.
The scenebreak detection application218 is executed to detect scene breaks within video programs. To this end, the scenebreak detection application218 may monitor contrast and other characteristics that change betweenvideo frames103 to determine that the video program has moved from one scene to another. Themanual review system221 may be executed to provide manual review functionality for the castmember detection application215 and/or the scenebreak detection application218. For example, themanual review system221 may submit unrecognized faces for manual identification. Also, themanual review system221 may submit recognized faces for manual configuration or disambiguation from multiple possible cast members.
The data stored in thedata store212 includes, for example,video programs230,scene data233,cast member data236, cast member/frame associations239,unrecognized face data242,manual review data245,recognition data models248,detection data models251, and potentially other data. Each of thevideo programs230 corresponds to video data comprising a sequence ofvideo frames103. For example, avideo program230 may include 24 frames per second, 30 frames per second, or another frequency. Avideo program230 may correspond to a movie, a television show, and/or other video content in which people appear.
Thescene data233 describes various scenes into which thevideo programs230 may be divided. A scene corresponds to a period of time in thevideo program230 havingmultiple video frames103, and may be determined as having a distinct plot element or setting. In one embodiment, a scene is defined as having a beginningvideo frame103 and an endingvideo frame103. In another embodiment, a scene is defined as having a beginningvideo frame103 and a duration. Thescene data233 may be generated automatically by the scenebreak detection application218 or may be predetermined.
Thecast member data236 describes various actors, actresses, extras, etc., who appear in one or more of the video programs230. Thecast member data236 may includereference images109 and cast member/video program associations257. Eachreference image109 is known to depict a particular cast member. Areference image109 may correspond to a headshot, a publicity still, a screen grab from avideo frame103, and/or any other image that is known to depict a particular cast member. Thereference image109 may show the cast member in character as he or she appears in avideo program230. Alternatively, thereference image109 may show the cast member out-of-character or having an appearance of another character not in aparticular video program230.
The cast member/video program associations257 correspond to pre-existing data that associates particular cast members withparticular video programs230. For example, the cast member/video program associations257 may be obtained from cast listings provided by external sources of information. The cast member/video program associations257 indicate cast members who appear in the video programs230. In some cases, the cast member/video program associations257 may indicate cast members who participate in the production of avideo program230 but do not actually appear (e.g., voice talent).
The cast member/frame associations239 are generated by the castmember detection application215. The cast member/frame associations239 indicates cast members who are recognized by the castmember detection application215 as appearing in aparticular video frame103 or are predicted to appear in theparticular video frame103. In some cases, the cast member/frame associations239 indicate that a cast member appears in a particular scene comprising a givenvideo frame103, even if the cast member is not actually detected and recognized as being in the givenvideo frame103. The cast member/frame associations239 may be on a per-frame basis, per-scene basis, a time of appearance basis, or determined according to other approaches.
Theunrecognized face data242 includes data that corresponds to faces that have been detected but not recognized in thevideo program230. For example, a face may correspond to a person who appears in thevideo program230 but is uncredited and not included in the cast member/video program associations257. Alternatively, a face may correspond to a known cast member withreference images109 but may be unrecognizable due to camera angle, lighting, character makeup, and/or other factors.
Themanual review data245 includes data facilitating and produced as a result of a manual review through themanual review system221. Themanual review data245 may record whether a face recognition was confirmed correct or incorrect, a selection of one of multiple possible cast members for a detected face, an identification of a cast member for an unrecognized face, and so on. The tasks relating to manual review may be assigned to various agents or other users, who may be contracted on a per-task basis. Themanual review data245 may track the productivity and accuracy of the various agents, where the accuracy may be assessed through a multi-layer manual review involving multiple agents.
Therecognition data models248 and thedetection data models251 may be employed for machine learning purposes. For example, therecognition data models248 and thedetection data models251 may be trained through manual confirmation of correct or incorrect face detections and/or face recognitions. Where correct recognitions and/or detections are confirmed, the particular detected face may be employed in therecognition data models248 and/or thedetection data models251 to improve the accuracy of further detections and recognitions for aparticular video program230 or for a particular cast member appearing acrossmultiple video programs230.
The detection and recognition configuration data254 may include various parameters controlling theface detection logic224 and theface recognition logic227. For example, the detection and recognition configuration data254 may include a temporal smoothing factor for use by thetemporal smoothing logic228. In one embodiment, the temporal smoothing factor may correspond to a maximum number of video frames103 in which a cast member may be unrecognized, and despite being unrecognized, still be associated with the video frames103 due to being detected prior to and/or after the video frames103. The detection and recognition configuration data254 may include a maximum threshold for a quantity of faces to be detected in avideo frame103. For example, avideo frame103 may depict a large crowd of extras and it may be desirable to disable cast member detection for theparticular video frame103 or scene.
Theclient206 is representative of a plurality of client devices that may be coupled to thenetwork209. Theclient206 may comprise, for example, a processor-based system such as a computer system. Such a computer system may be embodied in the form of a desktop computer, a laptop computer, personal digital assistants, cellular telephones, smartphones, set-top boxes, music players, web pads, tablet computer systems, game consoles, electronic book readers, or other devices with like capability. Theclient206 may include adisplay260. Thedisplay260 may comprise, for example, one or more devices such as liquid crystal display (LCD) displays, gas plasma-based flat panel displays, organic light emitting diode (OLED) displays, LCD projectors, or other types of display devices, etc.
Theclient206 may be configured to execute various applications such as a manualreview client application263 and/or other applications. The manualreview client application263 may be executed to facilitate completing a task that is a part of a manual review of face detection and/or face recognition. The manualreview client application263 may be executed in aclient206, for example, to access network content served up by thecomputing environment203 and/or other servers, thereby rendering a user interface266 on thedisplay260. The manualreview client application263 may, for example, correspond to a browser, a mobile application, etc., and the user interface266 may correspond to a network page, a mobile application screen, etc. Theclient206 may be configured to execute applications beyond the manualreview client application263 such as, for example, video content player applications, browsers, mobile applications, email applications, social networking applications, and/or other applications. Although the manualreview client application263 is described as being executed in aclient206, in some embodiments, the manualreview client application263 may be executed in the same system as the castmember detection application215 or other components described herein.
Next, a general description of the operation of the various components of thenetworked environment200 is provided. To begin,various video programs230 and castmember data236 may be loaded in thedata store212. In some embodiments, thescene data233 may then be loaded from an external source or generated by way of the scenebreak detection application218. Therecognition data models248 and/or thedetection data models251 may be primed based at least in part on previous detections and/or recognitions performed through the castmember detection application215 and potentially subjected to manual review via themanual review system221.
The castmember detection application215 begins processing aparticular video program230 and obtains a set ofreference images109 that corresponds to the cast member/video program associations257. The set ofreference images109 show the cast members who appear or might appear in thevideo program230. It is noted that various cast members, credited or uncredited, may appear in thevideo program230 but have nocorresponding reference images109. In one embodiment, data encoding characteristics of the set of reference images109 (e.g., histograms, hashes, facial profiles, etc.) may be obtained rather than data encoding thereference images109 themselves.
The castmember detection application215 processes thevideo program230 by sampling avideo frame103. Aparticular video program230 may have vast quantities of video frames103, so the castmember detection application215 may be configured to sample thevideo program230 by processing, for example, onevideo frame103 per second of video rather than all 24 video frames103 within that second of video. In some embodiments, everyvideo frame103 may be processed by the castmember detection application215.
In processing avideo frame103, the castmember detection application215 usesface detection logic224 to detect zero or more faces present in theparticular video frame103. In some cases, theface detection logic224 may employ theface tracker logic225, which may use previous or subsequent video frames103 to map a trajectory of a detected face (i.e., a sequence of faces similar in appearance and/or position), which may improve accuracy of face detection in the intervening frames. Theface detection logic224 may employ adetection data model251 to perform the detection.
In response to detecting a face, the castmember detection application215 employs theface recognition logic227 to recognize the detected face. For example, theface recognition logic227 may operate on a portion of thevideo frame103 that has been identified by theface detection logic224 as likely depicting a face. Theface recognition logic227 may compare data from thereference images109 to recognize which person corresponds to the detected face.
In one embodiment, theface recognition logic227 may employ a universal set of thereference images109 across cast members of a multitude ofvideo programs230. In other embodiments, theface recognition logic227 employs only thosereference images109 that correspond to cast members identified in the cast member/video program associations257. This reduction inreference images109 to consider may improve processing speed and may reduce the likelihood of mistaken recognitions. In some of these embodiments, theface recognition logic227 may expand the set ofreference images109 to consider beyond those cast members indicated in the cast member/video program associations257 when a recognition could not be made using the specific set associated with the known cast members.
Upon recognition of a face, the castmember detection application215 generates a cast member/frame association239 for theparticular video frame103. The cast member/frame association239 may indicate a position of the recognized face within theparticular video frame103 and/or may merely indicate that the recognized face appears somewhere in theparticular video frame103. Due to sampling, one cast member/frame association239 may pertain to a range of multiple video frames103.
Additionally, theface recognition logic227 may employtemporal smoothing logic228 to account for video frames103 in which the face of a cast member is briefly absent or cannot be detected/recognized due to camera angle, lighting, etc. For example, if a cast member is detected in a first frame, the cast member may implicitly be detected in the next N frames (or previous N frames) as specified by a temporal smoothing factor. Alternatively, if a cast member is detected in a first frame, not detected in N second frames, and then detected again in a third frame, the cast member may implicitly be detected in the N second frames depending on a temporal smoothing factor and the value of N.
In various embodiments, associations of cast members to video frames103 may be performed on a per-scene basis. In one embodiment, if a cast member is recognized once in a scene, the cast member may be associated with the entire scene. In another embodiment, the cast member becomes associated with the rest of a scene beginning with a first recognized appearance in avideo frame103 of the scene.
Face recognition may be handled in a different manner when many faces appear. For example, in video frames103 where a crowd is shown, face recognition may be disabled or restricted to a set of recently recognized faces based at least in part on the detection of N faces, where N is a maximum threshold for a quantity of faces. When face recognition is disabled, the previously recognized cast members may continue to be associated with the particular video frames103 subject to the temporal smoothing factor and/or special thresholds that apply.
In one scenario, a particular face is not recognized by theface recognition logic227. If so, data indicating or derived from the unrecognized face may be recorded in theunrecognized face data242. In one embodiment, a clustering analysis may be employed on theunrecognized face data242 to determine groupings of unrecognized people or characters who appear in thevideo program230. The unrecognized people may then be submitted for manual review and identification via themanual review system221. For example, themanual review system221 may instruct theclient206 to display one or more representative images for each unrecognized person. A user interface266 may request a name and/or other information from the manual reviewer. In some embodiments, the user interface266 may present a listing of possible choices for cast members, with the manual reviewer selecting from the listing.
Manual review may also be indicated in situations where a face cannot be recognized up to a certain confidence level or threshold. For example, theface recognition logic227 may determine that the face is likely to correspond to a particular cast member (or subset of cast members from the video program230) but cannot confidently make the determination. In such a situation, the manual reviewer at theclient206 may be asked to confirm or disambiguate the determination.
In response to manual confirmation or rejection of recognitions and/or detections, therecognition data models248 and/or thedetection data models251 may be updated according to machine learning techniques. In one example, a manually confirmed face may be captured from avideo frame103 and added to thereference images109 as pertaining to the cast member, or particular, to the cast member as depicted in thevideo program230. Non-confirmed detections and/or recognitions may also be employed in some embodiments for the purposes of updating thereference images109, the recognition data models, and/or thedetection data models251.
The cast member/frame associations239 that are generated through the castmember detection application215 may be employed to show viewers who is appearing at a given time in avideo program230. Referring toFIG. 3A, shown is one example of auser interface266arendered by a client206 (FIG. 2) in the networked environment200 (FIG. 2) according to embodiments of the present disclosure. Theuser interface266aincludes thevideo frame103 that is currently being displayed,various media controls303, and a castmember identification component306 that identifies the cast members associated with thevideo frame103 and/or scene.
Referring toFIG. 3B, shown is another example of auser interface266brendered by a client206 (FIG. 2) in the networked environment200 (FIG. 2) according to embodiments of the present disclosure. In the alternative example ofFIG. 3B, the castmember identification components306aand306bare rendered such that they are positioned relative to the detected faces106aand106b(FIG. 1) of thevideo frame103 using position information recorded in the cast member/frame associations239. The cast member/frame associations239 may be employed in a variety of ways to facilitate navigation throughout thevideo program230, e.g., through a user interface266 that shows locations or scenes in avideo program230 where a particular cast member appears on screen, to indicate a location or scene in avideo program230 where a cast member first appears, and so on. According to one embodiment, if a particular cast member is determined to be in a current scene but is not recognized in a current frame, a cast member identification component may be rendered separately, e.g., off to the side, at the top, at the bottom, etc., possibly with a label indicating that the particular cast member is not pictured or is off-screen.
Various techniques related to enhancing video content using extrinsic data such as cast member/frame associations239 are described in U.S. patent application Ser. No. 13/227,097 entitled “SYNCHRONIZING VIDEO CONTENT WITH EXTRINSIC DATA” and filed on Sep. 7, 2011, U.S. patent application Ser. No. 13/601,267 entitled “ENHANCING VIDEO CONTENT WITH EXTRINSIC DATA” and filed on Aug. 31, 2012, U.S. patent application Ser. No. 13/601,235 entitled “TIMELINE INTERFACE FOR VIDEO CONTENT” and filed on Aug. 31, 2012, and U.S. patent application Ser. No. 13/601,210 entitled “PROVIDING EXTRINSIC DATA FOR VIDEO CONTENT” and filed on Aug. 31, 2012, all of which are incorporated herein by reference in their entirety.
Turning next toFIG. 3C, shown is another example of auser interface266crendered by a client206 (FIG. 2) in the networked environment200 (FIG. 2) according to embodiments of the present disclosure. Theuser interface266ccorresponds to an exemplary manual review interface rendered by the manual review client application263 (FIG. 2). Theuser interface266cpresents a set of detected faces313a,313b,313c,and313dfor manual review and confirmation. The detected faces313 may have been recognized by the face recognition logic227 (FIG. 2). In one example, one or more of the detected faces313 may be recognized below a minimum confidence level, thereby prompting manual confirmation.
In the example ofFIG. 3C, each of the detected faces313 is presented in association with a respective excludecomponent316a,316b,316c,and316dfor excluding the corresponding detected face313 from being recognized. In this non-limiting example, the detectedface313dis an outlier and does not belong with the other detected faces313a,313b,and313c,so the manual review user will likely select the detectedface313dto be excluded by way of the respective excludecomponent316d.In other examples, the detectedface313dmay be emphasized or highlighted in theuser interface266cto indicate to the user that the confidence level associated with the recognition is below a threshold.
Alabeling component319 may be provided for the manual review user to enter a name or other label for the cast member associated with the detected faces313. In this case, the cast member is to be labeled “Jim Kingsboro.” In various embodiments, a selection component may be provided for the manual review user to search for and select a particular cast member from a database, such as the cast member/video program associations257 (FIG. 2). Anupdate associations321 component may be provided for the manual review user to send the update (e.g., changed labeling, excluded detected faces313, etc.) to the manual review system221 (FIG. 2). Such an update may be reviewed and verified by other users before it is committed to the data store212 (FIG. 2). By excluding an outlier detectedface313dand confirming the other detected faces313a,313b,and313c,the recognition data models248 (FIG. 2) may be updated and improved for subsequent face recognition and scrubbing of the data set.
Referring next toFIG. 4A, shown is a flowchart that provides one example of the operation of a portion of the castmember detection application215 according to various embodiments. It is understood that the flowchart ofFIG. 4A provides merely an example of the many different types of functional arrangements that may be employed to implement the operation of the portion of the castmember detection application215 as described herein. As an alternative, the flowchart ofFIG. 4A may be viewed as depicting an example of steps of a method implemented in the computing environment203 (FIG. 2) according to one or more embodiments.
Beginning withbox403, the castmember detection application215 obtains a video frame103 (FIG. 2) from a video program230 (FIG. 2). Inbox406, the castmember detection application215 employs the face detection logic224 (FIG. 2) to perform face detection on thevideo frame103. Theface detection logic224 may, for example, use a detection data model251 (FIG. 2).
Inbox409, the castmember detection application215 determines whether a face is detected. If a face is not detected, the castmember detection application215 moves frombox409 tobox412 and determines whether anothervideo frame103 remains to be processed. If so, the castmember detection application215 returns tobox403 and obtains thenext video frame103 to process. If anothervideo frame103 does not remain to be processed, the portion of the castmember detection application215 ends.
If a face is detected, the castmember detection application215 moves frombox409 tobox415. Inbox415, the castmember detection application215 employs the face recognition logic227 (FIG. 2) to perform face recognition on the detected face. To this end, theface recognition logic227 may use the reference images109 (FIG. 2), the cast member/video program associations257 (FIG. 2), the recognition data models248 (FIG. 2), and/or other data. Inbox418, the castmember detection application215 determines whether a face has been recognized. If the face has not been recognized, the castmember detection application215 continues frombox418 tobox421 and adds the detected face to theunrecognized face data242 for later cluster analysis and manual identification. The castmember detection application215 then proceeds tobox424.
If the face has been recognized, the castmember detection application215 instead moves frombox418 tobox427. Inbox427, the castmember detection application215 associates thevideo frame103 with the recognized cast member. Accordingly, the castmember detection application215 may generate a cast member/frame association239 (FIG. 2). In some cases, the face may be not confidently recognized, e.g., at a confidence level threshold. Upon such a situation, the detected face may be submitted to manual review for confirmation before an association is generated. The castmember detection application215 continues tobox424.
Inbox424, the castmember detection application215 determines whether another face in thevideo frame103 is detected. If so, the castmember detection application215 returns tobox415 and performs face recognition on the detected face. If no other faces are detected, the castmember detection application215 continues tobox430. Inbox430, the castmember detection application215 determines whether anothervideo frame103 remains to be processed. If anothervideo frame103 is to be processed, the castmember detection application215 returns tobox403 and obtains the next video frame. Otherwise, the portion of the castmember detection application215 ends.
Moving on toFIG. 4B, shown is a flowchart that provides another example of the operation of a portion of the castmember detection application215 according to various embodiments. It is understood that the flowchart ofFIG. 4B provides merely an example of the many different types of functional arrangements that may be employed to implement the operation of the portion of the castmember detection application215 as described herein. As an alternative, the flowchart ofFIG. 4B may be viewed as depicting an example of steps of a method implemented in the computing environment203 (FIG. 2) according to one or more embodiments.
To begin, inbox433, the castmember detection application215 employs the face detection logic224 (FIG. 2) to detect face sequences in a video program230 (FIG. 2). To this end, theface detection logic224 may use the face tracker logic225 (FIG. 2) to detect a sequence of faces that spans multiple frames103 (FIG. 2). Inbox436, the castmember detection application215 employs the temporal smoothing logic228 (FIG. 2) to determinevideo frames103 that are associated with the detected face sequences.
Inbox439, the castmember detection application215 utilizes the face recognition logic227 (FIG. 2) to perform face recognition on the detected face sequences. Inbox442, the castmember detection application215 associates video frames103 with a recognized cast member for the face sequences that are recognized. Inbox445, the castmember detection application215 submits the unrecognized face sequences for manual review through themanual review system221. Thereafter, the portion of the castmember detection application215 ends.
With reference toFIG. 5, shown is a schematic block diagram of thecomputing environment203 according to an embodiment of the present disclosure. Thecomputing environment203 includes one ormore computing devices500. Eachcomputing device500 includes at least one processor circuit, for example, having aprocessor503 and amemory506, both of which are coupled to alocal interface509. To this end, eachcomputing device500 may comprise, for example, at least one server computer or like device. Thelocal interface509 may comprise, for example, a data bus with an accompanying address/control bus or other bus structure as can be appreciated.
Stored in thememory506 are both data and several components that are executable by theprocessor503. In particular, stored in thememory506 and executable by theprocessor503 are the castmember detection application215, the scenebreak detection application218, themanual review system221, and potentially other applications. Also stored in thememory506 may be adata store212 and other data. In addition, an operating system may be stored in thememory506 and executable by theprocessor503.
It is understood that there may be other applications that are stored in thememory506 and are executable by theprocessor503 as can be appreciated. Where any component discussed herein is implemented in the form of software, any one of a number of programming languages may be employed such as, for example, C, C++, C#, Objective C, Java®, JavaScript®, Perl, PHP, Visual Basic®, Python®, Ruby, Flash®, or other programming languages.
A number of software components are stored in thememory506 and are executable by theprocessor503. In this respect, the term “executable” means a program file that is in a form that can ultimately be run by theprocessor503. Examples of executable programs may be, for example, a compiled program that can be translated into machine code in a format that can be loaded into a random access portion of thememory506 and run by theprocessor503, source code that may be expressed in proper format such as object code that is capable of being loaded into a random access portion of thememory506 and executed by theprocessor503, or source code that may be interpreted by another executable program to generate instructions in a random access portion of thememory506 to be executed by theprocessor503, etc. An executable program may be stored in any portion or component of thememory506 including, for example, random access memory (RAM), read-only memory (ROM), hard drive, solid-state drive, USB flash drive, memory card, optical disc such as compact disc (CD) or digital versatile disc (DVD), floppy disk, magnetic tape, or other memory components.
Thememory506 is defined herein as including both volatile and nonvolatile memory and data storage components. Volatile components are those that do not retain data values upon loss of power. Nonvolatile components are those that retain data upon a loss of power. Thus, thememory506 may comprise, for example, random access memory (RAM), read-only memory (ROM), hard disk drives, solid-state drives, USB flash drives, memory cards accessed via a memory card reader, floppy disks accessed via an associated floppy disk drive, optical discs accessed via an optical disc drive, magnetic tapes accessed via an appropriate tape drive, and/or other memory components, or a combination of any two or more of these memory components. In addition, the RAM may comprise, for example, static random access memory (SRAM), dynamic random access memory (DRAM), or magnetic random access memory (MRAM) and other such devices. The ROM may comprise, for example, a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), or other like memory device.
Also, theprocessor503 may representmultiple processors503 and/or multiple processor cores and thememory506 may representmultiple memories506 that operate in parallel processing circuits, respectively. In such a case, thelocal interface509 may be an appropriate network that facilitates communication between any two of themultiple processors503, between anyprocessor503 and any of thememories506, or between any two of thememories506, etc. Thelocal interface509 may comprise additional systems designed to coordinate this communication, including, for example, performing load balancing. Theprocessor503 may be of electrical or of some other available construction.
Although the castmember detection application215, the face detection logic224 (FIG. 2), the face tracker logic225 (FIG. 2), the face recognition logic227 (FIG. 2), the temporal smoothing logic228 (FIG. 2), the scenebreak detection application218, themanual review system221, and other various systems described herein may be embodied in software or code executed by general purpose hardware as discussed above, as an alternative the same may also be embodied in dedicated hardware or a combination of software/general purpose hardware and dedicated hardware. If embodied in dedicated hardware, each can be implemented as a circuit or state machine that employs any one of or a combination of a number of technologies. These technologies may include, but are not limited to, discrete logic circuits having logic gates for implementing various logic functions upon an application of one or more data signals, application specific integrated circuits (ASICs) having appropriate logic gates, field-programmable gate arrays (FPGAs), or other components, etc. Such technologies are generally well known by those skilled in the art and, consequently, are not described in detail herein.
The flowcharts ofFIGS. 4A-4B show the functionality and operation of an implementation of portions of the castmember detection application215. If embodied in software, each block may represent a module, segment, or portion of code that comprises program instructions to implement the specified logical function(s). The program instructions may be embodied in the form of source code that comprises human-readable statements written in a programming language or machine code that comprises numerical instructions recognizable by a suitable execution system such as aprocessor503 in a computer system or other system. The machine code may be converted from the source code, etc. If embodied in hardware, each block may represent a circuit or a number of interconnected circuits to implement the specified logical function(s).
Although the flowcharts ofFIGS. 4A-B show a specific order of execution, it is understood that the order of execution may differ from that which is depicted. For example, the order of execution of two or more blocks may be scrambled relative to the order shown. Also, two or more blocks shown in succession inFIGS. 4A-4B may be executed concurrently or with partial concurrence. Further, in some embodiments, one or more of the blocks shown inFIGS. 4A-4B may be skipped or omitted. In addition, any number of counters, state variables, warning semaphores, or messages might be added to the logical flow described herein, for purposes of enhanced utility, accounting, performance measurement, or providing troubleshooting aids, etc. It is understood that all such variations are within the scope of the present disclosure.
Also, any logic or application described herein, including the castmember detection application215, theface detection logic224, theface tracker logic225, theface recognition logic227, thetemporal smoothing logic228, the scenebreak detection application218, and themanual review system221, that comprises software or code can be embodied in any non-transitory computer-readable medium for use by or in connection with an instruction execution system such as, for example, aprocessor503 in a computer system or other system. In this sense, the logic may comprise, for example, statements including instructions and declarations that can be fetched from the computer-readable medium and executed by the instruction execution system. In the context of the present disclosure, a “computer-readable medium” can be any medium that can contain, store, or maintain the logic or application described herein for use by or in connection with the instruction execution system.
The computer-readable medium can comprise any one of many physical media such as, for example, magnetic, optical, or semiconductor media. More specific examples of a suitable computer-readable medium would include, but are not limited to, magnetic tapes, magnetic floppy diskettes, magnetic hard drives, memory cards, solid-state drives, USB flash drives, or optical discs. Also, the computer-readable medium may be a random access memory (RAM) including, for example, static random access memory (SRAM) and dynamic random access memory (DRAM), or magnetic random access memory (MRAM). In addition, the computer-readable medium may be a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), or other type of memory device.
It should be emphasized that the above-described embodiments of the present disclosure are merely possible examples of implementations set forth for a clear understanding of the principles of the disclosure. Many variations and modifications may be made to the above-described embodiment(s) without departing substantially from the spirit and principles of the disclosure. All such modifications and variations are intended to be included herein within the scope of this disclosure and protected by the following claims.