CROSS REFERENCE TO RELATED APPLICATION This application is a continuation of U.S. patent application Ser. No. 09/516,983, filed Mar. 1, 2000, and entitledSubscriber Characterization Systemwith Filters, which is a continuation-in-part of U.S. patent application Ser. No. 09/204,888, filed Dec. 3, 1998, now U.S. Pat. No. 7,150,030, the entire disclosures of which are incorporated herein by reference.
BACKGROUND OF THE INVENTION Subscribers face an increasingly large number of choices for entertainment programming, which is delivered over networks such as cable TV systems, over-the-air broadcast systems, and switched digital access systems which use telephone company twisted wire pairs for the delivery of signals.
Cable television service providers have typically provided one-way broadcast services but now offer high-speed data services and can combine traditional analog broadcasts with digital broadcasts and access to Internet web sites. Telephone companies can offer digital data and video programming on a switched basis over digital subscriber line technology. Although the subscriber may only be presented with one channel at a time, channel change requests are instantaneously transmitted to centralized switching equipment and the subscriber can access the programming in a broadcast-like manner. Internet Service Providers (ISPs) offer Internet access and can offer access to text, audio, and video programming which can also be delivered in a broadcast-like manner in which the subscriber selects “channels” containing programming of interest. Such channels may be offered as part of a video programming service or within a data service and can be presented within an Internet browser.
Along with the multitude of programming choices which the subscriber faces, subscribers are subject to advertisements, which in many cases subsidize or pay for the entire cost of the programming. While advertisements are sometimes beneficial to subscribers and deliver desired information regarding specific products or services, consumers generally view advertising as a “necessary evil” for broadcast-type entertainment.
In order to deliver more targeted programming and advertising to subscribers, it is necessary to understand their likes and dislikes to a greater extent than is presently done today. Systems which identify subscriber preferences based on their purchases and responses to questionnaires allow for the targeted marketing of literature in the mail, but do not in any sense allow for the rapid and precise delivery of programming and advertising which is known to have a high probability of acceptance to the subscriber. In order to determine which programming or advertising is appropriate for the subscriber, knowledge of that subscriber and the subscriber product and programming preferences is required.
Specific information regarding a subscriber's viewing habits or the Internet web sites they have accessed can be stored for analysis, but such records are considered private and subscribers are not generally willing to have such information leave their control. Although there are regulatory models which permit the collection of such data on a “notice and consent” basis, there is a general tendency towards legal rules which prohibit such raw data to be collected.
SUMMARY OF THE INVENTION For the foregoing reasons, there is a need for a subscriber characterization system which may generate and store subscriber characteristics that reflect the probable demographics and preferences of the subscriber and household.
The present invention includes a system for characterizing subscribers watching video or multimedia programming based on monitoring their detailed selection choices including the time duration of their viewing, the number of channel changes, the volume at which the programming is listened, the program selection, and collecting text information about that programming to determine what type of programming the subscriber is most interested in.
Furthermore, the system is equipped with one or more filters that assist in determining selection data associated with irrelevant activities by the subscriber which should be excluded from the actual viewing selection data, e.g., selection data associated with channel surfing and/or channel jumping (up and down) activities by the subscriber.
The channel surfing activity refers to one or more rapid channel changes initiated by the subscriber for the purpose of selecting a channel/program for actual viewing. Generally, the subscriber selects a channel, and views the contents of the program at the selected channel for few seconds (about 3-4 seconds), and then changes the channel to view the contents of the next channel. Such rapid changes generally occur a few times in a row before the subscriber selects a channel/programming for actual viewing. The filters of the present invention are configured to detect channel surfing activities by the subscriber by monitoring and evaluating associated viewing times, thereby the channel surfing activities are not considered in the determination of actual viewing selections.
The channel jumping refers to an activity wherein the subscriber changes channels very rapidly in order to move from an existing channel to a desired channel. Therein, the subscriber is not channel surfing, instead the subscriber already knows the intended channel/program for actual viewing and is jumping channels to reach the desired channels, e.g., the subscriber is atchannel number6, and wants to go tochannel number12, the subscriber may jump the channel by changing the channel six times. Generally, in channel jumping, the channel changes occur very rapidly and the viewing time at the each channel is very brief, e.g., less than one second. The filters of the present invention are configured to detect channel jumping, thereby the channel jumping activities are not considered in the determination of actual viewing selections.
The filters of the present invention are also capable of monitoring extended spans of inactivity, e.g., a lack of any channel changes, volume changes, or any other selection changes activity for more than 3 hours. Such spans of inactivity are considered “dead periods” implying that subscriber is not actively watching the video and/or other multimedia programming. The reasons for such dead periods may be caused by the fact that the subscriber has left the room, or the subscriber is not active (e.g., the subscriber has gone to sleep or has dozed off), or the fact that the subscriber is actively engaging in another activity within the room and is not attending to the programming.
The system of the present invention analyzes the actual viewing selections made by the subscriber or the subscriber household, and generates a demographic description of the subscriber or household. This demographic description describes the probable age, income, gender and other demographics. The resulting characterization includes probabilistic determinations of what other programming or products the subscriber/household will be interested in.
The present invention also encompasses the use of heuristic rules in logical form or expressed as conditional probabilities to aid in forming a subscriber profile. The heuristic rules in logical form allow the system to apply generalizations that have been learned from external studies to obtain a characterization of the subscriber. In the case of conditional probabilities, determinations of the probable content of a program can be applied in a mathematical step to a matrix of conditional probabilities to obtain probabilistic subscriber profiles indicating program and product likes and dislikes as well for determining probabilistic demographic data.
In accordance with the principles of the present invention, the resulting probabilistic information can be stored locally and controlled by the subscriber, or can be transferred to a third party that can provide access to the subscriber characterization. The information can also be encrypted to prevent unauthorized access in which case only the subscriber or someone authorized by the subscriber can access the data.
These and other features and objects of the invention will be more fully understood from the following detailed description of the preferred embodiments which should be read in light of the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS The accompanying drawings, which are incorporated in and form a part of the specification, illustrate the embodiments of the present invention and, together with the description serve to explain the principles of the invention.
In the drawings:
FIG. 1A illustrates a context diagram for a subscriber characterization system having filters;
FIG. 1B illustrates a functional diagram of the processing utilized by filters;
FIG. 2 illustrates a block diagram for a realization of a subscriber monitoring system for receiving video signals;
FIG. 3 illustrates a block diagram of a channel processor;
FIG. 4 illustrates a block diagram of a computer for a realization of the subscriber monitoring system;
FIG. 5 illustrates a channel sequence and volume over a twenty-four (24) hour period;
FIG. 6A illustrates a time of day detailed record;
FIG. 6B illustrates the processing utilized by filters ofFIG. 1A to determine channel surfing activities;
FIG. 6C illustrates the processing utilized by filters ofFIG. 6C to determine channel jumping activities;
FIG. 7 illustrates a household viewing habits statistical table;
FIG. 8A illustrates an entity-relationship diagram for the generation of program characteristics vectors;
FIG. 8B illustrates a flowchart for program characterization;
FIGS. 9A illustrates a deterministic program category vector;
FIG. 9B illustrates a deterministic program sub-category vector;
FIG. 9C illustrates a deterministic program rating vector;
FIG. 9D illustrates a probabilistic program category vector;
FIG. 9E illustrates a probabilistic program sub-category vector;
FIG. 9F illustrates a probabilistic program content vector;
FIG. 10A illustrates a set of logical heuristic rules;
FIG. 10B illustrates a set of heuristic rules expressed in terms of conditional probabilities;
FIG. 11 illustrates an entity-relationship diagram for the generation of program demographic vectors;
FIG. 12 illustrates a program demographic vector;
FIG. 13 illustrates an entity-relationship diagram for the generation of household session demographic data and household session interest profiles;
FIG. 14 illustrates an entity-relationship diagram for the generation of average and session household demographic characteristics;
FIG. 15 illustrates average and session household demographic data;
FIG. 16 illustrates an entity-relationship diagram for generation of a household interest profile; and
FIG. 17 illustrates a household interest profile including programming and product profiles.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT In describing a preferred embodiment of the invention illustrated in the drawings, specific terminology will be used for the sake of clarity. However, the invention is not intended to be limited to the specific terms so selected, and it is to be understood that each specific term includes all technical equivalents that operate in a similar manner to accomplish a similar purpose.
With reference to the drawings, in general, andFIGS. 1 through 17 in particular, the apparatus of the present invention is disclosed.
The present invention is directed at an apparatus for generating a subscriber profile that contains useful information regarding the subscriber likes and dislikes. Such a profile is useful for systems which provide targeted programming or advertisements to the subscriber, and allow material (programs or advertisements) to be directed at subscribers who will have a high probability of liking the program or a high degree of interest in purchasing the product.
Since there are typically multiple individuals in a household, the subscriber characterization may not be a characterization of an individual subscriber but may instead be a household average. When used herein, the term subscriber refers both to an individual subscriber as well as the average characteristics of a household of multiple subscribers.
In the present system the programming viewed by the subscriber, both entertainment and advertisement, can be studied and processed by the subscriber characterization system. In this study, system filters are configured to eliminate selection data associated with irrelevant activities from the actual selection data. The actual selection data is then used to determine the program characteristics. This determination of the program characteristics is referred to as a program characteristics vector. This vector may be a truly one-dimensional vector, but can also be represented as an n dimensional matrix which can be decomposed into vectors.
The subscriber profile vector represents a profile of the subscriber (or the household of subscribers) and can be in the form of a demographic profile (average or session) or a program or product preference vector. The program and product preference vectors are considered to be part of a household interest profile which can be thought of as an n dimensional matrix representing probabilistic measurements of subscriber interests.
In the case that the subscriber profile vector is a demographic profile, the subscriber profile vector indicates a probabilistic measure of the age of the subscriber or average age of the viewers in the household, sex of the subscriber, income range of the subscriber or household, and other such demographic data. Such information comprises household demographic characteristics and is composed of both average and session values. Extracting a single set of values from the household demographic characteristics can correspond to a subscriber profile vector.
The household interest profile can contain both programming and product profiles, with programming profiles corresponding to probabilistic determinations of what programming the subscriber (household) is likely to be interested in, and product profiles corresponding to what products the subscriber (household) is likely to be interested in. These profiles contain both an average value and a session value, the average value being a time average of data, where the averaging period may be several days, weeks, months, or the time between resets of unit.
Since a viewing session is likely to be dominated by a particular viewer, the session values may, in some circumstances, correspond most closely to the subscriber values, while the average values may, in some circumstances, correspond most closely to the household values.
FIG. 1A depicts the context diagram of a preferred embodiment of a Subscriber Characterization System with Filters (SCSF)100. A context diagram, in combination with entity-relationship diagrams, provide a basis from which one skilled in the art can realize the present invention. The present invention can be realized in a number of programming languages including C, C++, Perl, and Java, although the scope of the invention is not limited by the choice of a particular programming language or tool. Object oriented languages have several advantages in terms of construction of the software used to realize the present invention, although the present invention can be realized in procedural or other types of programming languages known to those skilled in the art.
Filters ofSCSF100 may be a computer means or a software module configured with some predetermined rules. These predetermined rules assist in recognizing irrelevant activities and the elimination of the selection data from the raw subscriber selection data. Filters and their related processing are described in detail later.
In the process of collecting raw subscriber selection data, theSCSF100 receives from auser120 commands in the form of avolume control signal124 orprogram selection data122 which can be in the form of a channel change but may also be an address request which requests the delivery of programming from a network address. Arecord signal126 indicates that the programming or the address of the programming is being recorded by the user. Therecord signal126 can also be a printing command, a tape recording command, a bookmark command or any other command intended to store the program being viewed, or program address, for later use.
The material being viewed by theuser120 is referred to assource material130. Thesource material130, as defined herein, is the content that a subscriber selects and may consist of analog video, Motion Picture Expert Group (MPEG) digital video source material, other digital or analog material, Hypertext Markup Language (HTML) or other type of multimedia source material. Thesubscriber characterization system100 can access thesource material130 received by theuser120 using astart signal132 and astop signal134, which control the transfer of sourcerelated text136 which can be analyzed as described herein.
In a preferred embodiment, the sourcerelated text136 can be extracted from thesource material130 and stored in memory. The sourcerelated text136, as defined herein, includes source related textual information including descriptive fields which are related to thesource material130, or text which is part of thesource material130 itself. The sourcerelated text136 can be derived from a number of sources including but not limited to closed-captioning information, Electronic Program Guide (EPG) material, and text information in the source itself (e.g. text in HTML files).
Electronic Program Guide (EPG)140 contains information related to thesource material130 which is useful to theuser120. TheEPG140 is typically a navigational tool which contains source related information including but not limited to the programming category, program description, rating, actors, and duration. The structure and content of EPG data is described in detail in U.S. Pat. No. 5,596,373 assigned to Sony Corporation and Sony Electronics which is herein incorporated by reference. As shown inFIG. 1, theEPG140 can be accessed by theSCSF100 by a request EPG data signal142 which results in the return of acategory144, asub-category146, and aprogram description148.
In one embodiment of the present invention, EPG data is accessed and program information such as thecategory144, thesub-category146, and theprogram description148 are stored in memory.
In another embodiment of the present invention, the sourcerelated text136 is the closed-captioning text embedded in the analog or digital video signal. Such closed-captioning text can be stored in memory for processing to extract the programcharacteristic vectors150.
The rawsubscriber selection data110 is accumulated from the monitored activities of the user. The rawsubscriber selection data110 includestime112A, which corresponds to the time of an event, channel ID114A,program ID116A,program title117A,volume level118A, andchannel change record119A. A detailed record of selection data is illustrated inFIG. 6A.
Generally, the rawsubscriber selection data110 contains the raw data accumulated over a predetermined period of time and relates to viewing selections made by the subscriber over the predetermined period of time. The filters ofSCSF100 evaluate the rawsubscriber selection data110, eliminate any selection data associated with irrelevant activities, and in turn generate actualsubscriber selection data199 that corresponds only to the actual viewing selections made by the subscriber. The actualsubscriber selection data199 comprisestime112B, which corresponds to the time of an actual viewing event exclusive of channel surfing, channel jumping or dead periods, channel ID114B,program ID116B,program title117B,volume level118B, andchannel change record119B.
The rawsubscriber selection data110 may be processed in accordance with some pre-determined heuristic rules to generate actualsubscriber selection data199. In one embodiment, the selection data associated with channel surfing, channel jumping and dead periods is eliminated from the raw subscriber selection data to generate actualsubscriber selection data199.
Based on the actualsubscriber selection data199,SCSF100 generates one or moreprogram characteristics vectors150 which are comprised ofprogram characteristics data152, as illustrated inFIG. 1. Theprogram characteristics data152, which can be used to create theprogram characteristics vectors150 both in vector and table form, are examples of source related information which represent characteristics of the source material. In a preferred embodiment, theprogram characteristics vectors150 are lists of values which characterize the programming (source) material in accordance to thecategory144, thesub-category146, and theprogram description148. The present invention may also be applied to advertisements, in which case program characteristics vectors contain, as an example, a product category, a product sub-category, and a brand name.
As illustrated inFIG. 1A, theSCSF100 usesheuristic rules160. Theheuristic rules160, as described herein, are composed of both logical heuristic rules as well as heuristic rules expressed in terms of conditional probabilities. Theheuristic rules160 can be accessed by theSCSF100 via a request rules signal162 which results in the transfer of a copy ofrules164 to theSCSF100.
TheSCSF100 forms programdemographic vectors170 fromprogram demographics172, as illustrated inFIG. 1A. The programdemographic vectors170 also represent characteristics of source related information in the form of the intended or expected demographics of the audience for which the source material is intended.
In a preferred embodiment,household viewing data197, as illustrated inFIG. 1A, is computed from the actualsubscriber selection data199. Thehousehold viewing data197 is derived from the actualsubscriber selection data199 by looking at viewing habits at a particular time of day over an extended period of time, usually several days or weeks, and making some generalizations regarding the viewing habits during that time period. TheSCSF100 also transformshousehold viewing data197 to formhousehold viewing habits195, i.e. statistical representation of subscriber/household viewing data illustrating patterns in viewing.
Theprogram characteristics vector150 is derived from the sourcerelated text136 and/or from theEPG140 by applying information retrieval techniques. The details of this process are discussed in accordance withFIG. 8.
Theprogram characteristics vector150 is used in combination with a set of theheuristic rules160 to define a set of the programdemographic vectors170 illustrated inFIG. 1A describing the audience the program is intended for.
One output of theSCSF100 is a household profile including householddemographic characteristics190 and ahousehold interest profile180. The householddemographic characteristics190 resulting from the transfer of householddemographic data192, and thehousehold interest profile180, resulting from the transfer ofhousehold interests data182. Both thehousehold demographics characteristics190 and thehousehold interest profile180 have a session value and an average value, as will be discussed herein.
Referring now toFIG. 1B, exemplary processing of Filters is shown. As mentioned before, filters150 evaluate thesubscriber selection data110 to determine any data associated with irrelevant selection activities and then generate actualsubscriber selection data199 which does not include irrelevant selection data. The irrelevant selection data generally corresponds to channel surfing, channel jumping, or dead periods activities. These activities are generally recognized by reviewing corresponding viewing times. In the case of channel surfing or channel jumping, the associated viewing times are very brief, a few milliseconds or a few seconds. In the case of dead periods, the viewing time is relatively long having no actions, e.g., a few hours.
The monitoring system depicted inFIG. 2 is responsible for monitoring the subscriber activities, and can be used to realize theSCSF100. In a preferred embodiment, the monitoring system ofFIG. 2 is located in a television set-top device or in the television itself. In an alternate embodiment, the monitoring system is part of a computer which receives programming from a network.
In an application of the system for television services, aninput connector220 accepts the video signal coming either from an antenna, cable television input, or other network. The video signal can be analog or Digital MPEG. Alternatively, the video source may be a video stream or other multimedia stream from a communications network including the Internet.
As illustrated inFIG. 2, asystem control unit200 receives commands from theuser120, decodes the command and forwards the command to the destined module. In a preferred embodiment, the commands are entered via a remote control to aremote receiver205 or a set of selection buttons207 available at the front panel of thesystem control unit200. In an alternate embodiment, the commands are entered by theuser120 via a keyboard.
Thesystem control unit200 also contains a Central Processing Unit (CPU)203 for processing and supervising all of the operations of thesystem control unit200, a Read Only Memory (ROM)202 containing the software and fixed data, a Random Access Memory (RAM)204 for storing data.CPU203,RAM204,ROM202, and I/O controller201 are attached to amaster bus206. A power supply in a form of battery can also be included in thesystem control unit200 for backup in case of power outage.
An input/output (I/O)controller201 interfaces thesystem control unit200 with external devices. In a preferred embodiment, the I/O controller201 interfaces to theremote receiver205 and a selection button such as the channel change button on a remote control. In an alternate embodiment, it can accept input from a keyboard or a mouse.
Theprogram selection data122 is forwarded to achannel processor210. Thechannel processor210 tunes to a selected channel and the media stream is decomposed into its basic components: the video stream, the audio stream, and the data stream. The video stream is directed to avideo processor module230 where it is decoded and further processed for display to the TV screen. The audio stream is directed to anaudio processor240 for decoding and output to the speakers.
The data stream can be EPG data, closed-captioning text, Extended Data Service (EDS) information, a combination of these, or an alternate type of data. In the case of EDS the call sign, program name and other useful data are provided. In a preferred embodiment, the data stream is stored in a reserved location of theRAM204. In an alternate embodiment, a magnetic disk is used for data storage. Thesystem control unit200 writes also in a dedicated memory, which in a preferred embodiment is theRAM204, the selected channel, thetime112A of selection, thevolume level118A and theprogram ID116A and theprogram title117A. Upon receiving theprogram selection data122, the new selected channel is directed to thechannel processor210 and thesystem control unit200 writes to the dedicated memory the channel selection end time and theprogram title117A at thetime112A of channel change. Thesystem control unit200 keeps track of the number of channel changes occurring during the viewing time via thechannel change record119A. This data forms part of the rawsubscriber selection data110.
The volume control signal124A is sent to theaudio processor240. In a preferred embodiment, thevolume level118A selected by theuser120 corresponds to the listening volume. In an alternate embodiment, thevolume level118A selected by theuser120 represents a volume level to another piece of equipment such as an audio system (home theatre system) or to the television itself. In such a case, the volume can be measured directly by a microphone or other audio sensing device which can monitor the volume at which the selected source material is being listened.
A program change occurring while watching a selected channel is also logged by thesystem control unit200. Monitoring the content of the program at the time of the program change can be done by reading the content of the EDS. The EDS contains information such as the program title, which is transmitted via the VBI. A change on the program title field is detected by the monitoring system and logged as an event. In an alternate embodiment, an EPG is present and program information can be extracted from the EPG. In a preferred embodiment, the programming data received from the EDS or EPG permits distinguishing between entertainment programming and advertisements.
FIG. 3 shows the block diagram of thechannel processor210. In a preferred embodiment, theinput connector220 connects to atuner300 which tunes to the selected channel. A local oscillator can be used to heterodyne the signal to the IF signal. Ademodulator302 demodulates the received signal and the output is fed to anFEC decoder304. The data stream received from theFEC decoder304 is, in a preferred embodiment, in an MPEG format. In a preferred embodiment,system demultiplexer306 separates out video and audio information for subsequent decompression and processing, as well as ancillary data which can contain program related information.
The data stream presented to thesystem demultiplexer306 consists of packets of data including video, audio and ancillary data. The system demultiplexer306 identifies each packet from the stream ID and directs the stream to the corresponding processor. The video data is directed to thevideo processor module230 and the audio data is directed to theaudio processor240. The ancillary data can contain closed-captioning text, emergency messages, program guide, or other useful information.
Closed-captioning text is considered to be ancillary data and is thus contained in the video stream. The system demultiplexer306 accesses the user data field of the video stream to extract the closed-captioning text. The program guide, if present, is carried on data stream identified by a specific transport program identifier.
In an alternate embodiment, analog video can be used. For analog programming, ancillary data such as closed-captioning text or EDS data are carried in a vertical blanking interval.
FIG. 4 shows the block diagram of a computer system for a realization of the subscriber monitoring system based on the reception of multimedia signals from a bi-directional network. Asystem bus422 transports data amongst theCPU203, theRAM204, Read Only Memory—Basic Input Output System (ROM-BIOS)406 and other components. TheCPU203 accesses ahard drive400 through adisk controller402. The standard input/output devices are connected to thesystem bus422 through the I/O controller201. A keyboard is attached to the I/O controller201 through akeyboard port416 and the monitor is connected through amonitor port418. The serial port device uses aserial port420 to communicate with the I/O controller201. Industry Standard Architecture (ISA)expansion slots408 and Peripheral Component Interconnect (PCI)expansion slots410 allow additional cards to be placed into the computer. In a preferred embodiment, a network card is available to interface a local area, wide area, or other network.
FIG. 5 illustrates a channel sequence and volume over a twenty-four(24) hour period. The Y-axis represents the status of the receiver in terms of on/off status and volume level. The X-axis represents the time of day. The channels viewed are represented by the windows501-506, with afirst channel502 being watched followed by the viewing of asecond channel504, and athird channel506 in the morning. In the evening afourth channel501 is watched, afifth channel503, and asixth channel505. A channel change is illustrated by a momentary transition to the “off” status and a volume change is represented by a change of level on the Y-axis.
A detailed record of the rawsubscriber selection data110 is illustrated inFIG. 6A in a table format. Atime column602 contains the starting time of every event occurring during the viewing time. AChannel ID column604 lists the channels viewed or visited during that period. Aprogram title column603 contains the titles of all programs viewed. Avolume column601 contains the volume level118 at the time112 of viewing a selected channel.
Generally, the rawsubscriber selection data110 is unprocessed data and comprises the data associated with irrelevant or inconsequential activities, e.g., channel surfing, channel jumping, or dead activities. Thus, before subscriber/household viewing habits195 are determined, the rawsubscriber selection data110 is filtered to eliminate the data associated with irrelevant (inconsequential) activities such as channel surfing, channel jumping, or dead period activities.
As illustrated inFIG. 6B, the channel surfing relates to an activity wherein the subscriber rapidly changes channels before arriving at a channel which may be of interest to him. During the channel surfing period, the viewing time of each intermediate channel is very brief, e.g., less than one minute. In this viewing time, the subscriber briefly glances at the channel programming, and then moves on to the next channel.
One ormore filters115 of the present invention are configured to filter out the surfing activity and only the actual viewing activity is considered in the actual make-up of household viewing habits. For example, inFIG. 6B, the viewing record illustrates that the viewing time of each of thechannels2,3,4,5 is less than a minute, however, the viewing time ofchannel6 is about an hour.Filter115 of the present invention evaluates this record, and then removes the corresponding viewing times ofchannel2,3,4,5 from the viewing records. The viewing time ofchannel number6 is kept as it is not indicative of the channel surfing, but of an actual viewing.
Similarly, the viewing record also indicates that the corresponding viewing times of each ofchannel numbers7,8,9,58,57,56,55,54,53 are about minute or less, however, the viewing time ofchannel25 is about 10 minutes. This implies that after the subscriber had completed the viewing ofchannel number6, the subscriber once again surfed the channels to find a programming of interest atchannel25.
Filters115 of the present invention are configured to evaluate the associated viewing times and to remove the data associated with the most of the channel surfing activities. For example, the viewing times of thechannel numbers7,8,9,58,57,56,55,54, and53 are removed, but, the viewing time associated withchannel number25 is kept. Similarly, the viewing times associated withchannels24,23,99,98,97, and2 are eliminated (indicate channel surfing) and the viewing time ofchannel number3 is kept.
FIG. 6C illustrates processing involved in the elimination of viewing times associated with the channel jumping activities. The channel jumping activity is different than a channel surfing activity in a sense that the subscriber already knows the intended programming (and corresponding channel number) he wants to watch, and utilizes the channel up or channel down button to arrive at the intended channel.
The viewing time of all the intermediate channels during channel jumping activity are generally very brief (less than a second). Also, as the channel up or channel down button is utilized to reach the desired channels, generally, there exists an upwards or a downwards stream of channel changes, i.e., subscriber may jump throughchannels2,3,4 and5 to reach channel number6 (an intended channel). Similarly, subscriber jumps may throughchannel7,8,9,1,11,12,13,14,15, and16 to reachchannel17.
Filters115 of the present invention are configured to eliminate the channel jumping data from the actual viewing data. Filters generally evaluate the associated viewing times, and all the viewing times which correspond to channel jumping, e.g., are less than one second, are removed from the viewing records. In the exemplary case ofFIG. 6C, the viewing times ofchannel15, and14 are removed, but the viewing time ofchannel13 is kept. Similarly, the viewing times ofchannel14,15,16,17,18,19,20,21 are removed and the viewing time ofchannel22 is kept.
Filters115 are also configured to eliminate data associated with dead activities, e.g., extended spans of inactivity. These extended spans of inactivity indicate that the subscriber is not actively watching the programming, e.g., the subscriber has left the room, has gone to sleep, or is otherwise engaged in some other activity. These spans of inactivity may be determined by evaluating channel change commands, volume change commands, or other program selection commands issued by the subscriber. For example, if the evaluation of the viewing record indicates that the subscriber has not issued either of the channel change, volume change, on/off, or any other program selection command in last three hours, it is assumed that subscriber is in an inactive condition, and the remaining viewing time of that viewing session is not considered in the make-up of thehousehold viewing habits195. The spans of inactivity may be caused by many reasons, e.g., the subscriber has gone to sleep or has dozed off, or the subscriber is actively engaging in another activity and is not attending to the programming. Also,it is generally known that subscribers often do not turn their televisions and other multimedia sources off before attending to some other activities, e.g. cooking in the kitchen, make a run to the nearby grocery store, or going to basement for a work-out, etc.
Thefilters115 of the present invention are constantly filtering out the irrelevant information associated with the channel surfing activities, channel jumping activities, or with the periods of inactivity, so that the data used for generating household viewing habits is more illustrative of the actual viewing habits. The actual subscriber selection data is then used to create household viewing habits.
A representative statistical record corresponding to thehousehold viewing habits195 is illustrated inFIG. 7. In a preferred embodiment, a time ofday column700 is organized in period of time including morning, mid-day, afternoon, night, and late night. In an alternate embodiment, smaller time periods are used.Column702 lists the number of minutes watched in each period. The average number of channel changes during that period are included incolumn704. The average volume is also included incolumn706. The last row of the statistical record contains the totals for the items listed in the minutes watchedcolumn702, thechannel changes column704 and theaverage volume706.
FIG. 8A illustrates an entity-relationship diagram for the generation of theprogram characteristics vector150. The context vector generation and retrieval technique described in U.S. Pat. No. 5,619,709, which is incorporated herein by reference, can be applied for the generation of theprogram characteristics vectors150. Other techniques are well known by those skilled in the art.
Referring toFIG. 8A, thesource material130 or theEPG140 are passed through aprogram characterization process800 to generate theprogram characteristics vectors150. Theprogram characterization process800 is described in accordance withFIG. 8B. Program content descriptors including a firstprogram content descriptor802, a secondprogram content descriptor804 and an nthprogram content descriptor806, each classified in terms of thecategory144, thesub-category146, and other divisions as identified in the industry accepted program classification system, are presented to acontext vector generator820. As an example, the program content descriptor can be text representative of the expected content of material found in theparticular program category144. In this example, theprogram content descriptors802,804 and806 would contain text representative of what would be found in programs in the news, fiction, and advertising categories respectively. Thecontext vector generator820 generates context vectors for that set of sample texts resulting in a firstsummary context vector808, a secondsummary context vector810, and an nthsummary context vector812. In the example given, thesummary context vectors808,810, and812 correspond to the categories of news, fiction and advertising respectively. The summary vectors are stored in a local data storage system.
Referring toFIG. 8B, a sample of the sourcerelated text136 which is associated with the new program to be classified is passed to thecontext vector generator820 which generates aprogram context vector840 for that program. The sourcerelated text136 can be either thesource material130, theEPG140, or other text associated with the source material. A comparison is made between the actual program context vectors and the stored program content context vectors by computing, in a dotproduct computation process830, the dot product of the firstsummary context vector808 with theprogram context vector840 to produce afirst dot product814. Similar operations are performed to producesecond dot product816 andnth dot product818.
The values contained in thedot products814,816 and818, while not probabilistic in nature, can be expressed in probabilistic terms using a simple transformation in which the result represents a confidence level of assigning the corresponding content to that program. The transformed values add up to one. The dot products can be used to classify a program, or form a weighted sum of classifications which results in theprogram characteristics vectors150. In the example given, if the sourcerelated text136 was from an advertisement, thenth dot product818 would have a high value, indicating that the advertising category was the most appropriate category, and assigning a high probability value to that category. If the dot products corresponding to the other categories were significantly higher than zero, those categories would be assigned a value, with the result being theprogram characteristics vectors150 as shown inFIG. 9D.
For the sub-categories, probabilities obtained from the content pertaining to thesame sub-category146 are summed to form the probability for the new program being in thatsub-category146. At the sub-category level, the same method is applied to compute the probability of a program being from the givencategory144. The three levels of the program classification system; thecategory144, thesub-category146 and the content, are used by theprogram characterization process800 to form theprogram characteristics vectors150 which are depicted inFIGS. 9D-9F.
Theprogram characteristics vectors150 in general are represented inFIGS. 9A through 9F.FIGS. 9A, 9B and9C are an example of deterministic program vectors. This set of vectors is generated when the program characteristics are well defined, as can occur when the sourcerelated text136 or theEPG140 contains specific fields identifying thecategory144 and thesub-category146. A program rating can also provided by theEPG140.
In the case that these characteristics are not specified, a statistical set of vectors is generated from the process described in accordance withFIG. 8.FIG. 9D shows the probability that a program being watched is from the givencategory144. The categories are listed in the X-axis. Thesub-category146 is also expressed in terms of probability. This is shown inFIG. 9E. The content component of this set of vectors is a third possible level of the program classification, and is illustrated inFIG. 9F.
FIG. 10A illustrates sets of logical heuristics rules which form part of theheuristic rules160. In a preferred embodiment, logical heuristic rules are obtained from sociological or psychological studies. Two types of rules are illustrated inFIG. 10A. The first type links an individual's viewing characteristics to demographic characteristics such as gender, age, and income level. A channel changingrate rule1030 attempts to determine gender based on channel change rate. An income related channelchange rate rule1010 attempts to link channel change rates to income brackets. A second type of rules links particular programs to particular audience, as illustrated by agender determining rule1050 which links theprogram category144/sub-category146 with a gender. The result of the application of the logical heuristic rules illustrated inFIG. 10A are probabilistic determinations of factors including gender, age, and income level. Although a specific set of logical heuristic rules has been used as an example, a wide number of types of logical heuristic rules can be used to realize the present invention. In addition, these rules can be changed based on learning within the system or based on external studies which provide more accurate rules.
FIG. 10B illustrates a set of theheuristic rules160 expressed in terms of conditional probabilities. In the example shown inFIG. 10B, thecategory144 has associated with it conditional probabilities for demographic factors such as age, income, family size and gender composition. Thecategory144 has associated with it conditional probabilities that represent probability that the viewing group is within a certain age group dependent on the probability that they are viewing a program in thatcategory144.
FIG. 11 illustrates an entity-relationship diagram for the generation of the programdemographic vectors170. In a preferred embodiment, theheuristic rules160 are applied along with the programcharacteristic vectors150 in a programtarget analysis process1100 to form the programdemographic vectors170. The programcharacteristic vectors150 indicate a particular aspect of a program, such as its violence level. Theheuristic rules160 indicate that a particular demographic group has a preference for that program. As an example, it may be the case that young males have a higher preference for violent programs than other sectors of the population. Thus, a program which has the programcharacteristic vectors150 indicating a high probability of having violent content, when combined with theheuristic rules160 indicating that “young males like violent programs,” will result, through the programtarget analysis process1100, in the programdemographic vectors170 which indicate that there is a high probability that the program is being watched by a young male.
The programtarget analysis process1100 can be realized using software programmed in a variety of languages which processes mathematically theheuristic rules160 to derive the programdemographic vectors170. The table representation of theheuristic rules160 illustrated inFIG. 10B expresses the probability that the individual or household is from a specific demographic group based on a program with aparticular category144. This can be expressed, using probability terms as follow “the probability that the individuals are in a given demographic group conditional to the program being in a given category”. Referring toFIG. 12, the probability that the group has certain demographic characteristics based on the program being in a specific category is illustrated.
Expressing the probability that a program is destined to a specific demographic group can be determined by applying Bayes rule. This probability is the sum of the conditional probabilities that the demographic group likes the program, conditional to thecategory144 weighted by the probability that the program is from thatcategory144. In a preferred embodiment, the program target analysis can calculate the program demographic vectors by application of logical heuristic rules, as illustrated inFIG. 10A, and by application of heuristic rules expressed as conditional probabilities as shown inFIG. 10B. Logical heuristic rules can be applied using logical programming and fuzzy logic using techniques well understood by those skilled in the art, and are discussed in the text by S. V. Kartalopoulos entitled “Understanding Neural Networks and Fuzzy Logic” which is incorporated herein by reference.
Conditional probabilities can be applied by simple mathematical operations multiplying program context vectors by matrices of conditional probabilities. By performing this process over all the demographic groups, the programtarget analysis process1100 can measure how likely a program is to be of interest to each demographic group. Those probabilities values form the programdemographic vector170 represented inFIG. 12.
As an example, the heuristic rules expressed as conditional probabilities shown inFIG. 10B are used as part of a matrix multiplication in which theprogram characteristics vector150 of dimension N, such as those shown inFIGS. 9A-9F is multiplied by an N×M matrix of heuristic rules expressed as conditional probabilities, such as that shown inFIG. 10B. The resulting vector of dimension M is a weighted average of the conditional probabilities for each category and represents thehousehold demographic characteristics190. Similar processing can be performed at the sub-category and content levels.
FIG. 12 illustrates an example of the programdemographic vector170, and shows the extent to which a particular program is destined to a particular audience. This is measured in terms of probability as depicted inFIG. 12. The Y-axis is the probability of appealing to the demographic group identified on the X-axis.
FIG. 13 illustrates an entity-relationship diagram for the generation of household sessiondemographic data1310 and householdsession interest profile1320. In a preferred embodiment, the actualsubscriber selection data199 is used along with theprogram characteristics vectors150 in asession characterization process1300 to generate the householdsession interest profile1320. Thesubscriber selection data110 indicates what the subscriber is watching, for how long and at what volume they are watching the program.
In a preferred embodiment, thesession characterization process1300 forms a weighted average of theprogram characteristics vectors150 in which the time duration the program is watched is normalized to the session time (typically defined as the time from which the unit was turned on to the present). Theprogram characteristics vectors150 are multiplied by the normalized time duration (which is less than one unless only one program has been viewed) and summed with the previous value. Time duration data, along with other subscriber viewing information, is available from thesubscriber selection data110. The resulting weighted average of program characteristics vectors forms the householdsession interest profile1320, with each program contributing to the householdsession interest profile1320 according to how long it was watched. The householdsession interest profile1320 is normalized to produce probabilistic values of the household programming interests during that session.
In an alternate embodiment, theheuristic rules160 are applied to both the actualsubscriber selection data199 and theprogram characteristics vectors150 to generate the household sessiondemographic data1310 and the householdsession interest profile1320. In this embodiment, weighted averages of theprogram characteristics vectors150 are formed based on the actualsubscriber selection data199, and theheuristic rules160 are applied. In the case of logical heuristic rules as shown inFIG. 10A, logical programming can be applied to make determinations regarding the household sessiondemographic data1310 and the householdsession interest profile1320. In the case of heuristic rules in the form of conditional probabilities such as those illustrated inFIG. 10B, a dot product of the time averaged values of the program characteristics vectors can be taken with the appropriate matrix of heuristic rules to generate both the household sessiondemographic data1310 and the householdsession interest profile1320.
Volume control measurements which form part of the actualsubscriber selection data199 can also be applied in thesession characterization process1300 to form a householdsession interest profile1320. This can be accomplished by using normalized volume measurements in a weighted average manner similar to how time duration is used. Thus, muting a show results in a zero value for volume, and theprogram characteristics vector150 for this show will not be averaged into the householdsession interest profile1320.
FIG. 14 illustrates an entity-relationship diagram for the generation of average household demographic characteristics and session householddemographic characteristics190. A householddemographic characterization process1400 generates the householddemographic characteristics190 represented in table format inFIG. 15. The householddemographic characterization process1400 uses thehousehold viewing habits195 in combination with theheuristic rules160 to determine demographic data. For example, a household with a number of minutes watched of zero during the day may indicate a household with two working adults. Both logical heuristic rules as well as rules based on conditional probabilities can be applied to thehousehold viewing habits195 to obtain thehousehold demographics characteristics190.
Thehousehold viewing habits195 is also used by the system to detect out-of-habits events. For example, if a household with a zero value for the minutes watchedcolumn702 at late night presents a session value at that time via the household sessiondemographic data1310, this session will be characterized as an out-of-habits event and the system can exclude such data from the average if it is highly probable that the demographics for that session are greatly different than the average demographics for the household. Nevertheless, the results of the application of the householddemographic characterization process1400 to the household sessiondemographic data1310 can result in valuable session demographic data, even if such data is not added to the average demographic characterization of the household.
FIG. 15 illustrates the average and session household demographic characteristics. A householddemographic parameters column1501 is followed by an average value column1505, asession value column1503, and anupdate column1507. The average value column1505 and thesession value column1503 are derived from the householddemographic characterization process1400. The deterministic parameters such as address and telephone numbers can be obtained from an outside source or can be loaded into the system by the subscriber or a network operator at the time of installation. Updating of deterministic values is prevented by indicating that these values should not be updated in theupdate column1507.
FIG. 16 illustrates an entity-relationship diagram for the generation of thehousehold interest profile180 in a household interestprofile generation process1600. In a preferred embodiment, the household interest profile generation process comprises averaging the householdsession interest profile1320 over multiple sessions and applying thehousehold viewing habits195 in combination with theheuristic rules160 to form thehousehold interest profile180 which takes into account both the viewing preferences of the household as well as assumptions about households/subscribers with those viewing habits and program preferences.
FIG. 17 illustrates thehousehold interest profile180 which is composed of a programming types row1709, a products typesrow1707, and ahousehold interests column1701, anaverage value column1703, and asession value column1705.
The product types row1707 gives an indication as to what type of advertisement the household would be interested in watching, thus indicating what types of products could potentially be advertised with a high probability of the advertisement being watched in its entirety. The programming types row1709 suggests what kind of programming the household is likely to be interested in watching. The household interestscolumn1701 specifies the types of programming and products which are statistically characterized for that household.
As an example of the industrial applicability of the invention, a household will perform its normal viewing routine without being requested to answer specific questions regarding likes and dislikes. Children may watch television in the morning in the household, and may change channels during commercials, or not at all. The television may remain off during the working day, while the children are at school and day care, and be turned on again in the evening, at which time the parents may “surf” channels, mute the television during commercials, and ultimately watch one or two hours of broadcast programming. The present invention provides the ability to characterize the household based on actual viewing selections, e.g., channel surfing, channel jumping or dead periods are not considered. Based on the actual subscriber selection data, the determinations are made that there are children and adults in the household, and program and product interests indicated in thehousehold interest profile180 corresponds to a family of that composition. For example, a household with two retired adults will have a completely different characterization which will be indicated in thehousehold interest profile180.
Although this invention has been illustrated by reference to specific embodiments, it will be apparent to those skilled in the art that various changes and modifications may be made which clearly fall within the scope of the invention. The invention is intended to be protected broadly within the spirit and scope of the appended claims.