RELATED APPLICATIONThe present application is a continuation to and claims the benefit of priority to U.S. patent application Ser. No. 11/403,640, filed Apr. 13, 2006 which is hereby incorporated by reference in its entirety for all purposes as if fully set forth herein.
FIELD OF THE INVENTIONThe present invention relates generally to content delivery and, in particular, to a method and apparatus for delivering encoded content.
BACKGROUND OF THE INVENTIONThe digital representation of content is known. Content includes, but is not limited to, music, video, program code, text and graphical documents, images, interactive presentations, etc. The content is generally encoded in accordance with a pre-set standard to create encoded content, such as a file or streaming media. Each standard generally specifies a protocol for encoding content such that it may be stored or transmitted, and a protocol for decoding content that has been encoded to reconstruct the content for playback. These standards are known as document types, and may involve protocols called codecs. The encoded content can be stored on digital media such as a hard disk drive, a floppy disk, an optical media disk, flash memory, volatile memory or, alternatively, can be transmitted via a communications network. As both storage and network bandwidth have associated costs, such codecs are generally designed to compress the digital representation of the content while maintaining a desired level of quality. For music, a number of codecs exist, including MP3, AAC and WAV. Similarly, for still images, the codecs include, but are not limited to, JPEG, GIF, PNG and TIFF. A number of codecs exist for video, including MPEG-2, MPEG-4, AVI and WMV. Similarly, other encoding schemes for other types of computer-readable content exist, such as plain text, document files (e.g., Microsoft Word and Excel documents), program files (e.g., executables and dynamic link libraries) and interactive media (e.g., Microsoft PowerPoint and Macromedia Shockwave files).
A feature of encoded content is that there are currently few limitations on the ability to reproduce any number of identical copies of the encoded content and distribute it freely without identifying the source of the copies. This ability to make unlicensed copies of encoded content is an artifact of how computers operate. The replication of encoded content is also an artifact of how the majority of communications networks are designed, in that they reproduce data transmitted across them, regardless of the type of data.
In many circumstances, it is desired by the owners of encoded content to limit its unauthorized access, copying and dissemination. There are, however, no widely-implemented mechanisms in current communications protocols or hardware that control the authorized use of the data being processed. That is, computers that use communications protocols to connect to networks, such as the Internet, and that exchange encoded content, do not have sufficient logical controls automatically to determine the proprietorship, source or rights associated with the encoded content that is being processed. hence the complete inability to govern its distribution.
In order to control the distribution and use of such encoded content, some content proprietors have implemented digital rights management systems. Such systems store encoded content in an encrypted digital format that corresponds to an encryption key, and rely on a non-standard application to decrypt the encoded content at the time of presentation or playback. These systems, however, require the use of specialized players and/or content viewers, hereinafter referred to as “decoders”, that are capable of decrypting the encoded content, thereby limiting the selection of decoders available to end-users and/or causing compatibility issues. For example, music content licensed to a person by a proprietor employing a particular digital rights management scheme may only be accessible via a particular decoder application on a particular operating system, and may not be decodable via a traditional hardware appliance, such as a compact disk player.
Further, as the encoded content in such digital rights management systems includes the totality of the data representing the content, successful cryptanalytic attacks can ultimately permit access to the entire content.
There are a number of other schemes for restricting access to content which require specialized devices such as a non-standard decoders or physical hardware. Still others employ traditional cryptography and are therefore vulnerable to the same class of cryptanalytic attacks against their restriction mechanisms because the attacker has the entirety of the encoded content. In these cases, only the computational problem of generating the correct decryption key needs to be solved in order to unrestrict the entirety of the content, which can then be copied in a repudiable manner.
Current systems for distributing authorized encoded content may include license identifiers only as a non-essential part or extension of the functional part of the data structure of an encoding scheme, like a credit at the end of a film, or a notice in the headers of a file. As the content itself is not affected, however, the content can be separated reasonably easily from the marked encoded content.
File-sharing services, such as Internet-based Kazaa and Limewire, are under pressure from content proprietors to distinguish between encoded content which may be shared freely from that for which the proprietors' permission is required. Without the ability to preclude the unauthorized distribution of encoded content, such services risk liability.
Since existing popular codecs for encoded content do not have standardized features to identify' the terms of authorized use and/or the individual licensed to use the encoded content, there is no consistent or automatic way for file-sharing services to assess whether or not the sharing of encoded content via their networks is unauthorized.
Ultimately, in these cases, since the end-user possesses the entirety of the content, the content can potentially be decrypted and copied or distributed in their entirety in a repudiable manner.
SUMMARY OF THE INVENTIONIn an aspect of the invention, there is provided a method of delivering encoded content. comprising: extracting a holdback representing a portion of said encoded content, thereby damaging said encoded content; distributing said damaged encoded content; and transmitting said holdback to enable reintegration of said holdback with said damaged encoded content to restore said encoded content.
The holdback can be modified such that a distinct copy of the encoded content is generated when the holdback is reintegrated with the damaged encoded content. The holdback can be modified to include steganographically-embedded information. The holdback can be modified such that when the holdback is reintegrated into the damaged encoded content, the quality of the encoded content is not significantly decreased. The holdback can be modified to include information identifying an authorized end-user, such as by identifying a record in a license database identifying said authorized end-user. The license database can also identify distribution rights for the encoded content for the authorized end-user. The incomplete content can be encrypted prior to extraction of the holdback, such as with a cipher block chain method.
The extracting can comprise extracting a plurality of portions of the encoded content collectively comprising the holdback. The distributing can comprise distributing an optical media disk including the damaged encoded content. Alternatively, the damaged encoded content can be distributed over a communication network.
The holdback can be transmitted over a communication network.
The damaged content can be encrypted using a cipher block chain method, and a holdback can be extracted from the encrypted damaged encoded content.
In accordance with another aspect of the invention, there is provided a method of delivering encoded content, comprising:
- generating a unique identifier;
- embedding said unique identifier in encoded content selected from a library of encoded content: and
- transmitting said encoded content to an end-user associated with the unique identifier.
The end-user can be authorized to access the encoded content prior to generation of the unique identifier, which can identify the end-user.
The unique identifier can identify a record in a license database identifying the authorized end-user.
In accordance with a further aspect of the invention, there is provided a method of delivering encoded content, comprising: dividing said encoded content into segments; generating a set of distinct instances of at least one of said segments; selecting one of said distinct instances from each set of distinct instances; and reassembling said encoded content using said selected distinct instances.
A set of distinct instances can be generated for each of the segments. The set of distinct instances can be encrypyted, and the selected distinct instances can be decrypted before reassembly.
The distinct instances can be delivered to a client prior to selection, and the selected distinct instances can be communicated to the client after selection, wherein the client performs the reassembly.
In accordance with yet another aspect of the invention, there is provided a method of authenticating encoded content, comprising: analyzing encoded content to determine at least one characteristic thereof; transmitting said at least one characteristic; and receiving authentication of said encoded content if said encoded content is authenticated.
A hash calculation can be performed on at least a portion of the encoded content. The header of the encoded content can be parsed to read an identifier therein. The encoded content can be parsed to locate a brand, and the encoded content can be analyzed to read an identifier if the brand is present in the encoded content.
In yet another aspect of the invention, there is provided a computer-readable medium, comprising: encoded content having an identifier embedded therein, said identifier corresponding with a record in a license database identifying the end-user to which said encoded content is licensed.
The identifier can be steganographically embedded in the encoded content. Alternatively, the identifier can be embedded in the header of the encoded content.
Further, the identifier can be embedded in the content prior to its encoding. Still further, the identifier can be determined by performing a hash calculation of at least a portion of the encoded content.
A method is provided for customizing encoded content such that it becomes unique to the end-user. The encoded content enables authorized extraction of information that could identify the end-user, or identify the content as having been tampered with in an unauthorized way.
In accordance with yet another aspect of the invention, there is provided a process for encrypting and encoding uniquely identifiable bits of data into the file format of a digital media file such that they are not obvious or identifiable by the end-user using intended means of access, and not extractable by those who would attempt a cryptanalytic attack, yet easily identifiable by the content owner or licensed distributor of the content, or other authorized monitor.
In accordance with yet another aspect of the invention, there is provided a distribution system that encodes, distributes, and licenses encoded content by pre-distributing or delivering only a portion of the encoded content, while withholding the remainder of it, ensuring that the end-user must obtain the withheld portion to make the content in the file usable. Information identifying the end-user and possibly other attributes is obtained and embedded into the withheld portion before the withheld portion is reintegrated with the remainder of the encoded content to generate a reconstituted and usable file.
BRIEF DESCRIPTION OF THE DRAWINGSEmbodiments will now be described, by way of example only, with reference to the attached Figures, wherein:
FIG. 1 shows a system for delivering encoded content in accordance with an embodiment of the invention;
FIG. 2 is a flowchart of the method of delivering encoded content using the system ofFIG. 1;
FIG. 3 is a flowchart showing the steps performed during extraction of a holdback from encoded content in the method ofFIG. 2;
FIG. 4 is an abstract datagram of an MPEG-2 file;
FIG. 5 is a holdback reintegration map determined for a video encoded using MPEG-2;
FIG. 6 is an abstract schematic representation of the extraction of the holdback from the MPEG-2 encoded file ofFIG. 4 corresponding to the set of re-assembly instructions ofFIG. 5;
FIG. 7 is a flowchart of the steps performed during the encrypting of encoded content:
FIG. 8 is a flowchart of the steps performed during the requesting of access to encoded content;
FIG. 9 is a window presenting a list of encoded content that can be selected for accessing;
FIG. 10 shows the system ofFIG. 1 with additional components for operating a peer-to-peer file-sharing network;
FIG. 11 is a flowchart showing the method of checking authorization to access encoded content in the system ofFIG. 10;
FIG. 12 illustrates the steps performed during checking of whether the encoded content is licensed in the method ofFIG. 11;
FIG. 13 shows the method of delivering encoded content n accordance with another embodiment;
FIG. 14 shows the steps performed during analysis of the encoded content in the method ofFIG. 13;
FIG. 15A is an abstract diagram of encoded content;
FIG. 15B illustrates segmentation of the encoded content ofFIG. 15A;
FIG. 16 illustrates a number of instances of the segments of the encoded content ofFIG. 15B;
FIG. 17 illustrates the extraction of holdbacks from the instances ofFIG. 16;
FIG. 18 shows an exemplary content map;
FIG. 19 is a flowchart of the steps performed during the decrypting, reconstituting and reassembling of encoded content in the method ofFIG. 13; and
FIG. 20 shows the encoded content after reintegration of the holdback and assembly from the segment instances.
DETAILED DESCRIPTION OF THE EMBODIMENTSTurning now toFIG. 1, a system for delivering encoded content in accordance with an embodiment of the invention is shown generally at24. In this embodiment, critical data is extracted from the encoded content, thereby damaging the ability to decode the content. The encoded content is then encrypted using a cipher block chain approach and, once again, critical data is extracted from the encrypted incomplete encoded content such that decryption of the encoded content is made infeasible. The encoded content is thus “deconstituted”, and can only be “reconstituted” when the critical data is reintegrated. The deconstituted encoded content is then pre-delivered to an end-user. Upon licensing the encoded content, the critical data initially extracted from the encoded content is modified such that when the critical data is reintegrated with the incomplete encoded content, a distinct copy of the encoded content is generated. The critical data is then delivered to the end-user thereby enabling reconstitution of the encoded content.
Encoded content is delivered in the form of anoptical media disk28 to an end-user32, who can read theoptical media disk28 using a personal computer (“PC”)36. The optical media disk includes one or more files that represent “deconstituted” encoded content; that is, the encoded and possibly encrypted content from which has been extracted critical data, thereby damaging its utility and decryption, if applicable.PC36 is in communication with anauthorization server40 via theInternet44.
Theauthorization server40 is an enterprise-level server that includes web server and database server functionality. Theauthorization server40 manages an end-user database48, acontent completion database52. asecurity database56 and alicense database60. The end-user database48 stores various information about the end-users of thesystem24, including End-User ID, a password or passwords, addresses, information required to process a payment, a transaction history and/or a credit balance. This information is collected by theauthorization server40 during registration of an end-user and purchase of credit that is used to access encoded content. The End-User ID is a unique identifier selected for the end-user32. Thecontent completion database52 stores critical data of the encoded content held back (hereinafter referred to as a “holdback”, or more particularly a “first holdback”) and information associated with the first holdback. Thesecurity database56 stores encryption keys for the encoded content on theoptical media disk28. In addition, thesecurity database56 stores critical data extracted from the encrypted incomplete encoded content (hereinafter referred to as a “second holdback”) and second holdback reintegration maps that specify how to reintegrate second holdbacks into the remainder of the encrypted deconstituted encoded content. Thelicense database60 stores information about each license issued, including the End-User ID, a Content ID, a list of license parameters, etc. Thelicense database60 includes a content map for each license issued. All other information stored about the transaction, and license parameters, is accessed through thelicense database60 using the content map as a key.
The general method of delivering encodedcontent100 using thesystem24 is shown inFIG. 2. In themethod100, upon authorization of an end-user, a customized first holdback, is generated for the end-user. The customized first holdback contains information such that, when the first holdback is reintegrated with the remainder of the encoded content, customized copy of the encoded content is created. In particular, the first holdback is customized with an identifier known as the content consumer transaction identifier (“CCTID”) that is a collision-resistant hash of the End-User ID, the Content ID, a set of license parameters, etc.
Themethod100 commences with the determination of the type of the encoded content and the embedding of a marker referred to hereinafter as a “brand” and Content ID in the encoded content (step110). For most types of encoded content, the brand and Content ID are embedded in one or more unused header fields. The brand indicates that the encoded content has been processed using the method described herein. The Content ID is a unique identifier generated for the encoded content to distinguish it from the encoded content. Next, a first holdback is extracted from the encoded content (step120).
FIG. 3 shows the steps performed during the extraction of the first holdback from the encoded content. A holdback scheme corresponding to the encoding type is selected (step121). A holdback scheme is a set of rules that are used to specify how a first holdback is to be determined for a particular content encoding type. In some circumstances, it is desirable to select a first holdback that is small in size in comparison to the size of the encoded content, yet damages the content when extracted therefrom. The holdback scheme is also selected based on suitability for receiving a CCTID.
An exemplaryabstract datagram200 for a generic encoded content file (such as MPEG-2) is shown inFIG. 4 in order to illustrate how a holdback scheme is defined. The datagram includes aheader204, a sequence of content frames208 and afooter212. Theheader204 contains structural parameters for thesequence208. In MPEG-2 encoded content, the sequence is organized to permit random access and synchronize with audio streams. The content frames208 contain the information comprising the content, per se. The content frames208 include a number of logical units. or video frames, of a video sequence. Some of the video frames, namely F1, F6 and Fn, are I-frames216 that represent complete images. The remaining frames are difference frames (P-frames and B-frames)220 that represent changes in video information subsequent to the I-frames. By grouping a series of video frames that have a desired level of common visual elements, each video frame subsequent to the first video frame in the series can be characterized by those portions that differ from the I-frame216 or the immediately-preceding frame. That is, if there is only movement in the foreground of a series of video frames representing a scene of a movie, the first video frame would include an entire image and subsequent difference frames would only include information about those portions of the image that have changed (objects in the foreground and background revealed by movement of the foreground objects).
The holdback scheme selected takes into consideration the type of encoding used to encode the content. In some cases, it is desirable to select a first holdback that is both small in size and, when removed from the encoded content, damages the quality of the content for an end-user. To damage the content decoded from generic encoded content, it is generally sufficient to remove a plurality of blocks of one or more bytes from random locations in the encoded content. However, it is also desired to remove some blocks suitable for embedding the CCTID. In general, all blocks are classified into three types, namely (a) those blocks that cannot change (e.g., file structural parameters); (b) those that are interdependent with other bytes (e.g., cyclic redundancy checks); or (c) those that are free to change. The entire first holdback consists of the union of an arbitrary selection of as many as needed of each type.
This is achieved in different ways for different encoding types. For content encoded using the MPEG-2 standard, the defined holdback scheme specifies that a block of one or more bytes in length is to be removed from a random location in theheader204, and a block of one or more bytes in length is to be removed from each I-frame216. When a first holdback is extracted from MPEG-2 encoded content using this holdback scheme, it has been found that the quality of the content for an end-user is damaged.
Once the holdback scheme has been selected, the candidate locations at which the CCTID may be embedded and the portion(s) of the encoded content to be held back are selected (step122). The candidate CCTID embedding locations are determined in accordance with the holdback scheme. It can be desirable to embed the CCTID in the encoded content such that it cannot be readily removed without damaging the encoded content. In addition, it can be desirable to embed the CCTID in a location where it is relatively undetectable to an end-user in comparison to other locations in the encoded content so that there is little or no significant degradation of the content quality. As the CCTID is embedded into the first holdback, the candidate locations for embedding the CCTID are selected in order to achieve these goals within the constraints of the holdback scheme. As the CCTID is embedded in the first holdback prior to delivery of the first holdback to the client, the candidate locations for embedding the CCTID and the portions of the encoded content to be held back are selected such that the candidate CCTID embedding locations are within at least one of the portions of the encoded content to be held back. The candidate CCTID embedding locations are determined in terms of absolute locations in the portions of the encoded content to be held back (i.e., the first holdback). The number of candidate CCTID embedding locations selected well exceeds the number of locations used to embed the CCTID for any one particular end-user. As a result, there will likely be little overlap in the CCTID embedding locations selected for two end-users. In this manner, the embedded CCTIDs are made more resilient against tampering. Further, the set of locations where the CCTID is embedded in the encoded content can be distinct for each user, thereby also providing information which can be used to identify the end-user(s) for which the original encoded content was generated.
One potential attack directed to making the CCTID illegible would be to combine two or more separate distinct copies of the same encoded content to identify bits that differ between the copies and either average the differing values or replace them with random bits. In either case, the locations at which the CCTIDs were embedded can be determined by comparing the resultant copy of the encoded content to the original file.
It has been found that if the CCTID is embedded into a relatively complex portion of the content, its detectability is decreased. In images and/or video frames, such areas of relative complexity could be areas of high texture that do not conform to a regular pattern (for example, blades of grass). For example, greyscale values may be shifted up or down a small increment in such portions during the embedding of the CCTID. By performing such analysis prior to when the CCTID is embedded, an in-depth analysis can be performed to locate those portions of the content which are more suitable for receiving the CCTID.
The first holdback is determined in accordance with the holdback scheme and selected to encompass the candidate CCTID embedding locations.
Upon selecting the first holdback extraction instructions and the candidate CCTID embedding locations, a first holdback reintegration map is determined and registered, along with the candidate CCTID embedding locations (step123). In particular, the first holdback reintegration map identifies portions of the first holdback to be inserted and locations in the deconstituted encoded content at which the portions of the first holdback are to be reinserted.
FIG. 5 illustrates a portion of an exemplary firstholdback reintegration map224 used in thesystem24. The firstholdback reintegration map224 consists of holdbackportion insertion parameters228, each including aninsertion location232 and aholdback portion length236. Eachinsertion location232 indicates where a portion of the holdback is to be inserted. The correspondingholdback portion length236 specifies what portion of the first holdback is to be inserted at theholdback location232. The portion of the first holdback to be inserted at the location specified in the first set of holdbackportion insertion parameters228 is the portion of the first holdback of theholdback portion length236 at the start of the encoded content. Subsequent holdback portion insertion parameters correspond to subsequent portions of the first holdback commencing where the previous first holdback portions ended.
Once the first holdback reintegration map and the candidate CCTID embedding locations are determined and registered, the first holdback is extracted from the encoded content (step124). During extraction of the first holdback, the first holdback portions are extracted and concatenated to form a contiguous first holdback. The remaining portions of the encoded content are concatenated together to form the incomplete encoded content. The incomplete encoded content is deconstituted as it is damaged as a result of the extraction of the first holdback.
FIG. 6 illustrates the extraction of the first holdback determined in accordance with the first holdback reintegration map ofFIG. 5 for the encoded content ofFIG. 4. A plurality ofsegments248A to248D are extracted from the encoded content and concatenated to collectively form afirst holdback252. In particular,segment248A is extracted from theheader204. The remaining segments of the encoded content are concatenated collectively to formincomplete content256.Demarcation lines260A to260D symbolically indicate the former locations of thefirst holdback segments248A to248D respectively in the deconstituted encoded content.
Referring again toFIG. 2, once the first holdback has been extracted from the encoded content, it is encrypted and from it a second holdback is extracted (step130).
FIG. 7 illustrates the method of encrypting the encoded content and extraction of the second holdback therefrom. An encryption key is randomly generated and used to encrypt the incomplete encodedcontent256 prior to distribution on optical media disk28 (step131). The incomplete encodedcontent256 is encrypted using a symmetric cipher in cipher block chain mode. In cipher block chain mode, the encryption of any particular block of data is dependent on the encrypted value of all the preceding data. As a result, the omission of any segment of the encrypted encoded content makes decryption of the encoded content infeasible. In addition, the name and extension of the file are changed so as to obscure the encoded content. The second holdback is then extracted from the encrypted incomplete encoded content (step132). The second holdback is an arbitrarily selected set of data blocks that are extracted from the encrypted incomplete encoded content. A second holdback reintegration map is generated to indicate how the second holdback is to be reintegrated with the encrypted incomplete encoded content to restore the encrypted incomplete encoded content to its original form. By extracting the second holdback from the encrypted incomplete encoded content, it is deconstituted; that is, it is damaged and cannot be restored to its original form without the extracted data. The first holdback reintegration map, the candidate CCTID embedding locations, the encryption key, the second holdback and the second holdback reintegration map are stored (step133). In particular, the first holdback reintegration map, the candidate CCTID embedding locations and the first holdback are stored in thecontent completion database48 and the encryption key, the second holdback and the second holdback reintegration map are stored in thesecurity database52. The second holdback, the second holdback reintegration map, the encryption key, the first holdback and the first holdback reintegration map are all that is required by thePC36 to reconstruct the encoded content from the deconstituted encoded content. The candidate CCTID embedding locations enable theauthorization server40 to select locations and embed the CCTID in the first holdback prior to its delivery to thePC36.
Referring again toFIG. 2, once the first holdbacks have been extracted, the incomplete encoded content has been encrypted and the second holdback has been extracted, the encrypted deconstituted encoded content is placed onto theoptical media disk28 and distributed to the end-user32 (step140). Distribution of theoptical media disk28 can be performed in a number of ways, including its inclusion with periodicals, manual distribution at an event, mail distribution via a subscriber list, etc. Theoptical media disk28 includes the deconstituted encoded content and a content access application for reconstituting the encoded content. In addition, theoptical media disk28 is provided with a media identifier. The media identifier is registered with theauthorization server40 along with a list of the encoded content on theoptical media disk28.
In order for the end-user32 of thePC36 to be able to enjoy any of the encoded content on theoptical media disk28, the end-user32 requests access to the encoded content and is authorized (step150). Authorization occurs via the content access application. This application automatically launches when theoptical media disk28 is inserted intoPC36 and permits the end-user32 to obtain authorization for accessing encoded content stored in deconstituted form on theoptical media disk28. The end-user32 logs in by entering the End-user User ID and password at a login screen.
The method of authorizing access to encoded content on theoptical media disk28 by thePC36 is illustrated inFIG. 8. The method begins with the generation of a list of encoded content available for licensing by the content access application on the PC36 (step151). The content access application queries theoptical media disk28 to obtain the media identifier, and transmits the media identifier to theauthorization server40 via theInternet44, along with the End-User ID to determine what encoded content is available on theoptical media disk28, and what encoded content on theoptical media disk28 the end-user32 is currently licensed to access. Theauthorization server40 then responds with a list of what encoded content is available on theoptical media disk28, along with what encoded content the end-user32 is currently licensed to access.
Upon obtaining the list of encoded content, the content access application parses the list and presents the list of encoded content available on theoptical media disk28 to the end-user32 (step152). The list of encoded content presented by the content access application indicates that encoded content that the end-user32 is currently licensed to access, as provided by theauthorization server40.
FIG. 8 illustrates an exemplarycontent selection window300 that presents a list of encodedcontent304, music videos in this case, that is available on theoptical media disk28, along with the cost of a license to access the encoded content. An acceptbutton308 and a cancelbutton312 permit acceptance or cancellation of any selections made by the end-user32. The list of music videos is sorted by genre and artist.
Referring toFIGS. 8 and 9, the content access application then receives the selection of encoded content that the end-user32 wishes to access (step153). Check boxes beside each music video permit the end-user32 to indicate what encoded content he/she wishes to access. Upon selection of the encoded content that the end-user32 wishes to access, the end-user32 clicks on the acceptbutton308. An authorization confirmation dialog box (not shown) appears, prompting the end-user32 to confirm that the end-user32 wishes to spend credit on the selected encoded content. Upon confirmation, the end-user32 is deemed to have confirmed their selection of encoded content.
The content access application then requests access to the selected encoded content (step154). The request from thePC36 includes the unique End-User ID of the end-user32 and the Content ID(s) for the selected encoded content.
The end-user32 is then authorized to access the requested encoded content by the authorization server40 (step155). At this point. a number of events occur. Theauthorization server40 receives the request from thePC36 to access encoded content and authorizes the end-user32 for the selected encoded content. The End-User ID transmitted with the request to access encoded content is used to retrieve end-user information from the end-user database48. The end-user database48 contains information for each end-user32 including information required to process a payment, a transaction history and a credit balance. In order to obtain authorization to access encoded content. the end-user purchases credit via a website operated by theauthorization server40. The end-user database48 is then updated to reflect the new credit. If the end-user has sufficient credit to purchase access to the selected encoded content (that is, a license), a transaction history is updated, and the end-user's credit balance is debited the appropriate amount for the selected encoded content. If the end-user does not have sufficient credit to purchase access to all of the selected encoded content, theauthorization server40 can direct the content access application to display a message that insufficient credit is available, with a link to a page where the end-user can purchase additional credit.
Referring back toFIG. 2, theauthorization server40 then generates a customized first holdback (step160). Theauthorization server40 generates a CCTID for the license by calculating a collision-resistant hash of the End-User ID, the Content ID and license information, and then stores the CCTID in thelicense database60.
Theauthorization server40 uses the Content ID to retrieve the corresponding first holdback, first holdback reintegration map and the candidate CCTID embedding locations from thecontent completion database52. In addition, theauthorization server40 also retrieves the encryption key, the second holdback and the second holdback reintegration map from thesecurity database56. Theauthorization server40 then selects a subset of the candidate CCTID embedding locations for the particular end-user and registers the selected subset in thelicense database60. Theauthorization server40 then embeds the CCTID into the first holdback at the selected candidate CCTID embedding locations. During embedding of the CCTID in the first holdback, error-checking codes frequently used in media files to identify corrupt frames are adjusted to reflect the embedding of the CCTID. MPEG-2 for instance allows a cyclic redundancy check.
The customized first holdback is then bundled by theauthorization server40 with the first holdback reintegration map, the encryption key, the second holdback and the second holdback reintegration map to generate a customized reconstitution package. Theauthorization server40 then transmits the customized reconstitution package to thePC36.
ThePC36 receives the customized reconstitution package containing the customized first holdback, the first holdback reintegration map, the encryption key, the second holdback and the second holdback reintegration map (step170). ThePC36 reintegrates the second holdback in accordance with the second holdback reintegration map and decrypts the incomplete content using the encryption key provided in the holdback package (step180). The holdback package is parsed by thePC36 to obtain the encryption key, the second holdback and the second holdback reintegration map. ThePC36 reintegrates the second holdback with the remainder of the encoded deconstructed encoded content in accordance with the component reintegration map. Next, thePC36 uses the encryption key to decrypt the incomplete encoded content using a symmetric cipher in cipher block chain mode. ThePC36 then restores the incomplete encoded content by inserting the first holdback into the incomplete encoded content using the first holdback reintegration map (step190).
The completed encoded content is thereafter stored on non-volatile storage of thePC36 for accessing at a later time.
Once the deconstituted encoded content has been reconstituted, the encoded content is generally as it was before it was deconstituted, with the exception that the CCTID has been steganographically embedded in the encoded content such that it is generally undetectable by an end-user.
By distributing encoded content in a deconstituted form, the end-user is not provided with all of the data required to enable decoding of the encoded content prior to authorization. By cipher block chain encrypting the incomplete encoded content and extracting a second holdback, the decryption of the incomplete encoded content to its original form is made infeasible. Further, as the end-user does not possess all of the encoded content prior to reconstitution, the end-user cannot reconstruct the entire encoded content from the deconstituted encoded content. Only once the deconstituted encoded content is reintegrated with the second holdback, decrypted and reconstituted can it be properly decoded and enjoyed by an end-user.
Due to the manner in which the CCTID is embedded in the encoded content, wherein actual content is changed, the CCTID in the resultant encoded content is resilient to a decoding/re-encoding attack. Further, as the CCTID is embedded in a set of locations that is distinct for each end-user, the encoded content is resilient to combination attacks where two legitimate copies of the encoded content licensed to two end-users are combined in some manner, and generally can permit identification of both end-users.
It is of interest in many cases for administrators of networks, either physical such as a local area network, or virtual such as a file-sharing network operated over the Internet, to ensure that the encoded content being shared on their networks is properly licensed. For such cases, a method and apparatus are provided to enable the authentication of encoded content.
FIG. 10 shows asystem400 that is similar to thesystem24 ofFIG. 1 and additionally includes anarbiter computer404 and a remote personal computer (“remote PC”)408 operated by aremote user412. Both thearbiter computer404 and theremote PC408 are in communication with thePC36. The end-user32 of thePC36 can offer to make available encoded content stored thereon to other users via a file-sharing network operated via thearbiter computer404. Upon launch of a file-sharing application,PC36 communicates to the arbiter computer404 a list of encoded content that the end-user32 has selected for sharing. This list is combined with lists of encoded content made available by other end-users to form an aggregate list of encoded content available on the file-sharing network. Theremote user412 can connect to the file-sharing network by executing a file-sharing application on theremote PC408. Theremote PC408 connects to thearbiter computer404 and communicates a list of encoded content that theremote user412 has selected for sharing. In order to locate and retrieve encoded content from the file-sharing network, theremote user412 specifies query criteria via the file-sharing application which, in turn, queries thearbiter computer404. Thearbiter computer404 responds with a list of encoded content matching the query of theremote user412. Upon selection by theremote user412 of encoded content to be downloaded, theremote PC408 requests the selected encoded content from thearbiter computer404. Thearbiter computer404 then requests the selected encoded content from thePC36 for transmission to theremote PC408.
FIG. 11 illustrates the method of authenticating and controlling distribution of encoded content via thesystem400 generally at500. The encoded content is received by anarbiter computer404 that is configured to monitor and control traffic and storage of encoded content on the file-sharing network (step510). Upon receiving the encoded content, thearbiter computer404 determines whether the encoded content is branded (step520). If the encoded content is branded, thearbiter computer404 authenticates and determines the distribution permissions for the encoded content (step530). If thearbiter computer404 determines that the encoded content is not authentic or that the encoded content may not be redistributed, thearbiter computer404 prohibits transmission of the encoded content (step540). If, instead, thearbiter computer404 determines that the encoded content is authenticated and redistribution is permitted, or if the encoded content is not branded, thearbiter computer404 permits transmission of the encoded content (step550).
FIG. 12 illustrates in more detail the determination of whether encoded content is authenticated. The arbiter computer determines whether the brand and the Content ID appear to be valid (step531). If it appears that a brand was embedded in the encoded content, and the brand and/or the Content ID have been tampered with, the encoded content is deemed inauthentic (step532). In this case, as the encoded content is deemed inauthentic, it is determined that distribution is not permitted. If, instead, the brand and the Content ID appear to be valid atstep531, thearbiter computer404 then performs a hash calculation on the encoded content (step533). Thearbiter computer404 provides the result to theauthorization server40 along with the Content ID parsed from the encoded content (step534). Upon receiving the result from thearbiter computer404, theauthorization server40 checks thelicense database60 for a record including the Content ID and the hash calculation result (step535). If theauthorization server40 does not find a record including the Content ID and the hash calculation result registered in thelicense database60, the encoded content is deemed to be inauthentic. If, instead, theauthorization server40 finds a record including the Content ID and the hash calculation result in thelicense database60, the encoded content is deemed to be authenticated and theauthorization server40 retrieves any permissions associated with the license from the record. Once the encoded content is determined to be authentic or inauthentic, theauthorization server40 responds back to thearbiter computer404 with the result including any permissions which affect distribution (step536).
In some cases, it is desirable to pre-customize portions of the encoded content that are distributed to end-users and still retain the ability to create a distinct version of the encoded content for a particular end-user.
Turning now toFIG. 13, a method of delivering encoded content in accordance with another embodiment is shown generally at600. For illustration, themethod600 will be described with respect to video content that is encoded using the MPEG-2 standard. Themethod600 commences with the analysis of the encoded content (step610).
FIG. 14 illustrates the steps performed during analysis of the encoded content duringstep610. The encoding type that is used to encode the content is determined (step611). A modification scheme corresponding to the encoding type determined atstep611 is selected (step612). The locations of where modifications may be made to the encoded content in accordance with the modification scheme depend on the type of encoding employed. For example, in the case of MPEG-2-encoded video, modifications are made to visual content and/or parameters of the I-frames and/or difference frames. These regions of the encoded content are identified by parsing the encoded content, and are flagged at this stage.
The encoded content is then analyzed to determine candidate locations for modifications (step613). As it is desirable not to significantly affect the experience of an end-user, the encoded content is parsed and analyzed to identify candidate locations therein where modifications may be made in a manner that is not readily detectable to the end-user. These regions of the encoded content are flagged as candidate locations for modifications.
In particular, the method used to modify the encoded content in this embodiment is adjustment of the bias of frames. The bias of the frames provides a baseline against which image data is compared in order to characterize it during the encoding. Accordingly, video frames are identified as candidate locations for modifications.
Referring again toFIG. 13, once the encoded content has been analyzed, a brand and the Content ID are embedded into the encoded content (step620). The encoded content is then segmented (step630). The size of such segments is generally pre-determined, but can be modified depending on the position of candidate locations in the encoded content in order to ensure that each segment contains a desired number of such candidate locations. By providing a plurality of candidate locations for each segment, the variations possible of each segment through modification are numerous. Further, the probability of detection of such modifications by an end-user can be decreased.
FIGS. 15A and 15B represent encodedcontent800 before and after segmentation respectively. The encodedcontent800 has been divided into fortysegments804.Bach segment804 includes a set of video frames.
Referring again toFIG. 13, when the encoded content has been segmented, a set of distinct instances of each segment is generated (step640). Each segment instance is generated by modifying the segment in at least one of the candidate locations identified instep610. As there are a number of candidate locations for each segment, the segment instances are readily distinguished from each other.
In order to modify a segment to generate a particular segment instance thereof, the bias of one or more frames of the segment is modified. As there are a number of frames in each segment, and as the bias of the frames can be modified to varying degrees. a number of combinations of distinct instances can be generated for each segment.
FIG. 16 illustrates the generation of a plurality ofinstances808 for eachsegment804 of the encodedcontent800. As shown, ninety-ninesegment instances808 of each of the fortysegments804 are generated.Segment instance808ais the first instance of asecond segment804aof the encoded content.Segment instance808bis the fourth instance of asecond segment804bof the encoded content. Each segment instance is provided twocoordinates812. The first coordinate refers to the number of the segment to which the segment instance is related. The second coordinate is an identifier for the particular instance of the segment. As segment instances are generated for a segment, they are numbered sequentially, starting at1.
Once the segment instances have been generated, hash calculations are performed on each of the segment instances and the results are stored in thelicense database56. These results are used to authenticate the encoded content in a manner similar to that described hereinabove.
Referring again toFIG. 13, once the plurality ofsegment instances808 has been generated, first holdbacks are extracted from each segment instance (step650). A first holdback refers to a portion of encoded content that is withheld so as to not provide all of the encoded content, and to damage it. In this particular case, a first holdback is a portion of a segment instance that is arbitrarily selected and extracted from the segment instance. The first holdback extracted from a segment instance can be a single, contiguous set of bits or, alternatively, can be a number of bits extracted from a variety of positions within the segment instance.
During extraction of the first holdbacks from the segment instances, the first holdbacks are registered in thecontent completion database52, along with the Content ID associated with the encoded content, and the coordinates associated with the particular segment instance from which each first holdback has been extracted. in addition, a first holdback reintegration map for reintegrating each first holdback into the corresponding segment instance is also stored in thecontent completion database52. Each first holdback reintegration map specifies how the corresponding first holdback is to be reintegrated with the segment instances. In particular, each first holdback reintegration map indicates a set of positions in the corresponding segment instance, and the length of the portion of the corresponding first holdback to be inserted at the particular positions.
FIG. 17 illustrates the plurality ofsegment instances808 after extraction offirst holdbacks812. As shown, thefirst holdbacks812 can be extracted from differing positions in each of thesegment instances808. The segment instances represent deconstituted encoded content as they do not contain all of the information needed to decode the encoded content.
After extraction of the first holdbacks, theincomplete segment instances808 are each cipher block chain encrypted using separate symmetric encryption keys and second holdbacks are extracted (step660). Second holdback reintegration maps are generated to specify how the second holdbacks are to be reintegrated with the encrypted segment instances. As a result of the encryption, the segment instances are put into a non-standard file format and are given names and extensions that do not enable them to be readily associated with the encoded content to which they are related. By extracting the second holdback from the encrypted incomplete segment instances, their decryption without the second holdback is made infeasible. The encryption keys, the second holdbacks and the second holdback reintegration maps are then registered by theauthorization server40 in thesecurity database56, along with the Content ID of the encoded content and the two coordinates of the segment instance to which they correspond.
Referring back toFIG. 13, once the deconstituted encoded content is encrypted, it is placed onto an optical media disk and distributed to the end-user (step670). The end-user requests access to the encoded content and is authorized (step680).Steps670 and680 are performed in the same manner as are done insteps140 and150 respectively.
Referring back toFIG. 13, if access to the encoded content is authorized, theauthorization server40 generates a distinct content map for the license and registers it in the license database48 (step690). The content map is a mapping of the segment instances to direct thePC36 to recombine certain segment instances to generate a copy of the encoded content that is distinct for the particular end-user, albeit incomplete due to the absence of the first holdbacks.
Anexemplary content map900 is shown inFIG. 18. The content map is a sequenced list of segment instances that are to be combined to form a distinct customized copy of the encoded content. The order of the elements correspond with the segments to which they apply and each element indicates which instance is to be used for the particular segment. Incontent map900, the first element, “3”, indicates that the third segment instance is to be used for the first segment of the encoded content. Likewise, the second segment instance is to be used for the second segment of the encoded content as indicated by the “2” in the second position, and so on.
Theauthorization server40 registers the license and the content map in thelicense database60. Theauthorization server40 registers the End-User ID, the Content ID of the newly licensed encoded content, the content map and a set of license data. The license data can include particulars of the license terms, the date the license was issued, etc.
Theauthorization server40 then retrieves the first holdbacks, the first holdback reintegration maps, the encryption keys, the second holdbacks and the second holdback reintegration maps that correspond to the particular segment instances selected for the end-user32 from thecontent completion database52 and thesecurity database56 respectively, and creates and transmits a custom reconstitution package to the PC36 (step700). The custom reconstitution package includes the content map.
Upon receiving the custom reconstitution package, thePC36 reconstitutes the encoded content (step710).
FIG. 19 better illustrates the process of reconstituting the encoded content. ThePC36 receives the content map, the encryption keys, the second holdbacks, the second holdback reintegration maps, the first holdbacks and the first holdback reintegration maps (step711). ThePC36 reintegrates the second holdbacks with the encrypted incomplete segment instances on theoptical media disk28 specified by the content map using the second holdback reintegration map (step712). ThePC36 then decrypts the specified incomplete segment instances using the encryption keys (step713). Next, thePC36 reintegrates the first holdbacks into the specified incomplete segment instances in accordance with the first holdback reintegration maps (step714). Upon reintegration of the first holdbacks, the encoded content is reassembled from the segment instances in accordance with the content map (step715).
FIG. 20 illustrates an exemplary customized encodedcontent1000. In particular, the completed encodedcontent1000 was generated from the segment instances ofFIG. 16 and the content map ofFIG. 18.
The completed encoded content is thereafter stored on non-volatile storage of thePC36 for accessing at a later time.
In a further embodiment, the encoded content in its entirety is stored by the authorization server. Upon authorization of an end-user to access content, a distinct customized copy of the encoded content is generated and made available to the user. The customized copy is generated at the time of authorization. The customization can be a steganographically placed in the encoded content. In this case, pre-analysis to identify areas of the encoded content that are suitable for customization can enable rapid steganographic customization. The customization can also provide an identifier of the license via visual, audio or other suitable markings. The end-user can be provided an interface for selecting encoded content that they wish to access, such as a webpage. Upon selecting encoded content that the end-user wishes to access, a customized copy of the selected encoded content is generated and delivered to the end-user.
In the above-described embodiments, if the encoded content is copied by or from the licensee, the customization(s) to the encoded content can permit proving that the copy was made from an instance of the encoded content licensed to that particular end-user.
While the invention has been described with reference to the above-described embodiments, other embodiments will occur to those of skill in the art. For example, while, in the above-described embodiments where incomplete content is provided, it can be distributed on an optical media disk, other methods of such distribution will occur to those skilled in the art. The incomplete content can be distributed electronically via a content distribution network and downloaded to a PC. Incomplete content can be pre-delivered and cached on a PC prior to receiving a request to access the particular encoded content.
The decoder may be made aware of the customization and the end-user to which the encoded content is licensed (and can be made to permit/deny use if the registered end-user of the application is not the end-user licensed for the encoded content).
Data can be extracted and reintegrated any number of times and in any number of manners. It may be desirable simply to cipher block chain encrypt the encoded content and then extract data therefrom without extracting a holdback from the encoded content prior to the encoded content's encryption. Alternatively, data can be extracted from the encoded content without the encoded content being subsequently encrypted such that the encoded content is damaged and cannot be feasibly restored without the reintegration of the data extracted. Instead of specifying absolute addresses where the extracted data is to be reintegrated, relative addresses can be used.
The holdback can be extracted from the encoded content in other manners. For example, extracting the holdback, an XOR function can be applied to portions the encoded content, thereby leaving the size of the encoded content intact. In this case, reintegration of the holdback would comprise performing an XOR function on those portions of the encoded content specified by the holdback package.
Customization of encoded content can be performed by the PC or other locations. The authorization server can provide instructions to the PC to customize encoded content available to the PC. The instructions can be encrypted to retain the confidentiality of the information, the application executing on the PC to perform such customizations can be made robust so as to resist tampering and the encoded content can be stored in a secure manner prior to customization. In the scenario where there is a trusted client with knowledge of the embedding process that is licensed to generate copies of encoded content for end-users, unregistered copies may appear for authentication by the arbiter computer. In these cases, the arbiter computer can determine the CCTIDs generated and register each copy of the encoded content as it becomes aware of them. The trusted client embeds both the identity of the trusted client and the end-users, along with a serial number so that the number of copies generated can be counted. In this manner, the encoded content generated by the trusted client can be tracked.
In some cases, it may be desirable to simply deconstitute the encoded content and not customize it. The reconstitution data in this scenario can be delivered upon licensing of the encoded content.
The CCTID can be digitally signed by the authorization server with standard private encryption key technology, enabling authentication of the CCTID.
Instead of customizing the encoded content with a CCTID or the like, the transaction particulars can be embedded directly into the holdback and/or the encoded content itself.
By generating a hash calculation for the particular encoded content and generating and storing the results for each end-user in the license database as encoded content is licensed, the authorization server can quickly determine whether the encoded content upon which the hash calculation was performed matches the end-user. If the End-User ID and Content ID are not found in a record of the license database, or if the result of the hash calculation provided by the application does not match that for the End-User, the authorization server can generate a result indicating that the end-user is not authorized to access the encoded content and returns the result to the application.
It can be desirable in some cases to perform hash calculations only on subsets of the encoded content where the information gathered by performing hash calculations on a subset of the encoded content is sufficient for the intended purpose.
Other characteristics of the encoded content can be modified in order to differentiate the encoded content. For example, the keystoning of video frames can be altered slightly.
Another method of modifying encoded content to create distinct instances while not significantly changing the end-user experience is by looking for and altering data in the encoded content that is superfluous. Many encoding standards call for compression algorithms whereby the resulting encoded content includes superfluous data. If little or no superfluous data exists in encoded content. it can be desirable to insert some otherwise superfluous space into which an identifier can be embedded.
Encoded content is usually compressed by the encoder, which makes choices affecting compression ratios when encoding the content, (and this also trades off quality when using lossy compression). Since these choices may all be made at compression time, it is possible to select the compression ratio such that there is always sufficient superfluous space remaining for customization.
Further, it is possible to embed a brand and Content ID into and/or customize encoded content by embedding one or more robust watermarks in the encoded content at compression time. This information can then be read from the encoded content by “partially trusted” agents, such as peer-to-peer (“P2P”) file-sharing services and the like. The information in the watermark can be date-stamped and digitally signed. A robust digital watermark may be embedded by any algorithm that satisfies an arbitrary set of requirements for robustness, for example a few exemplary robust watermarking approaches are described in the document “Digital Watermarking Schemes for Multimedia Authentication”, Chang-Tsun Li, Department of Computer Science, University of Warwick, 15 Feb. 2004.
Other methods of inserting brands and/or Content IDs will occur to those skilled in the art. For example, the content can be altered in a manner that is relatively innocuous to an end-user by modifying least-significant bits within the file to satisfy a parity test.
For video or still-image content, areas of relatively high complexity provide camouflage to minor modifications made. Similarly, in audio content, portions of relatively complex sounds enable modifications to be made without significantly impacting the experience for an end-user.
Where the encoded content is customized to embed a CCTID or the like, the customization could include redundant data to identify the CCTID so that the CCTID is still legible if some of the customizations in the encoded content are tampered with.
Where the encoded content is divided into a set of segments, distinct instances can be generated for a subset of said segments, with the remainder of the segments being used in their original form. The encoded content is then reassembled from the remainder of the segments being used in their original form and one of the instances corresponding with each segment for which instances were generated. In this manner, storage requirements for the unassembled components (the original segments and segment instances) can be reduced.
While the method and apparatus disclosed above are described with respect to monitoring file-sharing traffic, encoded content registered in storage or communicated via another type of network can be scanned and authenticated in a similar manner.