COMPUTER-IMPLEMENTED METHOD FOR SELECTIVELY OBSCURING AND DE-OBSCURING IDENTITIES IN A MEDIA
TECHNICAL FIELD
The present invention relates to a computer-implemented method for selectively obscuring and de-obscuring identities in a media as well as a system configured to execute the method and a non-transitory computer readable medium containing instructions relating to executing said method.
PRIOR ART
US 2014/086493 A1 discloses a method for obscuring objects in photos, wherein the one or more objects include one or more of at least one face, text, and at least one logo. The user provides preferences associated with one or more photos including an indication which objects to obscure and optional additional preferences with an indication to select objects to selectively unobscure. Such indications to unobscure are given and stored in the social network system, where the photo is stored.
US 11 ,157,646 B2 discloses a method for obscuring privacy sensitive areas in photos, wherein the original photo is transformed in a modified image comprising one or more obfuscated privacy-sensitive areas as well as in an encrypted privacy-sensitive image having one or more unobfuscated privacy-sensitive image parts. If the receiver of the photo has the possibility, i.e. the necessary key, to decrypt the encrypted privacy-sensitive image, the obfuscated part(s) of the modified image are replaced by the decrypted unobfuscated privacy-sensitive image areas in a protected memory mode.
EP 2 260 469 A4 discloses a method and system of identity masking to obscure identities corresponding to face regions in an image. A face detector is applied to detect a set of possible face regions in the image. Then an identity masker is used to process the detected face regions by identity masking techniques in order to obscure identities corresponding to the regions. For example, a detected face region can be blurred as if it is in motion by a motion blur algorithm, such that the blurred region cannot be recognized as the original identity. Or the detected face region can be replaced by a substitute facial image by a face replacement algorithm to obscure the corresponding identity. US 10,628,922 B2 is related to an optional automatic obfuscation of a human or other subject in digital media based on a vector comprising features identifying the human or other subject, especially facial or object recognition.
US 11 ,482,041 B2 is related to a computerized image manipulation method within which an input image of a person is replaced with an updated image comprising a combination of a new facial image and the base face of the input image.
Finally, most related work has been trained on datasets showing frontal mugshots highlighting the need for future work dealing with images of real-world settings.
SUMMARY OF THE INVENTION
Embodiments of the prior art generally relate to providing privacy in a social network system. In some embodiments, a method includes recognizing one or more objects in at least one media. The method also includes determining one or more objects to be obscured in the at least one media based on one or more user preferences. The method also includes causing the at least one media to be displayed such that the determined one or more objects are obscured.
Based on the prior art it is an object of the present invention to improve the handling and security of the obfuscation for the user. Furthermore, it is an object for an owner or user to be able to distribute control access rights to media to other users when there are objects related to such other users in the media.
Such an object is achieved with a computer-implemented method for selectively obfuscating identities in a media on a computer of a user according to claim 1. Such a computer- implemented method for selectively obfuscating identities in a media on a computer by a user comprises the steps of receiving a media, detecting, in the media, a visual identity, creating an anonymized visual identity for the visual identity, wherein the anonymized visual identity is no longer recognizable as the human subject, creating metadata for each visual identity, creating an embedding for each anonymized visual identity, modifying the media with the embeddings creating an obfuscated media, sending the metadata for deobfuscation to a broker, and sending the obfuscated media to a sharer. The media M can be from the group comprising an image, a video, a digital photorealistic avatar, a 3D model or a neural representation. The visual identity can comprise a set of visual information from which a person's real identity can be derived by a machine, algorithm or another human without any technical or special domain knowledge required. Visual information can be facial features, body parts and modifications, hair style, a person’s appearance, accessories, gestures, and any combination of the above.
Usually each of the at least one metadata Bi is connected to one of the at least one embedding Ei with an unique identifier I Di. Such a unique identifier I Di allows to identify metadata Bi in the broker which is needed for de-obfuscation to get visual identity Fi.
The step of generating an anonymized visual identity Ai based on a visual identity Fi can be executed deterministically or stochastically. A stochastically created anonymized visual identity Ai provides even more privacy, since different images of the human subject are related to different synthetical images as obfuscated images. Furthermore, repeating the step of generating an anonymized visual identity based on the same visual identity Fi yields for every execution a different anonymized visual identity Ai. Automatic comparison of such stochastically altered images of a human subject do not point towards the same real identity of the human subject.
The step of generating an anonymized visual identity Ai based on a visual identity Fi can comprise a predetermined parameter of level of obfuscation of the visual identity Fi. Such a choice allows the user to alter the different features of an obfuscated visual identity to be more or less similar to the appearance in the original media. The creation of a synthetical media can be considered to be one type of obfuscation while placing a 2D code on the face of the human subject can be considered a second (stronger) type of obfuscation.
The embedding can also be, beside a 2D code on the face, having the form of watermarks, steganography, QR-tags, Fourier domain transformation, compressed pixel ring and other embedded codes which can be visible or less visible for a human viewer. The embedding Ei comprises several things and one of them is the anonymized identity Ai and the unique identifier I Di.
One example of how an embedding Ei is put into an image is a synthetical image displaying Ai and in such a case, the other data points in the embedding are hidden somewhere else, but linked to Ai. Another example is the QR code, where the pixel values of the anonymized identity Ai are encrypted and placed around or along one or more edges of the QR code.
There can be several more ways how an obfuscated media A looks like, they all have in common that the information on visual identity Fi is not present in the image anymore OR not accessible anymore through embedding as encrypted pixel values. Furthermore, media A can be always reconstructed to anonymized media M*, which displays the anonymized visual identities Ai in pixel space and is still considered an obfuscated media based on media M. Finally, it is also possible to use media A also as output media M*, i.e. M* = A.
Each of the at least one metadata Bi can be related to one of the at least one visual identity Fi by either pixels of this visual identity Fi or by a latent vector based on this visual identity Fi and/or the at least one anonymized visual identity Ai is related to the at least one visual identity Fi by either pixels of this visual identity Fi or by a latent vector based on this visual identity Fi.
The method can also use various encryption schemes. A key Ki is generated for each of the at least one visual identity Fi. This key Ki then encrypts this visual identity Fi yielding an encrypted visual identity EFi. Then according to one scheme, the key Ki can be included in the at least one embedding Ei and the encrypted visual identity EFi is included in the at least one metadata Bi. According to another scheme, the key Ki can be included in the at least one metadata Bi and the encrypted visual identity EFi is included in the at least one embedding Ei.
The broker can be configured to allow to manage viewing permissions of an obfuscated media A related to a media M through determining access to metadata Bi of visual identities Fi. In absence of such a permission, media M*, a synthetical reconstruction based on the obfuscated media A, is shown. It is possible that permissions are only related to part of the visual identities present in a media A.
The method can comprise a step of giving or withdrawing additional ownership for at least one visual identity Fi by an owner of a visual identity Fi at the broker to another registered user at the broker. Such ownership can be distributed between organisations, friends, partners or e.g. family members. This step can be executed manually by a user and it can be executed e.g. with an automated identification algorithm for everyone in the contacts of an owner to auto-identify and tag them so that the ownership transfer is automatically requested. A computer-implemented method for viewing an obfuscated media with obfuscated identities created according to the above mentioned method and wherein the media is received from a sharer, by a viewer using a retrieval device, is disclosed in claim 10.
Such a computer-implemented method for viewing an obfuscated media A comprises the steps of receiving the obfuscated media A image from the sharer to the retrieval device, identifying and extracting, in the obfuscated media A, at least one embedding Ei corresponding to an anonymized visual identity Ai, parsing each of the at least one embedding Ei to get an unique identifier I Di and the anonymized visual identity Ai, using the unique identifier I Di for receiving from a broker metadata Bi relating to the associated anonymized visual identity Ai, recovering at least one visual identity Fi based on the metadata Bi and the anonymized visual identity Ai, reconstructing the media to be displayed for the viewer based on the level of recovered visual identities Fi.
The parsing step is used for each anonymized visual identity in the media. Each Ei corresponds to one Ai with Ai being part of Ei and therefore, Ei needs to be detected and then extracted first from media A. Once Ei is obtained, it is parsed and I Di, Ai and based on the variant additional data are obtained (according to arrow flow in Fig. 3, 5, 7, 9, 11 , 13).
It is noted that retrieval of Bi is only possible if the user requesting has viewing permissions for Fi. Otherwise, Bi will not be returned from the broker or can be an empty data structure.
The level of recovered visual identities Fi can indicate two different situations. One is related to the number of visual identities Fi, e.g. the number of human subjects in the media, e.g. three out of six in Fig. 14B. The other is related to the level of obfuscation which can, according to various embodiments, be determined by the user.
Parts of the obfuscated media A, can be encrypted. Then, after receiving the at least one metadata Bi from the broker 20 and after having identified at least one anonymized identity Ai in a parsing embedding step, a key Ki is either extracted from the metadata Bi or from the at least one embedding Ei related to the anonymized identity Ai and an encrypted visual identity EFi is extracted from the at least one embedding Ei related to the anonymized identity Ai or from the metadata Bi, respectively, wherein the key Ki is used to decrypt the encrypted visual identity EFi into a related visual identity Fi. Furthermore, it is an object for an owner or user to be able to control access rights to media published by other users on the system. A computer-implemented method of claiming visual identity ownership for a visual identity by a user at a broker in the framework of a stored visual identity based on one or more media for which a plurality of metadata representing visual identities in such media were uploaded by a third party, is claimed in claim 13.
Such a method comprises the steps of claiming that a visual identity Fi should be owned by a different user, validating that the claim is legitimate and correct and if the claim is legitimate, transferring ownership of the corresponding metadata Bi of the visual identity Fi to the user.
Here, claiming ownership is initiated by a claimant. This claimant can be an automated system. A user as claimant can have such a system initiate an ownership claim on their behalf automatically. It is noted that the method transfers the ownership to the user whose real identity was assessed. It is possible to alter the method to restrict the transfer "to the user claiming it".
Furthermore, a user can claim an identity for another person (not necessarily another user): for example a legal guardian claims an identity for a minor. With that the identity of the claimant is different from the visual identity being claimed. This can be covered by an additional step of validating that the claim is legitimate and correct when the claimant is not the user.
Moreover, a third user can be enabled to initiate an ownership claim for another user. For example, if a claimant identifies a friend in someone else’s photo. When the user tags that friend and specifies what user that is, the system is able to initiate the corresponding claim automatically.
In contrast to claim 9 (ownership transfer) which is always initiated by an owner/algorithm or automatic program with permission granted by the owner of a visual identity Fi, these claims 13 and 14 (identity/ownership claim) can be initiated by a user/algorithm or automatic program that is not the owner. It can be either transferred by the current owner/algorithm or automatic program to another or additional registered user or claimed by a claimant, which is either i) an automatic program or ii) a registered user who claims it for themself or for another registered user. As mentioned above, if the user claims it for themself, this does, however, not always mean that the claimed visual identity belongs to their real identity, it could also be the claim of a legal guardian for the visual identity of children for example.
Furthermore, any of the above mentioned methods can be provided on a non-transitory computer readable medium containing instructions that, when executed by a processor, cause the processor to execute any one of said methods. These constructions can be provided and separated on different computer systems as with a broker, a sharer, a hosting platform, a peer-to-peer network, a smartphone of a user, a camera which can be used to capture the media to be egonymized.
These methods improve privacy for online media relating to protecting visual identity, increasing privacy, making media sharing safer, counteracting malicious media manipulation, through selectively obfuscating identities of people in posted media in a reversible way.
A registered user of such a system has and remains in full control of their visual identity, and especially their face in media posted online with the possibility of deciding who can view the face and for how long since media can be "unposted". Since it is possible to define permissions per visual identity, not per media - the user remains in control of their visual identity also in media other people post. The visual identity in the media is no longer "donated" to the Sharer anymore - the media is stored with the Sharer fully obfuscated, and the user decides how other viewers can view their visual identity. The method can process the original media fully locally and does not store it online. The method preferably includes the function of the applicant as a neutral broker for metadata needed by others to view the corresponding visual identity. The ownership of a visual identity in posted media is known and can be traced, ensuring authenticity and counteracting deep fakes.
Although the above paragraph mentions faces as one possibility as visual identity, other elements can be part of the object recognition, e.g. body, body parts, clothing and accessories. It is possible to apply obscuring algorithms on skin colour consistency, facial features, clothing etc. that can replace them with a different face or clothing item for example. Beside full obfuscation, the method allows for partial obfuscation when a number of visual identities are allowed to be de-obfuscated so that appearance of modified image remains intact, although the person cannot be identified anymore, since obscuring is achieved through modifying facial features. The obscurity can be adjustable.
Original visual identities in the media item are accessed and perturbed to generate a synthesized and anonymized visual identity which preserves at least some of the original attributes of the corresponding original visual identity.
It is possible to replace a region of a media item, as e.g. an image, with the representation of the visual identities of a user. Then this region can be replaced by a media item, e.g. an image of that region with the body or face of the human subject. Then a machine readable data code with the dimension of that region of the image comprising the synthesized facial image or the image of the body is created and replaces the image data of the region corresponding to the face or body of the human subject by machine readable data code creating a selectively obscured identity image.
The specification uses a number of terms which are defined as follows:
Further embodiments of the invention are laid down in the dependent claims.
BRIEF DESCRIPTION OF THE DRAWINGS
Preferred embodiments of the invention are described in the following with reference to the drawings, which are for the purpose of illustrating the present preferred embodiments of the invention and not for the purpose of limiting the same. In the drawings,
Fig. 1 shows an overview of an obfuscating, de-obfuscating and viewing system, called Egonym System, according to an embodiment of the invention;
Fig. 2 shows an overview of an obfuscating system according to an embodiment of the invention;
Fig. 3 shows an overview of an de-obfuscating system according to an embodiment of the invention;
Fig. 4 shows a detail view of an embodiment of the obfuscating system of Fig. 2;
Fig. 5 shows a detail view of an embodiment of the de-obfuscating system of Fig. 3;
Fig. 6 shows a detail view of an embodiment of the obfuscating system of Fig. 2 with an encryption step of the Visual Identity;
Fig. 7 shows a detail view of an embodiment of the de-obfuscating system of Fig. 3 with an decryption step of the Visual Identity;
Fig. 8 shows a detail view of a further embodiment of the obfuscating system of Fig. 2 with a further encryption step of the Visual Identity;
Fig. 9 shows a detail view of an embodiment of the de-obfuscating system of Fig. 3 with an decryption step of the Visual Identity;
Fig. 10 shows a detail view of a further embodiment of the obfuscating system of Fig. 2 with the Visual Identities Fi encoded in latent space;
Fig. 11 shows a detail view of an embodiment of the de-obfuscating system of Fig. 3 with Anonymized Identities Ai encoded in latent space;
Fig. 12 shows a detail view of a further embodiment of the obfuscating system of Fig. 2 with the Visual Identities Fi and Anonymized Identities Ai encoded in latent space in the Generate Metadata step; Fig. 13 shows a detail view of an embodiment of the de-obfuscating system of Fig. 3 with the Visual Identities Fi and Anonymized Identities Ai encoded in latent space in the Process Metadata step;
Fig. 14A shows an example of a media M with six visual identities;
Fig. 14B shows the image of Fig. 14A with obfuscated visual identities for three of the visual identities;
Fig. 14C shows the image of Fig. 14A with obfuscated visual identities for all six visual identities according to one type; and
Fig. 14D shows the image of Fig. 14A with obfuscated visual identities according to a second type.
DESCRIPTION OF PREFERRED EMBODIMENTS
A system according to an embodiment of the present invention is an end-to-end system and method, standalone or integrated into a larger platform, to protect people's visual identity. Fig. 1 gives a high level overview over the system, which comprises two main stages: Egonymization (media obfuscation) and Degonymization (selective media de-obfuscation). The system, as a whole, but also all individual components are platform independent and able to run on mobile devices, desktop computers, laptops or cloud hosted servers. An overview of the method comprises an approach for de-identification, which is called egonymization. A technique for segmentation augmentation follows to create variances of segmentation maps that are used for egonymization. Goal here is to allow the users through segmentation augmentation to control the egonymization process and therefore the degree of how much the identity changes. Next an identifier and optionally an encryption key is embedded together with the anonymized visual identity into an egonymized media so that the extracted original visual identity can be sent safely to a sharer with the possibility for users to receive the media. Finally the process of de-identification, i.e. degonymization, is explained.
Fig. 1 shows an overview of an Egonym System according to an embodiment of the invention. A registered user 10 egonymizes media M 100, which yields two pieces of data: the egonymized media A 200, and some metadata Bi 125 for each visual identity Fi 121. Media and metadata are sent to different locations: the egonymized media A 200 is sent to a sharer 30 (i.e. hoster, cloud provider, social network, peer-to-peer network, local harddrive), metadata Bi 125 are sent to a broker system (i.e. Egonym, sharer, a third party, or a peer-to-peer network). To recover the media M 100, a viewer needs to be authorized to access both media A 200 at the sharer 30 and each metadata Bi 125 for each visual identity Fi 121 , which will be combined through Degonymization to provide media M 100' being not necessarily bit-level accurate, but perceptually equivalent to the original media M 100 (for an Authorized Viewer 11). To achieve this, the de-obfuscating step as shown with box 60 accesses 68 the broker 20 system for receiving metadata Bi 125 to the extent the user is authorized.
In Figs. 1 to 3, thick lines between boxes indicate communication and execution lines of instructions which are used in the embodiments of the invention. Thin lines indicate optional features. In all drawings Fig. 1 to 13, dashed and dotted lines mainly indicate the permission paths of the different users.
For viewers 14 who are not authorized, Degonymization will instead yield anonymized media M* 400 (Unauthorized Viewer). If a viewer 13 is partially authorized, i.e. the viewer is only authorized to access a subset and not every metadata Bi 125, then, Degonymization will yield a partially degonymized version of media M in which only visual identities Fi 121 are reconstructed, for which the viewer 13 has authorized access 68 to the corresponding metadata Bi 125. All other visual identities Fi 121 will remain anonymized, that is, anonymized visual identities Ai 123.
The unauthorized viewer 14 is to be understood to be a user of the Egonym System trying to access a media A. Then, the degonymization method can alter media A 200 into media M* 400, e.g. removing embedding information from media A 200. Of course, any other viewer of media A, accessing media A directly at the sharer will just see media A as such, since no handling of media A occurs. In a further embodiment, it is possible to include the instructions, which cause a processor to execute the degonymization method, in the sharer's server software, so that this alteration of media A 200 into media M* 400 also happens when this sharer 30 is contacted for retrieval of media A 200 and it is not the degonymization part 60 asking for retrieval 69.
When sharer 30 or broker 20 is mentioned, this term is related to the computer system of this entity comprising inter alia storage for media and data and access rules for users 10 and viewers 11 , 13, 14. The necessary connections between the sharer 30 or broker 20 and the computer system, e.g. a smartphone, of a registered user 10 or a viewer 11 , 13, 14 are visualized in Fig. 1 by the arrows between the different features of said representation. The authorized viewer 11 is connected to the dashed line 61 , the partially authorized viewer 13 is connected to the dashed and dotted line 63 and the unauthorized viewer 14 is connected to the dotted line 64.
The egonymization step, i.e. media obfuscation of Fig. 2, and the degonymization step, i.e. media de-obfuscation of Fig. 3, is presented in an agnostic way, in the information technology context, to the different variants. It is important to know here, that Fi and Ai can be either encoded in pixel space or latent space. Latent space is used as an embedding of a set of items within a manifold in which items resembling each other are positioned closer to one another in the latent space.
Fig. 2 shows an overview of an obfuscating system according to an embodiment of the invention. Egonymization 40, i.e. media obfuscation, allows registered users 10 to input media M 100, like images, videos, 3D assets, photorealistic avatars or neural representation, from which it extracts 110 one or more original visual identities Fi 121. The box 120 in Fig. 2 shows the obfuscating related steps which are conducted for each visual identity Fi 121. The method de-identifies one or more original visual identities 121 by generating 122 a new unidentifiable anonymized visual identity Ai 123 for each original visual identity Fi 121 and then embeds 127 the anonymized visual identity Ai 123 with a unique identifier I Di 126' as embedding Ei 128 back into media M 100 yielding media A 200.
Anonymized visual identities Ai 123 (see Fig. 3) in pixel space may be visually aesthetic and realistic yet different visual identities, blurred or pixelated faces, or otherwise anonymized. The general composition, lighting and scenery of the modified media M' or M* after the de-identification process may remain the same as they were in the original input media M 100. In this context the modified media M' or M* are modified relating to the image portion related to the user in a broad sense. This not only comprises necessarily the face of the user, but also their body, body parts, skin properties, clothing etc. As a general concept the modification can also comprise a portion of the image environment as information relating to the place etc.
This egonymized media A 200 may be stored online, e.g. with a sharer 30 and distributed - when viewed the original visual identities Fi’s 121 are obfuscated. In addition a metadata Bi 125 is created 124 for each visual identity Fi 121 , which is to be stored in a different location, accessible only to authorized viewers, e.g. with the broker 20. Metadata Bi 125 consists of the visual identity revealing information of Fi 121 , either the actual visual identity Fi 121 in pixel or latent space, the latent space offset which applied to the anonymized visual identity Ai' 123 reconstructs Fi 12T (see Fig. 11) or the crypto key Ki 136’ (see Fig. 7) that encrypts the encrypted visual identity EFi 135 if it is part of the embedding Ei 128. Furthermore, it also stores the unique identifier I Di 126' to create a link between embedding Ei 128 and metadata Bi 125. Said unique identifier I Di 126' is created in a unique identifier generator 126 for each Fi and is integrated into metadata Bi 125 and also used in the generate embedding step 127. Therefore unique identifier I Di 126' is included with each embedding Ei 128 together with each anonymized identity Ai 123, being introduced into media M 100 in the modify media step 129 to create the egonymized media A 200, the obfuscated media.
Fig. 3 shows an overview of a de-obfuscating system according to an embodiment of the invention. To properly consume the egonymized media A 200 it must first be received from the sharer 30 system and then degonymized (de-obfuscated), shown in box 220 in Fig. 3. Again, this does not take place on a media level, i.e. the entire image, as would be the case for an encryption based system, but instead on an identity level. Meaning that for media as shown in Fig. 14A to 14D containing multiple identities, here six identities, for some of them the original visual identities may be recovered and displayed (authorized use-case) and for others the anonymized visual identities may be shown (unauthorized use-case). Therefore, each of the identities would be handled according to Fig. 3, one after the other will have to pass through the box 220, which shows the way of handling any detected anonymized identity Ai 123.
One or more visual identities Fi 121 present in media M 100 may only be recovered by combining media A 200 and corresponding metadata Bi 125. Degonymization as shown in Fig. 3 takes care of this combination. For each anonymized identity Ai 123, the embedding is extracted 221 providing embedding Ei 128 which is parsed 222 to access the anonymized identity Ai 123 as well as the unique identifier I Di 126' to access the broker 20. Given the unique user identifier of a viewer 11 , 13 or 14 and the embeddings Ei 128 in egonymized media A 200, degonymization requests from the broker 20 the corresponding metadata Bi 125 using the unique identifier IDi 126'. The broker 20 validates if the viewer 11 , 13 (, 14) is authorized, and if so it will send the metadata Bi 125 allowing the original visual identity Fi 121 to be reconstructed by parsing metadata Bi 125 and anonymized identity Ai 123 in step 225 and presented to the viewer 11 (or partially to the user 13) after reconstruction 224. Otherwise, the broker 20 will not return metadata Bi 125 and the anonymized visual identity Ai 123 will be presented which is media M* 400. Note that depending on the implementation, the original media M 100 may not be restored bit-level accurate, but perceptually equivalent yielding media M 100'. In case only a subset of visual identities Fi 121 have been reconstructed, this is related to media M’ 300.
Viewers 14 do either not belong to the group of registered users 10, or are not authenticated. They will either not see the media at all, or get to see the modified media with de-identified visual identities. This also applies if the user management and I or broker 20 that stores one or more visual identities cannot be reached. It is noted that the dashed and dotted lines in Fig. 3 reflect this and have the authorization level 61 , 63, 64 as shown in Fig. 1.
The broker 20 system stores multiple metadata Bi 125 corresponding to multiple visual identity Fi 121 coming from an arbitrary media M 100, and authorizes requests from registered users 10 to access the metadata Bi 125. It comprises user management that stores relations between registered users 10, identity ownerships and identity viewing permissions. A registered user 10 can access the user management (through a web user interface, mobile application or a provided application programming interface (API)) to view all visual identities assigned to them, grant or revoke viewing permissions for each individual visual identity to one or more other registered users and delete one or more visual identities assigned to them. The broker 20 may be represented either by a centralized entity: operated by the applicant, a third-party provider, or the sharer 30 itself; or by a peer-to-peer network such as blockchain.
Fig. 4 shows a detail view of an embodiment of the obfuscating system of Fig. 2. Identical features have received identical reference numerals. The general approach as shown in Fig. 2 is exemplified in Fig. 4. The visual identity Fi 121 is delivered to the generate metadata step 124 as well as used to generate a new identity 122 creating an anonymized identity Ai 123 used in a generate embedding 127 step to generate embeddings Ei upon reception of an unique identifier I Di 126' which is also used in the metadata Bi 125.
Fig. 5 shows a detail view of an embodiment of the de-obfuscating system of Fig. 3. Identical features have received identical reference numerals. The general approach as shown in Fig. 3 is exemplified in Fig. 5. The parse embedding step 222 delivers the IDi 126' which is forwarded to the broker 20. Broker 20 delivers in the framework of the authorizations connected to IDi 126' metadata Bi 125 for the re-identification process with viewing permission which is delimited by box 230.
The processing of metadata 225' delivers the visual identities Fi 121 , which are used, as well as the anonymized identities Ai 123 in the framework of the system as e.g. shown in Fig. 3.
Fig. 6 shows a detail view of an embodiment of the obfuscating system of Fig. 2 with an added encryption step of the visual identity. Identical features have received identical reference numerals. The general approach as shown in Fig. 2 is exemplified in Fig. 6 with an encryption step of the visual identity.
The steps for each visual identity as shown in the box 120 are identical from using the visual identities Fi 121 over generating a new identity 122 with as output an anonymized identity Ai 123 to generate embeddings 127 with the proviso that the generate metadata 124 from Fig. 2 is modified to an modified generate metadata 124".
This modified generate metadata 124" step comprises as input as usual the visual identities Fi 121 which are each encrypted with a key Ki 136' generated in a generate cryptokey 136 step. The encrypted visual identity 134 step is creating the encrypted visual identity EFi 135 which is provided as metadata Bi 125 together with the unique identifier I Di 126'.
Fig. 7 shows a detail view of an embodiment of the de-obfuscating system of Fig. 3 with an added decryption step of the visual identity. Identical features have received identical reference numerals. The general approach as shown in Fig. 3 is exemplified in Fig. 7 with an decryption step of the visual identity, when encrypted according to Fig. 6.
The steps for each anonymized identity as shown in box 220 are identical from parsing embedding 222 to identify the unique identifier I Di 126' and sending it to the broker 20 delivering the metadata Bi 125. The modified process metadata step 225" comprises delivering the encrypted visual identities EFi 135 from the metadata Bi 125 to the decrypt visual identity step 137 together with key Ki 136' received from the parse embedding step 222 to re-establish the visual identity Fi 121.
Fig. 8 shows a detail view of a further embodiment of the obfuscating system of Fig. 2 with an added further encryption step of the visual identity. Identical features have received identical reference numerals. The general approach as shown in Fig. 2 is exemplified in Fig. 8 with a different encryption step of the visual identity Fi 121.
This modified generate metadata 124'" step comprises as input as usual the visual identities Fi which are each encrypted with a key Ki 136' generated in a generate cryptokey 136 step. The encrypted visual identity 134 step is creating the encrypted visual identity EFi 135 which is delivered to the generate embedding 127 step and the key Ki 136' is provided to metadata Bi 125 together with the unique identifier I Di 126'.
Following Fig. 8, Fig. 9 shows a detail view of an embodiment of the de-obfuscating system of Fig. 3 with an added decryption step of the visual identity. Identical features have received identical reference numerals. The general approach as shown in Fig. 3 is similar to Fig. 7 with an decryption step of the visual identity, when encrypted according to Fig. 8.
The steps for each anonymized identity as shown in box 220 are identical from parsing embedding 222 to identify the unique identifier I Di 126' and sending it to the broker 20 delivering the metadata Bi 125. The modified process metadata step 225'" comprises receiving the encrypted visual identities EFi 135 from the parse embedding 222 whereas the key Ki 136' is stemming from the metadata Bi 125. The decrypt visual identity step 137' re-establishes the visual identity Fi 121 for further perusal.
The difference between the embodiments of Figs. 6/7 and Fig. 8/9 is based on the different handling of the encrypted visual identity EFi 135 and the key Ki 136' which are delivered to the generation of embedding 127 on one side and the inclusion in the metadata 125 on the other side, and vice versa. The consequence is the difference in the parse embedding step, i.e. on one side extraction of the encrypted visual identity EFi 135 or the key Ki 136' depending on the embodiment.
Fig. 10 shows a detail view of a further embodiment of the obfuscating system of Fig. 2 with the visual identities Fi 121 encoded in latent space. Identical features have received identical reference numerals.
Here, the modified generate metadata 124"" step comprises as input as usual the visual identities Fi 121 in latent space, i.e. Fi (latent) 12T as well as the anonymized visual identities Ai 123 in latent space, i.e. Ai (latent) 123’. The modified generate metadata 124"" step computes the latent offset 141 as latent offset Di 140 which is delivered to the metadata Bi 125. The embedding Ei 128 is generated as above with the latent space anonymized identities Ai (latent) 123'.
Following Fig. 10, Fig. 11 shows a detail view of an embodiment of the de-obfuscating system of Fig. 3 with anonymized identities Ai encoded in latent space 123'. Identical features have received identical reference numerals.
The steps for each anonymized identity Ai as shown in the box 220 are identical from parsing embedding 222 to identify the unique identifier I Di 126' and sending it to the broker 20 delivering the metadata Bi 125. The modified process metadata step 225"" comprises receiving the anonymized identity Ai (latent space) 123' from the parse embedding 222 whereas the latent offset Di 140 is stemming from the metadata Bi 125. The apply latent offset 142 step re-establishes the visual identity Fi (latent space) 12T for further perusal.
Identity data of visual identities can be encoded into different representations while still adhering to the overall concept of the present system - that is selective separation of identity information and the rest of the media. Each representation has its own benefits and drawbacks. However, in all cases, a viewer needs information from both sources (the egonymized media A 200 and the metadata Bi 125) in order to recover the corresponding original visual identity Fi 121. A straightforward representation is pixel (RGB) values of the original visual identity Fi (pixel) 121". Those pixels are stored directly in metadata Bi (the generate metadata 124 function only combines Fi (pixel) 121" and IDi 126' together) with the broker 20. Alternatively, a visual identity Fi 121 can be represented by a proxy - a so- called latent vector Fi 12T. In this scenario, the extract visual identities 110 function converts the detected visual identities from pixel space to latent space using an inversion function (for example an encoder neural network).
To get an anonymized visual identity Ai 123, the system generates a new latent vector given Fi 12T. From Fi 12T and Ai 123' in latent space, a latent offset Di 140 is computed and stored in metadata Bi 125 (which is done using the generate metadata 124 function). Ai 123' as a latent vector is embedded and distributed with the egonymized media A 200. When degonymizing the egonymized media A 200, an authorized viewer 11 receives the correct latent offset Di 140. In case the viewer is an unauthorized viewer 14 (or only a partially authorized viewer 13) a fake latent offset Gi (not shown in Fig. 11) is generated (in the simplest form a zero offset). Together with the extracted latent vector Ai 123', the latent vector Fi 12T is retrieved and using a generator function (a generative model such as GAN, Diffusion model, etc.), the pixel values are reconstructed.
Lastly, the anonymized visual identity Ai 123 can also be embedded into the egonymized media A 200 as pixels Ai (pixel) 123" instead of the latent vector as described previously. Those pixels are again generated from the generator function 124 and then brought back to the latent space using the inversion function to compute the latent offset Di - which is as above stored in metadata Bi 125. This has the advantage that for an unauthorized viewer 14 the system does not need to generate pixels to display de-identified identities.
Fig. 12 shows a detail view of a further embodiment of the obfuscating system of Fig. 2 with the visual identities Fi (pixel) 121" and anonymized identities Ai (pixel) 123" from the pixel space being encoded in latent space in the generate metadata step 124 for each detected visual identity Fi. The visual identity Fi (pixel) 121" and anonymized identity Ai (pixel) 123" ( from the pixel space are inverted to latent space 150 and are provided as visual entity Fi (latent) 12T and anonymized identity Ai (latent) 123' as input to the compute the latent offset step 141 creating the latent offset Di 140 which is delivered to the metadata 125.
Fig. 13 shows a detail view of an embodiment of the de-obfuscating system of Fig. 3 with the anonymized identity Ai (pixel) 123" being delivered to the process metadata step 225 using the invert to latent space 150 step thus creating (latent) anonymized identity Ai 123'. On the other side the broker 20 delivers metadata Bi 125 to provide the latent offset Di 140 for the process metadata step 225 The latent offset Di 140 is applied in the apply latent offset step 142 to anonymized identity Ai (latent) 123’ providing the visual identity Fi (latent) 12T which is subsequently converted in the convert to pixel space step 151 into the visual identity Fi (pixel) 121".
Fig. 14A to Fig. 14D are based on https:// splash.com/photos/p74ndnYWRY4. Fig. 14A shows an example of a media M with six visual identities. Fig. 14B shows the image of Fig. 14A with obfuscated visual identities for three of the visual identities. Three faces 601 , 602 and 603 have been obfuscated by changing some parts of the respective face, especially the eyes and the smile of the faces of these three faces had been altered. Fig. 14C shows the image of Fig. 14A with obfuscated visual identities of a first type for all faces. The three faces 601 , 602 and 603 have been altered in the same way as in Fig. 14B providing synthetical faces. The further faces 604, 605 and 606 have now been altered as well, especially the eye region. Fig. 14D shows the image of Fig. 14A with obfuscated visual identities of a second type, i.e. where a significant region of the face is covered by a square 2D code.
As mentioned above, obfuscation can also be related to clothing and other body parts. The obfuscated face(s) 510 are shown as QR-codes taking the place of the covered body part so it may be possible to obfuscate and change clothing parts as t-shirts 501 , trousers 502, pullover 503 or shoes, as well as body parts 504 as arms, hands, optionally tattooed regions, and additionally items 505 as jewellery, watches, bracelets, etc. Beside the shown obfuscation methods other ways of altering the image can be applied as e.g. blurring.
The persons depicted in Figs. 14A to 14D can preferably claim ownership of their visual identity and claim to transfer visual identity ownership to them. Then as a further registered user (as user 10) such a person also has the possibility to claim ownership of their visual identity in one or more media within the Egonym System. When a claim for a visual identity is made by a claimant, this claim then has to undergo a verification process in order to verify that the claimed visual identity belongs to the user it is claimed for. Claim verification is conducted either by the current owner and / or a neutral arbiter (i.e. Egonym). Verification proceedings are transparently defined and are to find a link between the visual identity and the user for whom the visual identity is claimed for when the proceedings are initiated. In this context known identification processes as e.g. for online banking can be used. After positive verification of the claimed visual identity, the ownership of the claimed visual identity is then revoked from the current owner and re-assigned to the user for whom the visual identity is claimed for. Alternatively, ownership transfer may be initiated by the current owner by identifying (tagging) a visual identity and a corresponding registered user. This requires a confirmation step from the receiving registered user. In case the broker 20 system is represented by a peer-to-peer network, the role of the broker 20 is a registry of users, visual identity ownerships and viewing permissions, while the actual metadata Bi lies with the current owner. Here, the visual identity claim is verified by the current owner on a trustbased basis. Upon successful verification, the current owner transfers the corresponding metadata Bi to the user for whom the visual identity is claimed for and informs the broker about the visual identity ownership changes. As a result, the former current owner has no longer any control over the identity viewing permissions and this ability lies solely with the user for whom the visual identity was claimed for.
Encryption of visual identity data is possible as an optional feature relating to identity data to be encrypted. Then a visual identity Fi is encrypted with a key Ki yielding an encrypted visual identity EFi which is stored with the broker in metadata Bi 125. The key Ki is embedded into embedding Ei 128 and distributed within the egonymized media A 200. At the Degonymization step, an authorized viewer 11 receives metadata Bi 125 containing the encrypted visual identity EFi. Together with the key Ki parsed from the extracted embedding Ei 128, the system decrypts from EFi to Fi. While the Egonym System is not dependent on such an encryption scheme, it improves its security, not leaving the visual identity data in a raw form.
Several use cases to protect visual identity can be contemplated and they are by no means complete. In Social Medias, user A wants to upload an image to their social media account that contains four visible faces, their own, two of their friends and one from another person in the background. Henceforth, friend B, friend C and user D respectively. Before user A as registered user 10 uploads the image I, they use the Egonym System to de-identificate the faces by replacing them with generated, aesthetically realistic, yet anonymized faces, which yields image X. User A then uploads image X to their social media and grants viewing permissions to friend B, friend C and their extended friend circle to which user D also belongs. User D one day sees the re-identified image I with their face in it and claims the visual identity. After a successful verification process, user D can now decide who has viewing permissions for their face, because they own it. The same could be done by friend B and friend C as well. Another user F, who does not belong to the extended friend circle of user A, but is able to see image X in a post shared by a friend in common, is not able to identify any person in the image, because they do not have viewing permissions. After befriending user A, they can now see the face of user A, friend B and friend C, but still the anonymized face of user D.
Criminal case records handled by the court contain sensitive visual content regarding the litigation. There are several groups of people that handle those case records and its content, who should or should not have knowledge about the actual identity of individual people involved. Therefore, de-identificating the content and granting viewing permissions could be an extra measure for protecting individuals privacy. Several examples of user groups are judges and jury members, plaintiff and defendant, their lawyers, members of the press and the public. The Egonym System would further help to conceal visual identities during the litigation, if, for example, an involved individual turns out to be a minor and protected by special legislation.
Medical data is highly sensitive but at the same time valuable for research and medical education. Through the Egonym System, patients can decide which users (e.g. doctors, nurses, medical students, etc.) can view their medical records that contain their visual identity. Now, even users without viewing permissions for the original visual identity can make use of the data, while the privacy of the patient is protected. Training autonomous driving models requires a vast amount of real-life video footage. More and more cars are equipped with cameras to record traffic interactions. This data is either used by the car company, sold to third parties or made publicly available. Unfortunately, many people appear in it whose visual identity is revealed without their knowledge or consent. By replacing the original visual identities with generated, aesthetically realistic, yet anonymized identities allows the data to be still usable for machine learning while respecting privacy. Even when the default viewing permissions are set to all, running such a dataset through the Egonym System before uploading allows users to claim their visual identity when finding themselves in a publicly available dataset.
LIST OF REFERENCE SIGNS
10 registered user 124"' generate modified metadata
11 authorized viewer 124"" generate modified metadata
13 partially authorized viewer 124"' generate modified metadata
14 unauthorized viewer 125 metadata Bi
20 broker 126 generate unique identifier I Di
30 sharer 126' unique identifier I Di
40 egonymization / de- 127 generate embedding identification 128 embedding Ei
60 degonymization / (partial / no) 129 modify media M re-identification 134 encrypt visual identity
61 permission for all visual 135 encrypted visual identity EFi identities 136 generate cryptokey
63 permission for some visual 136' key Ki identities 137 decrypt visual identity
64 no permission 140 latent offset Di
68 accessing broker data 141 compute latent offset
69 media received from sharer 142 apply latent offset to access requesting user 150 invert to latent space
100 media M 151 convert to pixel space
100' perceptually equivalent, but 200 obfuscated media M = media not necessarily bit-level A accurate media M 220 box for each detected
110 extract visual identities anonymized identity Ai
120 box for each visual identity Fi 221 extract embedding
121 visual identity Fi 222 parse embedding
12T visual identity Fi (latent) 224 reconstruct media
121" visual identity Fi (pixel) 225 process metadata
122 generate new identity 225' process modified metadata
123 anonymized identity Ai 225" process modified metadata
123' anonymized identity Ai 225'" process modified metadata
(latent) 225"" process modified metadata
123" anonymized identity Ai (pixel) 225'" process modified metadata
124 generate metadata 230 re-identification with viewing
124' generate modified metadata permission
124" generate modified metadata 300 media M' media M* embedding shirt 601 obfuscated face trousers 602 obfuscated face pullover 603 obfuscated face body part 604 obfuscated face item 605 obfuscated face obfuscated face with 2D code 606 obfuscated face