GB2621390A

Movatterモバイル変換

Info

Publication number: GB2621390A
Application number: GB2211770.9A
Authority: GB
Inventors: Anwar Ahmed Mansoor
Original assignee: Openorigins Ltd
Current assignee: Openorigins Ltd
Priority date: 2022-08-11
Filing date: 2022-08-11
Publication date: 2024-02-14
Also published as: WO2024033476A1; GB202211770D0

Abstract

Method of detecting a picture in picture spoofing attack, comprising: determining depth information of a scene 502; and determining a spoofing attack if part of the scene comprises a substantially flat surface 506. An image of the scene 503 may undergo image processing (object recognition) to determine that part of the scene is a three-dimensional object i.e. not a substantially flat surface 505. Detecting a spoofing attack 506 may use both the depth information and the determination of the presence of a three-dimensional object i.e. when a three-dimensional object has been recognised and the depth information indicates a flat surface then a spoofing attack may be occurring. The substantially flat surface (208, Fig.2) may comprise a curved surface. The depth information may be obtained from a depth sensor such as, time-of-flight sensors, LiDAR, radar, sonar and a multi-perspective (stereographic) camera. Depth information may comprise a plurality of depth values and a substantially flat surface may be determined if a threshold percentage of depth values are within a pre-defined standard deviation range of each other.

Description

METHODS AND SYSTEMS FOR SCENE VERIFICATION

Background

[0001] Embodiments of the present invention relate to methods, apparatus and systems for verification in general, and, in particular, verification of media comprising visual information for the purpose of detecting spoofing attacks.

[0002] Digital media comprising visual information, such as images and videos, have wide applications in the information age. A viewer normally assumes that content of an image or video is genuine, especially if the image or video is captured by a device or application trusted by the viewer.

[0003] However, capturing and transmission of visual information are subject to various spoofing' attacks, in which visual information may be created, modified or reproduced by an unauthorized adversary or program to gain an unfair or even illegitimate advantage. These attacks attempt to present a false scene to a viewer. The false scene may be created by modifying captured media or synthesizing media. There has been an ongoing need for methods by which viewers can verify the origin of important visual information they receive.

[0004] The embodiments described below are not limited to implementations which solve any or all of the disadvantages of known techniques for scene verification or spoofing detection.

Summary

[0005] This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.

[0006] The invention is set out in the appended set of claims.

[0007] A first aspect provides a computer-implemented method comprising determining, from depth information of a scene, whether at least a part of the scene comprises a substantially flat surface; and detecting a spoofing attack based on said determining, from the depth information of the scene, whether at least a part of the scene comprises a substantially flat surface.

[0008] The methods described herein may be performed by software in machine readable form on a tangible storage medium e.g. in the form of a computer program comprising computer program code means adapted to perform all the steps of any of the methods described herein when the program is run on a computer and where the computer program may be embodied on a computer readable medium. Examples of tangible (or non-transitory) storage media include disks, thumb drives, memory cards etc and do not include propagated signals. The software can be suitable for execution on a parallel processor or a serial processor such that the method steps may be carried out in any suitable order, or simultaneously.

[0009] This acknowledges that firmware and software can be valuable, separately tradable commodities. It is intended to encompass software, which runs on or controls "dumb" or standard hardware, to carry out the desired functions. It is also intended to encompass software which "describes" or defines the configuration of hardware, such as HDL (hardware description language) software, as is used for designing silicon chips, or for configuring universal programmable chips, to carry out desired functions.

[0010] The preferred features may be combined as appropriate, as would be apparent to a skilled person, and may be combined with any of the aspects of the invention.

Brief Description of the Drawings

[0011] Embodiments of the invention will be described, by way of example, with reference to the following drawings, in which: [0012] Figure 1 is a schematic diagram of an environment where a user device is taking an image or a video from a scene; [0013] Figure 2 is a schematic diagram of an environment where a user device is taking an image or a video of content displayed on a screen; [0014] Figure 3 is a block diagram of an exemplary set of components of a user device in which the embodiments of the present invention may be implemented; [0015] Figure 4 is a flow diagram of a method for scene verification according to some embodiments of the invention; [0016] Figure 5 is a flow diagram of a method for scene verification according to some embodiments of the invention.

[0017] Common reference numerals are used throughout the figures to indicate similar features.

Detailed Description

[0018] Embodiments of the present invention are described below by way of example only.

These examples represent the best ways of putting the invention into practice that are currently known to the Applicant although they are not the only ways in which this could be achieved. The description sets forth the functions of the example and the sequence of steps for constructing and operating the example. However, the same or equivalent functions and sequences may be accomplished by different examples.

[0019] A common spoofing technique is the Picture-of-Picture (PoP) attack. In this attack, an adversary first records an image or a video of a scene, possibly edits it, and displays it on a screen. Such a scene is typically a 3-dimensional environment containing visual features, which are expected by a viewer or application program. Then, using a capture device trusted by the viewer or application program, the adversary takes an image or a video of the screen displaying the pre-recorded image or video of the scene containing the visual features and uploads it for viewing by the viewer or for processing by the application program. Without any effective detection mechanism in place, even with the use of a trusted capture device, the viewer or application program will not be able to differentiate whether the scene is directly captured by the trusted device (meaning that authentic content is provided) or the scene is pre-recorded by a device untrusted by the viewer or application program, possibly with some modification, and then being displayed to the trusted device (meaning that the viewer or application program is spoofed with unauthentic content). Without any effective detection mechanism in place, the viewer or application program will not be able to differentiate the direct capture of 3-dimensional visual features in the real world from an indirect capture of such visual features displayed on a 2-dimensional screen. Accordingly, there is a need for a method and a system for verifying that content of the media has provable three-dimensional origins from a trusted capture device, for example, for verifying that the media is taken directly from the origin, instead of being pre-recorded, edited and replayed to deceive the viewer or application program.

[0020] In some of the embodiments / examples, a computer-implemented method for scene verification can be used to detect a spoofing attack. The method comprises determining, from depth information of a scene, whether at least a part of the scene comprises a substantially flat surface; and detecting a spoofing attack based on said determining, from the depth information of the scene, whether at least a part of the scene comprises a substantially flat surface.

[0021] Figure 1 illustrates an environment where a user is taking an image or a video of a scene. As shown in Figure 1, a system 100 comprises a user device 102, a communication network 104 and server device 106.

[0022] In the environment illustrated by Figure 1, the user device 102 is configured to capture visual information of a scene 108. In various embodiments, the scene 108 is a three-dimensional, real-world environment. Although the scene 108 in Figure 1 is depicted as a car, it can be appreciated that the car is only a non-limiting example and that the scene can be any three-dimensional, real-world environment and can include any three-dimensional, real-world object(s). In one embodiment, the user device 102 comprises at least one sensor for capturing media, for example, a camera for capturing images and/or videos. In an alternative embodiment, the user device does not necessarily comprise any sensor for capturing media, and is instead configured to receive data relating to media captured by a separate media capturing device (not illustrated). The user device 102 may also comprise at least one processor for processing the data relating to the captured images or videos and at least one memory for storing raw data and/or processed data relating to the captured images or videos. The user device 102 may also comprise a communication interface for sending data to and/or receiving data from the authenticator device 106 through the communication network 104.

Optionally, the user device can also comprise a display screen for displaying the scene 108 being captured.

[0023] The communication network 104 may include any wired or wireless connection, the internet, or any other form of communication. Although one network 104 is shown in FIG. 1, the communication network 104 may include any number of different communication networks between the user device 102 and the server device 106. The communication network 104 is configured to enable communication between the user device 102 and the server device 106. Various implementations of communication network 120 may employ different types of networks, for example, but not limited to, computer networks, telecommunications networks (e.g., cellular), mobile wireless data networks, and any combination of these and/or other networks.

[0024] The server device 106 is a computing device for displaying or processing the captured images or videos received by the server device 106 from the user device 102. The server device 106 comprises at least one processor and at least one memory for storing instructions and/or data to be processed by the at least one processor. The server device 106 may be configured to display the images or videos captured by the user device 102. Alternatively or additionally, the server device 106 can be configured to analyze the images or videos captured by the user device 102. The server device 106 may also be configured to display an outcome of the analysis of the images or videos captured by the user device 102. In some examples, the server device 106 may have access to information for authenticating a user, a device, an application and/or a process and/or to digital content. For example, the information may include pre-stored information, such as identity information and/or account information of authorised users. The server device 106 may also comprise a communication interface for sending data to and/or receiving data from the user device 102 through the communication network 104.

[0025] Figure 2 illustrates a scenario where a user device 202 is taking an image or a video of a scene being displayed on a screen. As shown in Figure 2, a system 200 comprises a user device 202, a communication network 204 and a server device 206. The user device 202, the communication network 204 and the authenticator device 206 may be identical to, or perform functions similar to those performed by, the user device 102, the communication network 104 and the authenticator device 106 respectively.

[0026] The system 200 also comprises a screen 208. The screen 208 is configured to display an image or a video. The image or video may represent a scene. Such image or video may be pre-recorded and displayed on the screen 208 by a malicious user, possibly with some modifications, in an attempt to deceive a user or an application of the server device. For example, the screen 208 may display an image or a video of a secondhand car being put up for sale online by a user. The user has a strong motive to tamper with the image or video to make it look nicer than it is in real life. Thus, instead of directly taking an image or video of the car using the user device 202 trusted by the server device 206, the user can modify an image or video of the car, display the modified image or video on the screen 208 and then take an image or video of the modified image or video on the screen 208 using the user device 202.

Even if the user device 202 is a device trusted by a user or an application of the server device 206, without any effective detection mechanism in place, the user or application of the server device 206 will not be able to determine whether the scene shown in the image / video are directly captured by the user device 202 or are pre-recorded and then displayed to the user device 202.

[0027] Figure 3 is a block diagram of an exemplary set of components for a system 300 in which the embodiments of the present invention may be implemented. The user device 102! 202 in system 100 / 200 may be implemented as system 300.

[0028] The system 300 may be implemented as one or more computing and/or electronic devices. The system 300 comprises one or more processors 302 which may be micro-processors, controllers or any other suitable type of processors for processing computer executable instructions to control the operation of the system 300. Platform software comprising an operating system 306 or any other suitable platform software may be provided on the system to enable application software 308 to be executed on the system. In some embodiments, the application software 308 may comprise a software program for processing images, deriving data from the images, and processing the data derived from the images according to various methods described herein. The components of the system 300 described herein may be enclosed in a casing.

[0029] Computer executable instructions may be provided using any computer-readable media that are accessible by the system 300. Computer-readable media may include, for example, computer storage media such as a memory 304 and communications media.

Computer storage media, such as the memory 304, include volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media include, but are not limited to, RAM, ROM, EPROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information for access by a computing device. In contrast, communication media may embody computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave, or other transport mechanism. Although the computer storage medium (the memory 304) is shown within the system 300, it will be appreciated, by a person skilled in the art, that at least a part of the storage may be distributed or located remotely and accessed via a network or other communication link (e.g. using a communication interface 312).

[0030] The system 300 may comprise an input/output controller 314 arranged to receive and process input from one or more input devices 318 which may be separate from or integral to the system 300 and may also be arranged to output information to one or more optional output devices 316 which may be separate from or integral to the system 300. In some embodiments, the input devices 318 may comprise input devices for controlling the operation of the system 300, such as a set of buttons or keys. For example, the input devices 318 may comprise keys for controlling at least one sensing device, such as adjusting an orientation and/or a zoom of a camera, and/or for manipulating an image being displayed on a screen. In some embodiments, the input devices may comprise at least one image sensing device, such as a camera.

[0031] The input devices 318 may further comprise at least one depth sensor. The at least one depth sensor can provide depth information relating to a surface or a scene from a viewpoint. The depth information may contain a plurality of depth values, each representing a distance between a viewpoint (e.g. a depth sensor for sensing the depth values) and a respective point in the scene. The depth information may be in the form of a depth map. The term "depth" is known as the "Z value" or "Z-depth". "Z" in these latter terms relates to a convention that the central axis of view of a camera is in the direction of the camera's Z axis.

[0032] In some embodiments, the at least one depth sensor comprises at least one radio wave based depth sensor, such as a radar sensor, at least one light based depth senor, such as a LiDAR sensor, at least one acoustic depth sensor, such as a sonar sensor, at least one multi-perspective camera setup system.

[0033] LiDAR, also known as "light detection and ranging" or "laser imaging, detection, and ranging", is a time-of-flight technique for determining ranges (variable distance) by targeting an object or a surface with a laser and measuring the time for the reflected light to return to a sensor, e.g. a LiDAR camera.

[0034] Radar, which stands for radio detection and ranging, is a detection technique that uses radio waves to determine the distance (ranging), angle, and radial velocity of objects relative to a site. A radar system consists of a transmitter producing electromagnetic waves in the radio or microwaves domain, a transmitting antenna, a receiving antenna (often the same antenna is used for transmitting and receiving) and a receiver and processor to determine properties of an object. Radio waves from the transmitter reflect off the object and return to the receiver, giving information about the object's location and speed. Radar signals can obtain a distance measurement based on the time-of-flight by transmitting a short pulse of radio signal and measure the time it takes for the reflection of the radio signal caused by a target object or surface to return.

[0036] Sonar, also known as sound navigation and ranging, is an acoustic location technique that uses sound propagation and reflection to measure distances to sound reflective surfaces or detect objects. Sonar can be used to derive a contour of a surface by emitting pulses of sounds and detect echoes reflected from the surface.

[0036] The time-of-flight sensor is a range imaging camera system employing time-of-flight techniques to calculate a distance between its camera and a point on a target surface, by measuring the round-trip time of a signal, such as an artificial light signal emitted by a light source, and reflected by the point on the target surface back to the camera. The time-of-flight sensor can be used to make a digital 3-D representation of a surface by collecting depth information from a plurality of points on the surface using the time of flight of the light (i.e., the time it takes each light signal from the light source to hit a target point and return to the camera).

[0037] A depth sensor based on the multi-perspective camera setup system measures depth by capturing images of the same scene from different perspectives using different cameras. It uses stereophotogrammetry where the depth data of the pixels are determined from data acquired using the multi-perspective camera setup system. To solve the depth measurement problem using the multi-perspective camera setup system, corresponding points in the different images captured by the different cameras are identified. A disparity map can be constructed to indicate the apparent pixel difference or motion between a pair of stereo images. Various techniques exist for deriving a depth map from a disparity map.

[0038] The optional output devices 316 may include a display screen. In some embodiments, the output device 316 may also act as an input device, for example, when the output device 316 is a touch screen. The input/output controller 314 may also output data to devices other than the output device, for example to a locally connected computing device. According to some embodiments, image processing and calculations based on data derived from images captured by the input device 318 and/or any other functionality as described in the embodiments, may be implemented by software or firmware, for example, the operating system 306 and the application software 308 working together and/or independently, and executed by the processor 202.

[0039] The communication interface 312 enables the system 300 to communicate with other devices and systems. The communication interface 312 may include any type of signal transceivers, such as 3G, 4G and/or SG wireless mobile telecommunications transceivers, WIFiTM signal transceivers and/or BluetoothTM transceivers.

[0040] The functionality described herein in the embodiments may be performed, at least in part, by one or more hardware logic components. According to an embodiment, the computing device 300 is configured by programs 306, 308 stored in the memory 304 when executed by the processor 302 to execute the embodiments of the operations and functionality described. Alternatively, or in addition, the functionality described herein may be performed, at least in part, by one or more hardware logic components. For example, and without limitation, illustrative types of hardware logic components that can be used include Field-programmable Gate Arrays (FPGAs), Application-specific Integrated Circuits (ASICs), Application-specific Standard Products (ASSPs), System-on-a-chip systems (SOCs), Complex Programmable Logic Devices (CPLDs), Graphics Processing Units (CPUs).

[0041] Figure 4 is a flow diagram of a method for verification according to some embodiments of the invention. The scene verification method provides increased assurance of authenticity and can be used to detect potential spoofing attacks.

[0042] Method 400 starts with step 402, in which depth information of a scene is obtained. The depth information may contain a plurality of depth values, each representing a distance from the viewpoint (e.g. a camera of the depth sensor) to a respective point in the scene (which can be a point on a surface in the scene reflecting a depth sensing signal back to the depth sensor). The depth information may be in the form of a depth map. A depth map typically consists of a two-dimensional array of pixels, each pixel having a value (i.e. the depth value) representing the depth at a location of a scene corresponding to the pixel in the depth map for the scene.

[0043] The depth information can be obtained from measurements taken by at least one depth sensor. In some embodiments, the at least one depth sensor comprises at least one radio wave based depth sensor, such as a radar sensor, at least one light based depth senor, such as a LiDAR sensor, at least one acoustic depth sensor, such as a sonar sensor and/or at least one multi-perspective camera setup depth sensor.

[0045] The term "substantially flat surface" means the surface being produced within the technical tolerance of the method used to manufacture it and the flatness of such a surface being measured within the technical tolerance of the depth sensor used to measure it. The skilled person can appreciate that the condition(s), threshold(s) and/or range(s) required for depth values on a depth map to indicate a substantially flat surface can differ, depending on tolerances of depth sensors and tolerance for manufacturing a flat surface, such as a flat panel display. In some embodiments, the substantially flat surface comprises a slightly curved surface, such as a display screen with a slight curvature. Such a slightly curved surface can be detected as a substantially flat surface, if the depth values measured from it satisfies the condition set for the depth values for determining a substantially flat surface.

[0046] The at least a part of the scene in step 404 takes up a threshold percentage of the area of the scene, and preferably is a significant proportion of the scene. In some embodiments, the at least a part of the scene takes up more than 50% or approximately 100% on a depth map or an image of the scene.

[0047] Then the method 400 proceeds to step 406, in which detection of a spoofing attack is performed based on the determining in step 404. In some embodiments, the detection of a spoofing attack comprises verifying live capture of a three-dimensional scene, i.e. a scene not being substantially flat. In other words, it verifies that an image or video representing a three-dimensional scene or environment is indeed being captured live by an imaging device in a real-world environment, instead of being reproduced, replayed or modified after an original capture or being artificially created and being displayed on a screen to the imaging device. If the image or video is not captured live, there is likely to be a spoofing attack, and in particular, a Picture-of-Picture attack. In some embodiments, the presence of a Picture-of-Picture attack is determined if it is determined in step 404 that at least a part of the scene comprises a substantially flat surface. This is because a substantially flat surface may be a display screen, which is typically flat and can be used by an adversary in a Picture-of-Picture attack to display an image or video pre-recorded from a scene to spoof a viewer or application program receiving this image or video, as illustrated in Figure 2. In some embodiments, the substantially flat surface may be a slightly curved surface. Such a slightly curved surface can represent a slightly curved screen which may also be used in a Picture-of-Picture attack. The at least a part of the scene in step 406 takes up a threshold percentage of the area of the scene, and preferably is a significant proportion of the scene. In some embodiments, the at least a part of the scene takes up more than 50% or approximately 100% on a depth map or an image of the scene.

[0048] In some embodiments, preferably there is an additional step of informing an administrator of the system of the result of the spoofing detection via, for example, an visual and/or audio alert, so that the administrator can take any appropriate actions, such as labelling an image or video of the scene with an informational tag that can be presented along with the media (such as "This media appears to be flat and may be a picture of a screen and not the scene that is purported"), not granting access to any information, data or application requested by a requesting party and/or asking for more information to verify the information provided by the requesting party. In some embodiments, instead of automatically determining a spoofing attack, step 406 may comprise issuing an alert to an administrator the system, so that the administrator can make a determination of whether there is a spoofing attack based on the depth information, an image of the scene and/or other information associated with the scene or a user using the user device 202. In some embodiments, the detection of a spoofing attack is performed based on the determining in step 404 in combination with at least one additional factor. The at least one additional factor may include a determination made based on at least one image of the scene, for example, according to the method of Figure 5.

[0049] Figure 5 is a flow diagram of a method for verification according to further embodiments of the invention. The scene verification method provides increased assurance of authenticity and can be used to detect potential spoofing attacks.

[0050] Steps 502 and 504 in Figure 5 are identical to steps 402 and 404 in Figure 4 respectively. As with method 400, method 500 starts with a step (step 502), in which depth information of a scene is obtained. As described with respect to Figure 4, the depth information can be obtained from measurements taken by one or more depth sensors. In some embodiments, the one or more depth sensors comprise at least one time-of-flight sensor, at least one radio wave based depth sensor, such as a radar sensor, at least one light based depth senor, such as a LiDAR sensor, at least one acoustic depth sensor, such as a sonar sensor, and/or at least one a multi-perspective camera setup depth sensor.

[0051] As with method 400, subsequent to step 502, the method 500 proceeds to step 504, which involves determining, from the depth information of the scene, whether at least a part of the scene comprises a substantially flat surface. The features and processes described above with reference to step 404 apply, mutatis mutandis, to step 504 and are therefore not described again for conciseness.

[0052] The method of Figure 5 additionally includes step 503, in which at least one image of the scene is obtained. Such an image may be captured by an optical camera. Such an image may be a digital image. In some embodiments, such an image may be a part of a digital video, such as a frame in a digital video. Step 503 can be performed before, after or simultaneously with either step 502 or 504.

[0053] Subsequent to step 503, the method 500 proceeds to step 505, which involves determining, from the at least one image taken from the scene, that the at least a part of the scene does not represent a substantially flat surface. In various embodiments, this can be achieved by determining, from the at least one image captured from the scene, the existence of at least one three-dimensional object. If it is determined from the image that a three-dimensional object exists in a part of the scene and/or the three-dimensional object is above a pre-determined size, it can be deduced that the part of the scene in the real world should not be substantially flat. Various techniques for recognizing three-dimensional object(s) from a two-dimensional image exist. Three-dimensional object recognition techniques may use pattern recognition approaches, which use appearance information gathered from pre-captured or pre-computed projections of an object to match the object in the potentially cluttered scene. Three-dimensional object recognition techniques may also use feature-based geometric approaches, which can more effectively detect objects with distinctive features. Three-dimensional objects with pronounced edge features or blob features can be more effectively recognized by detection algorithms, such as the Harris affine region detection and the scale-invariant feature transform (SIFT) method. There are also various open-source software programs for detecting three-dimensional objects, an example of which can be found at: httos://docs.opencv.org/4.4d5/d54/ciroup obidetect.html. Alternatively or additionally, a human observer can also recognize the presence of any three-dimensional object(s) in a two-dimensional image, in which case step 505 involves making a determination based on the human observer's input.

[0054] Step 505 can be performed before, after or simultaneously with either step 502 or 504, as long as step 505 is performed subsequent to step 503 and prior to step 506.

[0055] Once steps 504 and 505 are completed, the method 500 proceeds to step 506, in which detection of a spoofing attack is performed based on the outcome of step 504 and the outcome of step 505. In some embodiments, the presence of a spoofing attack, for example a Pictureof-Picture attack, can be determined if it is determined in step 504, from the depth information of the scene, that the at least a part of the scene comprises a substantially flat surface and if it is determined in step 505, from the at least one image taken from the scene, that the at least a part of the scene does not represent a substantially flat surface. This is because if the image taken from the scene shows that a part of the scene should be a 3-dimensional object or environment (such as a car or a landscape), whereas the depth information of the scene indicates that the part of the scene is substantially flat, then it is quite likely that there is a Picture-of-Picture attack, in which an adversary displays the 3-dimensional object or environment on a substantially flat display screen.

[0056] In some embodiments, instead of automatically determining a spoofing attack, performing the authorization in step 506 may comprise issuing an alert to an administrator of the system, so that the administrator can make a determination of whether there is a spoofing attack based on the depth information, an image of the scene and/or other information associated with the scene or a user using the user device 202. In some embodiments, the detection of a spoofing attack is performed in step 506 based on the determining in step 504, the determining in step 505 and at least one additional factor.

[0057] The term 'computer' or 'computing device' is used herein to refer to any device with processing capability such that it can execute instructions. Those skilled in the art will realize that such processing capabilities are incorporated into many different devices and therefore the term 'computer' includes PCs, servers, mobile telephones, personal digital assistants and many other devices.

[0058] Those skilled in the art will realize that storage devices utilized to store program instructions can be distributed across a network. For example, a remote computer may store an example of the process described as software. A local or terminal computer may access the remote computer and download a part or all of the software to run the program. Alternatively, the local computer may download pieces of the software as needed, or execute some software instructions at the local terminal and some at the remote computer (or computer network).

Those skilled in the art will also realize that by utilizing conventional techniques known to those skilled in the art that all, or a portion of the software instructions may be carried out by a dedicated circuit, such as a DSP, programmable logic array, or the like.

[0059] Any range or device value given herein may be extended or altered without losing the effect sought, as will be apparent to the skilled person.

[0060] It will be understood that the benefits and advantages described above may relate to one embodiment or may relate to several embodiments. The embodiments are not limited to those that solve any or all of the stated problems or those that have any or all of the stated benefits and advantages.

[0061] Any reference to 'an' item refers to one or more of those items. The term 'comprising' is used herein to mean including the method blocks or elements identified, but that such blocks or elements do not comprise an exclusive list and a method or apparatus may contain additional blocks or elements.

[0062] The steps of the methods described herein may be carried out in any suitable order, or simultaneously where appropriate. Additionally, individual blocks may be deleted from any of the methods without departing from the spirit and scope of the subject matter described herein.

Aspects of any of the examples described above may be combined with aspects of any of the other examples described to form further examples without losing the effect sought.

[0063] It will be understood that the above description of a preferred embodiment is given by way of example only and that various modifications may be made by those skilled in the art.

Although various embodiments have been described above with a certain degree of particularity, or with reference to one or more individual embodiments, those skilled in the art could make numerous alterations to the disclosed embodiments without departing from the spirit or scope of this invention.

Claims

Claims 1. 2. 3.
A computer-implemented method, comprising: determining, from depth information of a scene, whether at least a part of the scene comprises a substantially flat surface; and detecting a spoofing attack based on said determining, from the depth information of the scene, whether at least a part of the scene comprises a substantially flat surface.
The method of claim 1, wherein the depth information is obtained from at least one depth sensor, which comprises at least one of a time-of-flight sensor, a LiDAR sensor, a radar sensor, a sonar sensor, and a multi-perspective camera setup depth sensor.
The method of claim 1 or 2, wherein the depth information comprises a plurality of depth values, each representing a distance between a respective point in the scene and a depth sensor for sensing the depth value The method of claim 3, wherein said determining, from depth information of a scene, whether at least a part of the scene comprises a substantially flat surface comprises: determining whether more than a threshold percentage of all depth values within a depth map of the scene vary within a pre-defined range of depth values.
The method of claim 3 or 4, wherein said determining, from depth information of a scene, whether at least a part of the scene comprises a substantially flat surface comprises: determining whether a standard deviation of depth values for the at least a part of the scene is within a pre-determined threshold standard deviation.
The method of any of claims 1-5, wherein said detecting a spoofing attack comprises: determining the presence of a spoofing attack if it is determined that the at least a part of the scene comprises a substantially flat surface.
The method of any of claims 1-6, further comprising: obtaining the at least one image captured from the scene, and determining, from the at least one image taken from the scene, that the at least a part of the scene is not a substantially flat surface.
8. The method of claim 7, further comprising determining, from the at least one image taken from the scene, that the at least a part of the scene does not represent a substantially flat surface comprises determining, from the at least one image taken from the scene, whether the at least a part of the scene comprises a substantially three-dimensional object.
9. The method of claim 7 or 8, wherein the at least one image comprises a video.
10 The method of any of claims 7-9, wherein said detecting a spoofing attack comprises determining the presence of a spoofing attack, if it is determined, from the depth information of the scene, that the at least a part of the scene comprises a substantially flat surface and if it is determined, from the at least one image taken from the scene, that the at least a part of the scene is not a substantially flat surface.
11. The method of any of claims 1-10, wherein the substantially flat surface comprises a surface with a curvature.
12. The method of any of claims 1-11, wherein said detecting a spoofing attack comprises detecting a Picture-of-Picture, PoP, attack.
13 An apparatus, comprising at least one processor and at least one memory, the memory storing computer-implementable instructions, when executed by the at least one processor, causing the at least one processor to perform the method of any of claims 1-12.
14. The apparatus of claim 13, wherein the apparatus is a mobile phone.
15. The apparatus of claim 11 or 12, comprising at least one depth sensor, which comprises at least one of a time-of-flight depth sensor, a LiDAR sensor, a radar sensor, a sonar sensor and a multi-perspective camera setup depth sensor.
16. A computer-readable media storing computer-executable instructions configured to perform the method of any of claims 1-12.
17. A method substantially as described with reference to figures 1-5 of the drawings. 35
18. A system substantially as described with reference to figures 1-5 of the drawings.