[0049] Table 1 : Values of weights used in each category of content. The tuples in each cell refer to weights for saliency maps (both structural and neural), object detection, optical flow, center bias and/or horizon bias, and contrast/brightness respectively. The categories on the horizontal axis represent camera motion while those on the vertical axis are clutter parameter. In some embodiments, the weights are user modified, optimized based on an objective function or obtained over a series of runs for each feature (an "active learning" approach).

[0050] Gaze map determiner 117 is operative to determine based on the feature weights a gaze path for the media stream. Gaze map determiner 117 initialize gaze direction for a frame. In some embodiments, the gaze direction is initialized at the center of the frame. Gaze map determiner 117 further includes a feature maps determiner 230. Feature maps determiner 230 is operative to determine one or more feature maps for each frame of the media stream. In some embodiments, determining one or more feature maps for the frame includes determining one or more of a saliency map 231 for the frame, an object detection map 232 for the frame, an optical flow map 233 for the frame, a center bias and/or horizon bias map 234 for the frame, and/or a contrast/brightness map 235 for the frame.

[0051] In some embodiments, determining a saliency map 231 is based on structural information - e.g., using model that uses intensity, color, and orientation at multiple scales to define points of interest. In some embodiments, determining a saliency map 231 for the frame includes is performed according to neural networks - using the output of a neural network explicitly trained to predict saliency.

[0052] In some embodiments, determining an object detection map 232 for the frame includes using an object detection model to obtain bounding boxes or instance segmentations of predefined objects of interest (e.g., humans). Using the bounding boxes, a binary mask is constructed where each pixel is labelled positive or negative based on whether it is part of an object of interest or not.

[0053] In some embodiments, determining an optical flow map 233 includes using a model for optical flow, deriving a map representing the motion (e.g., magnitude of motion) of each pixel in a frame.

[0054] In some embodiments, determining center bias includes adding positive bias in the center of the frame (e.g., 2x2 center in a 4x8 tiled frame). This may be uniform (constant) or weighted as per saliency in each quadrant of the frame. In some embodiments, determining horizon bias includes adding a positive bias in the center in the center row of the frame (e.g., a 2x8 horizontal band in a 4x8 tiled frame). Each of these feature map determiners produces a feature frame for each frame of the media stream that is of the same width and height as the frame, with rescaling applied if necessary.

[0055] Combiner 240 sums scores across the feature maps after multiplying by the associated feature weights to obtain a gaze direction density map 241, which is referred to as f . , which can be considered a probability density of gaze direction over the frame. Raw gaze direction predictor 250 determines from the gaze direction density map 241 one or more predicted gaze directions for the frame. A predicted gaze direction is the gaze direction of most viewers when a user views the frame of the media stream. Based on the one or more predicted gaze directions, a foveation area is created. The foveation area can be centered at the predicted gaze direction (when the predicted gaze direction is a single entity). Alternatively, the foveation area can be determined from several regions, each region being centered at one of the predicted gaze directions for the frame. A foveation area includes a set of foveation weights. A weight from the set of foveation weights is indicative of a resolution at which to encode a tile from one or more tiles forming a scene or a frame. Each weight of the foveation area is a score derived from the scores determined across the feature maps after being multiplied by the associated feature weights. Raw gaze direction predictor 250 selects the max of the foveation area as the updated predicted gaze direction.

[0056] The operations described above are repeated for each frame of the media stream to obtain raw foveation maps for the media stream, each raw foveation map associated with a frame from the media stream.

[0057] Foveated Q* gaze predictor 260 aggregates, based on a type of tiling scheme (uniform or adaptive), the scores of the raw foveation map into a tile-level foveation map, where a tile includes a plurality of pixels. The foveation map includes a set of foveation weights. A foveation weight is associated with a tile of the frame. A foveation weight from the set of foveation weights is indicative of a resolution at which to encode a tile from one or more tiles forming the frame. In some embodiments, the foveation weight is the average of the pixels it covers from the raw foveation map. This forms the final Q* predicted gaze map 122.

[0058] Referring back to Figure 1, the Q* predicted gaze map 122 is then used by the encoder 114 to encode the frame of the media stream into an encoded frame to include encoded tiles of different resolutions. For example, if a given tile has a low score, low-quality encoding is generated for that tile, as the chance of the user looking at that tile is low. Alternatively, when a given tile has a high score, high quality encoding is generated for that tile, as the chance of the user looking at the tile is high.

Few-Shot Gaze Optimization (client-side federated learning architecture)

[0059] In some embodiments, when the gaze data is available (e.g., a log of user gaze information), a privacy-aware few-shot refinement can be performed. At both the user device 104 A and the remote electronic device 102, a lightweight neural network is initialized with identity weights. These will henceforth be called the refiner networks, as they refine the zeroshot gaze estimation (Q* predicted gaze map) based on real world data. Whenever a user device 104A requests a media stream, both the encodings and the Q* predicted gaze maps for the media stream are delivered by the remote electronic device via the network 105.

[0060] On the user device 104 A, as the user watches videos, gaze data for that specific user representing their unique viewing patterns is collected. The Q* predicted gaze maps received from remote electronic device 102 are fed into the lightweight refiner network of the user device 104A, whose output are refined Q* predicted gaze maps optimized for the specific user device 104 A. The architecture of the refiner network is a lightweight image-to-image dense prediction model, such as a lightweight UNet. The refiner network is trained using the collected gaze data using an appropriate objective function, such as minimizing the mean square error. Periodically, remote electronic device 102 requests from a selection of user devices (e.g., a random selection) for the weights of their refiner networks. The server will then average the weights received and set that as the weights of its own refiner network, which will have the same architecture. According to these embodiments, no gaze data is communicated between the user devices and the remote electronic device 102, consequently preserving users’ privacy. For further computations of the Q* predicted gaze maps, remote electronic device 102 first computes the raw zero-shot score, and then passes it through its refiner network to obtain an optimized score, which it will then use for encoding media stream and transmit the encoded media streams to a user device.

[0061] Figure 3A illustrates a flow diagram of exemplary operations performed for determining a set of feature weights for a media stream, in accordance with some embodiments. In some embodiments, the operations can be performed in a remote electronic device.

[0062] At operation 321, the remote electronic device selects one or more frames from the media stream. In some embodiments, prior to selecting the one or more frames, the remote electronic device is operative to receive the media stream. In some embodiments, the remote electronic device receives the media stream in an encoded format and is operative to decode the media stream prior to the selection of the frames. In some embodiments, the selection of the frames can be performed as described with respect to the sampler 221. The flow of operations moves to operation 322.

[0063] At operation 322, which is optional in some embodiments, the remote electronic device determines a category for the media stream. In some embodiments, determining the category for the media stream can include operations 322 A and 322B. At operation 322A, the remote electronic device determines based on the set of one or more frames, a camera motion parameter for the media stream. The camera motion parameter for the media stream is indicative of a degree of motion of the camera for the media stream. At operation 322B, the remote electronic device determines based on the set of one or more frames, a clutter parameter for the media stream. The clutter parameter is indicative of a degree of clutter of regions of interests for the media stream. In some embodiments, operations 322, 322A, and 322B are performed as described with reference to Figure 2 and the category determiner 222, camera motion parameter determiner 222A, and the clutter parameter determiner 222B. The flow of operations moves to operation 323. [0064] At operation 323, the remote electronic device determines, based on the category for the media stream, a set of one or more feature weights. The category includes a pair of parameters including clutter parameter and camera motion parameter. Each feature weight is associated with a feature parameter and is indicative of how much to emphasize the effect of that parameter in the determination of a gaze direction. If the category is not used, the determination of feature weight(s) can be done differently. As an example, the feature weight(s) may be deterministically selected, randomly selected, uniformly selected, or learned based on a model. [0065] Figure 3B illustrates a flow diagram of exemplary operations performed for determining a gaze path for the media stream, in accordance with some embodiments. While the embodiments herein are described with respect to operations performed for a frame, the operations can be performed for a scene from the media stream, where a scene includes a portion of a frame, a frame, or one or more frames with similar visual content. In some embodiments, the operations are performed in a remote electronic device.

[0066] At operation 325, the remote electronic device initializes the gaze direction for the frame. At operation 330, the remote electronic device determines one or more feature maps for a frame of the media stream. The flow moves to operation 330. At operation 330, the remote electronic device 102 determines one or more feature maps for each frame of the media stream. In some embodiments, determining one or more feature maps includes operations 331-335. At operation 331, the remote electronic device 102 determines a saliency map for the frame. At operation 332, remote electronic device 102 determines an object detection map for the frame. At operation 333, remote electronic device 102 determines an optical flow map for the frame. At operation 334, remote electronic device 102 determines a center and/or horizon bias map for the frame. At operation 335, remote electronic device 102 determines a contrast and/or brightness map for the frame. The flow moves to operation 335. At operation 340, remote electronic device 102 determines based on the feature weights and the feature maps a gaze direction density map. The flow of operations moves to operation 350. At operation 350, the remote electronic device 102 determines based on the gaze direction density map a tile-level foveation map (e.g., Q* predicted gaze map) that is to be used for encoding the frame of the media stream into an encoded frame that includes encoded tiles of different resolutions.

[0067] Figure 4A illustrates exemplary frames of the media stream and predicted Q* predicted gaze maps generated for those frames, in accordance with some embodiments. Figure 4B illustrates different exemplary frames of the media stream and predicted Q* predicted gaze maps generated for those frames, in accordance with some embodiments. The leftmost column in each Figure represents the raw frame, the middle column in each Figure is foveated gaze density prediction by model, and the rightmost column in each Figure is ground-truth gaze information used for testing. The ground-truth gaze information represents where real viewers have looked when shown the same frames based on gaze tracking of the real viewers. Video of a drone flying over a lake - based on the preprocessing module, is categorized as having high camera motion and low clutter. As a result, a high weight is assigned to central bias and saliency, and low to other features. Video of a person walking their dog through a park, such as shown in Figures 4 A and 4B, based on the preprocessing module, may be categorized having medium motion and medium clutter. This frame includes a few moving objects, and human gaze is expected to track them. The weights are balanced among central bias, object detection and optical flow. In another example (not illustrated), when the media stream is a video of a crowd of humans standing in front of the Eiffel tower - based on the preprocessing module, this would be categorized as having low motion and high clutter. In this case, humans are expected to explore the scene in a free-moving way. Thus, horizon bias and object detection receive high weights, with lower weights for the other modules.

[0068] Figure 5 shows an example of a communication system 500 in accordance with some embodiments.

[0069] In the example, the communication system 500 includes a telecommunication network 502 that includes an access network 504, such as a radio access network (RAN), and a core network 506, which includes one or more core network nodes 508. The access network 504 includes one or more access network nodes, such as network nodes 510a and 510b (one or more of which may be generally referred to as network nodes 510), or any other similar 3^rd Generation Partnership Project (3GPP) access nodes or non-3GPP access points. Moreover, as will be appreciated by those of skill in the art, a network node is not necessarily limited to an implementation in which a radio portion and a baseband portion are supplied and integrated by a single vendor. Thus, it will be understood that network nodes include disaggregated implementations or portions thereof. For example, in some embodiments, the telecommunication network 502 includes one or more Open-RAN (ORAN) network nodes. An ORAN network node is a node in the telecommunication network 502 that supports an ORAN specification (e.g., a specification published by the O-RAN Alliance, or any similar organization) and may operate alone or together with other nodes to implement one or more functionalities of any node in the telecommunication network 502, including one or more network nodes 510 and/or core network nodes 508.

[0070] Examples of an ORAN network node include an open radio unit (O-RU), an open distributed unit (O-DU), an open central unit (O-CU), including an O-CU control plane (O-CU- CP) or an O-CU user plane (O-CU-UP), a RAN intelligent controller (near-real time or non-real time) hosting software or software plug-ins, such as a near-real time control application (e.g., xApp) or a non-real time control application (e.g., rApp), or any combination thereof (the adjective “open” designating support of an ORAN specification). The network node may support a specification by, for example, supporting an interface defined by the ORAN specification, such as an Al, Fl, Wl, El, E2, X2, Xn interface, an open fronthaul user plane interface, or an open fronthaul management plane interface. Moreover, an ORAN access node may be a logical node in a physical node. Furthermore, an ORAN network node may be implemented in a virtualization environment (described further below) in which one or more network functions are virtualized. For example, the virtualization environment may include an O-Cloud computing platform orchestrated by a Service Management and Orchestration Framework via an 0-2 interface defined by the 0-RAN Alliance or comparable technologies. The network nodes 510 facilitate direct or indirect connection of user equipment (UE), such as by connecting UEs 512a, 512b, 512c, and 512d (one or more of which may be generally referred to as UEs 512) to the core network 506 over one or more wireless connections.

[0071] Example wireless communications over a wireless connection include transmitting and/or receiving wireless signals using electromagnetic waves, radio waves, infrared waves, and/or other types of signals suitable for conveying information without the use of wires, cables, or other material conductors. Moreover, in different embodiments, the communication system 500 may include any number of wired or wireless networks, network nodes, UEs, and/or any other components or systems that may facilitate or participate in the communication of data and/or signals whether via wired or wireless connections. The communication system 500 may include and/or interface with any type of communication, telecommunication, data, cellular, radio network, and/or other similar type of system.

[0072] The UEs 512 may be any of a wide variety of communication devices, including wireless devices arranged, configured, and/or operable to communicate wirelessly with the network nodes 510 and other communication devices. Similarly, the network nodes 510 are arranged, capable, configured, and/or operable to communicate directly or indirectly with the UEs 512 and/or with other network nodes or equipment in the telecommunication network 502 to enable and/or provide network access, such as wireless network access, and/or to perform other functions, such as administration in the telecommunication network 502.

[0073] In the depicted example, the core network 506 connects the network nodes 510 to one or more hosts, such as host 516. These connections may be direct or indirect via one or more intermediary networks or devices. In other examples, network nodes may be directly coupled to hosts. The core network 506 includes one more core network nodes (e.g., core network node 508) that are structured with hardware and software components. Features of these components may be substantially similar to those described with respect to the UEs, network nodes, and/or hosts, such that the descriptions thereof are generally applicable to the corresponding components of the core network node 508. Example core network nodes include functions of one or more of a Mobile Switching Center (MSC), Mobility Management Entity (MME), Home Subscriber Server (HSS), Access and Mobility Management Function (AMF), Session Management Function (SMF), Authentication Server Function (AUSF), Subscription Identifier De-concealing function (SIDE), Unified Data Management (UDM), Security Edge Protection Proxy (SEPP), Network Exposure Function (NEF), and/or a User Plane Function (UPF).

[0074] The host 516 may be under the ownership or control of a service provider other than an operator or provider of the access network 504 and/or the telecommunication network 502, and may be operated by the service provider or on behalf of the service provider. The host 516 may host a variety of applications to provide one or more service. Examples of such applications include live and pre-recorded audio/video content, data collection services such as retrieving and compiling data on various ambient conditions detected by a plurality of UEs, analytics functionality, social media, functions for controlling or otherwise interacting with remote devices, functions for an alarm and surveillance center, or any other such function performed by a server.

[0075] As a whole, the communication system 500 of Figure 5 enables connectivity between the UEs, network nodes, and hosts. In that sense, the communication system may be configured to operate according to predefined rules or procedures, such as specific standards that include, but are not limited to: Global System for Mobile Communications (GSM); Universal Mobile Telecommunications System (UMTS); Long Term Evolution (LTE), and/or other suitable 2G, 3G, 4G, 5G standards, or any applicable future generation standard (e.g., 6G); wireless local area network (WLAN) standards, such as the Institute of Electrical and Electronics Engineers (IEEE) 802.11 standards (WiFi); and/or any other appropriate wireless communication standard, such as the Worldwide Interoperability for Microwave Access (WiMax), Bluetooth, Z-Wave, Near Field Communication (NFC) ZigBee, LiFi, and/or any low- power wide-area network (LPWAN) standards such as LoRa and Sigfox.

[0076] In some examples, the telecommunication network 502 is a cellular network that implements 3GPP standardized features. Accordingly, the telecommunications network 502 may support network slicing to provide different logical networks to different devices that are connected to the telecommunication network 502. For example, the telecommunications network 502 may provide Ultra Reliable Low Latency Communication (URLLC) services to some UEs, while providing Enhanced Mobile Broadband (eMBB) services to other UEs, and/or Massive Machine Type Communication (mMTC)/Massive loT services to yet further UEs. [0077] In some examples, the UEs 512 are configured to transmit and/or receive information without direct human interaction. For instance, a UE may be designed to transmit information to the access network 504 on a predetermined schedule, when triggered by an internal or external event, or in response to requests from the access network 504. Additionally, a UE may be configured for operating in single- or multi-RAT or multi- standard mode. For example, a UE may operate with any one or combination of Wi-Fi, NR (New Radio) and LTE, i.e. being configured for multi-radio dual connectivity (MR-DC), such as E-UTRAN (Evolved- UMTS Terrestrial Radio Access Network) New Radio - Dual Connectivity (EN-DC).

[0078] In the example, the hub 514 communicates with the access network 504 to facilitate indirect communication between one or more UEs (e.g., UE 512c and/or 512d) and network nodes (e.g., network node 510b). In some examples, the hub 514 may be a controller, router, content source and analytics, or any of the other communication devices described herein regarding UEs. For example, the hub 514 may be a broadband router enabling access to the core network 506 for the UEs. As another example, the hub 514 may be a controller that sends commands or instructions to one or more actuators in the UEs. Commands or instructions may be received from the UEs, network nodes 510, or by executable code, script, process, or other instructions in the hub 514. As another example, the hub 514 may be a data collector that acts as temporary storage for UE data and, in some embodiments, may perform analysis or other processing of the data. As another example, the hub 514 may be a content source. For example, for a UE that is a VR headset, display, loudspeaker or other media delivery device, the hub 514 may retrieve VR assets, video, audio, or other media or data related to sensory information via a network node, which the hub 514 then provides to the UE either directly, after performing local processing, and/or after adding additional local content. In still another example, the hub 514 acts as a proxy server or orchestrator for the UEs, in particular if one or more of the UEs are low energy loT devices.

[0079] The hub 514 may have a constant/persistent or intermittent connection to the network node 510b. The hub 514 may also allow for a different communication scheme and/or schedule between the hub 514 and UEs (e.g., UE 512c and/or 512d), and between the hub 514 and the core network 506. In other examples, the hub 514 is connected to the core network 506 and/or one or more UEs via a wired connection. Moreover, the hub 514 may be configured to connect to an M2M service provider over the access network 504 and/or to another UE over a direct connection. In some scenarios, UEs may establish a wireless connection with the network nodes 510 while still connected via the hub 514 via a wired or wireless connection. In some embodiments, the hub 514 may be a dedicated hub - that is, a hub whose primary function is to route communications to/from the UEs from/to the network node 510b. In other embodiments, the hub 514 may be a non-dedicated hub - that is, a device which is capable of operating to route communications between the UEs and network node 510b, but which is additionally capable of operating as a communication start and/or end point for certain data channels.

[0080] Figure 6 shows a UE 600 in accordance with some embodiments. As used herein, a UE refers to a device capable, configured, arranged and/or operable to communicate wirelessly with network nodes and/or other UEs. Examples of a UE include, but are not limited to, a smart phone, mobile phone, cell phone, voice over IP (VoIP) phone, wireless local loop phone, desktop computer, personal digital assistant (PDA), wireless cameras, gaming console or device, music storage device, playback appliance, wearable terminal device, wireless endpoint, mobile station, tablet, laptop, laptop-embedded equipment (LEE), laptop-mounted equipment (LME), smart device, wireless customer-premise equipment (CPE), vehicle, vehicle-mounted or vehicle embedded/integrated wireless device, etc. Other examples include any UE identified by the 3rd Generation Partnership Project (3GPP), including a narrow band internet of things (NB-IoT) UE, a machine type communication (MTC) UE, and/or an enhanced MTC (eMTC) UE.

[0081] A UE may support device-to-device (D2D) communication, for example by implementing a 3 GPP standard for sidelink communication, Dedicated Short-Range Communication (DSRC), vehicle-to-vehicle (V2V), vehicle-to-infrastructure (V2I), or vehicle- to-everything (V2X). In other examples, a UE may not necessarily have a user in the sense of a human user who owns and/or operates the relevant device. Instead, a UE may represent a device that is intended for sale to, or operation by, a human user but which may not, or which may not initially, be associated with a specific human user (e.g., a smart sprinkler controller).

Alternatively, a UE may represent a device that is not intended for sale to, or operation by, an end user but which may be associated with or operated for the benefit of a user (e.g., a smart power meter).

[0082] The UE 600 includes processing circuitry 602 that is operatively coupled via a bus 604 to an input/output interface 606, a power source 608, a memory 610, a communication interface 612, and/or any other component, or any combination thereof. Certain UEs may utilize all or a subset of the components shown in Figure 6. The level of integration between the components may vary from one UE to another UE. Further, certain UEs may contain multiple instances of a component, such as multiple processors, memories, transceivers, transmitters, receivers, etc.

[0083] The processing circuitry 602 is configured to process instructions and data and may be configured to implement any sequential state machine operative to execute instructions stored as machine-readable computer programs in the memory 610. The processing circuitry 602 may be implemented as one or more hardware-implemented state machines (e.g., in discrete logic, field-programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), etc.); programmable logic together with appropriate firmware; one or more stored computer programs, general-purpose processors, such as a microprocessor or digital signal processor (DSP), together with appropriate software; or any combination of the above. For example, the processing circuitry 602 may include multiple central processing units (CPUs).

[0084] In the example, the input/output interface 606 may be configured to provide an interface or interfaces to an input device, output device, or one or more input and/or output devices. Examples of an output device include a speaker, a sound card, a video card, a display, a monitor, a printer, an actuator, an emitter, a smartcard, another output device, or any combination thereof. An input device may allow a user to capture information into the UE 600. Examples of an input device include a touch-sensitive or presence-sensitive display, a camera (e.g., a digital camera, a digital video camera, a web camera, etc.), a microphone, a sensor, a mouse, a trackball, a directional pad, a trackpad, a scroll wheel, a smartcard, and the like. The presence-sensitive display may include a capacitive or resistive touch sensor to sense input from a user. A sensor may be, for instance, an accelerometer, a gyroscope, a tilt sensor, a force sensor, a magnetometer, an optical sensor, a proximity sensor, a biometric sensor, etc., or any combination thereof. An output device may use the same type of interface port as an input device. For example, a Universal Serial Bus (USB) port may be used to provide an input device and an output device.

[0085] In some embodiments, the power source 608 is structured as a battery or battery pack. Other types of power sources, such as an external power source (e.g., an electricity outlet), photovoltaic device, or power cell, may be used. The power source 608 may further include power circuitry for delivering power from the power source 608 itself, and/or an external power source, to the various parts of the UE 600 via input circuitry or an interface such as an electrical power cable. Delivering power may be, for example, for charging of the power source 608. Power circuitry may perform any formatting, converting, or other modification to the power from the power source 608 to make the power suitable for the respective components of the UE 600 to which power is supplied.

[0086] The memory 610 may be or be configured to include memory such as random access memory (RAM), read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable readonly memory (EEPROM), magnetic disks, optical disks, hard disks, removable cartridges, flash drives, and so forth. In one example, the memory 610 includes one or more application programs 614, such as an operating system, web browser application, a widget, gadget engine, or other application, and corresponding data 616. The memory 610 may store, for use by the UE 600, any of a variety of various operating systems or combinations of operating systems.

[0087] The memory 610 may be configured to include a number of physical drive units, such as redundant array of independent disks (RAID), flash memory, USB flash drive, external hard disk drive, thumb drive, pen drive, key drive, high-density digital versatile disc (HD-DVD) optical disc drive, internal hard disk drive, Blu-Ray optical disc drive, holographic digital data storage (HDDS) optical disc drive, external mini-dual in-line memory module (DIMM), synchronous dynamic random access memory (SDRAM), external micro-DIMM SDRAM, smartcard memory such as tamper resistant module in the form of a universal integrated circuit card (UICC) including one or more subscriber identity modules (SIMs), such as a USIM and/or ISIM, other memory, or any combination thereof. The UICC may for example be an embedded UICC (eUICC), integrated UICC (iUICC) or a removable UICC commonly known as ‘SIM card.’ The memory 610 may allow the UE 600 to access instructions, application programs and the like, stored on transitory or non-transitory memory media, to off-load data, or to upload data. An article of manufacture, such as one utilizing a communication system may be tangibly embodied as or in the memory 610, which may be or comprise a device-readable storage medium.

[0088] The processing circuitry 602 may be configured to communicate with an access network or other network using the communication interface 612. The communication interface 612 may comprise one or more communication subsystems and may include or be communicatively coupled to an antenna 622. The communication interface 612 may include one or more transceivers used to communicate, such as by communicating with one or more remote transceivers of another device capable of wireless communication (e.g., another UE or a network node in an access network). Each transceiver may include a transmitter 618 and/or a receiver 620 appropriate to provide network communications (e.g., optical, electrical, frequency allocations, and so forth). Moreover, the transmitter 618 and receiver 620 may be coupled to one or more antennas (e.g., antenna 622) and may share circuit components, software or firmware, or alternatively be implemented separately.

[0089] In the illustrated embodiment, communication functions of the communication interface 612 may include cellular communication, Wi-Fi communication, LPWAN communication, data communication, voice communication, multimedia communication, short- range communications such as Bluetooth, near-field communication, location-based communication such as the use of the global positioning system (GPS) to determine a location, another like communication function, or any combination thereof. Communications may be implemented in according to one or more communication protocols and/or standards, such as IEEE 802.11, Code Division Multiplexing Access (CDMA), Wideband Code Division Multiple Access (WCDMA), GSM, LTE, New Radio (NR), UMTS, WiMax, Ethernet, transmission control protocol/intemet protocol (TCP/IP), synchronous optical networking (SONET), Asynchronous Transfer Mode (ATM), QUIC, Hypertext Transfer Protocol (HTTP), and so forth. [0090] Regardless of the type of sensor, a UE may provide an output of data captured by its sensors, through its communication interface 612, via a wireless connection to a network node. Data captured by sensors of a UE can be communicated through a wireless connection to a network node via another UE. The output may be periodic (e.g., once every 15 minutes if it reports the sensed temperature), random (e.g., to even out the load from reporting from several sensors), in response to a triggering event (e.g., when moisture is detected an alert is sent), in response to a request (e.g., a user initiated request), or a continuous stream (e.g., a live video feed of a patient).

[0091] As another example, a UE comprises an actuator, a motor, or a switch, related to a communication interface configured to receive wireless input from a network node via a wireless connection. In response to the received wireless input the states of the actuator, the motor, or the switch may change. For example, the UE may comprise a motor that adjusts the control surfaces or rotors of a drone in flight according to the received input or to a robotic arm performing a medical procedure according to the received input.

[0092] A UE, when in the form of an Internet of Things (loT) device, may be a device for use in one or more application domains, these domains comprising, but not limited to, city wearable technology, extended industrial application and healthcare. Non-limiting examples of such an loT device are a device which is or which is embedded in: a connected refrigerator or freezer, a TV, a connected lighting device, an electricity meter, a robot vacuum cleaner, a voice controlled smart speaker, a home security camera, a motion detector, a thermostat, a smoke detector, a door/window sensor, a flood/moisture sensor, an electrical door lock, a connected doorbell, an air conditioning system like a heat pump, an autonomous vehicle, a surveillance system, a weather monitoring device, a vehicle parking monitoring device, an electric vehicle charging station, a smart watch, a fitness tracker, a head-mounted display for Augmented Reality (AR) or Virtual Reality (VR), a wearable for tactile augmentation or sensory enhancement, a water sprinkler, an animal- or item-tracking device, a sensor for monitoring a plant or animal, an industrial robot, an Unmanned Aerial Vehicle (UAV), and any kind of medical device, like a heart rate monitor or a remote controlled surgical robot. A UE in the form of an loT device comprises circuitry and/or software in dependence of the intended application of the loT device in addition to other components as described in relation to the UE 600 shown in Figure 6. [0093] As yet another specific example, in an loT scenario, a UE may represent a machine or other device that performs monitoring and/or measurements, and transmits the results of such monitoring and/or measurements to another UE and/or a network node. The UE may in this case be an M2M device, which may in a 3 GPP context be referred to as an MTC device. As one particular example, the UE may implement the 3GPP NB-IoT standard. In other scenarios, a UE may represent a vehicle, such as a car, a bus, a truck, a ship and an airplane, or other equipment that is capable of monitoring and/or reporting on its operational status or other functions associated with its operation.

[0094] In practice, any number of UEs may be used together with respect to a single use case. For example, a first UE might be or be integrated in a drone and provide the drone’s speed information (obtained through a speed sensor) to a second UE that is a remote controller operating the drone. When the user makes changes from the remote controller, the first UE may adjust the throttle on the drone (e.g. by controlling an actuator) to increase or decrease the drone’s speed. The first and/or the second UE can also include more than one of the functionalities described above. For example, a UE might comprise the sensor and the actuator, and handle communication of data for both the speed sensor and the actuators.

[0095] Figure 7 shows a network node 700 in accordance with some embodiments. As used herein, network node refers to equipment capable, configured, arranged and/or operable to communicate directly or indirectly with a UE and/or with other network nodes or equipment, in a telecommunication network. Examples of network nodes include, but are not limited to, access points (APs) (e.g., radio access points), base stations (BSs) (e.g., radio base stations, Node Bs, evolved Node Bs (eNBs) and NR NodeBs (gNBs)), O-RAN nodes or components of an O-RAN node (e.g., O-RU, O-DU, O-CU).

[0096] Base stations may be categorized based on the amount of coverage they provide (or, stated differently, their transmit power level) and so, depending on the provided amount of coverage, may be referred to as femto base stations, pico base stations, micro base stations, or macro base stations. A base station may be a relay node or a relay donor node controlling a relay. A network node may also include one or more (or all) parts of a distributed radio base station such as centralized digital units, distributed units (e.g., in an O-RAN access node) and/or remote radio units (RRUs), sometimes referred to as Remote Radio Heads (RRHs). Such remote radio units may or may not be integrated with an antenna as an antenna integrated radio. Parts of a distributed radio base station may also be referred to as nodes in a distributed antenna system (DAS).

[0097] Other examples of network nodes include multiple transmission point (multi-TRP) 5G access nodes, multi-standard radio (MSR) equipment such as MSR BSs, network controllers such as radio network controllers (RNCs) or base station controllers (BSCs), base transceiver stations (BTSs), transmission points, transmission nodes, multi-cell/multicast coordination entities (MCEs), Operation and Maintenance (O&M) nodes, Operations Support System (OSS) nodes, Self-Organizing Network (SON) nodes, positioning nodes (e.g., Evolved Serving Mobile Location Centers (E-SMLCs)), and/or Minimization of Drive Tests (MDTs).

[0098] The network node 700 includes a processing circuitry 702, a memory 704, a communication interface 706, and a power source 708. The network node 700 may be composed of multiple physically separate components (e.g., a NodeB component and a RNC component, or a BTS component and a BSC component, etc.), which may each have their own respective components. In certain scenarios in which the network node 700 comprises multiple separate components (e.g., BTS and BSC components), one or more of the separate components may be shared among several network nodes. For example, a single RNC may control multiple NodeBs. In such a scenario, each unique NodeB and RNC pair, may in some instances be considered a single separate network node. In some embodiments, the network node 700 may be configured to support multiple radio access technologies (RATs). In such embodiments, some components may be duplicated (e.g., separate memory 704 for different RATs) and some components may be reused (e.g., a same antenna 710 may be shared by different RATs). The network node 700 may also include multiple sets of the various illustrated components for different wireless technologies integrated into network node 700, for example GSM, WCDMA, LTE, NR, WiFi, Zigbee, Z-wave, LoRaWAN, Radio Frequency Identification (RFID) or Bluetooth wireless technologies. These wireless technologies may be integrated into the same or different chip or set of chips and other components within network node 700.

[0099] The processing circuitry 702 may comprise a combination of one or more of a microprocessor, controller, microcontroller, central processing unit, digital signal processor, application- specific integrated circuit, field programmable gate array, or any other suitable computing device, resource, or combination of hardware, software and/or encoded logic operable to provide, either alone or in conjunction with other network node 700 components, such as the memory 704, to provide network node 700 functionality.

[0100] In some embodiments, the processing circuitry 702 includes a system on a chip (SOC). In some embodiments, the processing circuitry 702 includes one or more of radio frequency (RF) transceiver circuitry 712 and baseband processing circuitry 714. In some embodiments, the radio frequency (RF) transceiver circuitry 712 and the baseband processing circuitry 714 may be on separate chips (or sets of chips), boards, or units, such as radio units and digital units. In alternative embodiments, part or all of RF transceiver circuitry 712 and baseband processing circuitry 714 may be on the same chip or set of chips, boards, or units. [0101] The memory 704 may comprise any form of volatile or non-volatile computer- readable memory including, without limitation, persistent storage, solid-state memory, remotely mounted memory, magnetic media, optical media, random access memory (RAM), read-only memory (ROM), mass storage media (for example, a hard disk), removable storage media (for example, a flash drive, a Compact Disk (CD) or a Digital Video Disk (DVD)), and/or any other volatile or non-volatile, non-transitory device-readable and/or computer-executable memory devices that store information, data, and/or instructions that may be used by the processing circuitry 702. The memory 704 may store any suitable instructions, data, or information, including a computer program, software, an application including one or more of logic, rules, code, tables, and/or other instructions capable of being executed by the processing circuitry 702 and utilized by the network node 700. The memory 704 may be used to store any calculations made by the processing circuitry 702 and/or any data received via the communication interface 706. In some embodiments, the processing circuitry 702 and memory 704 is integrated.

[0102] The communication interface 706 is used in wired or wireless communication of signaling and/or data between a network node, access network, and/or UE. As illustrated, the communication interface 706 comprises port(s)/terminal(s) 716 to send and receive data, for example to and from a network over a wired connection. The communication interface 706 also includes radio front-end circuitry 718 that may be coupled to, or in certain embodiments a part of, the antenna 710. Radio front-end circuitry 718 comprises filters 720 and amplifiers 722. The radio front-end circuitry 718 may be connected to an antenna 710 and processing circuitry 702. The radio front-end circuitry may be configured to condition signals communicated between antenna 710 and processing circuitry 702. The radio front-end circuitry 718 may receive digital data that is to be sent out to other network nodes or UEs via a wireless connection. The radio front-end circuitry 718 may convert the digital data into a radio signal having the appropriate channel and bandwidth parameters using a combination of filters 720 and/or amplifiers 722. The radio signal may then be transmitted via the antenna 710. Similarly, when receiving data, the antenna 710 may collect radio signals which are then converted into digital data by the radio front-end circuitry 718. The digital data may be passed to the processing circuitry 702. In other embodiments, the communication interface may comprise different components and/or different combinations of components.

[0103] In certain alternative embodiments, the network node 700 does not include separate radio front-end circuitry 718, instead, the processing circuitry 702 includes radio front-end circuitry and is connected to the antenna 710. Similarly, in some embodiments, all or some of the RF transceiver circuitry 712 is part of the communication interface 706. In still other embodiments, the communication interface 706 includes one or more ports or terminals 716, the radio front-end circuitry 718, and the RF transceiver circuitry 712, as part of a radio unit (not shown), and the communication interface 706 communicates with the baseband processing circuitry 714, which is part of a digital unit (not shown).

[0104] The antenna 710 may include one or more antennas, or antenna arrays, configured to send and/or receive wireless signals. The antenna 710 may be coupled to the radio front-end circuitry 718 and may be any type of antenna capable of transmitting and receiving data and/or signals wirelessly. In certain embodiments, the antenna 710 is separate from the network node 700 and connectable to the network node 700 through an interface or port.

[0105] The antenna 710, communication interface 706, and/or the processing circuitry 702 may be configured to perform any receiving operations and/or certain obtaining operations described herein as being performed by the network node. Any information, data and/or signals may be received from a UE, another network node and/or any other network equipment. Similarly, the antenna 710, the communication interface 706, and/or the processing circuitry 702 may be configured to perform any transmitting operations described herein as being performed by the network node. Any information, data and/or signals may be transmitted to a UE, another network node and/or any other network equipment.

[0106] The power source 708 provides power to the various components of network node 700 in a form suitable for the respective components (e.g., at a voltage and current level needed for each respective component). The power source 708 may further comprise, or be coupled to, power management circuitry to supply the components of the network node 700 with power for performing the functionality described herein. For example, the network node 700 may be connectable to an external power source (e.g., the power grid, an electricity outlet) via an input circuitry or interface such as an electrical cable, whereby the external power source supplies power to power circuitry of the power source 708. As a further example, the power source 708 may comprise a source of power in the form of a battery or battery pack which is connected to, or integrated in, power circuitry. The battery may provide backup power should the external power source fail.

[0107] Embodiments of the network node 700 may include additional components beyond those shown in Figure 7 for providing certain aspects of the network node’s functionality, including any of the functionality described herein and/or any functionality necessary to support the subject matter described herein. For example, the network node 700 may include user interface equipment to allow input of information into the network node 700 and to allow output of information from the network node 700. This may allow a user to perform diagnostic, maintenance, repair, and other administrative functions for the network node 700. [0108] Figure 8 is a block diagram of a host 800, which may be an embodiment of the host 516 of Figure 5, in accordance with various aspects described herein. The remote electronic device 102 may be provided by the host 800 in some aspects. As used herein, the host 800 may be or comprise various combinations hardware and/or software, including a standalone server, a blade server, a cloud-implemented server, a distributed server, a virtual machine, container, or processing resources in a server farm. The host 800 may provide one or more services to one or more UEs, such as providing video frames encoded as described here.

[0109] The host 800 includes processing circuitry 802 that is operatively coupled via a bus 804 to an input/output interface 806, a network interface 808, a power source 810, and a memory 812. Other components may be included in other embodiments. Features of these components may be substantially similar to those described with respect to the devices of previous figures, such as Figures 6 and 7, such that the descriptions thereof are generally applicable to the corresponding components of host 800.

[0110] The memory 812 may include one or more computer programs including one or more host application programs 814 and data 816, which may include user data, e.g., data generated by a UE for the host 800 or data generated by the host 800 for a UE. Embodiments of the host 800 may utilize only a subset or all of the components shown. The host application programs 814 may be implemented in a container-based architecture and may provide support for video codecs (e.g., Versatile Video Coding (VVC), High Efficiency Video Coding (HEVC), Advanced Video Coding (AVC), MPEG, VP9) and audio codecs (e.g., FLAC, Advanced Audio Coding (AAC), MPEG, G.711), including transcoding for multiple different classes, types, or implementations of UEs (e.g., handsets, desktop computers, wearable display systems, heads-up display systems). The host application programs 814 may also provide for user authentication and licensing checks and may periodically report health, routes, and content availability to a central node, such as a device in or on the edge of a core network. Accordingly, the host 800 may select and/or indicate a different host for over-the-top services for a UE. The host application programs 814 may support various protocols, such as the HTTP Live Streaming (HLS) protocol, Real-Time Messaging Protocol (RTMP), Real-Time Streaming Protocol (RTSP), Dynamic Adaptive Streaming over HTTP (MPEG-DASH), etc.

[0111] Figure 9 is a block diagram illustrating a virtualization environment 900 in which functions implemented by some embodiments may be virtualized. In the present context, virtualizing means creating virtual versions of apparatuses or devices which may include virtualizing hardware platforms, storage devices and networking resources. As used herein, virtualization can be applied to any device described herein, or components thereof, and relates to an implementation in which at least a portion of the functionality is implemented as one or more virtual components. Some or all of the functions described herein may be implemented as virtual components executed by one or more virtual machines (VMs) implemented in one or more virtual environments 900 hosted by one or more of hardware nodes, such as a hardware computing device that operates as a network node, UE, core network node, or host. Further, in embodiments in which the virtual node does not require radio connectivity (e.g., a core network node or host), then the node may be entirely virtualized. In some embodiments, the virtualization environment 900 includes components defined by the O-RAN Alliance, such as an O-Cloud environment orchestrated by a Service Management and Orchestration Framework via an 0-2 interface.

[0112] Applications 902 (which may alternatively be called software instances, virtual appliances, network functions, virtual nodes, virtual network functions, etc.) are run in the virtualization environment Q400 to implement some of the features, functions, and/or benefits of some of the embodiments disclosed herein.

[0113] Hardware 904 includes processing circuitry, memory that stores software and/or instructions executable by hardware processing circuitry, and/or other hardware devices as described herein, such as a network interface, input/output interface, and so forth. Software may be executed by the processing circuitry to instantiate one or more virtualization layers 906 (also referred to as hypervisors or virtual machine monitors (VMMs)), provide VMs 908a and 908b (one or more of which may be generally referred to as VMs 908), and/or perform any of the functions, features and/or benefits described in relation with some embodiments described herein. The virtualization layer 906 may present a virtual operating platform that appears like networking hardware to the VMs 908.

[0114] The VMs 908 comprise virtual processing, virtual memory, virtual networking or interface and virtual storage, and may be run by a corresponding virtualization layer 906. Different embodiments of the instance of a virtual appliance 902 may be implemented on one or more of VMs 908, and the implementations may be made in different ways. Virtualization of the hardware is in some contexts referred to as network function virtualization (NFV). NFV may be used to consolidate many network equipment types onto industry standard high volume server hardware, physical switches, and physical storage, which can be located in data centers, and customer premise equipment.

[0115] In the context of NFV, a VM 908 may be a software implementation of a physical machine that runs programs as if they were executing on a physical, non-virtualized machine. Each of the VMs 908, and that part of hardware 904 that executes that VM, be it hardware dedicated to that VM and/or hardware shared by that VM with others of the VMs, forms separate virtual network elements. Still in the context of NFV, a virtual network function is responsible for handling specific network functions that run in one or more VMs 908 on top of the hardware 904 and corresponds to the application 902.

[0116] Hardware 904 may be implemented in a standalone network node with generic or specific components. Hardware 904 may implement some functions via virtualization.

Alternatively, hardware 904 may be part of a larger cluster of hardware (e.g. such as in a data center or CPE) where many hardware nodes work together and are managed via management and orchestration 910, which, among others, oversees lifecycle management of applications 902. In some embodiments, hardware 904 is coupled to one or more radio units that each include one or more transmitters and one or more receivers that may be coupled to one or more antennas. Radio units may communicate directly with other hardware nodes via one or more appropriate network interfaces and may be used in combination with the virtual components to provide a virtual node with radio capabilities, such as a radio access node or a base station. In some embodiments, some signaling can be provided with the use of a control system 912 which may alternatively be used for communication between hardware nodes and radio units.

[0117] Figure 10 shows a communication diagram of a host 1002 communicating via a network node 1004 with a UE 1006 over a partially wireless connection in accordance with some embodiments. Example implementations, in accordance with various embodiments, of the UE (such as a UE 512a of Figure 5 and/or UE 600 of Figure 6), network node (such as network node 510a of Figure 5 and/or network node 700 of Figure 7), and host (such as host 516 of Figure 5 and/or host 800 of Figure 8) discussed in the preceding paragraphs will now be described with reference to Figure 10.

[0118] Like host 800, embodiments of host 1002 include hardware, such as a communication interface, processing circuitry, and memory. The host 1002 also includes software, which is stored in or accessible by the host 1002 and executable by the processing circuitry. The software includes a host application that may be operable to provide a service to a remote user, such as the UE 1006 connecting via an over-the-top (OTT) connection 1050 extending between the UE 1006 and host 1002. In providing the service to the remote user, a host application may provide user data which is transmitted using the OTT connection 1050. [0119] The network node 1004 includes hardware enabling it to communicate with the host 1002 and UE 1006. The connection 1060 may be direct or pass through a core network (like core network 506 of Figure 5) and/or one or more other intermediate networks, such as one or more public, private, or hosted networks. For example, an intermediate network may be a backbone network or the Internet.

[0120] The UE 1006 includes hardware and software, which is stored in or accessible by UE 1006 and executable by the UE’s processing circuitry. The software includes a client application, such as a web browser or operator- specific “app” that may be operable to provide a service to a human or non-human user via UE 1006 with the support of the host 1002. In the host 1002, an executing host application may communicate with the executing client application via the OTT connection 1050 terminating at the UE 1006 and host 1002. In providing the service to the user, the UE’s client application may receive request data from the host's host application and provide user data in response to the request data. The OTT connection 1050 may transfer both the request data and the user data. The UE's client application may interact with the user to generate the user data that it provides to the host application through the OTT connection 1050. [0121] The OTT connection 1050 may extend via a connection 1060 between the host 1002 and the network node 1004 and via a wireless connection 1070 between the network node 1004 and the UE 1006 to provide the connection between the host 1002 and the UE 1006. The connection 1060 and wireless connection 1070, over which the OTT connection 1050 may be provided, have been drawn abstractly to illustrate the communication between the host 1002 and the UE 1006 via the network node 1004, without explicit reference to any intermediary devices and the precise routing of messages via these devices.

[0122] As an example of transmitting data via the OTT connection 1050, in step 1008, the host 1002 provides user data, which may be performed by executing a host application. In some embodiments, the user data is associated with a particular human user interacting with the UE 1006. In other embodiments, the user data is associated with a UE 1006 that shares data with the host 1002 without explicit human interaction. In step 1010, the host 1002 initiates a transmission carrying the user data towards the UE 1006. The host 1002 may initiate the transmission responsive to a request transmitted by the UE 1006. The request may be caused by human interaction with the UE 1006 or by operation of the client application executing on the UE 1006. The transmission may pass via the network node 1004, in accordance with the teachings of the embodiments described throughout this disclosure. Accordingly, in step 1012, the network node 1004 transmits to the UE 1006 the user data that was carried in the transmission that the host 1002 initiated, in accordance with the teachings of the embodiments described throughout this disclosure. In step 1014, the UE 1006 receives the user data carried in the transmission, which may be performed by a client application executed on the UE 1006 associated with the host application executed by the host 1002.

[0123] In some examples, the UE 1006 executes a client application which provides user data to the host 1002. The user data may be provided in reaction or response to the data received from the host 1002. Accordingly, in step 1016, the UE 1006 may provide user data, which may be performed by executing the client application. In providing the user data, the client application may further consider user input received from the user via an input/output interface of the UE 1006. Regardless of the specific manner in which the user data was provided, the UE 1006 initiates, in step 1018, transmission of the user data towards the host 1002 via the network node 1004. In step 1020, in accordance with the teachings of the embodiments described throughout this disclosure, the network node 1004 receives user data from the UE 1006 and initiates transmission of the received user data towards the host 1002. In step 1022, the host 1002 receives the user data carried in the transmission initiated by the UE 1006.

[0124] One or more of the various embodiments improve the performance of OTT services provided to the UE 1006 using the OTT connection 1050, in which the wireless connection 1070 forms the last segment.

[0125] In an example scenario, factory status information may be collected and analyzed by the host 1002. As another example, the host 1002 may process audio and video data which may have been retrieved from a UE for use in creating maps. As another example, the host 1002 may collect and analyze real-time data to assist in controlling vehicle congestion (e.g., controlling traffic lights). As another example, the host 1002 may store surveillance video uploaded by a UE. As another example, the host 1002 may store or control access to media content such as video, audio, VR or AR which it can broadcast, multicast or unicast to UEs. As other examples, the host 1002 may be used for energy pricing, remote control of non-time critical electrical load to balance power generation needs, location services, presentation services (such as compiling diagrams etc. from data collected from remote devices), or any other function of collecting, retrieving, storing, analyzing and/or transmitting data.

[0126] In some examples, a measurement procedure may be provided for the purpose of monitoring data rate, latency and other factors on which the one or more embodiments improve. There may further be an optional network functionality for reconfiguring the OTT connection 1050 between the host 1002 and UE 1006, in response to variations in the measurement results. The measurement procedure and/or the network functionality for reconfiguring the OTT connection may be implemented in software and hardware of the host 1002 and/or UE 1006. In some embodiments, sensors (not shown) may be deployed in or in association with other devices through which the OTT connection 1050 passes; the sensors may participate in the measurement procedure by supplying values of the monitored quantities exemplified above, or supplying values of other physical quantities from which software may compute or estimate the monitored quantities. The reconfiguring of the OTT connection 1050 may include message format, retransmission settings, preferred routing etc.; the reconfiguring need not directly alter the operation of the network node 1004. Such procedures and functionalities may be known and practiced in the art. In certain embodiments, measurements may involve proprietary UE signaling that facilitates measurements of throughput, propagation times, latency and the like, by the host 1002. The measurements may be implemented in that software causes messages to be transmitted, in particular empty or ‘dummy’ messages, using the OTT connection 1050 while monitoring propagation times, errors, etc.

[0127] Although the computing devices described herein (e.g., UEs, network nodes, hosts) may include the illustrated combination of hardware components, other embodiments may comprise computing devices with different combinations of components. It is to be understood that these computing devices may comprise any suitable combination of hardware and/or software needed to perform the tasks, features, functions and methods disclosed herein.

Determining, calculating, obtaining or similar operations described herein may be performed by processing circuitry, which may process information by, for example, converting the obtained information into other information, comparing the obtained information or converted information to information stored in the network node, and/or performing one or more operations based on the obtained information or converted information, and as a result of said processing making a determination. Moreover, while components are depicted as single boxes located within a larger box, or nested within multiple boxes, in practice, computing devices may comprise multiple different physical components that make up a single illustrated component, and functionality may be partitioned between separate components. For example, a communication interface may be configured to include any of the components described herein, and/or the functionality of the components may be partitioned between the processing circuitry and the communication interface. In another example, non-computationally intensive functions of any of such components may be implemented in software or firmware and computationally intensive functions may be implemented in hardware.

[0128] In certain embodiments, some or all of the functionality described herein may be provided by processing circuitry executing instructions stored on in memory, which in certain embodiments may be a computer program product in the form of a non-transitory computer- readable storage medium. In alternative embodiments, some or all of the functionality may be provided by the processing circuitry without executing instructions stored on a separate or discrete device-readable storage medium, such as in a hard-wired manner. In any of those particular embodiments, whether executing instructions stored on a non-transitory computer- readable storage medium or not, the processing circuitry can be configured to perform the described functionality. The benefits provided by such functionality are not limited to the processing circuitry alone or to other components of the computing device, but are enjoyed by the computing device as a whole, and/or by end users and a wireless network generally.

[0129] While the flow diagrams in the figures show a particular order of operations performed by certain embodiments of the invention, it should be understood that such order is exemplary (e.g., alternative embodiments may perform the operations in a different order, combine certain operations, overlap certain operations, etc.).

[0130] While the invention has been described in terms of several embodiments, those skilled in the art will recognize that the invention is not limited to the embodiments described, can be practiced with modification and alteration within the spirit and scope of the appended claims. The description is thus to be regarded as illustrative instead of limiting.

[0131] Some embodiments are as described in the following enumerated embodiments.

[0132] Embodiment 1. A method performed by a network node for encoding a media stream, the method comprising: selecting one or more frames from the media stream; determining a category for the media stream based on the one or more frames; determining, based on the category for the media stream, a set of one or more feature weights; determining one or more feature maps for a frame of the media stream; determining based on the feature weights and the feature maps a gaze direction density map; and determining, based on the gaze direction density map, a tile-level foveation map that is to be used to encode the frame of the media stream into an encoded frame that includes encoded tiles of different resolutions.

[0133] Embodiment 2. The method of the previous embodiment, wherein the determining a category for the media stream includes: determining, based on the set of one or more frames, a clutter parameter for the media stream, wherein the clutter parameter is indicative of a degree of clutter of regions of interests for the media stream.

[0134] Embodiment 3. The method of any of the previous embodiments, wherein the determining a category for the media stream includes: determining, based on the set of one or more frames, a camera motion parameter for the media stream, wherein the camera motion parameter is indicative of a degree of motion of the camera for the media stream.

[0135] Embodiment 4. The method of any of the previous embodiments, wherein the determining one or more feature maps includes: determining at least one of a saliency map for the frame; an object detection map for the frame; an optical flow map for the frame; a center and/or horizon bias map for the frame; and a contrast and/or brightness map for the frame.

[0136] Embodiment 5. A network node for encoding a media stream, the network node comprising: processing circuitry configured to perform any of the steps of the preceding embodiments; and power supply circuitry configured to supply power to the processing circuitry. REFERENCES

1. Y. Xu et al., "Gaze Prediction in Dynamic 360° Immersive Videos," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 5333-5342, doi: 10.1109/CVPR.2018.00559.

2. D. Zanca, S. Melacci and M. Gori, "Gravitational Laws of Focus of Attention," in IEEE Transactions on Patern Analysis and Machine Intelligence, vol. 42, no. 12, pp. 2983-2995, 1 Dec. 2020, doi: 10.1109/TPAMI.2019.2920636.

3. Zanca, Dario & Zugarini, Andrea & Dietz, Simon & Altstidl, Thomas & Ndjeuha, Mark & Schwinn, Leo & Eskofier, Bjoern. (2023). Contrastive Language-Image Pretrained Models are Zero-Shot Human Scanpath Predictors.

4. L. Itti, C. Koch and E. Niebur, "A model of saliency-based visual attention for rapid scene analysis," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 11, pp. 1254-1259, Nov. 1998, doi: 10.1109/34.730558.

5. Heeseung Yun, Sehun Lee, Gunhee Kim; Panoramic Vision Transformer for Saliency Detection in 360° Videos

6. Tom Foulsham; Geoffrey Underwood; What can saliency models predict about eye movements? Spatial and sequential aspects of fixations during encoding and recognition; Journal of Vision February 2008, Vol.8, 6. doi:https://doi.org/10.1167/8.2.6

7. US Patent No. 10,432,970Bl System and Method for Encoding 360 Degree Immersive Video 2019

Claims

CLAIMS What is claimed is:

1. A method performed by a remote electronic device (102) for encoding a media stream (112 A) that includes a plurality of frames, the method comprising: selecting a set of frames from the plurality of frames of the media stream (112A); determining a set of one or more feature weights; determining one or more feature maps for a first one of the plurality of frames of the media stream (112A); determining based on the feature weights and the feature maps a gaze direction density map (241); determining, based on the gaze direction density map (241), a tile-level foveation map; encoding, using the tile-level foveation map, the first frame of the media stream (112A) into an encoded frame that includes encoded tiles of different resolutions; and transmitting the encoded frame to a user device (104A).

2. The method of claim 1, further comprising: determining a category for the media stream (112 A) based on the selected set of frames, and wherein the set of one or more feature weights are determined based at least in part on the determined category for the media stream (112A).

3. The method of claim 2, wherein the determining the category for the media stream (112A) includes: determining, based on the selected set of frames, a clutter parameter for the media stream (112A), wherein the clutter parameter is indicative of a degree of clutter of regions of interests for the media stream (112A).

4. The method of any of claims 2 or 3, wherein the determining the category for the media stream (112 A) includes: determining, based on the selected set of frames, a camera motion parameter for the media stream (112A), wherein the camera motion parameter is indicative of a degree of motion of the camera for the media stream (112A).

5. The method of any of claims 1-4, wherein the determining one or more feature maps includes: determining at least one of a saliency map (231) for the first frame; an object detection map (232) for the first frame; an optical flow map (233) for the first frame; a center and/or horizon bias map (234) for the first frame; and a contrast and/or brightness map (235) for the first frame.

6. The method of any of claims 1-5, wherein the different resolutions include lower resolutions for tiles that have a low score and higher resolution for tiles that have a relatively higher score.

7. The method of any of claims 1-6, further comprising: transmitting the tile-level foveation map for the media stream (112 A) to the user device (104A).

8. The method of any of claims 1-7, wherein determining the tile-level foveation map includes: determining one or more predicted gaze directions for the first frame; based on the one or more predicted gaze directions, creating a foveation area, wherein the foveation area includes a set of foveation weights; obtaining a raw foveation map from the foveation area; and aggregating the raw foveation map into a tile- level foveation map.

9. The method of any of claims 1-8, wherein determining the gaze direction density map (241) includes: summing scores across the one or more feature maps after multiplying by the associated feature weights.

10. The method of claim 9, wherein the gaze direction density map (241) is a probability density of gaze direction over the first frame.

11. A remote electronic device ( 102) for encoding a media stream (112A) that includes a plurality of frames, the remote electronic device comprising: processing circuitry (802); and a non-transitory machine-readable storage medium that provides instructions that, when executed by the processing circuitry (802), cause the remote electronic device (102) to perform operations including: selecting a set of frames from the plurality of frames of the media stream (112A); determining a set of one or more feature weights, determining one or more feature maps for a first one of the plurality of frames of the media stream (112A), determining based on the feature weights and the feature maps a gaze direction density map (241), determining, based on the gaze direction density map (241), a tile-level foveation map, encoding, using the tile-level foveation map, the first frame of the media stream (112 A) into an encoded frame that includes encoded tiles of different resolutions, and transmitting the encoded frame to a user device (104A).

12. The remote electronic device (102) of claim 11, wherein the operations further include: determining a category for the media stream (112 A) based on the selected set of frames, and wherein the set of one or more feature weights are determined based at least in part on the determined category for the media stream (112A).

13. The remote electronic device (102) of claim 12, wherein the determining the category for the media stream (112A) includes: determining, based on the selected set of frames, a clutter parameter for the media stream (112A), wherein the clutter parameter is indicative of a degree of clutter of regions of interests for the media stream (112A).

14. The remote electronic device (102) of claims 12 or 13, wherein the determining the category for the media stream (112A) includes: determining, based on the selected set of frames, a camera motion parameter for the media stream (112A), wherein the camera motion parameter is indicative of a degree of motion of the camera for the media stream (112A).

15. The remote electronic device (102) of any of claims 11-14, wherein the determining one or more feature maps includes: determining at least one of a saliency map (231) for the first frame; an object detection map (232) for the first frame; an optical flow map (233) for the first frame; a center and/or horizon bias map (234) for the first frame; and a contrast and/or brightness map (235) for the first frame.

16. The remote electronic device (102) of any of claims 11-15, wherein the different resolutions include lower resolutions for tiles that have a low score and higher resolution for tiles that have a relatively higher score.

17. The remote electronic device (102) of any of claims 11-16, wherein the operations further comprise: transmitting the tile-level foveation map for the media stream (112A) to the user device (104 A).

18. The remote electronic device (102) of any of claims 11-17, wherein determining the tile- level foveation map includes: determining one or more predicted gaze directions for the first frame; based on the one or more predicted gaze directions, creating a foveation area, wherein the foveation area includes a set of foveation weights; obtaining a raw foveation map from the foveation area; and aggregating the raw foveation map into a tile- level foveation map.

19. The remote electronic device (102) of any of claims 11-18, wherein determining the gaze direction density map (241) includes: summing scores across the one or more feature maps after multiplying by the associated feature weights.

20. The remote electronic device (102) of claim 19, wherein the gaze direction density map (241) is a probability density of gaze direction over the first frame.

21. A non-transitory machine-readable storage medium that includes instructions that, when executed by a remote electronic device (102), cause the remote electronic device (102) to perform operations for encoding a media stream, the operations comprising: selecting a set of frames from a plurality of frames of the media stream (112A); determining a set of one or more feature weights; determining one or more feature maps for a first one of the plurality of frames of the media stream (112A); determining based on the feature weights and the feature maps a gaze direction density map (241); determining, based on the gaze direction density map (241), a tile-level foveation map; encoding, using the tile-level foveation map, the first frame of the media stream (112A) into an encoded frame that includes encoded tiles of different resolutions; and transmitting the encoded frame to a user device (104 A).

22. The non-transitory machine-readable storage medium of claim 21, wherein the operations further include: determining a category for the media stream (112 A) based on the selected set of frames, and wherein the set of one or more feature weights are determined based at least in part on the determined category for the media stream (112A).

23. The non-transitory machine-readable storage medium of claim 22, wherein the determining the category for the media stream (112A) includes: determining, based on the selected set of frames, a clutter parameter for the media stream (112A), wherein the clutter parameter is indicative of a degree of clutter of regions of interests for the media stream (112A).

24. The non-transitory machine-readable storage medium of any of claims 22 or 23, wherein the determining the category for the media stream (112A) includes: determining, based on the selected set of frames, a camera motion parameter for the media stream (112A), wherein the camera motion parameter is indicative of a degree of motion of the camera for the media stream (112A).

25. The non-transitory machine-readable storage medium of any of claims 21-24, wherein the determining one or more feature maps includes: determining at least one of a saliency map (231) for the first frame; an object detection map (232) for the first frame; an optical flow map (233) for the first frame; a center and/or horizon bias map (234) for the first frame; and a contrast and/or brightness map (235) for the first frame.

26. The non-transitory machine-readable storage medium of any of claims 21-25, wherein the different resolutions include lower resolutions for tiles that have a low score and higher resolution for tiles that have a relatively higher score.

27. The non-transitory machine-readable storage medium of any of claims 21-26, wherein the operations further comprise: transmitting the tile-level foveation map for the media stream (112A) to the user device (104A).

28. The non-transitory machine-readable storage medium of any of claims 21-27, wherein determining the tile-level foveation map includes: determining one or more predicted gaze directions for the first frame; based on the one or more predicted gaze directions, creating a foveation area, wherein the foveation area includes a set of foveation weights; obtaining a raw foveation map from the foveation area; and aggregating the raw foveation map into a tile- level foveation map.

29. The non-transitory machine-readable storage medium of any of claims 21-28, wherein determining the gaze direction density map (241) includes: summing scores across the one or more feature maps after multiplying by the associated feature weights.

30. The non-transitory machine-readable storage medium of claim 29, wherein the gaze direction density map (241) is a probability density of gaze direction over the first frame.

31. A computer program, comprising instructions that, when executed by a remote electronic device (102), cause the remote electronic device (102) to carry out the method according to any of claims 1-10.