Movatterモバイル変換


[0]ホーム

URL:


MDN Web Docs

Using WebRTC Encoded Transforms

Limited availability

WebRTC Encoded Transforms provide a mechanism to inject a high performanceStream API for modifying encoded video and audio frame into the incoming and outgoing WebRTC pipelines.This enables use cases such as end-to-end encryption of encoded frames by third-party code.

The API defines both main thread and worker side objects.The main-thread interface is aRTCRtpScriptTransform instance, which on construction specifies theWorker that is to implement the transformer code.The transform running in the worker is inserted into the incoming or outgoing WebRTC pipeline by adding theRTCRtpScriptTransform toRTCRtpReceiver.transform orRTCRtpSender.transform, respectively.

A counterpartRTCRtpScriptTransformer object is created in the worker thread, which has aReadableStreamreadable property, aWritableStreamwritable property, and anoptions object passed from the associatedRTCRtpScriptTransform constructor.Encoded video frames (RTCEncodedVideoFrame) or audio frames (RTCEncodedAudioFrame) from the WebRTC pipeline are enqueued onreadable for processing.

TheRTCRtpScriptTransformer is made available to code as thetransformer property of thertctransform event, which is fired at the worker global scope whenever an encoded frame is enqueued for processing (and initially on construction of the correspondingRTCRtpScriptTransform).The worker code must implement a handler for the event that reads encoded frames fromtransformer.readable, modifies them as needed, and writes them totransformer.writable in the same order and without any duplication.

While the interface doesn't place any other restrictions on the implementation, a natural way to transform the frames is to create apipe chain that sends frames enqueued on theevent.transformer.readable stream through anTransformStream to theevent.transformer.writable stream.We can use theevent.transformer.options property to configure any transform code that depends on whether the transform is enqueuing incoming frames from the packetizer or outgoing frames from a codec.

TheRTCRtpScriptTransformer interface also provides methods that can be used when sending encoded video to get the codec to generate a "key" frame, and when receiving video to request that a new key frame be sent.These may be useful to allow a recipient to start viewing the video more quickly, if (for example) they join a conference call when delta frames are being sent.

The following examples provide more specific examples of how to use the framework using aTransformStream based implementation.

Test if encoded transforms are supported

Test ifencoded transforms are supported by checking for the existence ofRTCRtpSender.transform (orRTCRtpReceiver.transform):

js
const supportsEncodedTransforms =  window.RTCRtpSender && "transform" in RTCRtpSender.prototype;

Adding a transform for outgoing frames

A transform running in a worker is inserted into the outgoing WebRTC pipeline by assigning its correspondingRTCRtpScriptTransform to theRTCRtpSender.transform for an outgoing track.

This example shows how you might stream video from a user's webcam over WebRTC, adding a WebRTC encoded transform to modify the outgoing streams.The code assumes that there is anRTCPeerConnection calledpeerConnection that is already connected to a remote peer.

First we get aMediaStreamTrack, usinggetUserMedia() to get a videoMediaStream from a media device, and then theMediaStream.getTracks() method to get the firstMediaStreamTrack in the stream.

The track is added to the peer connection usingaddTrack(), which starts streaming it to the remote peer.TheaddTrack() method returns theRTCRtpSender that is being used to send the track.

js
// Get Video stream and MediaTrackconst stream = await navigator.mediaDevices.getUserMedia({ video: true });const [track] = stream.getTracks();const videoSender = peerConnection.addTrack(track, stream);

AnRTCRtpScriptTransform is then constructed taking a worker script, which defines the transform, and an optional object that can be used to pass arbitrary messages to the worker (in this case we've used aname property with value "senderTransform" to tell the worker that this transform will be added to the outbound stream).We add the transform to the outgoing pipeline by assigning it to theRTCRtpSender.transform property.

js
// Create a worker containing a TransformStreamconst worker = new Worker("worker.js");videoSender.transform = new RTCRtpScriptTransform(worker, {  name: "senderTransform",});

TheUsing separate sender and receiver transforms section below shows how thename might be used in a worker.

Note that you can add the transform at any time, but by adding it immediately after callingaddTrack() the transform will get the first encoded frame that is sent.

Adding a transform for incoming frames

A transform running in a worker is inserted into the incoming WebRTC pipeline by assigning its correspondingRTCRtpScriptTransform to theRTCRtpReceiver.transform for an incoming track.

This example shows how you add a transform to modify an incoming stream.The code assumes that there is anRTCPeerConnection calledpeerConnection that is already connected to a remote peer.

First we add anRTCPeerConnectiontrack event handler to catch the event when the peer starts receiving a new track.Within the handler we construct anRTCRtpScriptTransform and add it toevent.receiver.transform (event.receiver is aRTCRtpReceiver).As in the previous section, the constructor takes an object withname property, but here we usereceiverTransform as the value to tell the worker that frames are incoming.

js
peerConnection.ontrack = (event) => {  const worker = new Worker("worker.js");  event.receiver.transform = new RTCRtpScriptTransform(worker, {    name: "receiverTransform",  });  received_video.srcObject = event.streams[0];};

Note again that you can add the transform stream at any time.However by adding it in thetrack event handler ensures that the transform stream will get the first encoded frame for the track.

Worker implementation

The worker script must implement a handler for thertctransform event, creating apipe chain that pipes theevent.transformer.readable (ReadableStream) stream through aTransformStream to theevent.transformer.writable (WritableStream) stream.

A worker might support transforming incoming or outgoing encoded frames, or both, and the transform might be hard coded, or configured at run-time using information passed from the web application.

Basic WebRTC Encoded Transform

The example below shows a basic WebRTC Encoded transform, which negates all bits in queued frames.It does not use or need options passed in from the main thread because the same algorithm can be used in the sender pipeline to negate the bits and in the receiver pipeline to restore them.

The code implements an event handler for thertctransform event.This constructs aTransformStream, then pipes through it usingReadableStream.pipeThrough(), and finally pipes toevent.transformer.writable usingReadableStream.pipeTo().

js
addEventListener("rtctransform", (event) => {  const transform = new TransformStream({    start() {}, // Called on startup.    flush() {}, // Called when the stream is about to be closed.    async transform(encodedFrame, controller) {      // Reconstruct the original frame.      const view = new DataView(encodedFrame.data);      // Construct a new buffer      const newData = new ArrayBuffer(encodedFrame.data.byteLength);      const newView = new DataView(newData);      // Negate all bits in the incoming frame      for (let i = 0; i < encodedFrame.data.byteLength; ++i) {        newView.setInt8(i, ~view.getInt8(i));      }      encodedFrame.data = newData;      controller.enqueue(encodedFrame);    },  });  event.transformer.readable    .pipeThrough(transform)    .pipeTo(event.transformer.writable);});

The implementation of the WebRTC encoded transform is similar to a "generic"TransformStream, but with some important differences.Like the generic stream, itsconstructor takes an object that defines anoptionalstart() method, which is called on construction,flush() method, which is called as the stream is about to be closed, andtransform() method, which is called every time there is a chunk to be processed.Unlike the generic constructor anywritableStrategy orreadableStrategy properties that are passed in the constructor object are ignored, and the queuing strategy is entirely managed by the user agent.

Thetransform() method also differs in that it is passed either anRTCEncodedVideoFrame orRTCEncodedAudioFrame rather than a generic "chunk".The actual code shown here for the method isn't notable other than it demonstrates how to convert the frame to a form where you can modify it and enqueue it afterwards on the stream.

Using separate sender and receiver transforms

The previous example works if the transform function is the same when sending and receiving, but in many cases the algorithms will be different.You could use separate worker scripts for the sender and receiver, or handle both cases in one worker as shown below.

If the worker is used for both sender and receiver, it needs to know whether the current encoded frame is outgoing from a codec, or incoming from the packetizer.This information can be specified using the second option in theRTCRtpScriptTransform constructor.For example, we can define a separateRTCRtpScriptTransform for the sender and receiver, passing the same worker, and an options object with propertyname that indicates whether the transform is used in the sender or receiver (as shown in previous sections above).The information is then available in the worker inevent.transformer.options.

In this example we implement theonrtctransform event handler on the global dedicated worker scope object.The value of thename property is used to determine whichTransformStream to construct (the actual constructor methods are not shown).

js
// Code to instantiate transform and attach them to sender/receiver pipelines.onrtctransform = (event) => {  let transform;  if (event.transformer.options.name === "senderTransform")    transform = createSenderTransform(); // returns a TransformStream  else if (event.transformer.options.name === "receiverTransform")    transform = createReceiverTransform(); // returns a TransformStream  else return;  event.transformer.readable    .pipeThrough(transform)    .pipeTo(event.transformer.writable);};

Note that the code to create the pipe chain is the same as in the previous example.

Runtime communication with the transform

TheRTCRtpScriptTransform constructor allows you to pass options and transfer objects to the worker.In the previous example we passed static information, but sometimes you might want to modify the transform algorithm in the worker at runtime, or get information back from the worker.For example, a WebRTC conference call that supports encryption might need to add a new key to the algorithm used by the transform.

While it is possible to share information between the worker running the transform code and the main thread usingWorker.postMessage(), it is generally easier to share aMessageChannel as anRTCRtpScriptTransform constructor option, because then the channel context is directly available in theevent.transformer.options when you are handling a new encoded frame.

The code below creates aMessageChannel andtransfers its second port to the worker.The main thread and transform can subsequently communicate using the first and second ports.

js
// Create a worker containing a TransformStreamconst worker = new Worker("worker.js");// Create a channel// Pass channel.port2 to the transform as a constructor option// and also transfer it to the workerconst channel = new MessageChannel();const transform = new RTCRtpScriptTransform(  worker,  { purpose: "encrypt", port: channel.port2 },  [channel.port2],);// Use the port1 to send a string.// (we can send and transfer basic types/objects).channel.port1.postMessage("A message for the worker");channel.port1.start();

In the worker the port is available asevent.transformer.options.port.The code below shows how you might listen on the port'smessage event to get messages from the main thread.You can also use the port to send messages back to the main thread.

js
event.transformer.options.port.onmessage = (event) => {  // The message payload is in 'event.data';  console.log(event.data);};

Triggering a key frame

Raw video is rarely sent or stored because it consumes a lot of space and bandwidth to represent each frame as a complete image.Instead, codecs periodically generate a "key frame" that contains enough information to construct a full image, and between key frames sends "delta frames" that just include the changes since the last delta frame.While this is far more efficient that sending raw video, it means that in order to display the image associated with a particular delta frame, you need the last key frame and all subsequent delta frames.

This can cause a delay for new users joining a WebRTC conference application, because they can't display video until they have received their first key frame.Similarly, if an encoded transform was used to encrypt frames, the recipient would not be able to display video until they get the first key frame encrypted with their key.

In order to ensure that a new key frame can be sent as early as possible when needed, theRTCRtpScriptTransformer object inevent.transformer has two methods:RTCRtpScriptTransformer.generateKeyFrame(), which causes the codec to generate a key frame, andRTCRtpScriptTransformer.sendKeyFrameRequest(), which a receiver can use to request a key frame from the sender.

The example below shows how the main thread might pass an encryption key to a sender transform, and trigger the codec to generate a key frame.Note that the main thread doesn't have direct access to theRTCRtpScriptTransformer object, so it needs to pass the key and restriction identifier ("rid") to the worker (the "rid" is a stream id, which indicates the encoder that must generate the key frame).Here we do that with aMessageChannel, using the same pattern as in the previous section.The code assumes there is already a peer connection, and thatvideoSender is anRTCRtpSender.

js
const worker = new Worker("worker.js");const channel = new MessageChannel();videoSender.transform = new RTCRtpScriptTransform(  worker,  { name: "senderTransform", port: channel.port2 },  [channel.port2],);// Post rid and new key to the senderchannel.port1.start();channel.port1.postMessage({  rid: "1",  key: "93ae0927a4f8e527f1gce6d10bc6ab6c",});

Thertctransform event handler in the worker gets the port and uses it to listen formessage events from the main thread.If an event is received it gets therid andkey, and then callsgenerateKeyFrame().

js
event.transformer.options.port.onmessage = (event) => {  const { rid, key } = event.data;  // key is used by the transformer to encrypt frames (not shown)  // Get codec to generate a new key frame using the rid  // Here 'rcEvent' is the rtctransform event.  rcEvent.transformer.generateKeyFrame(rid);};

The code for a receiver to request a new key frame would be almost identical, except that "rid" isn't specified.Here is the code for just the port message handler:

js
event.transformer.options.port.onmessage = (event) => {  const { key } = event.data;  // key is used by the transformer to decrypt frames (not shown)  // Request sender to emit a key frame.  transformer.sendKeyFrameRequest();};

Browser compatibility

See also

Help improve MDN

Learn how to contribute.

This page was last modified on byMDN contributors.


[8]ページ先頭

©2009-2025 Movatter.jp