| RFC 9559 | Matroska Format | October 2024 |
| Lhomme, et al. | Standards Track | [Page] |
This document defines the Matroska audiovisual data container structure,including definitions of its structural elements, terminology,vocabulary, and application.¶
This document updates RFC 8794 to permit the use of a previously reserved Extensible Binary Meta Language (EBML) Element ID.¶
This is an Internet Standards Track document.¶
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 7841.¶
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained athttps://www.rfc-editor.org/info/rfc9559.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Matroska is an audiovisual data container format. It was derived from aproject called[MCF] but diverges from itsignificantly because it is based on EBML (Extensible Binary Meta Language)[RFC8794], a binary derivative of XML. EBMLprovides significant advantages in terms of future format extensibility,without breaking file support in parsers reading the previous versions.¶
To avoid any misunderstandings, it is essential to clarify exactlywhat an audio/video container is:¶
It is NOT a video or audio compression format (codec).¶
It is an envelope in which there can be many audio, video, and subtitles streams,allowing the user to store a complete movie or CD in a single file.¶
Matroska is designed with the future in mind. It incorporates features such as:¶
Fast seeking in the file¶
Chapter entries¶
Full metadata (tags) support¶
Selectable subtitle/audio/video streams¶
Modularly expandable¶
Error resilience (can recover playback even when the stream is damaged)¶
Streamable over the Internet and local networks (HTTP[RFC9110], FTP[RFC0959], SMB[SMB-CIFS], etc.)¶
This document covers Matroska versions 1, 2, 3, and 4. Matroska version 4 is the current version.Matroska versions 1 to 3 are no longer maintained. No new elements are expected in files with version numbers 1, 2, or 3.¶
The key words "MUST", "MUST NOT","REQUIRED", "SHALL", "SHALL NOT","SHOULD", "SHOULD NOT","RECOMMENDED", "NOT RECOMMENDED","MAY", and "OPTIONAL" in this document areto be interpreted as described in BCP 14[RFC2119][RFC8174] when, and only when, they appear in all capitals,as shown here.¶
This document defines the following terms in order todefine the format and application of Matroska:¶
A multimedia container format based on EBML (Extensible Binary Meta Language).¶
Matroska Reader:A data parser that interprets the semantics of a Matroska document and creates a way for programs to use Matroska.¶
Matroska Player:AMatroska Reader with the primary purpose of playing audiovisual files, including Matroska documents.¶
Matroska Writer:A data writer that creates Matroska documents.¶
Matroska is a Document Type of EBML.This specification is dependent on the EBML specification[RFC8794].For an understanding of Matroska's EBML Schema, see in particular the sections of the EBML specification that coverEBML Element Types (Section7),EBML Schema (Section11.1),and EBML Structure (Section3).¶
Because of an oversight,[RFC8794] reserved EBML ID 0x80, which is used by deployed Matroska implementations.For this reason, this specification updates[RFC8794] to make 0x80 a legal EBML ID. Additionally, this specification makes the following updates:¶
OLD:¶
One-octet Element IDsMUST be between 0x81 and0xFE. These items are valuable because they are short, and they need to beused for commonly repeated elements. Element IDs are to be allocated withinthis range according to the "RFC Required" policy[RFC8126].¶
The following one-octet Element IDs are RESERVED: 0xFF and 0x80.¶
NEW:¶
One-octet Element IDsMUST be between 0x80 and0xFE. These items are valuable because they are short, and they need to beused for commonly repeated elements. Element IDs are to be allocated withinthis range according to the "RFC Required" policy[RFC8126].¶
The following one-octet Element ID is RESERVED: 0xFF.¶
OLD:¶
+=========================+================+=================+ | Element ID Octet Length | Range of Valid | Number of Valid | | | Element IDs | Element IDs | +=========================+================+=================+ | 1 | 0x81 - 0xFE | 126 | +-------------------------+----------------+-----------------+¶
NEW:¶
+=========================+================+=================+ | Element ID Octet Length | Range of Valid | Number of Valid | | | Element IDs | Element IDs | +=========================+================+=================+ | 1 | 0x80 - 0xFE | 127 | +-------------------------+----------------+-----------------+¶
As an EBML Document Type, Matroska adds the following constraints to the EBML specification[RFC8794]:¶
TheRoot Element and allTop-Level ElementsMUST use 4 octets for their EBML Element ID -- i.e.,Segment and direct children ofSegment.¶
Legacy EBML/Matroska parsers did not handle Empty Elements properly; elements were present in the file but had a length of 0.They always assumed the value was 0 for integers/dates or 0x0p+0, the textual expression of floats using the format in[ISO9899], no matter the default value of the element that should have been used instead.Therefore,Matroska WritersMUST NOT use EBML Empty Elements if the element has a default value that is not 0 for integers/dates and 0x0p+0 for floats.¶
When adding new elements to Matroska, these rules apply:¶
A Matroska fileMUST be composed of at least oneEBML Document using theMatroska Document Type.EachEBML DocumentMUST start with anEBML Header andMUST be followed by theEBML Root Element, defined asSegment in Matroska. Matroska defines severalTop-Level Elementsthat may occur within theSegment.¶
As an example, a simple Matroska file consisting of a singleEBML Document could be represented like this:¶
A more complex Matroska file consisting of anEBML Stream (consisting of twoEBML Documents) could be represented like this:¶
The following diagram represents a simple Matroska file, comprised of anEBML Documentwith anEBML Header, aSegment element (theRoot Element), and all eight MatroskaTop-Level Elements. In the diagrams in this section, horizontal spacing expressesa parent-child relationship between Matroska elements (e.g., theInfo element is contained withintheSegment element), whereas vertical alignment represents the storage order within the file.¶
+-------------+| EBML Header |+---------------------------+| Segment | SeekHead || |-------------|| | Info || |-------------|| | Tracks || |-------------|| | Chapters || |-------------|| | Cluster || |-------------|| | Cues || |-------------|| | Attachments || |-------------|| | Tags |+---------------------------+
The MatroskaEBML Schema defines eightTop-Level Elements:¶
SeekHead (Section 6.3)¶
Info (Section 6.5)¶
Tracks (Section 18)¶
Chapters (Section 20)¶
Cluster (Section 10)¶
Cues (Section 22)¶
Attachments (Section 21)¶
Tags (Section 6.8)¶
TheSeekHead element (also known asMetaSeek) contains anindex ofTop-Level Elements locations within theSegment. Use of theSeekHead element isRECOMMENDED. Without aSeekHead element, a Matroskaparser would have to search the entire file to find all of the otherTop-Level Elements. This is due to Matroska's flexible orderingrequirements; for instance, it is acceptable for theChapters elementto be stored after theCluster element(s).¶
+--------------------------------+| SeekHead | Seek | SeekID || | |--------------|| | | SeekPosition |+--------------------------------+
SeekHead ElementTheInfo element contains vital information for identifying the wholeSegment.This includes the title for theSegment, a randomly generated unique identifier (UID),and the UID(s) of any linkedSegment elements.¶
+-------------------------+| Info | SegmentUUID || |------------------|| | SegmentFilename || |------------------|| | PrevUUID || |------------------|| | PrevFilename || |------------------|| | NextUUID || |------------------|| | NextFilename || |------------------|| | SegmentFamily || |------------------|| | ChapterTranslate || |------------------|| | TimestampScale || |------------------|| | Duration || |------------------|| | DateUTC || |------------------|| | Title || |------------------|| | MuxingApp || |------------------|| | WritingApp ||-------------------------|
Info Element and ItsChild ElementsTheTracks element defines the technical details for each track and can store the name,number, UID, language, and type (audio, video, subtitles, etc.) of each track.For example, theTracks elementMAY store information about the resolution of a video trackor sample rate of an audio track.¶
TheTracks elementMUST identify all the data needed by the codec to decode the data of thespecified track. However, the data required is contingent on the codec used for the track.For example, aTrack element for uncompressed audio only requires the audio bit rate to be present.A codec such as AC-3 would require that theCodecID element be present for all tracks,as it is the primary way to identify which codec to use to decode the track.¶
+------------------------------------+| Tracks | TrackEntry | TrackNumber || | |--------------|| | | TrackUID || | |--------------|| | | TrackType || | |--------------|| | | Name || | |--------------|| | | Language || | |--------------|| | | CodecID || | |--------------|| | | CodecPrivate || | |--------------|| | | CodecName || | |----------------------------------+| | | Video | FlagInterlaced || | | |-------------------|| | | | FieldOrder || | | |-------------------|| | | | StereoMode || | | |-------------------|| | | | AlphaMode || | | |-------------------|| | | | PixelWidth || | | |-------------------|| | | | PixelHeight || | | |-------------------|| | | | DisplayWidth || | | |-------------------|| | | | DisplayHeight || | | |-------------------|| | | | AspectRatioType || | | |-------------------|| | | | Colour || | |----------------------------------|| | | Audio | SamplingFrequency || | | |-------------------|| | | | Channels || | | |-------------------|| | | | BitDepth ||--------------------------------------------------------|
Tracks Element and a Selection of ItsDescendant ElementsTheChapters element lists all of the chapters.Chapters are a way to set predefinedpoints to jump to in video or audio.¶
+-----------------------------------------+| Chapters | Edition | EditionUID || | Entry |--------------------|| | | EditionFlagDefault || | |--------------------|| | | EditionFlagOrdered || | |---------------------------------+| | | ChapterAtom | ChapterUID || | | |-------------------|| | | | ChapterStringUID || | | |-------------------|| | | | ChapterTimeStart || | | |-------------------|| | | | ChapterTimeEnd || | | |-------------------|| | | | ChapterFlagHidden || | | |-------------------------------+| | | | ChapterDisplay | ChapString || | | | |--------------|| | | | | ChapLanguage |+------------------------------------------------------------------+
Chapters Element and a Selection of ItsDescendant ElementsCluster elements contain the content for each track, e.g., video frames. A Matroska fileSHOULD contain at least oneCluster element.In the rare case it doesn't, there should be a method forSegments to linktogether, possibly usingChapters; seeSection 17.¶
TheCluster element helps to break upSimpleBlock orBlockGroup elements and helps with seeking and error protection.EveryCluster elementMUST contain aTimestamp element.ThisSHOULD be theTimestamp element used to play the firstBlock in theCluster element,unless a different value is needed to accommodate for moreBlocks; seeSection 11.2.¶
Cluster elements contain one or moreBlock element, such asBlockGroup orSimpleBlock elements.In some situations, aCluster elementMAY contain noBlock element, for example, in a live recordingwhen no data has been collected.¶
ABlockGroup elementMAY contain aBlock of data and any information relating directly to thatBlock.¶
+--------------------------+| Cluster | Timestamp || |----------------|| | Position || |----------------|| | PrevSize || |----------------|| | SimpleBlock || |----------------|| | BlockGroup |+--------------------------+
Cluster Element and Its ImmediateChild Elements+----------------------------------+| Block | Portion of | Data Type || | a Block | - Bit Flag || |--------------------------+| | Header | TrackNumber || | |-------------|| | | Timestamp || | |-------------|| | | Flags || | | - Gap || | | - Lacing || | | - Reserved || |--------------------------|| | Optional | FrameSize || |--------------------------|| | Data | Frame |+----------------------------------+
Block Element StructureEachClusterMUST contain exactly oneTimestamp element. TheTimestamp element valueMUSTbe stored once perCluster. TheTimestamp element in theCluster is relative to the entireSegment.TheTimestamp elementSHOULD be the first element in theCluster it belongs to or the second element if thatCluster contains aCRC-32 element (Section 6.2).¶
Additionally, theBlock contains an offset that, when added to theCluster'sTimestamp element value,yields theBlock's effective timestamp. Therefore, the timestamp in theBlock itself is relative totheTimestamp element in theCluster. For example, if theTimestamp element in theClusteris set to 10 seconds and aBlock in thatCluster is supposed to be played 12 seconds into the clip,the timestamp in theBlock would be set to 2 seconds.¶
TheReferenceBlock in theBlockGroup is used instead of the basic "P-frame"/"B-frame" description.Instead of simply saying that thisBlock depends on theBlock directly before or directly after,theTimestamp of the necessaryBlock is used. Because there can be as manyReferenceBlock elementsas necessary for aBlock, it allows for some extremely complex referencing.¶
TheCues element is used to seek when playing back a file by providing a temporal indexfor some of theTracks. It is similar to theSeekHead element but is used for seeking to a specific time when playing back the file. It is possible to seek without this element,but it is much more difficult because aMatroska Reader would have to "hunt and peck"through the file to look for the correct timestamp.¶
TheCues elementSHOULD contain at least oneCuePoint element. EachCuePoint elementstores the position of theCluster that contains theBlockGroup orSimpleBlock element.The timestamp is stored in theCueTime element, and the location is stored in theCueTrackPositions element.¶
TheCues element is flexible. For instance, theCues element can be used to index everysingle timestamp of everyBlock or they can be indexed selectively.¶
+-------------------------------------+| Cues | CuePoint | CueTime || | |-------------------|| | | CueTrackPositions || |------------------------------|| | CuePoint | CueTime || | |-------------------|| | | CueTrackPositions |+-------------------------------------+
Cues Element and Two Levels of ItsDescendant ElementsTheAttachments element is for attaching files to a Matroska file, such as pictures,fonts, web pages, etc.¶
+------------------------------------------------+| Attachments | AttachedFile | FileDescription || | |-------------------|| | | FileName || | |-------------------|| | | FileMediaType || | |-------------------|| | | FileData || | |-------------------|| | | FileUID |+------------------------------------------------+
Attachments ElementTheTags element contains metadata that describes theSegment and potentiallyitsTracks,Chapters, andAttachments. EachTrack orChapter that those tagsapplies to has its UID listed in theTags. TheTags contain all extra information aboutthe file: scriptwriters, singers, actors, directors, titles, edition, price, dates, genre, comments,etc.Tags can contain their values in multiple languages.For example, a movie's "TITLE" tag value might contain both the originalEnglish title as well as the German title.¶
+-------------------------------------------+| Tags | Tag | Targets | TargetTypeValue || | | |------------------|| | | | TargetType || | | |------------------|| | | | TagTrackUID || | | |------------------|| | | | TagEditionUID || | | |------------------|| | | | TagChapterUID || | | |------------------|| | | | TagAttachmentUID || | |------------------------------|| | | SimpleTag | TagName || | | |------------------|| | | | TagLanguage || | | |------------------|| | | | TagDefault || | | |------------------|| | | | TagString || | | |------------------|| | | | TagBinary || | | |------------------|| | | | SimpleTag |+-------------------------------------------+
Tags Element and Three Levels of ItsChildren ElementsThis specification includes anEBML Schema that defines the elements and structureof Matroska using the EBML Schema elements and attributes defined inSection 11.1 of [RFC8794].¶
Attributes using their default value (likeminOccurs,minver, etc.) or attributes with undefined values (likelength,maxver, etc.) are omitted.¶
The definitions for each Matroska element are provided below.¶
\Segment¶Root Element that contains all otherTop-Level Elements; seeSection 4.5.¶\Segment\SeekHead¶Top-Level Elements; seeSection 4.5.¶\Segment\SeekHead\Seek¶\Segment\SeekHead\Seek\SeekPosition¶Segment Position (Section 16) of aTop-Level Element.¶\Segment\Info¶Segment.¶\Segment\Info\SegmentUUID¶Segment amongst many others (128 bits). It is equivalent to a Universally Unique Identifier (UUID) v4[RFC9562] with all bits randomly (or pseudorandomly) chosen. An actual UUID v4 value, where some bits are not random,MAY also be used.¶Segment is a part of aLinked Segment, then this element isREQUIRED.The value of the UIDMUST contain at least one bit set to 1.¶\Segment\Info\PrevUUID¶Segment of aLinked Segment.¶Segment is a part of aLinked Segment that usesHard Linking (Section 17.1), then either thePrevUUID or theNextUUID element isREQUIRED. If aSegment contains aPrevUUIDbut not aNextUUID, then itMAY be considered as thelastSegment of theLinked Segment. ThePrevUUIDMUST NOT be equal to theSegmentUUID.¶\Segment\Info\PrevFilename¶Linked Segment.¶PrevUUIDSHOULD be considered authoritative for identifying the previousSegment in aLinked Segment.¶\Segment\Info\NextUUID¶Segment of aLinked Segment.¶Segment is a part of aLinked Segment that uses Hard Linking (Section 17.1),then either thePrevUUID or theNextUUID element isREQUIRED. If aSegment contains aNextUUID but not aPrevUUID, then itMAY be considered as the firstSegment of theLinked Segment. TheNextUUIDMUST NOT be equal to theSegmentUUID.¶\Segment\Info\SegmentFamily¶Segments of aLinked SegmentMUST share (128 bits). It is equivalent to a UUID v4[RFC9562] with all bits randomly (or pseudorandomly) chosen. An actual UUID v4 value, where some bits are not random,MAY also be used.¶SegmentInfo contains aChapterTranslate element, this element isREQUIRED.¶\Segment\Info\ChapterTranslate¶Segment and asegment value in the given Chapter Codec.¶SegmentUUIDs in Matroska.This allows remuxing a file with Chapter Codec without changing the content of the codec data, just theSegment mapping.¶\Segment\Info\ChapterTranslate\ChapterTranslateID¶Segment in the chapter codec data.The format depends on theChapProcessCodecID used; seeSection 5.1.7.1.4.15.¶\Segment\Info\ChapterTranslate\ChapterTranslateCodec¶\Segment\Info\ChapterTranslate\ChapterTranslateEditionUID¶ChapterTranslate applies.¶ChapterTranslateEditionUID is specified in theChapterTranslate, theChapterTranslate applies to all chapter editions found in theSegment using the givenChapterTranslateCodec.¶\Segment\Info\TimestampScale¶TimestampScale value of 1000000 means scaled timestamps in theSegment are expressed in milliseconds; seeSection 11 on how to interpret timestamps.¶\Segment\Cluster¶Top-Level Element containing the (monolithic)Block structure.¶\Segment\Cluster\Timestamp¶TimestampScale; seeSection 11.1.¶Cluster it belongs toor the second if thatCluster contains aCRC-32 element (Section 6.2).¶\Segment\Cluster\SimpleBlock¶Block (seeSection 10.1) but without all the extra information.Mostly used to reduce overhead when no extra feature is needed; seeSection 10.2 onSimpleBlock Structure.¶\Segment\Cluster\BlockGroup¶Block and information specific to thatBlock.¶\Segment\Cluster\BlockGroup\Block¶Block containing the actual data to be rendered and a timestamp relative to theCluster Timestamp;seeSection 10.1 onBlock Structure.¶\Segment\Cluster\BlockGroup\BlockAdditions¶Block element; seeSection 4.1.5 of [MatroskaCodec] for more information.An EBML parser that has no knowledge of theBlock structure could still see and use/skip these data.¶\Segment\Cluster\BlockGroup\BlockAdditions\BlockMore\BlockAddID¶BlockAdditional data; seeSection 4.1.5 of [MatroskaCodec] formore information. A value of 1 indicates that theBlockAdditional data isdefined by the codec. Any other value indicates that theBlockAdditional datashould be handled according to theBlockAddIDType that is located in theTrackEntry.¶BlockAddID valueMUST be unique between allBlockMore elements found in aBlockAdditions element. To keepMaxBlockAdditionID as low as possible, small valuesSHOULD be used.¶\Segment\Cluster\BlockGroup\BlockDuration¶Block, expressed in Track Ticks; seeSection 11.1. TheBlockDuration element can be usefulat the end of aTrack to define the duration of the last frame (asthere is no subsequentBlock available) or when there is a break in atrack like for subtitle tracks.¶| attribute | note |
|---|---|
| minOccurs | BlockDurationMUST be set (minOccurs=1) if the associatedTrackEntry stores aDefaultDuration value. |
| default | If a value is not present and noDefaultDuration is defined, the value is assumed to be the difference between the timestamp of thisBlock and the timestamp of the nextBlock in "display" order (not coding order). |
\Segment\Cluster\BlockGroup\ReferencePriority¶\Segment\Cluster\BlockGroup\ReferenceBlock¶Block in thisBlockGroup, expressed in Track Ticks; seeSection 11.1.This is used to reference other frames necessary to decode this frame.The relative valueSHOULD correspond to a validBlock that thisBlock depends on.Historically,Matroska Writers didn't write the actualBlock(s) that thisBlock depends on, but they did writesomeBlock(s) in the past.¶The value "0"MAY also be used to signify that thisBlock cannot be decoded on its own, but the necessary referenceBlock(s) is unknown. In this case, otherReferenceBlock elementsMUST NOT be found in the sameBlockGroup. If theBlockGroup doesn't have aReferenceBlock element, then theBlock it contains can be decoded without using any otherBlock data.¶
\Segment\Cluster\BlockGroup\DiscardPadding¶Block, expressed inMatroska Ticks -- i.e., in nanoseconds; seeSection 11.1(padding at the end of theBlock for positive values and at thebeginning of theBlock for negative values). The duration ofDiscardPadding is not calculated in the duration of theTrackEntry andSHOULD be discarded duringplayback.¶\Segment\Tracks¶Top-Level Element of information with many tracks described.¶\Segment\Tracks\TrackEntry¶\Segment\Tracks\TrackEntry\TrackType¶TrackType defines the type of each frame found in theTrack.The valueSHOULD be stored on 1 octet.¶| value | label | contents of each frame |
|---|---|---|
1 | video | An image. |
2 | audio | Audio samples. |
3 | complex | A mix of different otherTrackType. The codec needs to define how theMatroska Player should interpret such data. |
16 | logo | An image to be rendered over the video track(s). |
17 | subtitle | Subtitle or closed caption data to be rendered over the video track(s). |
18 | buttons | Interactive button(s) to be rendered over the video track(s). |
32 | control | Metadata used to control the player of theMatroska Player. |
33 | metadata | Timed metadata that can be passed on to theMatroska Player. |
\Segment\Tracks\TrackEntry\FlagForced¶\Segment\Tracks\TrackEntry\DefaultDuration¶\Segment\Tracks\TrackEntry\DefaultDecodedFieldDuration¶\Segment\Tracks\TrackEntry\TrackTimestampScale¶\Segment\Tracks\TrackEntry\MaxBlockAdditionID¶BlockAddID (Section 5.1.3.5.2.3).A value of 0 means there is noBlockAdditions (Section 5.1.3.5.2) for this track.¶\Segment\Tracks\TrackEntry\BlockAdditionMapping¶BlockAddID (Section 5.1.3.5.2.3), or to the track as a wholewithBlockAddIDExtraData.¶\Segment\Tracks\TrackEntry\BlockAdditionMapping\BlockAddIDValue¶BlockAddID (Section 5.1.3.5.2.3) value being described.¶MaxBlockAdditionID as low as possible, small valuesSHOULD be used.¶\Segment\Tracks\TrackEntry\BlockAdditionMapping\BlockAddIDType¶Block Additional Mappingto define how theBlockAdditional data should be handled.¶BlockAddIDType is 0, theBlockAddIDValue and correspondingBlockAddID valuesMUST be 1.¶\Segment\Tracks\TrackEntry\BlockAdditionMapping\BlockAddIDExtraData¶BlockAddIDType can use to interpret theBlockAdditional data.The interpretation of the binary data depends on theBlockAddIDType value and the correspondingBlock Additional Mapping.¶\Segment\Tracks\TrackEntry\Language¶LanguageBCP47 element is used in the sameTrackEntry.¶\Segment\Tracks\TrackEntry\CodecDelay¶Cluster.¶\Segment\Tracks\TrackEntry\SeekPreRoll¶\Segment\Tracks\TrackEntry\TrackTranslate¶TrackEntry and a track value in the given Chapter Codec.¶\Segment\Tracks\TrackEntry\TrackTranslate\TrackTranslateTrackID¶TrackEntry in the chapter codec data.The format depends on theChapProcessCodecID used; seeSection 5.1.7.1.4.15.¶\Segment\Tracks\TrackEntry\TrackTranslate\TrackTranslateCodec¶\Segment\Tracks\TrackEntry\TrackTranslate\TrackTranslateEditionUID¶TrackTranslate applies.¶TrackTranslateEditionUID is specified in theTrackTranslate, theTrackTranslate applies to all chapter editions found in theSegment using the givenTrackTranslateCodec.¶\Segment\Tracks\TrackEntry\Video¶\Segment\Tracks\TrackEntry\Video\FlagInterlaced¶| value | label | definition |
|---|---|---|
0 | undetermined | Unknown status. This valueSHOULD be avoided. |
1 | interlaced | Interlaced frames. |
2 | progressive | No interlacing. |
\Segment\Tracks\TrackEntry\Video\FieldOrder¶FlagInterlaced is not set to 1, this elementMUST be ignored.¶| value | label | definition |
|---|---|---|
0 | progressive | Interlaced frames. This valueSHOULD be avoided; settingFlagInterlaced to 2 is sufficient. |
1 | tff | Top field displayed first. Top field stored first. |
2 | undetermined | Unknown field order. This valueSHOULD be avoided. |
6 | bff | Bottom field displayed first. Bottom field stored first. |
9 | tff (interleaved) | Top field displayed first. Fields are interleaved in storage with the top line of the top field stored first. |
14 | bff (interleaved) | Bottom field displayed first. Fields are interleaved in storage with the top line of the top field stored first. |
\Segment\Tracks\TrackEntry\Video\StereoMode¶| value | label |
|---|---|
0 | mono |
1 | side by side (left eye first) |
2 | top - bottom (right eye is first) |
3 | top - bottom (left eye is first) |
4 | checkboard (right eye is first) |
5 | checkboard (left eye is first) |
6 | row interleaved (right eye is first) |
7 | row interleaved (left eye is first) |
8 | column interleaved (right eye is first) |
9 | column interleaved (left eye is first) |
10 | anaglyph (cyan/red) |
11 | side by side (right eye first) |
12 | anaglyph (green/magenta) |
13 | both eyes laced in one Block (left eye is first) |
14 | both eyes laced in one Block (right eye is first) |
\Segment\Tracks\TrackEntry\Video\AlphaMode¶BlockAdditional element withBlockAddID of "1"contains Alpha data as defined by the Codec Mapping for theCodecID.Undefined values (i.e., values other than 0 or 1)SHOULD NOT be used, as thebehavior of known implementations is different.¶| value | label | definition |
|---|---|---|
0 | none | TheBlockAdditional element withBlockAddID of "1" does not exist orSHOULD NOT be considered as containing such data. |
1 | present | TheBlockAdditional element withBlockAddID of "1" contains alpha channel data. |
\Segment\Tracks\TrackEntry\Video\OldStereoMode¶StereoMode value used in old versions of[libmatroska].¶| value | label |
|---|---|
0 | mono |
1 | right eye |
2 | left eye |
3 | both eyes |
\Segment\Tracks\TrackEntry\Video\DisplayWidth¶| attribute | note |
|---|---|
| default | If the DisplayUnit of the sameTrackEntry is 0, then the default value forDisplayWidth is equal toPixelWidth -PixelCropLeft -PixelCropRight; else, there is no default value. |
\Segment\Tracks\TrackEntry\Video\DisplayHeight¶| attribute | note |
|---|---|
| default | If the DisplayUnit of the sameTrackEntry is 0, then the default value forDisplayHeight is equal toPixelHeight -PixelCropTop -PixelCropBottom; else, there is no default value. |
\Segment\Tracks\TrackEntry\Video\DisplayUnit¶DisplayWidth andDisplayHeight are interpreted.¶| value | label |
|---|---|
0 | pixels |
1 | centimeters |
2 | inches |
3 | display aspect ratio |
4 | unknown |
\Segment\Tracks\TrackEntry\Video\UncompressedFourCC¶Track's data as a FourCC.This value is similar in scope to the biCompression value of AVI'sBITMAPINFO[AVIFormat]. There is neither a definitive list of FourCC values nor an official registry. Some common values for YUV pixel formats can be found at[MSYUV8],[MSYUV16], and[FourCC-YUV]. Some common values for uncompressed RGB pixel formats can be found at[MSRGB] and[FourCC-RGB].¶| attribute | note |
|---|---|
| minOccurs | UncompressedFourCCMUST be set (minOccurs=1) inTrackEntry when theCodecID element of theTrackEntry is set to "V_UNCOMPRESSED". |
\Segment\Tracks\TrackEntry\Video\Colour\MatrixCoefficients¶MatrixCoefficients are adopted from Table 4 of[ITU-H.273].¶| value | label |
|---|---|
0 | Identity |
1 | ITU-R BT.709 |
2 | unspecified |
3 | reserved |
4 | US FCC 73.682 |
5 | ITU-R BT.470BG |
6 | SMPTE 170M |
7 | SMPTE 240M |
8 | YCoCg |
9 | BT2020 Non-constant Luminance |
10 | BT2020 Constant Luminance |
11 | SMPTE ST 2085 |
12 | Chroma-derived Non-constant Luminance |
13 | Chroma-derived Constant Luminance |
14 | ITU-R BT.2100-0 |
\Segment\Tracks\TrackEntry\Video\Colour\ChromaSubsamplingHorz¶ChromaSubsamplingHorzSHOULD be set to 1.¶\Segment\Tracks\TrackEntry\Video\Colour\ChromaSubsamplingVert¶ChromaSubsamplingVertSHOULD be set to 1.¶\Segment\Tracks\TrackEntry\Video\Colour\CbSubsamplingHorz¶ChromaSubsamplingHorz.Example: For video with 4:2:1 chromasubsampling, theChromaSubsamplingHorzSHOULD be set to 1, andCbSubsamplingHorzSHOULD be set to 1.¶\Segment\Tracks\TrackEntry\Video\Colour\ChromaSitingHorz¶| value | label |
|---|---|
0 | unspecified |
1 | left collocated |
2 | half |
\Segment\Tracks\TrackEntry\Video\Colour\ChromaSitingVert¶| value | label |
|---|---|
0 | unspecified |
1 | top collocated |
2 | half |
\Segment\Tracks\TrackEntry\Video\Colour\Range¶| value | label |
|---|---|
0 | unspecified |
1 | broadcast range |
2 | full range (no clipping) |
3 | defined by MatrixCoefficients / TransferCharacteristics |
\Segment\Tracks\TrackEntry\Video\Colour\TransferCharacteristics¶TransferCharacteristics are adopted from Table 3 of[ITU-H.273].¶| value | label |
|---|---|
0 | reserved |
1 | ITU-R BT.709 |
2 | unspecified |
3 | reserved2 |
4 | Gamma 2.2 curve - BT.470M |
5 | Gamma 2.8 curve - BT.470BG |
6 | SMPTE 170M |
7 | SMPTE 240M |
8 | Linear |
9 | Log |
10 | Log Sqrt |
11 | IEC 61966-2-4 |
12 | ITU-R BT.1361 Extended Colour Gamut |
13 | IEC 61966-2-1 |
14 | ITU-R BT.2020 10 bit |
15 | ITU-R BT.2020 12 bit |
16 | ITU-R BT.2100 Perceptual Quantization |
17 | SMPTE ST 428-1 |
18 | ARIB STD-B67 (HLG) |
\Segment\Tracks\TrackEntry\Video\Colour\Primaries¶Primaries are adopted from Table 2 of[ITU-H.273].¶| value | label |
|---|---|
0 | reserved |
1 | ITU-R BT.709 |
2 | unspecified |
3 | reserved2 |
4 | ITU-R BT.470M |
5 | ITU-R BT.470BG - BT.601 625 |
6 | ITU-R BT.601 525 - SMPTE 170M |
7 | SMPTE 240M |
8 | FILM |
9 | ITU-R BT.2020 |
10 | SMPTE ST 428-1 |
11 | SMPTE RP 432-2 |
12 | SMPTE EG 432-2 |
22 | EBU Tech. 3213-E - JEDEC P22 phosphors |
\Segment\Tracks\TrackEntry\Video\Projection\ProjectionType¶| value | label |
|---|---|
0 | rectangular |
1 | equirectangular |
2 | cubemap |
3 | mesh |
\Segment\Tracks\TrackEntry\Video\Projection\ProjectionPrivate¶ProjectionType equals 0 (rectangular), then this elementMUST NOT be present.¶ProjectionType equals 1 (equirectangular), then this elementMUST be present and contain the same binary data that would be stored inside an ISOBMFF Equirectangular Projection Box ("equi").¶ProjectionType equals 2 (cubemap), then this elementMUST be present and contain the same binary data that would be stored inside an ISOBMFF Cubemap Projection Box ("cbmp").¶ProjectionType equals 3 (mesh), then this elementMUST be present and contain the same binary data that would be stored inside an ISOBMFF Mesh Projection Box ("mshp").¶\Segment\Tracks\TrackEntry\Video\Projection\ProjectionPoseYaw¶ProjectionPosePitch orProjectionPoseRoll rotations.The value of this elementMUST be in the -180 to 180 degree range, both inclusive.¶SettingProjectionPoseYaw to 180 or -180 degrees withProjectionPoseRoll andProjectionPosePitch set to 0 degrees flips the image horizontally.¶
\Segment\Tracks\TrackEntry\Video\Projection\ProjectionPosePitch¶ProjectionPoseYaw rotation and before theProjectionPoseRoll rotation.The value of this elementMUST be in the -90 to 90 degree range, both inclusive.¶\Segment\Tracks\TrackEntry\Video\Projection\ProjectionPoseRoll¶ProjectionPoseYaw andProjectionPosePitch rotations. The value of this elementMUST be in the -180 to 180 degree range, both inclusive. SettingProjectionPoseRoll to 180 or -180 degrees andProjectionPoseYaw to 180 or -180 degrees withProjectionPosePitch set to 0 degrees flips the image vertically.SettingProjectionPoseRoll to 180 or -180 degrees withProjectionPoseYaw andProjectionPosePitch set to 0 degreesflips the image horizontally and vertically.¶\Segment\Tracks\TrackEntry\Audio¶\Segment\Tracks\TrackEntry\Audio\OutputSamplingFrequency¶| attribute | note |
|---|---|
| default | The default value forOutputSamplingFrequency of the sameTrackEntry is equal to theSamplingFrequency. |
\Segment\Tracks\TrackEntry\TrackOperation¶\Segment\Tracks\TrackEntry\TrackOperation\TrackCombinePlanes\TrackPlane\TrackPlaneType¶| value | label |
|---|---|
0 | left eye |
1 | right eye |
2 | background |
\Segment\Tracks\TrackEntry\ContentEncodings¶\Segment\Tracks\TrackEntry\ContentEncodings\ContentEncoding\ContentEncodingOrder¶ContentEncoding of theContentEncodings.The decoder/demuxerMUST start with theContentEncoding with the highestContentEncodingOrder and work its way down to theContentEncoding with the lowestContentEncodingOrder.This valueMUST be unique for eachContentEncoding found in theContentEncodings of thisTrackEntry.¶\Segment\Tracks\TrackEntry\ContentEncodings\ContentEncoding\ContentEncodingScope¶| value | label | definition |
|---|---|---|
0x1 | Block | All frame contents, excluding lacing data. |
0x2 | Private | The track'sCodecPrivate data. |
0x4 | Next | The next ContentEncoding (nextContentEncodingOrder; the data insideContentCompression and/orContentEncryption). This valueSHOULD NOT be used, as it's not supported by players. |
\Segment\Tracks\TrackEntry\ContentEncodings\ContentEncoding\ContentEncodingType¶| value | label |
|---|---|
0 | Compression |
1 | Encryption |
\Segment\Tracks\TrackEntry\ContentEncodings\ContentEncoding\ContentCompression¶ContentEncodingType is 0 and absent otherwise.Each blockMUST be decompressable, even if no previous block is available in order to not prevent seeking.¶\Segment\Tracks\TrackEntry\ContentEncodings\ContentEncoding\ContentCompression\ContentCompAlgo¶Matroska WriterSHOULD NOT use these compression methods by default. AMatroska ReaderMAY support methods "1" and "2" andSHOULD support other methods.¶| value | label | definition |
|---|---|---|
0 | zlib | zlib compression[RFC1950]. |
1 | bzlib | bzip2 compression[BZIP2]SHOULD NOT be used; see usage notes. |
2 | lzo1x | Lempel-Ziv-Oberhumer compression[LZO]SHOULD NOT be used; see usage notes. |
3 | Header Stripping | Octets inContentCompSettings (Section 5.1.4.1.31.7) have been stripped from each frame. |
\Segment\Tracks\TrackEntry\ContentEncodings\ContentEncoding\ContentCompression\ContentCompSettings¶ContentCompAlgo=3),the bytes that were removed from the beginning of each frame of the track.¶\Segment\Tracks\TrackEntry\ContentEncodings\ContentEncoding\ContentEncryption¶ContentEncodingType is 1 (encryption) andMUST be ignored otherwise.AMatroska PlayerMAY support encryption.¶\Segment\Tracks\TrackEntry\ContentEncodings\ContentEncoding\ContentEncryption\ContentEncAlgo¶| value | label | definition |
|---|---|---|
0 | Not encrypted | The data are not encrypted. |
1 | DES | Data Encryption Standard (DES)[FIPS46-3]. This valueSHOULD be avoided. |
2 | 3DES | Triple Data Encryption Algorithm[SP800-67]. This valueSHOULD be avoided. |
3 | Twofish | Twofish Encryption Algorithm[Twofish]. |
4 | Blowfish | Blowfish Encryption Algorithm[Blowfish]. This valueSHOULD be avoided. |
5 | AES | Advanced Encryption Standard (AES)[FIPS197]. |
\Segment\Tracks\TrackEntry\ContentEncodings\ContentEncoding\ContentEncryption\ContentEncAESSettings¶| attribute | note |
|---|---|
| maxOccurs | ContentEncAESSettingsMUST NOT be set (maxOccurs=0) if ContentEncAlgo is not AES (5). |
\Segment\Tracks\TrackEntry\ContentEncodings\ContentEncoding\ContentEncryption\ContentEncAESSettings\AESSettingsCipherMode¶| value | label | definition |
|---|---|---|
1 | AES-CTR | Counter[SP800-38A] |
2 | AES-CBC | Cipher Block Chaining[SP800-38A] |
| attribute | note |
|---|---|
| maxOccurs | AESSettingsCipherModeMUST NOT be set (maxOccurs=0) if ContentEncAlgo is not AES (5). |
\Segment\Cues¶Top-Level Element to speed seeking access. All entries arelocal to theSegment.¶| attribute | note |
|---|---|
| minOccurs | This elementSHOULD be set when theSegment is not transmitted as a live stream; seeSection 23.2. |
\Segment\Cues\CuePoint¶Segment.¶\Segment\Cues\CuePoint\CueTime¶TimestampScale; seeSection 11.1.¶\Segment\Cues\CuePoint\CueTrackPositions¶\Segment\Cues\CuePoint\CueTrackPositions\CueClusterPosition¶Segment Position (Section 16) of theCluster containing the associatedBlock.¶\Segment\Cues\CuePoint\CueTrackPositions\CueDuration¶TimestampScale; seeSection 11.1.If missing, the track'sDefaultDuration does not apply and no duration information is available in terms of the cues.¶\Segment\Attachments¶\Segment\Chapters¶\Segment\Chapters\EditionEntry¶Segment edition.¶\Segment\Chapters\EditionEntry\+ChapterAtom¶\Segment\Chapters\EditionEntry\+ChapterAtom\ChapterTimeStart¶Chapter, expressed in Matroska Ticks -- i.e., in nanoseconds; seeSection 11.1.¶\Segment\Chapters\EditionEntry\+ChapterAtom\ChapterTimeEnd¶Chapter (timestamp excluded), expressed in Matroska Ticks -- i.e., in nanoseconds; seeSection 11.1.The valueMUST be greater than or equal to theChapterTimeStart of the sameChapterAtom.¶ChapterTimeEnd timestamp value being excluded, itMUST take into account the duration ofthe last frame it includes, especially for theChapterAtom using the last frames of theSegment.¶| attribute | note |
|---|---|
| minOccurs | ChapterTimeEndMUST be set (minOccurs=1) if theEdition is an ordered edition; seeSection 20.1.3. If it's aParent Chapter, seeSection 20.2.3. |
\Segment\Chapters\EditionEntry\+ChapterAtom\ChapterFlagHidden¶Chapter flags).¶\Segment\Chapters\EditionEntry\+ChapterAtom\ChapterSegmentUUID¶SegmentUUID of anotherSegment to play during this chapter.¶SegmentUUID value of theSegment it belongs to.¶| attribute | note |
|---|---|
| minOccurs | ChapterSegmentUUIDMUST be set (minOccurs=1) ifChapterSegmentEditionUID is used; seeSection 17.2 on Medium-LinkingSegments. |
\Segment\Chapters\EditionEntry\+ChapterAtom\ChapterSegmentEditionUID¶EditionUID to play from theSegment linked inChapterSegmentUUID.IfChapterSegmentEditionUID is undeclared, then noEdition of theLinked Segment is used; seeSection 17.2 on Medium-LinkingSegments.¶\Segment\Chapters\EditionEntry\+ChapterAtom\ChapterPhysicalEquiv¶ChapterAtom, e.g., "DVD" (60) or "SIDE" (50);seeSection 20.4 for a complete list of values.¶\Segment\Chapters\EditionEntry\+ChapterAtom\ChapterDisplay\ChapLanguage¶ChapLanguageBCP47 element is used within the sameChapterDisplay element.¶\Segment\Chapters\EditionEntry\+ChapterAtom\ChapterDisplay\ChapLanguageBCP47¶ChapString,in the form defined in[RFC5646]; seeSection 12 on language codes.If aChapLanguageBCP47 element is used, then anyChapLanguage andChapCountry elements used in the sameChapterDisplayMUST be ignored.¶\Segment\Chapters\EditionEntry\+ChapterAtom\ChapterDisplay\ChapCountry¶ChapLanguageBCP47 element is used within the sameChapterDisplay element.¶\Segment\Chapters\EditionEntry\+ChapterAtom\ChapProcess\ChapProcessCodecID¶| value | label | definition |
|---|---|---|
0 | Matroska Script | Chapter commands using the Matroska Script codec. |
1 | DVD-menu | Chapter commands using the DVD-like codec. |
\Segment\Chapters\EditionEntry\+ChapterAtom\ChapProcess\ChapProcessPrivate¶ChapProcessCodecID information.ForChapProcessCodecID = 1, it is the "DVD level" equivalent; seeSection 20.3 on DVD menus.¶\Segment\Chapters\EditionEntry\+ChapterAtom\ChapProcess\ChapProcessCommand\ChapProcessTime¶| value | label |
|---|---|
0 | during the whole chapter |
1 | before starting playback |
2 | after playback of the chapter |
\Segment\Chapters\EditionEntry\+ChapterAtom\ChapProcess\ChapProcessCommand\ChapProcessData¶ChapProcessCodecID value. ForChapProcessCodecID = 1,the data correspond to the binary DVD cell pre/post commands; seeSection 20.3 on DVD menus.¶\Segment\Tags¶Tracks,Editions,Chapters,Attachments, or theSegment as a whole.A list of valid tags can be found in[MatroskaTags].¶\Segment\Tags\Tag¶\Segment\Tags\Tag\Targets¶Segment.¶\Segment\Tags\Tag\Targets\TargetTypeValue¶TargetTypeValue values are meant to be compared.Higher valuesMUST correspond to a logical level that contains the lower logical levelTargetTypeValue values.¶| value | label | definition |
|---|---|---|
70 | COLLECTION | The highest hierarchical level that tags can describe. |
60 | EDITION / ISSUE / VOLUME / OPUS / SEASON / SEQUEL | A list of lower levels grouped together. |
50 | ALBUM / OPERA / CONCERT / MOVIE / EPISODE | The most common grouping level of music and video (e.g., an episode for TV series). |
40 | PART / SESSION | When an album or episode has different logical parts. |
30 | TRACK / SONG / CHAPTER | The common parts of an album or movie. |
20 | SUBTRACK / MOVEMENT / SCENE | Corresponds to parts of a track for audio, such as a movement or scene in a movie. |
10 | SHOT | The lowest hierarchy found in music or movies. |
\Segment\Tags\Tag\Targets\TargetType¶| value | label |
|---|---|
COLLECTION | TargetTypeValue 70 |
EDITION | TargetTypeValue 60 |
ISSUE | TargetTypeValue 60 |
VOLUME | TargetTypeValue 60 |
OPUS | TargetTypeValue 60 |
SEASON | TargetTypeValue 60 |
SEQUEL | TargetTypeValue 60 |
ALBUM | TargetTypeValue 50 |
OPERA | TargetTypeValue 50 |
CONCERT | TargetTypeValue 50 |
MOVIE | TargetTypeValue 50 |
EPISODE | TargetTypeValue 50 |
PART | TargetTypeValue 40 |
SESSION | TargetTypeValue 40 |
TRACK | TargetTypeValue 30 |
SONG | TargetTypeValue 30 |
CHAPTER | TargetTypeValue 30 |
SUBTRACK | TargetTypeValue 20 |
MOVEMENT | TargetTypeValue 20 |
SCENE | TargetTypeValue 20 |
SHOT | TargetTypeValue 10 |
\Segment\Tags\Tag\Targets\TagTrackUID¶Track(s) that the tags belong to.¶Segment.If set to any other value, itMUST match theTrackUID value of a track found in thisSegment.¶\Segment\Tags\Tag\Targets\TagEditionUID¶EditionEntry(s) that the tags belong to.¶Segment.If set to any other value, itMUST match theEditionUID value of an edition found in thisSegment.¶\Segment\Tags\Tag\Targets\TagChapterUID¶Chapter(s) that the tags belong to.¶Segment.If set to any other value, itMUST match theChapterUID value of a chapter found in thisSegment.¶\Segment\Tags\Tag\Targets\TagAttachmentUID¶Segment. If set to any other value, itMUST matchtheFileUID value of an attachment found in thisSegment.¶\Segment\Tags\Tag\+SimpleTag¶\Segment\Tags\Tag\+SimpleTag\TagLanguage¶TagLanguageBCP47 element is used within the sameSimpleTag element.¶\Segment\Tags\Tag\+SimpleTag\TagLanguageBCP47¶TagString,in the form defined in[RFC5646]; seeSection 12 on language codes.If this element is used, then anyTagLanguage elements used in the sameSimpleTagMUST be ignored.¶With the exceptions of theEBML Header and theCRC-32element, the EBML specification[RFC8794] does not require anyparticular storage order for elements. However, this specification definesmandates and recommendations for ordering certain elements to facilitatebetter playback, seeking, and editing efficiency. This section describes andoffers rationale for ordering requirements and recommendations forMatroska.¶
TheInfo element is the onlyREQUIREDTop-Level Element in a Matroska file.To be playable, MatroskaMUST also contain at least oneTracks element andCluster element.The firstInfo element and the firstTracks element eitherMUST be stored before the firstCluster element orSHALL both be referenced by aSeekHead element occurring before the firstCluster element.¶
AllTop-Level ElementsMUST use a 4-octet EBML Element ID.¶
When using Medium Linking, chapters are used to reference otherSegments to play in a given order (seeSection 17.2).ASegment containing theseLinked Chapters does not require aTracks element or aCluster element.¶
It is possible to edit a Matroska file after it has been created. Forexample, chapters, tags, or attachments can be added. When newTop-LevelElements are added to a Matroska file, theSeekHead element(s)MUST be updated so that theSeekHead element(s)itemizes the identity and position of allTop-Level Elements.¶
Editing, removing, or adding elements to a Matroska file often requiresthat some existing elements be voided or extended. Transforming the existingelements intoVoid elements as padding can be used as a method toavoid moving large amounts of data around.¶
As noted by the EBML specification[RFC8794], if aCRC-32 element is used, then theCRC-32 elementMUST be the first ordered element within itsParent Element.¶
In Matroska, allTop-Level Elements of an EBML DocumentSHOULD include aCRC-32 elementas their firstChild Element.TheSegment element, which is theRoot Element,SHOULD NOT have aCRC-32 element.¶
If used, the firstSeekHead elementMUST be the first non-CRC-32 Child elementof theSegment element. If a secondSeekHead element is used, then the firstSeekHead elementMUST reference the identity and position of the secondSeekHead element.¶
Additionally, the secondSeekHead elementMUST only referenceCluster elementsand not any otherTop-Level Element already contained within the firstSeekHead element.¶
The secondSeekHead elementMAY be stored in any order relative to the otherTop-Level Elements.Whether one or twoSeekHead elements are used, theSeekHead element(s)MUSTcollectively reference the identity and position of allTop-Level Elements exceptfor the firstSeekHead element.¶
TheCues element isRECOMMENDED to optimize seeking access in Matroska. It isprogrammatically simpler to add theCues element after allCluster elementshave been written because this does not require a prediction of how much space toreserve before writing theCluster elements. However, storing theCues elementbefore theCluster elements can provide some seeking advantages. If theCues elementis present, then itSHOULD either be stored before the firstCluster elementor be referenced by aSeekHead element.¶
The firstInfo elementSHOULD occur before the firstTracks element and firstCluster element except when referenced by aSeekHead element.¶
TheChapters elementSHOULD be placed before theCluster element(s). TheChapters element can be used during playback even if the user does not need to seek.It immediately gives the user information about what section is being read and whatother sections are available.¶
In the case ofOrdered Chapters, it isRECOMMENDED to evaluatethe logical linking before playing. TheChapters elementSHOULD be placed beforethe firstTracks element and after the firstInfo element.¶
TheAttachments element is not intended to be used by default when playing the filebut could contain information relevant to the content, such as cover art or fonts.Cover art is useful even before the file is played, and fonts could be needed before playbackstarts for the initialization of subtitles. TheAttachments elementMAY be placed beforethe firstCluster element; however, if theAttachments element is likely to be edited,then itSHOULD be placed after the lastCluster element.¶
TheTags element is most subject to changes after the file was originally created.For easier editing, theTags element can be placed at the end of theSegment element,even after theAttachments element. On the other hand, it is inconvenient to have toseek in theSegment for tags, especially for network streams; thus, it's better if theTags element is found early in the stream. When editing theTags element, the originalTags element at the beginning can be overwritten with aVoid element and anewTags element written at the end of theSegment element. The file andSegment sizes will only marginally change.¶
Matroska is based on the principle that a reading application does not have to support100% of the specifications in order to be able to play the file. Therefore, a Matroska file contains version indicators that tell a reading application what to expect.¶
It is possible and valid to have the version fields indicate that the filecontains Matroska elements from a higher specification version number whilesignaling that a reading applicationMUST only support a lowerversion number properly in order to play it back (possibly with a reducedfeature set).¶
TheEBML Header of each Matroska document informs the readingapplication on what version of Matroska to expect. The elements within theEBML Header with jurisdiction over this information areDocTypeVersion andDocTypeReadVersion.¶
DocTypeVersionMUST be equal to or greater than the highest Matroska version number ofany element present in the Matroska file. For example, a file using theSimpleBlock element (Section 5.1.3.4)MUST have aDocTypeVersion equal to or greater than 2. A file containingCueRelativePositionelements (Section 5.1.5.1.2.3)MUST have aDocTypeVersion equal to or greater than 4.¶
TheDocTypeReadVersionMUST contain the minimumversion number that a reading application can minimally support in order toplay the file back -- optionally with a reduced feature set. For example, if afile contains only elements of version 2 or lower except forCueRelativePosition (which is a version 4 Matroska element), thenDocTypeReadVersionSHOULD still be set to 2 and not 4because evaluatingCueRelativePosition is not necessary for standardplayback -- it makes seeking more precise if used.¶
A reading application supporting Matroska versionVMUST NOT refuse to read afile withDocReadTypeVersion equal to or lower thanV, even ifDocTypeVersionis greater thanV.¶
A reading application supporting at least Matroska versionV andreading a file whoseDocTypeReadVersion field is equal to or lowerthanVMUST skip Matroska/EBML elements it encountersbut does not know about if that unknown element fits into the size constraintsset by the currentParent Element.¶
It is sometimes necessary to create a Matroska file from another Matroska file, for example, to add subtitles in a languageor to edit out a portion of the content.Some values from the original Matroska file need to be kept the same in the destination file.For example, theSamplingFrequency of an audio track wouldn't change between the two files.Some other values may change between the two files, for example, theTrackNumber of an audio track when another track has been added.¶
An element is marked with a property "stream copy: True" when the values of that element need to be kept identical between the source and destination files.If that property is not set, elements may or may not keep the same value between the source and destination files.¶
TheDefaultDecodedFieldDuration element can signal to thedisplaying application how often fields of a video sequence will be availablefor displaying. It can be used for both interlaced and progressivecontent.¶
If the video sequence is signaled as interlaced (Section 5.1.4.1.28.1), thenDefaultDecodedFieldDuration equalsthe period between two successive fields at the output of the decoding process.For video sequences signaled as progressive,DefaultDecodedFieldDuration is half ofthe period between two successive frames at the output of the decoding process.¶
These values are valid at the end of the decoding process before post-processing(such as deinterlacing or inverse telecine) is applied.¶
Examples:¶
Blu-ray movie: 1000000000 ns/(48/1.001) = 20854167 ns¶
PAL broadcast/DVD: 1000000000 ns/(50/1.000) = 20000000 ns¶
N/ATSC broadcast: 1000000000 ns/(60/1.001) = 16683333 ns¶
Hard-telecined DVD: 1000000000 ns/(60/1.001) = 16683333 ns (60 encoded interlaced fields per second)¶
Soft-telecined DVD: 1000000000 ns/(60/1.001) = 16683333 ns (48 encoded interlaced fields per second, with "repeat_first_field = 1")¶
Frames using referencesSHOULD be stored in "coding order" (i.e., the references first and thenthe frames referencing them). A consequence is that timestamps might not be consecutive.However, a frame with a past timestampMUST reference a frame already known; otherwise, it is considered bad/void.¶
Matroska has two similar ways to store frames in a block:¶
TheSimpleBlock is usually preferred unless some extra elements of theBlockGroup need to be used.AMatroska ReaderMUST support both types of blocks.¶
Each block contains the same parts in the following order:¶
The block header starts with the number of theTrack it corresponds to.The valueMUST correspond to theTrackNumber (Section 5.1.4.1.1) of aTrackEntry of theSegment.¶
TheTrackNumber is coded using the Variable-Size Integer (VINT) mechanism described inSection 4 of [RFC8794].To save space, the shortest VINT formSHOULD be used. The value can be coded using up to 8 octets.This is the only element with a variable size in the block header.¶
The timestamp is expressed in Track Ticks; seeSection 11.1.The value is stored as a signed value on 16 bits.¶
This section describes the binary data contained in theBlock element (Section 5.1.3.5.1). Bit 0 is the most significant bit.¶
As theTrackNumber size can vary between 1 and 8 octets, there are 8 different sizes for theBlock header.The definitions forTrackNumber sizes of 1 and 2 are provided; the other variants can be deduced by extending the size of theTrackNumber by multiples of 8 bits.¶
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | |I|LAC|U| | Track Number | Timestamp | Rsvrd |N|ING|N| | | | |V| |U| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Track Number | Timestamp | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |I|LAC|U| | Rsvrd |N|ING|N| ... | |V| |U| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
where:¶
2 bits. Uses lacing mode.¶
The remaining data in theBlock corresponds to the lacing data and frames usage as described in each respective lacing mode (seeSection 10.3).¶
This section describes the binary data contained in theSimpleBlock element (Section 5.1.3.4). Bit 0 is the most significant bit.¶
TheSimpleBlock structure is inspired by theBlock structure; seeSection 10.1.The main differences are the added Keyframe flag and Discardable flag. Otherwise, everything is the same.¶
As theTrackNumber size can vary between 1 and 8 octets, there are 8 different sizes for theSimpleBlock header.The definitions forTrackNumber sizes of 1 and 2 are provided; theother variants can be deduced by extending the size of theTrackNumber by multiples of 8 bits.¶
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | |K| |I|LAC|D| | Track Number | Timestamp |E|Rsvrd|N|ING|I| | | |Y| |V| |S| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Track Number | Timestamp | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |K| |I|LAC|D| |E|Rsvrd|N|ING|I| ... |Y| |V| |S| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
where:¶
Block contains only keyframes.¶2 bits. Uses lacing mode.¶
Block can be discarded during playing if needed.¶The remaining data in theSimpleBlock corresponds to the lacing data and frames usage as described in each respective lacing mode (seeSection 10.3).¶
Lacing is a mechanism to save space when storing data. It is typically used for small blocksof data (referred to as frames in Matroska). It packs multiple frames into a singleBlock orSimpleBlock.¶
LacingMUST NOT be used to store a single frame in aBlock orSimpleBlock.¶
There are three types of lacing:¶
Xiph, which is inspired by what is found in the Ogg container[RFC3533]¶
EBML, which is the same with sizes coded differently¶
Fixed-size, where the size is not coded¶
When lacing is not used, i.e., to store a single frame, the lacing bits (bits 5 and 6) of theBlock orSimpleBlockMUST be set to zero.¶
For example, a user wants to store three frames of the same track. The first frame is 800 octets long,the second is 500 octets long, and the third is 1000 octets long.Because these frames are small,they can be stored in a lace to save space.¶
It is possible to not use lacing at all and just store a single frame without any extra data.When theFlagLacing (Section 5.1.4.1.12) is set to 0, all blocks of that trackMUST NOT use lacing.¶
When no lacing is used, the number of frames in the lace is omitted, and only one frame can be stored in theBlock. The LACING bits of theBlock Header flags are set to00b.¶
TheBlock for an 800-octet frame is as follows:¶
When aBlock contains a single frame, itMUST use this "no lacing" mode.¶
The Xiph lacing uses the same coding of size as found in the Ogg container[RFC3533].The LACING bits of theBlock Header flags are set to01b.¶
TheBlock data with laced frames is stored as follows:¶
Lacing Head on 1 Octet: Number of frames in the lace minus 1.¶
Lacing size of each frame except the last one.¶
Binary data of each frame consecutively.¶
The lacing size is split into 255 values, stored as unsigned octets -- for example, 500 is coded 255;245 or [0xFF 0xF5].A frame with a size multiple of 255 is coded with a 0 at the end of the size -- for example, 765 is coded 255;255;255;0 or [0xFF 0xFF 0xFF 0x00].¶
The size of the last frame is deduced from the size remaining in theBlock after the other frames.¶
Because large sizes result in large coding of the sizes, it isRECOMMENDED to use Xiph lacing only with small frames.¶
In our example, the 800-, 500-, and 1000-octet frames are stored with Xiph lacing in aBlock as follows:¶
| Block Octets | Value | Description |
|---|---|---|
| 4 | 0x02 | Number of frames minus 1 |
| 5-8 | 0xFF 0xFF 0xFF 0x23 | Size of the first frame (255;255;255;35) |
| 9-10 | 0xFF 0xF5 | Size of the second frame (255;245) |
| 11-810 | First frame data | |
| 811-1310 | Second frame data | |
| 1311-2310 | Third frame data |
TheBlock is 2311 octets, and the last frame starts at 1311, so we can deduce that the size of the last frame is 2311 - 1311 = 1000.¶
The EBML lacing encodes the frame size with an EBML-like encoding[RFC8794].The LACING bits of theBlock Header flags are set to11b.¶
TheBlock data with laced frames is stored as follows:¶
Lacing Head on 1 Octet: Number of frames in the lace minus 1.¶
Lacing size of each frame except the last one.¶
Binary data of each frame consecutively.¶
The first frame size is encoded as an EBML VINT value.The remaining frame sizes are encoded as signed values using the difference between the frame size and the previous frame size.These signed values are encoded as VINT, with a mapping from signed to unsigned numbers.Decoding the unsigned number stored in the VINT to a signed number is done by subtracting 2((7*n)-1)-1, wheren is the octet size of the VINT.¶
| Bit Representation of Signed VINT | Possible Value Range |
|---|---|
| 1xxx xxxx | 2^7 values from -(26-1) to 26 |
| 01xx xxxx xxxx xxxx | 2^14 values from -(213-1) to 213 |
| 001x xxxx xxxx xxxx xxxx xxxx | 2^21 values from -(220-1) to 220 |
| 0001 xxxx xxxx xxxx xxxx xxxx xxxx xxxx | 2^28 values from -(227-1) to 227 |
| 0000 1xxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx | 2^35 values from -(234-1) to 234 |
In our example, the 800-, 500-, and 1000-octet frames are stored with EBML lacing in aBlock as follows:¶
| Block Octets | Value | Description |
|---|---|---|
| 4 | 0x02 | Number of frames minus 1 |
| 5-6 | 0x43 0x20 | Size of the first frame (800 = 0x320 + 0x4000) |
| 7-8 | 0x5E 0xD3 | Size of the second frame (500 - 800 = -300 = - 0x12C + 0x1FFF + 0x4000) |
| 8-807 | <frame1> | First frame data |
| 808-1307 | <frame2> | Second frame data |
| 1308-2307 | <frame3> | Third frame data |
TheBlock is 2308 octets, and the last frame starts at 1308, so we can deduce that the size of the last frame is 2308 - 1308 = 1000.¶
Fixed-size lacing doesn't store the frame size; rather, it only stores the number of frames in the lace.Each frameMUST have the same size. The frame size of each frame is deduced from the total size of theBlock.The LACING bits of theBlock Header flags are set to10b.¶
TheBlock data with laced frames is stored as follows:¶
Lacing Head on 1 Octet: Number of frames in the lace minus 1.¶
Binary data of each frame consecutively.¶
For example, for three frames that are 800 octets each:¶
| Block Octets | Value | Description |
|---|---|---|
| 4 | 0x02 | Number of frames minus 1 |
| 5-804 | <frame1> | First frame data |
| 805-1604 | <frame2> | Second frame data |
| 1605-2404 | <frame3> | Third frame data |
This gives aBlock of 2405 octets. When reading theBlock, we find that there are three frames (Octet 4). The data start at Octet 5, so the size of each frame is (2405 - 5) / 3 = 800.¶
ABlock only contains a single timestamp value. But when lacing is used, it contains more than one frame.Each frame originally has its own timestamp, or Presentation Timestamp (PTS). That timestamp applies tothe first frame in the lace.¶
In the lace, each frame after the first one has an underdetermined timestamp.However, each of these framesMUST be contiguous -- i.e., the decoded dataMUST NOT contain any gapbetween them. If there is a gap in the stream, the frames around the gapMUST NOT be in the sameBlock.¶
Lacing is only useful for small contiguous data to save space. This is usually the case for audio tracksand not the case for video (which use a lot of data) or subtitle tracks (which have long gaps).For audio, there is usually a fixed output sampling frequency for the whole track,so the decoder should be able to recover the timestamp of each sample, knowing eachoutput sample is contiguous with a fixed frequency.For subtitles, this is usually not the case, so lacingSHOULD NOT be used.¶
Random Access Points (RAPs) are positions where the parser can seek to andstart playback without decoding what was before. In Matroska,BlockGroups andSimpleBlocks can be RAPs. To seek to theseelements, it is still necessary to seek to theCluster containingthem, read theCluster Timestamp, and start playback from theBlockGroup orSimpleBlock that is a RAP.¶
Because a Matroska File is usually composed of multiple tracks playing at the same time-- video, audio, and subtitles -- to seek properly to a RAP, each selected track must betaken into account. Usually, all audio and subtitleBlockGroups orSimpleBlocks are RAPs.They are independent of each other and can be played randomly.¶
On the other hand, video tracks often use references to previous and futureframes for better coding efficiency. Frames with such referencesMUST either contain one or moreReferenceBlockelements in theirBlockGroup orMUST be marked asnon-keyframe in aSimpleBlock; seeSection 10.2.¶
<Cluster> <Timestamp>123456</Timestamp> <BlockGroup> <!-- References a Block 40 Track Ticks before this one --> <ReferenceBlock>-40</ReferenceBlock> <Block/> </BlockGroup> ...</Cluster>
<Cluster> <Timestamp>123456</Timestamp> <SimpleBlock/> (octet 3 bit 0 not set) ...</Cluster>
Frames that are RAPs (i.e., frames that don't depend on other frames)MUST set the keyframeflag if they are in aSimpleBlock or their parentBlockGroupMUST NOT containaReferenceBlock.¶
<Cluster> <Timestamp>123456</Timestamp> <BlockGroup> <!-- No ReferenceBlock allowed in this BlockGroup --> <Block/> </BlockGroup> ...</Cluster>
<Cluster> <Timestamp>123456</Timestamp> <SimpleBlock/> (octet 3 bit 0 set) ...</Cluster>
There may be cases where the use ofBlockGroup is necessary, as the frame may need aBlockDuration,BlockAdditions,CodecState, orDiscardPadding element.For those cases, aSimpleBlockMUST NOT be used;the reference informationSHOULD be recovered for non-RAP frames.¶
<Cluster> <Timestamp>123456</Timestamp> <SimpleBlock/> (octet 3 bit 0 not set) ...</Cluster>
<Cluster> <Timestamp>123456</Timestamp> <BlockGroup> <!-- ReferenceBlock value recovered based on the codec --> <ReferenceBlock>-40</ReferenceBlock> <BlockDuration>20</BlockDuration> <Block/> </BlockGroup> ...</Cluster>
BlockGroup to AddBlockDuration, with the EBML Tree Shown as XMLWhen a frame in aBlockGroup is not a RAP, theBlockGroupMUST contain at least aReferenceBlock.TheReferenceBlocksMUST be used in one of the following ways:¶
each reference frame listed as aReferenceBlock,¶
some referenced frames listed as aReferenceBlock, even if the timestamp value is accurate, or¶
oneReferenceBlock with the timestamp value "0" corresponding to a self or unknown reference.¶
The lack ofReferenceBlock would mean such a frame is a RAP, and seeking on thatframe that actually depends on other frames may create a bogus output or even crash.¶
<Cluster> <Timestamp>123456</Timestamp> <BlockGroup> <!-- ReferenceBlock value not recovered from the codec --> <ReferenceBlock>0</ReferenceBlock> <BlockDuration>20</BlockDuration> <Block/> </BlockGroup> ...</Cluster>
BlockGroup, but the Reference Could Not Be Recovered, with the EBML Tree Shown as XML<Cluster> <Timestamp>123456</Timestamp> <BlockGroup> <!-- References a Block 80 Track Ticks before this one --> <ReferenceBlock>-80</ReferenceBlock> <!-- References a Block 40 Track Ticks after this one --> <ReferenceBlock>40</ReferenceBlock> <Block/> </BlockGroup> ...</Cluster>
BlockGroup with a Frame That References Two Other Frames, with the EBML Tree Shown as XMLIntra-only video frames, such as the ones found in AV1 or VP9, can be decoded without any otherframe, but they don't reset the codec state. Thus, seeking to these frames is not possible,as the next frames may need frames that are not known from this seeking point.Such intra-only framesMUST NOT be considered as keyframes, so the keyframe flagMUST NOT be set in theSimpleBlock or aReferenceBlockMUST be usedto signify the frame is not a RAP. The timestamp value of theReferenceBlockMUSTbe "0", meaning it's referencing itself.¶
<Cluster> <Timestamp>123456</Timestamp> <BlockGroup> <!-- References itself to mark it should not be used as RAP --> <ReferenceBlock>0</ReferenceBlock> <Block/> </BlockGroup> ...</Cluster>
Because a videoSimpleBlock has less information on references than a videoBlockGroup,it is possible to remux a video track usingBlockGroup into aSimpleBlock,as long as it doesn't use any otherBlockGroup features thanReferenceBlock.¶
Historically, timestamps in Matroska were mistakenly called timecodes. TheTimestamp elementwas called Timecode, theTimestampScale element was called TimecodeScale, theTrackTimestampScale element was called TrackTimecodeScale, and theReferenceTimestamp element was called ReferenceTimeCode.¶
All timestamp values in Matroska are expressed in multiples of a tick.They are usually stored as integers.There are three types of ticks possible: Matroska Ticks, Segment Ticks, and Track Ticks.¶
The timestamp value is stored directly in nanoseconds.¶
The elements storing values in Matroska Ticks/nanoseconds are:¶
TrackEntry\DefaultDuration; defined inSection 5.1.4.1.13¶
TrackEntry\DefaultDecodedFieldDuration; defined inSection 5.1.4.1.14¶
TrackEntry\SeekPreRoll; defined inSection 5.1.4.1.26¶
TrackEntry\CodecDelay; defined inSection 5.1.4.1.25¶
BlockGroup\DiscardPadding; defined inSection 5.1.3.5.7¶
ChapterAtom\ChapterTimeStart; defined inSection 5.1.7.1.4.3¶
ChapterAtom\ChapterTimeEnd; defined inSection 5.1.7.1.4.4¶
Elements in Segment Ticks involve the use of theTimestampScale element of theSegment to get the timestamp in nanoseconds of the element, with the following formula:¶
timestamp in nanosecond = element value * TimestampScale¶
This allows for storage of smaller integer values in the elements.¶
When using the default value of "1,000,000" forTimestampScale, one Segment Tick represents one millisecond.¶
The elements storing values in Segment Ticks are:¶
Cluster\Timestamp; defined inSection 5.1.3.1¶
Info\Duration is stored as a floating-point, but the same formula applies; defined inSection 5.1.2.10¶
CuePoint\CueTime; defined inSection 5.1.5.1.1¶
CuePoint\CueTrackPositions\CueDuration; defined inSection 5.1.5.1.2.4¶
CueReference\CueRefTime; defined inSection 5.1.5.1.1¶
Elements in Track Ticks involve the use of theTimestampScaleelement of theSegment and theTrackTimestampScale elementof theTrack to get the timestamp in nanoseconds of the element, withthe following formula:¶
timestamp in nanoseconds = element value * TrackTimestampScale * TimestampScale¶
This allows for storage of smaller integer values in the elements.The resulting floating-point values of the timestamps are still expressed in nanoseconds.¶
When using the default values of "1,000,000" forTimestampScale and "1.0" forTrackTimestampScale, one Track Tick represents one millisecond.¶
The elements storing values in Track Ticks are:¶
Cluster\BlockGroup\Block andCluster\SimpleBlock timestamps; detailed inSection 11.2¶
Cluster\BlockGroup\BlockDuration; defined inSection 5.1.3.5.3¶
Cluster\BlockGroup\ReferenceBlock; defined inSection 5.1.3.5.5¶
When theTrackTimestampScale is interpreted as "1.0", Track Ticks are equivalent to Segment Ticksand give an integer value in nanoseconds. This is the most common case asTrackTimestampScale is usually omitted.¶
A value ofTrackTimestampScale other than "1.0"MAYbe used to scale the timestamps more in tune with eachTrack samplingfrequency. For historical reasons, a lot ofMatroska Readers don'ttake theTrackTimestampScale value into account. Thus, using a valueother than "1.0" might not work in many places.¶
ABlock element andSimpleBlock element timestamp is thetime when the decoded data of the first frame in theBlock/SimpleBlockMUST be presented if thetrack of thatBlock/SimpleBlock is selected for playback.This is also known as the Presentation Timestamp (PTS).¶
TheBlock element andSimpleBlock element store theirtimestamps as signed integers, relative to theCluster\Timestampvalue of theCluster they are stored in. To get the timestamp of aBlock orSimpleBlock in nanoseconds, the following formulais used:¶
( Cluster\Timestamp + ( block timestamp * TrackTimestampScale ) ) *TimestampScale¶
TheBlock element andSimpleBlock element store their timestamps as 16-bit signed integers,allowing a range from "-32768" to "+32767" Track Ticks.Although these values can be negative, when added to theCluster\Timestamp, the resulting frame timestampSHOULD NOT be negative.¶
When aCodecDelay element is set, its valueMUST be subtracted from eachBlock timestamp of that track.To get the timestamp in nanoseconds of the first frame in aBlock orSimpleBlock, the formula becomes:¶
( ( Cluster\Timestamp + ( block timestamp * TrackTimestampScale ) ) * TimestampScale ) - CodecDelay¶
The resulting frame timestampSHOULD NOT be negative.¶
During playback, when a frame has a negative timestamp, the contentMUST be decoded by the decoder but not played to the user.¶
The default Track Tick duration is one millisecond.¶
TheTimestampScale is a floating-point value that is usually"1.0". But when it's not, the multipliedBlock Timestamp is afloating-point value in nanoseconds. TheMatroska ReaderSHOULD use the nearest rounding value in nanoseconds to get theproper nanosecond timestamp of aBlock. This allows some cleverTimestampScale values to have a more refined timestamp precision perframe.¶
Matroska versions 1 through 3 use language codes that can be either the three-letterbibliographic ISO 639-2 form[ISO639-2] (like "fre" for French)or such a language code followed by a dash and a country code for specialities in languages (like "fre-ca" for Canadian French).TheISO 639-2 Language elements areLanguage element,TagLanguage element, andChapLanguage element.¶
Starting in Matroska version 4, the forms defined in either[ISO639-2] or[RFC5646]MAY be used, although the form in[RFC5646] isRECOMMENDED. TheLanguage elements in the[RFC5646] formareLanguageBCP47 element,TagLanguageBCP47 element, andChapLanguageBCP47 element. If both an[ISO639-2] Language element and an[RFC5646] Language element are used within the sameParent Element, then theLanguage element in the[ISO639-2] formMUSTbe ignored and precedence given to theLanguage element in the[RFC5646] form.¶
In this document, "BCP47" in element names refers specifically to[RFC5646], which is part of BCP 47.¶
Country codes are the[RFC5646] two-letter region subtags, without the UK exception.¶
This Matroska specification provides no interoperable solution for securingthe data container with any assurances of confidentiality, integrity,authenticity, or authorization. TheContentEncryption element (Section 5.1.4.1.31.8) and associated sub-fields (Section 5.1.4.1.31.9 toSection 5.1.4.1.31.12) are defined only for the benefit ofimplementers to construct their own proprietary solution or as the basis forfurther standardization activities. How to use these fields to secure aMatroska data container is out of scope, as are any related issues such as keymanagement and distribution.¶
AMatroska Reader who encounters containers that use the fieldsdefined in this sectionMUST rely on out-of-scope guidance todecode the associated content.¶
Because encryption occurs within theBlock element, it is possibleto manipulate encrypted streams without decrypting them. The streams couldpotentially be copied, deleted, cut, appended, or any number of other possibleediting techniques without decryption. The data can be used without having toexpose it or go through the decrypting process.¶
Encryption can also be layered within Matroska. This means that two completely differenttypes of encryption can be used, requiring two separate keys to be able to decrypt a stream.¶
Encryption information is stored in theContentEncodings element under theContentEncryption element.¶
For encryption systems sharing public/private keys, the creation of the keys and the exchange of keysare not covered by this document. They have to be handled by the system using Matroska.¶
The algorithms described inTable 24 supportdifferent modes of operations and key sizes. The specification of theseparameters is required for a complete solution but is out of scope of thisdocument and left to the proprietary implementations using them or subsequentprofiles of this document.¶
TheContentEncodingScope element gives an idea of which part ofthe track is encrypted, but eachContentEncAlgo element and itssub-elements (likeAESSettingsCipherMode) define exactly how theencrypted track should be interpreted.¶
An example of an extension that builds upon these security-related fields in this specification is[WebM-Enc].It uses AES-CTR,ContentEncAlgo = 5 (Section 5.1.4.1.31.9), andAESSettingsCipherMode = 1 (Section 5.1.4.1.31.12).¶
AMatroska WriterMUST NOT use insecurecryptographic algorithms to create new archives or streams, but aMatroskaReaderMAY support these algorithms to read previouslymade archives or streams.¶
ThePixelCrop elements (PixelCropTop,PixelCropBottom,PixelCropRight, andPixelCropLeft)indicate when, and by how much, encoded video framesSHOULD becropped for display. These elements allow edges of the frame that are notintended for display (such as the sprockets of a full-frame film scan or theVideo ANCillary (VANC) area of a digitized analog videotape) to be stored buthidden.PixelCropTop andPixelCropBottom store an integerof how many rows of pixelsSHOULD be cropped from the top andbottom of the image, respectively.PixelCropLeft andPixelCropRight store an integer of how many columns of pixelsSHOULD be cropped from the left and right of the image,respectively.¶
For example, a pillar-boxed video that stores a 1440x1080 visual imagewithin the center of a padded 1920x1080 encoded image may set bothPixelCropLeft andPixelCropRight to "240", so aMatroskaPlayer should crop off 240 columns of pixels from the left and right ofthe encoded image to present the image with the pillar-boxes hidden.¶
Cropping has to be performed before resizing and the display dimensionsgiven byDisplayWidth,DisplayHeight, andDisplayUnit apply to the already-cropped image.¶
TheProjectionPoseRoll element (Section 5.1.4.1.28.46) can be used to indicate that the imagefrom the associated video trackSHOULD be rotated forpresentation. For instance, the following example of theProjectionelement (Section 5.1.4.1.28.41) and theProjectionPoseRoll element represents a video track where the imageSHOULD be presented with a 90-degree counter-clockwiserotation, with the EBML tree shown as XML:¶
<Projection> <ProjectionPoseRoll>90</ProjectionPoseRoll></Projection>
TheSegment Position of an element refers to the position of thefirst octet of theElement ID of that element, measured in octets,from the beginning of theElement Data section of the containingSegment element. In other words, theSegment Position of anelement is the distance in octets from the beginning of its containingSegment element minus the size of theElement ID andElement Data Size of thatSegment element. TheSegmentPosition of the firstChild Element of theSegmentelement is 0. An element that is not stored within aSegmentelement, such as the elements of theEBML Header, do not have aSegment Position.¶
Elements that are defined to store aSegment PositionMAY define reserved values toindicate a special meaning.¶
This table presents an example ofSegment Position by showing a hexadecimal representationof a very small Matroska file with labels to show the offsets in octets. The file containsaSegment element with anElement ID of "0x18538067" and aMuxingApp element with anElement ID of "0x4D80".¶
0 1 2 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ 0 |1A|45|DF|A3|8B|42|82|88|6D|61|74|72|6F|73|6B|61| ^ EBML Header 0 | |18|53|80|67| ^ Segment ID 20 |93| ^ Segment Data Size 20 | |15|49|A9|66|8E|4D|80|84|69|65|74|66|57|41|84|69|65|74|66| ^ Start of Segment data 20 | |4D|80|84|69|65|74|66|57|41|84|69|65|74|66| ^ MuxingApp start¶
In the above example, theElement ID of theSegment element is stored at offset 16,theElement Data Size of theSegment element is stored at offset 20, and theElement Data of theSegment element is stored at offset 21.¶
TheMuxingApp element is stored at offset 26. Since theSegment Position ofan element is calculated by subtracting the position of theElement Data ofthe containingSegment element from the position of that element, theSegment Positionof theMuxingApp element in the above example is "26 - 21" or "5".¶
Matroska provides several methods to link two or moreSegmentelements together to create aLinked Segment. ALinkedSegment is a set of multipleSegments linked together into asingle presentation by using Hard Linking or Medium Linking.¶
AllSegments within aLinked SegmentMUST have aSegmentUUID.¶
AllSegments within aLinked SegmentSHOULD be stored within the same directoryor be quickly accessible based on theirSegmentUUIDin order to have a seamless transition between segments.¶
AllSegments within aLinked SegmentMAY set aSegmentFamily with a common value to makeit easier for aMatroska Player to know whichSegments are meant to be played together.¶
TheSegmentFilename,PrevFilename, andNextFilename elementsMAY also give hints onthe original filenames that were used when theSegment links were created, in case someSegmentUUIDs are damaged.¶
Hard Linking, also called "splitting", is the process of creating aLinked Segmentby linking multipleSegment elements using theNextUUID andPrevUUID elements.¶
AllSegments within aHard Linked SegmentMUST use the sameTracks list andTimestampScale.¶
Within aLinked Segment, the timestamps ofBlock andSimpleBlockMUST consecutively followthe timestamps ofBlock andSimpleBlock from the previousSegment in linking order.¶
With Hard Linking, the chapters of anySegment within theLinked SegmentMUST only reference the currentSegment.TheNextUUID andPrevUUID reference the respectiveSegmentUUID values of the next and previousSegments.¶
The firstSegment of aLinked SegmentMUST NOT have aPrevUUID element.The lastSegment of aLinked SegmentMUST NOT have aNextUUID element.¶
For each node of the chain ofSegments of aLinked Segment, at least oneSegmentMUST reference the otherSegment within the chain.¶
In a chain ofSegments of aLinked Segment, theNextUUID always takes precedence over thePrevUUID.Thus, if SegmentA has aNextUUID to SegmentB and SegmentB has aPrevUUID to SegmentC,the link to use isNextUUID between SegmentA and SegmentB, and SegmentC is not part of theLinked Segment.¶
If SegmentB has aPrevUUID to SegmentA, but SegmentA has noNextUUID, then theMatroska PlayerMAY consider these twoSegments linked as SegmentA followed by SegmentB.¶
As an example, threeSegments can be Hard Linked as aLinked Segment throughcross-referencing each other withSegmentUUID,PrevUUID, andNextUUID as shown in this table:¶
| file name | SegmentUUID | PrevUUID | NextUUID |
|---|---|---|---|
start.mkv | 71000c23cd310998 53fbc94dd984a5dd | Invalid | a77b3598941cb803 eac0fcdafe44fac9 |
middle.mkv | a77b3598941cb803 eac0fcdafe44fac9 | 71000c23cd310998 53fbc94dd984a5dd | 6c92285fa6d3e827 b198d120ea3ac674 |
end.mkv | 6c92285fa6d3e827 b198d120ea3ac674 | a77b3598941cb803 eac0fcdafe44fac9 | Invalid |
An example where only theNextUUID element is used:¶
| file name | SegmentUUID | PrevUUID | NextUUID |
|---|---|---|---|
start.mkv | 71000c23cd310998 53fbc94dd984a5dd | Invalid | a77b3598941cb803 eac0fcdafe44fac9 |
middle.mkv | a77b3598941cb803 eac0fcdafe44fac9 | n/a | 6c92285fa6d3e827 b198d120ea3ac674 |
end.mkv | 6c92285fa6d3e827 b198d120ea3ac674 | n/a | Invalid |
An example where only thePrevUUID element is used:¶
| file name | SegmentUUID | PrevUUID | NextUUID |
|---|---|---|---|
start.mkv | 71000c23cd310998 53fbc94dd984a5dd | Invalid | n/a |
middle.mkv | a77b3598941cb803 eac0fcdafe44fac9 | 71000c23cd310998 53fbc94dd984a5dd | n/a |
end.mkv | 6c92285fa6d3e827 b198d120ea3ac674 | a77b3598941cb803 eac0fcdafe44fac9 | Invalid |
An example where only themiddle.mkv is using thePrevUUID andNextUUID elements:¶
| file name | SegmentUUID | PrevUUID | NextUUID |
|---|---|---|---|
start.mkv | 71000c23cd310998 53fbc94dd984a5dd | Invalid | n/a |
middle.mkv | a77b3598941cb803 eac0fcdafe44fac9 | 71000c23cd310998 53fbc94dd984a5dd | 6c92285fa6d3e827 b198d120ea3ac674 |
end.mkv | 6c92285fa6d3e827 b198d120ea3ac674 | n/a | Invalid |
Medium Linking creates relationships betweenSegments usingOrdered Chapters (Section 20.1.3) and theChapterSegmentUUID element. AChapter Edition withOrdered ChaptersMAY containChapterselements that reference timestamp ranges from otherSegments. TheSegment referenced by theOrdered Chapter via theChapterSegmentUUID elementSHOULD be played as part ofaLinked Segment.¶
The timestamps ofSegment content referenced byOrdered ChaptersMUST be adjusted according to the cumulative duration of the previousOrdered Chapters.¶
As an example, a file namedintro.mkv could have aSegmentUUID of "0xb16a58609fc7e60653a60c984fc11ead". Another filecalledprogram.mkv could use aChapter Edition that containstwoOrdered Chapters. The first chapter references theSegment ofintro.mkv with the use of aChapterSegmentUUID,ChapterSegmentEditionUID,ChapterTimeStart, and an optionalChapterTimeEnd element.The second chapter references content within theSegment ofprogram.mkv. AMatroska PlayerSHOULDrecognize theLinked Segment created by the use ofChapterSegmentUUID in an enabledEdition and present thereference content of the twoSegments as a single presentation.¶
TheChapterSegmentUUID represents theSegment that holds the content to play in place of theLinked Chapter.TheChapterSegmentUUIDMUST NOT be theSegmentUUID of its ownSegment.¶
There are two ways to use a chapter link:¶
AMatroska PlayerMUST play the content of theLinked Segment from theChapterTimeStart until theChapterTimeEnd timestamp in place of theLinked Chapter.¶
ChapterTimeStart andChapterTimeEnd represent timestamps in theLinked Segment matching the value ofChapterSegmentUUID.Their valuesMUST be in the range of theLinked Segment duration.¶
TheChapterTimeEnd valueMUST be set when using Linked-Duration chapter linking.ChapterSegmentEditionUIDMUST NOT be set.¶
AMatroska PlayerMUST play the wholeLinked Edition of theLinked Segment in place of theLinked Chapter.¶
ChapterSegmentEditionUID represents a validEdition from theLinked Segment matching the value ofChapterSegmentUUID.¶
When using Linked-Edition chapter linking,ChapterTimeEnd isOPTIONAL.¶
The Default flag is a hint for aMatroska Player indicating that agiven trackSHOULD be eligible to be automatically selected asthe default track for a given language. If no tracks in a given language havethe Default flag set, then all tracks in that language are eligible forautomatic selection. This can be used to indicate that a track provides"regular service" that is suitable for users with default settings, as opposedto specialized services, such as commentary, captions for users with hearingimpairments, or descriptive audio.¶
TheMatroska PlayerMAY override the Default flagfor any reason, including user preferences to prefer tracks providingaccessibility services.¶
The Forced flag tells theMatroska Player that itSHOULD display this subtitle track, even if user preferencesusually would not call for any subtitles to be displayed alongside the audiotrack that is currently selected. This can be used to indicate that a trackcontains translations of on-screen text or dialogue spoken in a differentlanguage than the track's primary language.¶
The Hearing-Impaired flag tells theMatroska Player that itSHOULD prefer this track when selecting a default track for auser with a hearing impairment and that itMAY prefer to selecta different track when selecting a default track for a user that is nothearing impaired.¶
The Visual-Impaired flag tells theMatroska Player that itSHOULD prefer this track when selecting a default track for auser with a visual impairment and that itMAY prefer to selecta different track when selecting a default track for a user that is notvisually impaired.¶
The Descriptions flag tells theMatroska Player that this track issuitable to play via a text-to-speech system for a user with a visualimpairment and that itSHOULD NOT automatically select thistrack when selecting a default track for a user that is not visuallyimpaired.¶
The Original flag tells theMatroska Player that this track is inthe original language and that itSHOULD prefer this track ifconfigured to prefer original-language tracks of this track's type.¶
The Commentary flag tells theMatroska Player that this trackcontains commentary on the content.¶
TrackOperation allows for the combination of multiple tracks to make a virtual one. It usestwo separate system to combine tracks. One to create a 3D "composition" (left/right/background planes)and one to simplify join two tracks together to make a single track.¶
A track created withTrackOperation is a proper track with a UID and all its flags.However, the codec ID is meaningless because each "sub" track needs to be decoded by itsown decoder before the "operation" is applied. TheCues elements corresponding to sucha virtual trackSHOULD be the union of theCues elements for each of the tracks it's composed of (when theCues are defined per track).¶
In the case ofTrackJoinBlocks, theBlock elements (fromBlockGroup andSimpleBlock) of all the tracksSHOULD be used as if they were defined for this new virtualTrack. When twoBlock elements have overlapping start orend timestamps, it's up to the underlying system to either drop some of theseframes or render them the way they overlap. This situationSHOULD be avoided when creating such tracks, as you can neverbe sure of the end result on different platforms.¶
An overlay trackSHOULD be rendered in the same channel as the track it's linked to.When content is found in such a track, itSHOULD be played on the rendering channelinstead of the original track.¶
There are two different ways to compress 3D videos: have each eye track in a separate trackand have one track have both eyes combined inside (which is more efficient compression-wise).Matroska supports both ways.¶
For the single-track variant, there is theStereoMode element,which defines how planes are assembled in the track (mono or left-rightcombined). Odd values ofStereoMode means the left plane comes firstfor more convenient reading. The pixel count of the track(PixelWidth/PixelHeight) is the raw number of pixels (forexample, 3840x1080 for full HD side by side), and theDisplayWidth/DisplayHeight in pixels is the number of pixelsfor one plane (1920x1080 for that full HD stream). Old stereo 3D movies weredisplayed using anaglyph (cyan and red colors separated). For compatibilitywith such movies, there is a value of theStereoMode that correspondsto anaglyph.¶
There is also a "packed" mode (values 13 and 14) that consists of packing two frames togetherin aBlock that uses lacing. The first frame is the left eye and the other frame is the right eye(or vice versa). The framesSHOULD be decoded in that order and are possibly dependenton each other (P and B frames).¶
For separate tracks, Matroska needs to define exactly which track does what.TrackOperation withTrackCombinePlanes does that. For more details, seeSection 18.8 on howTrackOperation works.¶
The 3D support is still in infancy and may evolve to support more features.¶
TheStereoMode used to be part of Matroska v2, but it didn't meet therequirement for multiple tracks. There was also a bug in[libmatroska] prior to 0.9.0 that would save/read it as0x53B9 instead of0x53B8; seeOldStereoMode (Section 5.1.4.1.28.5).Matroska ReadersMAY support these legacy files by checking Matroska v2 or0x53B9. The older values ofStereoMode were 0 (mono), 1 (right eye),2 (left eye), and 3 (both eyes); these are the only values that can be foundinOldStereoMode. They are not compatible with theStereoMode values found inMatroska v3 and above.¶
This section provides some example sets ofTracks and hypotheticaluser settings, along with indications of which ones a similarly configuredMatroska PlayerSHOULD automatically select forplayback by default in such a situation. A playerMAY provideadditional settings with more detailed controls for more nuancedscenarios. These examples are provided as guidelines to illustrate theintended usages of the various supportedTrack flags and theirexpected behaviors.¶
Track names are shown in English for illustrative purposes; actualfiles may have titles in the language of each track or provide titles inmultiple languages.¶
Example track set:¶
| No. | Type | Lang | Layout | Original | Default | Other Flags | Name |
|---|---|---|---|---|---|---|---|
| 1 | Video | und | N/A | N/A | N/A | None | |
| 2 | Audio | eng | 5.1 | 1 | 1 | None | |
| 3 | Audio | eng | 2.0 | 1 | 1 | None | |
| 4 | Audio | eng | 2.0 | 1 | 0 | Visual-Impaired | Descriptive audio |
| 5 | Audio | esp | 5.1 | 0 | 1 | None | |
| 6 | Audio | esp | 2.0 | 0 | 0 | Visual-Impaired | Descriptive audio |
| 7 | Audio | eng | 2.0 | 1 | 0 | Commentary | Director's Commentary |
| 8 | Audio | eng | 2.0 | 1 | 0 | None | Karaoke |
The table above shows a file with seven audio tracks -- five in English and two in Spanish.¶
The English tracks all have the Original flag, indicating that English is the original content language.¶
Generally, the player will first consider the track languages. If the player has an option to preferoriginal-language audio and the user has enabled it, then it should prefer one of the tracks with the Original flag.If the user has configured to specifically prefer audio tracks in English orSpanish, the player should select one of the tracks in the correspondinglanguage. The player may also wish to prefer a track with the Original flag ifno tracks matching any of the user's explicitly preferred languages areavailable.¶
Two of the tracks have the Visual-Impaired flag. If the player has been configured to prefer such tracks,it should select one; otherwise, it should avoid them if possible.¶
If selecting an English track, when other settings have left multiple possible options,it may be useful to exclude the tracks that lack the Default flag. Here, one provides descriptive service forindividuals with visual impairments (which has its own flag and may be automatically selected by user configurationbut is unsuitable for users with default-configured players), one is a commentary track(which has its own flag and the player may or may not have specialized handling for),and the last contains karaoke versions of the music that plays during the film (which is an unusualspecialized audio service that Matroska has no built-in support for indicating, so it's indicatedin the track name instead). By not setting the Default flag on these specialized tracks, the file's authorhints that they should not be automatically selected by a default-configured player.¶
Having narrowed its choices down, the example player now may have to select between tracks 2 and 3.The only difference between these tracks is their channel layouts: 2 is 5.1 surround, while 3 is stereo.If the player is aware that the output device is a pair of headphones or stereo speakers, it may wishto prefer the stereo mix automatically. On the other hand, if it knows that the device is a surround system,it may wish to prefer the surround mix.¶
If the player finishes analyzing all of the available audio tracks and finds that more than one seem equallyand maximally preferable, itSHOULD default to the first of the group.¶
Example track set:¶
| No. | Type | Lang | Original | Default | Forced | Other Flags | Name |
|---|---|---|---|---|---|---|---|
| 1 | Video | und | N/A | N/A | N/A | None | |
| 2 | Audio | fra | 1 | 1 | N/A | None | |
| 3 | Audio | por | 0 | 1 | N/A | None | |
| 4 | Subtitles | fra | 1 | 1 | 0 | None | |
| 5 | Subtitles | fra | 1 | 0 | 0 | Hearing-Impaired | Captions for users with hearing impairments |
| 6 | Subtitles | por | 0 | 1 | 0 | None | |
| 7 | Subtitles | por | 0 | 0 | 1 | None | Signs |
| 8 | Subtitles | por | 0 | 0 | 0 | Hearing-Impaired | SDH |
The table above shows two audio tracks and five subtitle tracks. As we can see, French is the original language.¶
We'll start by discussing the case where the user prefers French (or original-language)audio (or has explicitly selected the French audio track) and also prefers French subtitles.¶
In this case, if the player isn't configured to display captions when the audio matches theirpreferred subtitle languages, the player doesn't need to select a subtitle track at all.¶
If the userhas indicated that they want captions to be displayed, the selection simplycomes down to whether hearing-impaired subtitles are preferred.¶
The situation for a user who prefers Portuguese subtitles starts out somewhat analogous.If they select the original French audio (either by explicit audio language preference,preference for original-language tracks, or explicitly selecting that track), then theselection once again comes down to the hearing-impaired preference.¶
However, the case where the Portuguese audio track is selected has an important catch:a Forced track in Portuguese is present. This may contain translations of on-screen textfrom the video track or of portions of the audio that are not translated (music, for instance).This means that even if the user's preferences wouldn't normally call for captions here,the Forced track should be selected nonetheless, rather than selecting no track at all.On the other hand, if the user's preferencesdo call for captions, the non-Forced tracksshould be preferred, as the Forced track will not contain captioning for the dialogue.¶
The MatroskaChapters system can have multipleEditions, and eachEdition can consist ofSimple Chapters where a chapter start time is used as a marker in the timeline only. AnEdition can be more complex withOrdered Chapters where a chapter end timestamp is additionallyused or much more complex withLinked Chapters. The MatroskaChapters system can also have a menustructure borrowed from the DVD-menu system[DVD-Video] or have its own built-in Matroska menu structure.¶
TheEditionEntry is also called anEdition.AnEdition contains a set ofEdition flags andMUST contain at least oneChapterAtom element.Chapters are always inside anEdition (or aChapter itself is part of anEdition).MultipleEditions are allowed. Some of theseEditionsMAY be ordered and others not.¶
Only oneEditionSHOULD have anEditionFlagDefault flag set totrue.¶
TheDefault Edition is theEdition that aMatroska PlayerSHOULD use for playback by default.¶
The firstEdition with theEditionFlagDefault flag set totrue is theDefault Edition.¶
When allEditionFlagDefault flags are set tofalse, then the firstEditionis theDefault Edition.¶
| Edition | FlagDefault | Default Edition |
|---|---|---|
| Edition 1 | true | X |
| Edition 2 | true | |
| Edition 3 | true |
| Edition | FlagDefault | Default Edition |
|---|---|---|
| Edition 1 | false | X |
| Edition 2 | false | |
| Edition 3 | false |
| Edition | FlagDefault | Default Edition |
|---|---|---|
| Edition 1 | false | |
| Edition 2 | true | X |
| Edition 3 | false |
TheEditionFlagOrdered flag is a significant feature, as itenables anEdition ofOrdered Chapters that defines andarranges a virtual timeline rather than simply labeling points within thetimeline. For example, withEditions ofOrdered Chapters, asingleMatroska file can present multiple edits of a film withoutduplicating content. Alternatively, if a videotape is digitized in full, oneOrdered Edition could present the full content (including colorbars,countdown, slate, a feature presentation, and black frames), while anotherEdition ofOrdered Chapters can useChapters thatonly mark the intended presentation with the colorbars and other ancillaryvisual information excluded. If anEdition ofOrderedChapters is enabled, then theMatroska PlayerMUST play thoseChapters in their stored order fromthe timestamp marked in theChapterTimeStart element to the timestampmarked in toChapterTimeEnd element.¶
If theEditionFlagOrdered flag evaluates to "0",SimpleChapters are used and only theChapterTimeStart of aChapter is used as a chapter mark to jump to the predefined point inthe timeline. WithSimple Chapters, aMatroska PlayerMUST ignore certain elements inside aChapterselement. In that case, these elements are informational only.¶
The following list shows the differentChapters elements only found inOrdered Chapters.¶
ChapterAtom\ChapterSegmentUUID¶
ChapterAtom\ChapterSegmentEditionUID¶
ChapterAtom\ChapProcess¶
Info\ChapterTranslate¶
TrackEntry\TrackTranslate¶
Furthermore, there are other EBML elements that could be used if theEditionFlagOrderedevaluates to "1".¶
Ordered Chapters supersede theHard Linking.¶
Ordered Chapters are used in a normal way and can be combinedwith theChapterSegmentUUID element, which establishes a link to anotherSegment.¶
SeeSection 17 onLinked Segments for more informationaboutHard Linking andMedium Linking.¶
TheChapterAtom is also called aChapter.¶
ChapterTimeStart is the timestamp of the start ofChapter with nanosecond accuracy and is not scaled byTimestampScale.ForSimple Chapters, this is the position of the chapter markers in the timeline.¶
ChapterTimeEnd is the timestamp of the end ofChapterwith nanosecond accuracy and is not scaled byTimestampScale. Thetimestamp defined by theChapterTimeEnd is not part of theChapter. AMatroska Player calculates the duration of thisChapter using the difference between theChapterTimeEnd andChapterTimeStart. The end timestampMUST be greaterthan or equal to the start timestamp.¶
When theChapterTimeEnd timestamp is equal to theChapterTimeStart timestamp,the timestamp is included in theChapter. It can be useful to put markers ina file or add chapter commands with ordered chapter commands without having to play anything;seeSection 5.1.7.1.4.14.¶
| Chapter | Start timestamp | End timestamp | Duration |
|---|---|---|---|
| Chapter 1 | 0 | 1000000000 | 1000000000 |
| Chapter 2 | 1000000000 | 5000000000 | 4000000000 |
| Chapter 3 | 6000000000 | 6000000000 | 0 |
| Chapter 4 | 9000000000 | 8000000000 | Invalid (-1000000000) |
AChapterAtom element can contain otherChapterAtom elements.That element is aParent Chapter, and theChapterAtom elements it contains areNested Chapters.¶
Nested Chapters can be useful to tag small parts of aSegment that already have tags oradd Chapter Codec commands on smaller parts of aSegment that already have Chapter Codec commands.¶
TheChapterTimeStart of aNested ChapterMUST be greater than or equal to theChapterTimeStart of itsParent Chapter.¶
If theParent Chapter of aNested Chapter has aChapterTimeEnd, theChapterTimeStart of thatNested ChapterMUST be smaller than or equal to theChapterTimeEnd of theParent Chapter.¶
TheChapterTimeEnd of the lowest level ofNested ChaptersMUST be set forOrdered Chapters.¶
When used withOrdered Chapters, theChapterTimeEnd value of aParent Chapter is useless for playback,as the proper playback sections are described in itsNested Chapters.TheChapterTimeEndSHOULD NOT be set inParent Chapters andMUST be ignored for playback.¶
EachChapter'sChapterFlagHidden flag works independently ofParent Chapters.ANested Chapter with aChapterFlagHidden flag that evaluates to"0" remains visible in the user interface even if theParent Chapter'sChapterFlagHidden flag is set to "1".¶
| Chapter + Nested Chapter | ChapterFlagHidden | visible |
|---|---|---|
| Chapter 1 | 0 | yes |
| Nested Chapter 1.1 | 0 | yes |
| Nested Chapter 1.2 | 1 | no |
| Chapter 2 | 1 | no |
| Nested Chapter 2.1 | 0 | yes |
| Nested Chapter 2.2 | 1 | no |
The menu features are handled like achapter codec. That means each codec has a type,some private data, and some data in the chapters.¶
The type of the menu system is defined by theChapProcessCodecID parameter.For now, only two values are supported: 0 (Matroska Script) and 1 (menu borrowed from the DVD[DVD-Video]).The private data stored inChapProcessPrivate andChapProcessData depends on theChapProcessCodecID value.¶
The menu system, as well as Chapter Codecs in general, can perform actions on theMatroska Player, such as jumping to anotherChapter orEdition, selecting different tracks, and possibly more.The scope of all the possibilities of Chapter Codecs is not covered in this document, as itdepends on the Chapter Codec features and its integration in aMatroska Player.¶
Each level can have different meanings for audio and video. TheORIGINAL_MEDIA_TYPE tag[MatroskaTags] can be used tospecify a string for ChapterPhysicalEquiv = 60. Here is the list of possible levels for both audio and video:¶
| Value | Audio | Video | Comment |
|---|---|---|---|
| 70 | SET / PACKAGE | SET / PACKAGE | the collection of different media |
| 60 | CD / 12" / 10" / 7" / TAPE / MINIDISC / DAT | DVD / VHS / LASERDISC | the physical medium like a CD or a DVD |
| 50 | SIDE | SIDE | when the original medium (LP/DVD) has different sides |
| 40 | - | LAYER | another physical level on DVDs |
| 30 | SESSION | SESSION | as found on CDs and DVDs |
| 20 | TRACK | - | as found on audio CDs |
| 10 | INDEX | - | the first logical level of the side/medium |
In this example, a movie is split in different chapters. It could also just be anaudio file (album) in which each track corresponds to a chapter.¶
This translates to Matroska form, with the EBML tree shown as follows in XML:¶
<Chapters> <EditionEntry> <EditionUID>16603393396715046047</EditionUID> <ChapterAtom> <ChapterUID>1193046</ChapterUID> <ChapterTimeStart>0</ChapterTimeStart> <ChapterTimeEnd>5000000000</ChapterTimeEnd> <ChapterDisplay> <ChapString>Intro</ChapString> </ChapterDisplay> </ChapterAtom> <ChapterAtom> <ChapterUID>2311527</ChapterUID> <ChapterTimeStart>5000000000</ChapterTimeStart> <ChapterTimeEnd>25000000000</ChapterTimeEnd> <ChapterDisplay> <ChapString>Before the crime</ChapString> </ChapterDisplay> <ChapterDisplay> <ChapString>Avant le crime</ChapString> <ChapLanguage>fra</ChapLanguage> </ChapterDisplay> </ChapterAtom> <ChapterAtom> <ChapterUID>3430008</ChapterUID> <ChapterTimeStart>25000000000</ChapterTimeStart> <ChapterTimeEnd>27500000000</ChapterTimeEnd> <ChapterDisplay> <ChapString>The crime</ChapString> </ChapterDisplay> <ChapterDisplay> <ChapString>Le crime</ChapString> <ChapLanguage>fra</ChapLanguage> </ChapterDisplay> </ChapterAtom> <ChapterAtom> <ChapterUID>4548489</ChapterUID> <ChapterTimeStart>27500000000</ChapterTimeStart> <ChapterTimeEnd>38000000000</ChapterTimeEnd> <ChapterDisplay> <ChapString>After the crime</ChapString> </ChapterDisplay> <ChapterDisplay> <ChapString>Apres le crime</ChapString> <ChapLanguage>fra</ChapLanguage> </ChapterDisplay> </ChapterAtom> <ChapterAtom> <ChapterUID>5666960</ChapterUID> <ChapterTimeStart>38000000000</ChapterTimeStart> <ChapterTimeEnd>43000000000</ChapterTimeEnd> <ChapterDisplay> <ChapString>Credits</ChapString> </ChapterDisplay> <ChapterDisplay> <ChapString>Generique</ChapString> <ChapLanguage>fra</ChapLanguage> </ChapterDisplay> </ChapterAtom> </EditionEntry></Chapters>
In this example, an (existing) album is split into different chapters, and oneof them contains another splitting.¶
00:00 - 12:28: Baby wants to Bleep/Rock¶
This translates to Matroska form, with the EBML tree shown as follows in XML:¶
<Chapters> <EditionEntry> <EditionUID>1281690858003401414</EditionUID> <ChapterAtom> <ChapterUID>1</ChapterUID> <ChapterTimeStart>0</ChapterTimeStart> <ChapterTimeEnd>748000000</ChapterTimeEnd> <ChapterDisplay> <ChapString>Baby wants to Bleep/Rock</ChapString> </ChapterDisplay> <ChapterAtom> <ChapterUID>2</ChapterUID> <ChapterTimeStart>0</ChapterTimeStart> <ChapterTimeEnd>278000000</ChapterTimeEnd> <ChapterDisplay> <ChapString>Baby wants to bleep (pt.1)</ChapString> </ChapterDisplay> </ChapterAtom> <ChapterAtom> <ChapterUID>3</ChapterUID> <ChapterTimeStart>278000000</ChapterTimeStart> <ChapterTimeEnd>432000000</ChapterTimeEnd> <ChapterDisplay> <ChapString>Baby wants to rock</ChapString> </ChapterDisplay> </ChapterAtom> <ChapterAtom> <ChapterUID>4</ChapterUID> <ChapterTimeStart>432000000</ChapterTimeStart> <ChapterTimeEnd>633000000</ChapterTimeEnd> <ChapterDisplay> <ChapString>Baby wants to bleep (pt.2)</ChapString> </ChapterDisplay> </ChapterAtom> <ChapterAtom> <ChapterUID>5</ChapterUID> <ChapterTimeStart>633000000</ChapterTimeStart> <ChapterTimeEnd>748000000</ChapterTimeEnd> <ChapterDisplay> <ChapString>Baby wants to bleep (pt.3)</ChapString> </ChapterDisplay> </ChapterAtom> </ChapterAtom> <ChapterAtom> <ChapterUID>6</ChapterUID> <ChapterTimeStart>750000000</ChapterTimeStart> <ChapterTimeEnd>1178500000</ChapterTimeEnd> <ChapterDisplay> <ChapString>Bleeper_O+2</ChapString> </ChapterDisplay> </ChapterAtom> <ChapterAtom> <ChapterUID>7</ChapterUID> <ChapterTimeStart>1180500000</ChapterTimeStart> <ChapterTimeEnd>1340000000</ChapterTimeEnd> <ChapterDisplay> <ChapString>Baby wants to bleep (pt.4)</ChapString> </ChapterDisplay> </ChapterAtom> <ChapterAtom> <ChapterUID>8</ChapterUID> <ChapterTimeStart>1342000000</ChapterTimeStart> <ChapterTimeEnd>1518000000</ChapterTimeEnd> <ChapterDisplay> <ChapString>Bleep to bleep</ChapString> </ChapterDisplay> </ChapterAtom> <ChapterAtom> <ChapterUID>9</ChapterUID> <ChapterTimeStart>1520000000</ChapterTimeStart> <ChapterTimeEnd>2015000000</ChapterTimeEnd> <ChapterDisplay> <ChapString>Baby wants to bleep (k)</ChapString> </ChapterDisplay> </ChapterAtom> <ChapterAtom> <ChapterUID>10</ChapterUID> <ChapterTimeStart>2017000000</ChapterTimeStart> <ChapterTimeEnd>2668000000</ChapterTimeEnd> <ChapterDisplay> <ChapString>Bleeper</ChapString> </ChapterDisplay> </ChapterAtom> </EditionEntry></Chapters>
Matroska supports storage of related files and data in theAttachments element (aTop-LevelElement).Attachments elements can be used to store relatedcover art, font files, transcripts, reports, error recovery files, pictures,text-based annotations, copies of specifications, or other ancillary filesrelated to theSegment.¶
Matroska ReadersMUST NOT execute files stored asAttachments elements.¶
This section defines a set of guidelines for the storage of cover art inMatroska files. AMatroska ReaderMAY use embeddedcover art to display a representational still-image depiction of themultimedia contents of the Matroska file.¶
Only[JPEG] and PNG[RFC2083] image formatsSHOULD be used for cover art pictures.¶
There can be two different covers for a movie/album: a portrait style (e.g., a DVD case)and a landscape style (e.g., a wide banner ad).¶
There can be two versions of the same cover: thenormal cover andthesmall cover. The dimension of thenormal coverSHOULD be 600 pixels on the smallest side (e.g., 960x600 forlandscape, 600x800 for portrait, or 600x600 for square). The dimension of thesmall coverSHOULD be 120 pixels on the smallest side(e.g., 192x120 or 120x160).¶
Versions of cover art can be differentiated by the filename, which isstored in theFileName element. The default filename of thenormal cover in square or portrait mode iscover.(jpg|png). When stored, thenormal coverSHOULD be the firstAttachments element in storageorder. Thesmall coverSHOULD be prefixed with"small_", such assmall_cover.(jpg|png). The landscape variantSHOULD be suffixed with "_land", such ascover_land.(jpg|png). The filenames are case-sensitive.¶
The following table provides examples of file names for cover art inAttachments.¶
| File Name | Image Orientation | Pixel Length of Smallest Side |
|---|---|---|
| cover.jpg | Portrait or square | 600 |
| small_cover.png | Portrait or square | 120 |
| cover_land.png | Landscape | 600 |
| small_cover_land.jpg | Landscape | 120 |
Font filesMAY be added to a Matroska file as Attachments so that the font file may be usedto display an associated subtitle track. This allows the presentation of a Matroska file to beconsistent in various environments where the needed fonts might not be available on the local system.¶
Depending on the font format in question, each font file can contain multiple font variants.Each font variant has a name that will be referred to as Font Name from now on.This Font Name can be different from the Attachment'sFileName, even when disregarding the extension.In order to select a font for display, aMatroska PlayerSHOULD consider both the Font Nameand the base name of the Attachment'sFileName, preferring the former when there are multiple matches.¶
Subtitle codecs, such as SubStation Alpha (SSA) and Advanced SubStation Alpha (ASS), usually refer to a font by its Font Name, not by its filename.If none of the Attachments are a match for the Font Name, theMatroska PlayerSHOULDattempt to find a system font whose Font Name matches the one used in the subtitle track.¶
Since loading fonts temporarily can take a while, aMatroska Player usuallyloads or installs all the fonts found in attachments so they are ready to be used during playback.Failure to use the font attachment might result in incorrect rendering of the subtitles.¶
If a selected subtitle track has someAttachmentLink elements, the playerMAY restrict its font rendering to use only these fonts.¶
AMatroska PlayerSHOULD handle the official font media types from[RFC8081] when the system can handle the type:¶
font/sfnt: Generic SFNT Font Type¶
font/ttf: TrueType Font (TTF) Font Type¶
font/otf: OpenType Layout (OTF) Font Type¶
font/collection: Collection Font Type¶
font/woff: WOFF 1.0¶
font/woff2: WOFF 2.0¶
Fonts in Matroska existed long before[RFC8081]. A few unofficial media types for fonts were used in existing files.Therefore, it isRECOMMENDED for aMatroska Player to support the following legacy media types for font attachments:¶
application/x-truetype-font: TrueType fonts, equivalent tofont/ttf and sometimesfont/otf¶
application/x-font-ttf: TrueType fonts, equivalent tofont/ttf¶
application/vnd.ms-opentype: OpenType Layout fonts, equivalent tofont/otf¶
application/font-sfnt: Generic SFNT Font Type, equivalent tofont/sfnt¶
application/font-woff: WOFF 1.0, equivalent tofont/woff¶
There may also be some font attachments with theapplication/octet-stream media type.In that case, theMatroska PlayerMAY try to guess the font type by checking the file extension of theAttachedFile\FileName string.Common file extensions for fonts are:¶
.ttf for TrueType fonts, equivalent tofont/ttf¶
.otf for OpenType Layout fonts, equivalent tofont/otf¶
.ttc for Collection fonts, equivalent tofont/collection¶
The file extension checkMUST be case-insensitive.¶
Matroska WritersSHOULD use a valid font media type from[RFC8081] in theAttachedFile\FileMediaType of the font attachment.TheyMAY use the media types found in older files when compatibility with older players is necessary.¶
TheCues element provides an index of certainClusterelements to allow for optimized seeking to absolute timestamps within theSegment. TheCues element contains one or manyCuePoint elements, each of whichMUST reference anabsolute timestamp (via theCueTime element), aTrack (viatheCueTrack element), and aSegment Position (via theCueClusterPosition element). Additional non-mandated elements arepart of theCuePoint element, such asCueDuration,CueRelativePosition,CueCodecState, and others that provideanyMatroska Reader with additional information to use in theoptimization of seeking performance.¶
The following recommendations are provided to optimize Matroska performance.¶
Unless Matroska is used as a live stream, itSHOULD contain aCues element.¶
For each video track, each keyframeSHOULD be referenced by aCuePoint element.¶
It isRECOMMENDED to not reference non-keyframes of video tracks inCues unlessit references aCluster element that contains aCodecState element but no keyframes.¶
For each subtitle track present, each subtitle frameSHOULD be referenced by aCuePoint element with aCueDuration element.¶
References to audio tracksMAY be skipped inCuePoint elements if a video trackis present. When included, theCuePoint elementsSHOULD reference audio keyframesonce every 500 milliseconds at most.¶
If the referenced frame is not stored within the firstSimpleBlock or firstBlockGroup within itsCluster element, then theCueRelativePosition elementSHOULD be written to reference where in theCluster the reference frame is stored.¶
If aCuePoint element references aCluster element that includes aCodecState element,then thatCuePoint elementMUST use aCueCodecState element.¶
CuePoint elementsSHOULD be numerically sorted in storage order by the value of theCueTime element.¶
In Matroska, there are two kinds of streaming: file access and livestreaming.¶
File access can simply be reading a file located on your computer, but it also includesaccessing a file from an HTTP (web) server or Common Internet File System (CIFS) (Windows share) server. These protocolsare usually safe from reading errors, and seeking in the stream is possible. However,when a file is stored far away or on a slow server, seeking can be an expensive operationand should be avoided.When followed, the guidelines inSection 25 help reduce the number ofseeking operations for regular playback and also have the playback startquickly without needing to read lot of data first (like aCues element,Attachments element, orSeekHead element).¶
Matroska, having a small overhead, is well suited for storing music/videos on fileservers without a big impact on the bandwidth used. Matroska does not require the indexto be loaded before playing, which allows playback to start very quickly. The index canbe loaded only when seeking is requested the first time.¶
Livestreaming is the equivalent of television broadcasting on the Internet. There are twofamilies of servers for livestreaming: RTP / Real-Time Streaming Protocol (RTSP) and HTTP. Matroska is not meant to beused over RTP. RTP already has timing and channel mechanisms that would be wasted if doubledin Matroska. Additionally, having the same information at the RTP and Matroska level wouldbe a source of confusion if they do not match.Livestreaming of Matroska over file-like protocols like HTTP, QUIC, etc., is possible.¶
A live Matroska stream is different from a file because it usually has noknown end (only ending when the client disconnects). For this, all bits of the"size" portion of theSegment elementMUST be set to1. Another option is to concatenateSegment elements with knownsizes, one after the other. This solution allows a change of codec/resolutionbetween each segment. For example, this allows for a switch between 4:3 and16:9 in a television program.¶
WhenSegment elements are continuous, certain elements (likeSeekHead,Cues,Chapters, andAttachments)MUST NOT be used.¶
It is possible for aMatroska Player to detect that a stream isnot seekable. If the stream has neither aSeekHead list nor aCues list at the beginning of the stream, itSHOULD beconsidered non-seekable. Even though it is possible to seek forward in thestream, it isNOT RECOMMENDED.¶
In the context of live radio or web TV, it is possible to "tag" the content while it isplaying. TheTags element can be placed betweenClusters each time it is necessary.In that case, the newTags elementMUST reset the previously encounteredTags elementsand use the new values instead.¶
Tags allow tagging all kinds of Matroska parts with very detailed metadata in multiple languages.¶
Some Matroska elements also contain their own string value, like the trackName element (Section 5.1.4.1.18) or theChapString element (Section 5.1.7.1.4.10).¶
The following Matroska elements can also be defined with tags:¶
The trackName element (Section 5.1.4.1.18) corresponds to a tag with theTagTrackUID (Section 5.1.8.1.1.3) set to the given track, aTagName ofTITLE (Section 5.1.8.1.2.1), and aTagLanguage (Section 5.1.8.1.2.2) orTagLanguageBCP47 (Section 5.1.8.1.2.3) of "und".¶
TheChapString element (Section 5.1.7.1.4.10) corresponds to a tag with theTagChapterUID (Section 5.1.8.1.1.5) set to the same chapter UID, aTagName ofTITLE (Section 5.1.8.1.2.1), and aTagLanguage (Section 5.1.8.1.2.2) orTagLanguageBCP47 (Section 5.1.8.1.2.3) matching theChapLanguage (Section 5.1.7.1.4.11) orChapLanguageBCP47 (Section 5.1.7.1.4.12), respectively.¶
TheFileDescription element (Section 5.1.6.1.1) of an attachment corresponds to a tag with theTagAttachmentUID (Section 5.1.8.1.1.6) set to the given attachment, aTagName ofTITLE (Section 5.1.8.1.2.1), and aTagLanguage (Section 5.1.8.1.2.2) orTagLanguageBCP47 (Section 5.1.8.1.2.3) of "und".¶
When both values exist in the file, the value found in Tags takes precedence over the value found in the original location of the element.For example, if you have aTrackEntry\Name element and a tag valueTITLE for that track in a MatroskaSegment, the tag value stringSHOULD be used instead of theTrackEntry\Name string to identify the track.¶
As the Tag element is optional, a lot ofMatroska Readers do nothandle it and will not use the tags value when it's found. Thus, for maximumcompatibility, it's usually better to put the strings in theTrackEntry,ChapterAtom, andAttachments elementsand keep the tags matching these values if tags are also used.¶
Tag elements allow tagging information on multiple levels, with each level having aTargetTypeValue (Section 5.1.8.1.1.1).An element for a givenTargetTypeValue also applies to the lower levels denoted by smallerTargetTypeValue values. If an upper valuedoesn't apply to a level but the actual value to use is not known,an emptyTagString (Section 5.1.8.1.2.5) or an emptyTagBinary (Section 5.1.8.1.2.6)MUST be used as the tag value for this level.¶
See[MatroskaTags] for more details on common tag names, types, and descriptions.¶
It isRECOMMENDED that each individualCluster element contain no more thanfive seconds or five megabytes of content.¶
It isRECOMMENDED that the firstSeekHead element be followed by aVoid element toallow for theSeekHead element to be expanded to cover newTop-Level Elementsthat could be added to the Matroska file, such asTags,Chapters, andAttachments elements.¶
The size of thisVoid element should be adjusted depending on theTags,Chapters, andAttachments elements in the Matroska file.¶
While there can beTop-Level Elements in any order, some orderings of elements are better than others.The following subsections detail optimum layouts for different use cases.¶
This is the basic layout muxers should be using for an efficient playback experience:¶
When tags from the previous layout need to be extended, they are moved to the end with the extra information.The location where the old tags were located is voided.¶
Cues are usually a big chunk of data referencing a lot of locations in the file.Players that want to seek in the file need to seek to the end of the fileto access these locations. It is often better if they are placed early in the file.On the other hand, that means players that don't intend to seek will have to read/skipthese data no matter what.¶
Because theCues reference locations further in the file, it's often complicated toallocate the proper space for that element before all the locations are known.Therefore, this layout is rarely used:¶
In livestreaming (Section 23.2), only a few elements make sense. For example,SeekHead andCues are useless.All elements other than theClustersMUST be placed before theClusters.¶
Matroska inherits security considerations from EBML[RFC8794].¶
Attacks on aMatroska Reader could include:¶
Storage of an arbitrary and potentially executable data within anAttachments element.Matroska Readers that extract or use data fromMatroska AttachmentsSHOULDcheck that the data adheres to expectations or not use the attachment.¶
AMatroska Attachment with an inaccurate media type.¶
Damage to the Encryption and Compression fields (Section 14) that would result in bogus binary datainterpreted by the decoder.¶
Chapter Codecs running unwanted commands on the host system.¶
The same error handling done for EBML applies to Matroska files.Particular error handling is not covered in this specification, as this isdepends on the goal of theMatroska Readers.Matroska Readers decide how to handle the errors whether or not they arerecoverable in their code.For example, if the checksum of the\Segment\Tracks is invalid, somecould decide to try to read the data anyway, some will just reject the file,and most will not even check it.¶
Matroska Reader implementations need to be robust against malicious payloads. Those related to denial of service are outlined inSection 2.1 of [RFC4732].¶
Although rarer, the same may apply to aMatroska Writer. Malicious stream datamust not cause theMatroska Writer to misbehave, as this might allow an attacker accessto transcoding gateways.¶
As an audio/video container format, a Matroska file or stream willpotentially encapsulate numerous byte streams created with a variety ofcodecs. Implementers will need to consider the security considerations ofthese encapsulated formats.¶
IANA has created a new registry called the "Matroska Element IDs"registry.¶
To register a new Element ID in this registry, one needs an Element ID, anElement Name, a Change Controller, and anoptional Reference to a document describing the Element ID.¶
Element IDs are encodedusing the VINT mechanism described inSection 4 of [RFC8794] and can be betweenone and five octets long. Five-octet Element IDs are possibleonly if declared in the EBML Header.¶
Element IDs are described inSection 5 of [RFC8794], with the changes in[Err7189] and[Err7191].¶
One-octet Matroska Element IDs (range 0x80-0xFE) are to be allocated according to the "RFC Required" policy[RFC8126].¶
Two-octet Matroska Element IDs (range 0x407F-0x7FFE) are to be allocated according to the "Specification Required" policy[RFC8126].¶
Two-octet Matroska Element IDs between 0x0100 and 0x407E are not valid foruse as an Element ID.¶
Three-octet (range 0x203FFF-0x3FFFFE) and four-octet Matroska Element IDs (range 0x101FFFFF-0x1FFFFFFE) are to be allocated according to the "First Come First Served" policy[RFC8126].¶
Three-octet Matroska Element IDs between 0x010000 and 0x203FFE are not valid for use as an Element ID.¶
Four-octet Matroska Element IDs between 0x01000000 and 0x101FFFFE are not valid for use as an Element ID.¶
The allowed values in the "Matroska Element IDs" registry are similar to the ones foundin the "EBML Element IDs" registry defined inSection 17.1 of [RFC8794].¶
EBML Element IDs defined for the EBML Header -- as defined inSection 17.1 of [RFC8794] --MUST NOT be used as Matroska Element IDs.¶
Given the scarcity of one-octet Element IDs, they should only be createdto save space for elements found many times in a file (for example,BlockGrouporChapters). The four-octet Element IDs are mostly for synchronization oflarge elements. They should only be used for such high-level elements.Elements that are not expected to be used often should use three-octet ElementIDs.¶
Elements found inAppendix A have an assigned Matroska Element ID for historical reasons.These elements are not in use andSHOULD NOT be reused unless there are no other IDs available with the desired size.Such IDs are marked as "Reclaimed" in the "Matroska Element IDs" registry, as they could be used for other things in the future.¶
Table 53 shows the initial contents of the"Matroska Element IDs" registry. The Change Controller for the initialentries is the IETF.¶
| Element ID | Element Name | Reference |
|---|---|---|
| 0x80 | ChapterDisplay | RFC 9559,Section 5.1.7.1.4.9 |
| 0x83 | TrackType | RFC 9559,Section 5.1.4.1.3 |
| 0x85 | ChapString | RFC 9559,Section 5.1.7.1.4.10 |
| 0x86 | CodecID | RFC 9559,Section 5.1.4.1.21 |
| 0x88 | FlagDefault | RFC 9559,Section 5.1.4.1.5 |
| 0x8E | Slices | Reclaimed (RFC 9559,Appendix A.5) |
| 0x91 | ChapterTimeStart | RFC 9559,Section 5.1.7.1.4.3 |
| 0x92 | ChapterTimeEnd | RFC 9559,Section 5.1.7.1.4.4 |
| 0x96 | CueRefTime | RFC 9559,Section 5.1.5.1.2.8 |
| 0x97 | CueRefCluster | Reclaimed (RFC 9559,Appendix A.37) |
| 0x98 | ChapterFlagHidden | RFC 9559,Section 5.1.7.1.4.5 |
| 0x9A | FlagInterlaced | RFC 9559,Section 5.1.4.1.28.1 |
| 0x9B | BlockDuration | RFC 9559,Section 5.1.3.5.3 |
| 0x9C | FlagLacing | RFC 9559,Section 5.1.4.1.12 |
| 0x9D | FieldOrder | RFC 9559,Section 5.1.4.1.28.2 |
| 0x9F | Channels | RFC 9559,Section 5.1.4.1.29.3 |
| 0xA0 | BlockGroup | RFC 9559,Section 5.1.3.5 |
| 0xA1 | Block | RFC 9559,Section 5.1.3.5.1 |
| 0xA2 | BlockVirtual | Reclaimed (RFC 9559,Appendix A.3) |
| 0xA3 | SimpleBlock | RFC 9559,Section 5.1.3.4 |
| 0xA4 | CodecState | RFC 9559,Section 5.1.3.5.6 |
| 0xA5 | BlockAdditional | RFC 9559,Section 5.1.3.5.2.2 |
| 0xA6 | BlockMore | RFC 9559,Section 5.1.3.5.2.1 |
| 0xA7 | Position | RFC 9559,Section 5.1.3.2 |
| 0xAA | CodecDecodeAll | Reclaimed (RFC 9559,Appendix A.22) |
| 0xAB | PrevSize | RFC 9559,Section 5.1.3.3 |
| 0xAE | TrackEntry | RFC 9559,Section 5.1.4.1 |
| 0xAF | EncryptedBlock | Reclaimed (RFC 9559,Appendix A.15) |
| 0xB0 | PixelWidth | RFC 9559,Section 5.1.4.1.28.6 |
| 0xB2 | CueDuration | RFC 9559,Section 5.1.5.1.2.4 |
| 0xB3 | CueTime | RFC 9559,Section 5.1.5.1.1 |
| 0xB5 | SamplingFrequency | RFC 9559,Section 5.1.4.1.29.1 |
| 0xB6 | ChapterAtom | RFC 9559,Section 5.1.7.1.4 |
| 0xB7 | CueTrackPositions | RFC 9559,Section 5.1.5.1.2 |
| 0xB9 | FlagEnabled | RFC 9559,Section 5.1.4.1.4 |
| 0xBA | PixelHeight | RFC 9559,Section 5.1.4.1.28.7 |
| 0xBB | CuePoint | RFC 9559,Section 5.1.5.1 |
| 0xC0 | TrickTrackUID | Reclaimed (RFC 9559,Appendix A.28) |
| 0xC1 | TrickTrackSegmentUID | Reclaimed (RFC 9559,Appendix A.29) |
| 0xC4 | TrickMasterTrackSegmentUID | Reclaimed (RFC 9559,Appendix A.32) |
| 0xC6 | TrickTrackFlag | Reclaimed (RFC 9559,Appendix A.30) |
| 0xC7 | TrickMasterTrackUID | Reclaimed (RFC 9559,Appendix A.31) |
| 0xC8 | ReferenceFrame | Reclaimed (RFC 9559,Appendix A.12) |
| 0xC9 | ReferenceOffset | Reclaimed (RFC 9559,Appendix A.13) |
| 0xCA | ReferenceTimestamp | Reclaimed (RFC 9559,Appendix A.14) |
| 0xCB | BlockAdditionID | Reclaimed (RFC 9559,Appendix A.9) |
| 0xCC | LaceNumber | Reclaimed (RFC 9559,Appendix A.7) |
| 0xCD | FrameNumber | Reclaimed (RFC 9559,Appendix A.8) |
| 0xCE | Delay | Reclaimed (RFC 9559,Appendix A.10) |
| 0xCF | SliceDuration | Reclaimed (RFC 9559,Appendix A.11) |
| 0xD7 | TrackNumber | RFC 9559,Section 5.1.4.1.1 |
| 0xDB | CueReference | RFC 9559,Section 5.1.5.1.2.7 |
| 0xE0 | Video | RFC 9559,Section 5.1.4.1.28 |
| 0xE1 | Audio | RFC 9559,Section 5.1.4.1.29 |
| 0xE2 | TrackOperation | RFC 9559,Section 5.1.4.1.30 |
| 0xE3 | TrackCombinePlanes | RFC 9559,Section 5.1.4.1.30.1 |
| 0xE4 | TrackPlane | RFC 9559,Section 5.1.4.1.30.2 |
| 0xE5 | TrackPlaneUID | RFC 9559,Section 5.1.4.1.30.3 |
| 0xE6 | TrackPlaneType | RFC 9559,Section 5.1.4.1.30.4 |
| 0xE7 | Timestamp | RFC 9559,Section 5.1.3.1 |
| 0xE8 | TimeSlice | Reclaimed (RFC 9559,Appendix A.6) |
| 0xE9 | TrackJoinBlocks | RFC 9559,Section 5.1.4.1.30.5 |
| 0xEA | CueCodecState | RFC 9559,Section 5.1.5.1.2.6 |
| 0xEB | CueRefCodecState | Reclaimed (RFC 9559,Appendix A.39) |
| 0xED | TrackJoinUID | RFC 9559,Section 5.1.4.1.30.6 |
| 0xEE | BlockAddID | RFC 9559,Section 5.1.3.5.2.3 |
| 0xF0 | CueRelativePosition | RFC 9559,Section 5.1.5.1.2.3 |
| 0xF1 | CueClusterPosition | RFC 9559,Section 5.1.5.1.2.2 |
| 0xF7 | CueTrack | RFC 9559,Section 5.1.5.1.2.1 |
| 0xFA | ReferencePriority | RFC 9559,Section 5.1.3.5.4 |
| 0xFB | ReferenceBlock | RFC 9559,Section 5.1.3.5.5 |
| 0xFD | ReferenceVirtual | Reclaimed (RFC 9559,Appendix A.4) |
| 0xFF | Reserved | RFC 9559 |
| 0x0100-0x407E | Not valid for use as an Element ID | RFC 9559,Section 27.1 |
| 0x41A4 | BlockAddIDName | RFC 9559,Section 5.1.4.1.17.2 |
| 0x41E4 | BlockAdditionMapping | RFC 9559,Section 5.1.4.1.17 |
| 0x41E7 | BlockAddIDType | RFC 9559,Section 5.1.4.1.17.3 |
| 0x41ED | BlockAddIDExtraData | RFC 9559,Section 5.1.4.1.17.4 |
| 0x41F0 | BlockAddIDValue | RFC 9559,Section 5.1.4.1.17.1 |
| 0x4254 | ContentCompAlgo | RFC 9559,Section 5.1.4.1.31.6 |
| 0x4255 | ContentCompSettings | RFC 9559,Section 5.1.4.1.31.7 |
| 0x437C | ChapLanguage | RFC 9559,Section 5.1.7.1.4.11 |
| 0x437D | ChapLanguageBCP47 | RFC 9559,Section 5.1.7.1.4.12 |
| 0x437E | ChapCountry | RFC 9559,Section 5.1.7.1.4.13 |
| 0x4444 | SegmentFamily | RFC 9559,Section 5.1.2.7 |
| 0x4461 | DateUTC | RFC 9559,Section 5.1.2.11 |
| 0x447A | TagLanguage | RFC 9559,Section 5.1.8.1.2.2 |
| 0x447B | TagLanguageBCP47 | RFC 9559,Section 5.1.8.1.2.3 |
| 0x4484 | TagDefault | RFC 9559,Section 5.1.8.1.2.4 |
| 0x4485 | TagBinary | RFC 9559,Section 5.1.8.1.2.6 |
| 0x4487 | TagString | RFC 9559,Section 5.1.8.1.2.5 |
| 0x4489 | Duration | RFC 9559,Section 5.1.2.10 |
| 0x44B4 | TagDefaultBogus | Reclaimed (RFC 9559,Appendix A.43) |
| 0x450D | ChapProcessPrivate | RFC 9559,Section 5.1.7.1.4.16 |
| 0x45A3 | TagName | RFC 9559,Section 5.1.8.1.2.1 |
| 0x45B9 | EditionEntry | RFC 9559,Section 5.1.7.1 |
| 0x45BC | EditionUID | RFC 9559,Section 5.1.7.1.1 |
| 0x45DB | EditionFlagDefault | RFC 9559,Section 5.1.7.1.2 |
| 0x45DD | EditionFlagOrdered | RFC 9559,Section 5.1.7.1.3 |
| 0x465C | FileData | RFC 9559,Section 5.1.6.1.4 |
| 0x4660 | FileMediaType | RFC 9559,Section 5.1.6.1.3 |
| 0x4661 | FileUsedStartTime | Reclaimed (RFC 9559,Appendix A.41) |
| 0x4662 | FileUsedEndTime | Reclaimed (RFC 9559,Appendix A.42) |
| 0x466E | FileName | RFC 9559,Section 5.1.6.1.2 |
| 0x4675 | FileReferral | Reclaimed (RFC 9559,Appendix A.40) |
| 0x467E | FileDescription | RFC 9559,Section 5.1.6.1.1 |
| 0x46AE | FileUID | RFC 9559,Section 5.1.6.1.5 |
| 0x47E1 | ContentEncAlgo | RFC 9559,Section 5.1.4.1.31.9 |
| 0x47E2 | ContentEncKeyID | RFC 9559,Section 5.1.4.1.31.10 |
| 0x47E3 | ContentSignature | Reclaimed (RFC 9559,Appendix A.33) |
| 0x47E4 | ContentSigKeyID | Reclaimed (RFC 9559,Appendix A.34) |
| 0x47E5 | ContentSigAlgo | Reclaimed (RFC 9559,Appendix A.35) |
| 0x47E6 | ContentSigHashAlgo | Reclaimed (RFC 9559,Appendix A.36) |
| 0x47E7 | ContentEncAESSettings | RFC 9559,Section 5.1.4.1.31.11 |
| 0x47E8 | AESSettingsCipherMode | RFC 9559,Section 5.1.4.1.31.12 |
| 0x4D80 | MuxingApp | RFC 9559,Section 5.1.2.13 |
| 0x4DBB | Seek | RFC 9559,Section 5.1.1.1 |
| 0x5031 | ContentEncodingOrder | RFC 9559,Section 5.1.4.1.31.2 |
| 0x5032 | ContentEncodingScope | RFC 9559,Section 5.1.4.1.31.3 |
| 0x5033 | ContentEncodingType | RFC 9559,Section 5.1.4.1.31.4 |
| 0x5034 | ContentCompression | RFC 9559,Section 5.1.4.1.31.5 |
| 0x5035 | ContentEncryption | RFC 9559,Section 5.1.4.1.31.8 |
| 0x535F | CueRefNumber | Reclaimed (RFC 9559,Appendix A.38) |
| 0x536E | Name | RFC 9559,Section 5.1.4.1.18 |
| 0x5378 | CueBlockNumber | RFC 9559,Section 5.1.5.1.2.5 |
| 0x537F | TrackOffset | Reclaimed (RFC 9559,Appendix A.18) |
| 0x53AB | SeekID | RFC 9559,Section 5.1.1.1.1 |
| 0x53AC | SeekPosition | RFC 9559,Section 5.1.1.1.2 |
| 0x53B8 | StereoMode | RFC 9559,Section 5.1.4.1.28.3 |
| 0x53B9 | OldStereoMode | RFC 9559,Section 5.1.4.1.28.5 |
| 0x53C0 | AlphaMode | RFC 9559,Section 5.1.4.1.28.4 |
| 0x54AA | PixelCropBottom | RFC 9559,Section 5.1.4.1.28.8 |
| 0x54B0 | DisplayWidth | RFC 9559,Section 5.1.4.1.28.12 |
| 0x54B2 | DisplayUnit | RFC 9559,Section 5.1.4.1.28.14 |
| 0x54B3 | AspectRatioType | Reclaimed (RFC 9559,Appendix A.24) |
| 0x54BA | DisplayHeight | RFC 9559,Section 5.1.4.1.28.13 |
| 0x54BB | PixelCropTop | RFC 9559,Section 5.1.4.1.28.9 |
| 0x54CC | PixelCropLeft | RFC 9559,Section 5.1.4.1.28.10 |
| 0x54DD | PixelCropRight | RFC 9559,Section 5.1.4.1.28.11 |
| 0x55AA | FlagForced | RFC 9559,Section 5.1.4.1.6 |
| 0x55AB | FlagHearingImpaired | RFC 9559,Section 5.1.4.1.7 |
| 0x55AC | FlagVisualImpaired | RFC 9559,Section 5.1.4.1.8 |
| 0x55AD | FlagTextDescriptions | RFC 9559,Section 5.1.4.1.9 |
| 0x55AE | FlagOriginal | RFC 9559,Section 5.1.4.1.10 |
| 0x55AF | FlagCommentary | RFC 9559,Section 5.1.4.1.11 |
| 0x55B0 | Colour | RFC 9559,Section 5.1.4.1.28.16 |
| 0x55B1 | MatrixCoefficients | RFC 9559,Section 5.1.4.1.28.17 |
| 0x55B2 | BitsPerChannel | RFC 9559,Section 5.1.4.1.28.18 |
| 0x55B3 | ChromaSubsamplingHorz | RFC 9559,Section 5.1.4.1.28.19 |
| 0x55B4 | ChromaSubsamplingVert | RFC 9559,Section 5.1.4.1.28.20 |
| 0x55B5 | CbSubsamplingHorz | RFC 9559,Section 5.1.4.1.28.21 |
| 0x55B6 | CbSubsamplingVert | RFC 9559,Section 5.1.4.1.28.22 |
| 0x55B7 | ChromaSitingHorz | RFC 9559,Section 5.1.4.1.28.23 |
| 0x55B8 | ChromaSitingVert | RFC 9559,Section 5.1.4.1.28.24 |
| 0x55B9 | Range | RFC 9559,Section 5.1.4.1.28.25 |
| 0x55BA | TransferCharacteristics | RFC 9559,Section 5.1.4.1.28.26 |
| 0x55BB | Primaries | RFC 9559,Section 5.1.4.1.28.27 |
| 0x55BC | MaxCLL | RFC 9559,Section 5.1.4.1.28.28 |
| 0x55BD | MaxFALL | RFC 9559,Section 5.1.4.1.28.29 |
| 0x55D0 | MasteringMetadata | RFC 9559,Section 5.1.4.1.28.30 |
| 0x55D1 | PrimaryRChromaticityX | RFC 9559,Section 5.1.4.1.28.31 |
| 0x55D2 | PrimaryRChromaticityY | RFC 9559,Section 5.1.4.1.28.32 |
| 0x55D3 | PrimaryGChromaticityX | RFC 9559,Section 5.1.4.1.28.33 |
| 0x55D4 | PrimaryGChromaticityY | RFC 9559,Section 5.1.4.1.28.34 |
| 0x55D5 | PrimaryBChromaticityX | RFC 9559,Section 5.1.4.1.28.35 |
| 0x55D6 | PrimaryBChromaticityY | RFC 9559,Section 5.1.4.1.28.36 |
| 0x55D7 | WhitePointChromaticityX | RFC 9559,Section 5.1.4.1.28.37 |
| 0x55D8 | WhitePointChromaticityY | RFC 9559,Section 5.1.4.1.28.38 |
| 0x55D9 | LuminanceMax | RFC 9559,Section 5.1.4.1.28.39 |
| 0x55DA | LuminanceMin | RFC 9559,Section 5.1.4.1.28.40 |
| 0x55EE | MaxBlockAdditionID | RFC 9559,Section 5.1.4.1.16 |
| 0x5654 | ChapterStringUID | RFC 9559,Section 5.1.7.1.4.2 |
| 0x56AA | CodecDelay | RFC 9559,Section 5.1.4.1.25 |
| 0x56BB | SeekPreRoll | RFC 9559,Section 5.1.4.1.26 |
| 0x5741 | WritingApp | RFC 9559,Section 5.1.2.14 |
| 0x5854 | SilentTracks | Reclaimed (RFC 9559,Appendix A.1) |
| 0x58D7 | SilentTrackNumber | Reclaimed (RFC 9559,Appendix A.2) |
| 0x61A7 | AttachedFile | RFC 9559,Section 5.1.6.1 |
| 0x6240 | ContentEncoding | RFC 9559,Section 5.1.4.1.31.1 |
| 0x6264 | BitDepth | RFC 9559,Section 5.1.4.1.29.4 |
| 0x63A2 | CodecPrivate | RFC 9559,Section 5.1.4.1.22 |
| 0x63C0 | Targets | RFC 9559,Section 5.1.8.1.1 |
| 0x63C3 | ChapterPhysicalEquiv | RFC 9559,Section 5.1.7.1.4.8 |
| 0x63C4 | TagChapterUID | RFC 9559,Section 5.1.8.1.1.5 |
| 0x63C5 | TagTrackUID | RFC 9559,Section 5.1.8.1.1.3 |
| 0x63C6 | TagAttachmentUID | RFC 9559,Section 5.1.8.1.1.6 |
| 0x63C9 | TagEditionUID | RFC 9559,Section 5.1.8.1.1.4 |
| 0x63CA | TargetType | RFC 9559,Section 5.1.8.1.1.2 |
| 0x6624 | TrackTranslate | RFC 9559,Section 5.1.4.1.27 |
| 0x66A5 | TrackTranslateTrackID | RFC 9559,Section 5.1.4.1.27.1 |
| 0x66BF | TrackTranslateCodec | RFC 9559,Section 5.1.4.1.27.2 |
| 0x66FC | TrackTranslateEditionUID | RFC 9559,Section 5.1.4.1.27.3 |
| 0x67C8 | SimpleTag | RFC 9559,Section 5.1.8.1.2 |
| 0x68CA | TargetTypeValue | RFC 9559,Section 5.1.8.1.1.1 |
| 0x6911 | ChapProcessCommand | RFC 9559,Section 5.1.7.1.4.17 |
| 0x6922 | ChapProcessTime | RFC 9559,Section 5.1.7.1.4.18 |
| 0x6924 | ChapterTranslate | RFC 9559,Section 5.1.2.8 |
| 0x6933 | ChapProcessData | RFC 9559,Section 5.1.7.1.4.19 |
| 0x6944 | ChapProcess | RFC 9559,Section 5.1.7.1.4.14 |
| 0x6955 | ChapProcessCodecID | RFC 9559,Section 5.1.7.1.4.15 |
| 0x69A5 | ChapterTranslateID | RFC 9559,Section 5.1.2.8.1 |
| 0x69BF | ChapterTranslateCodec | RFC 9559,Section 5.1.2.8.2 |
| 0x69FC | ChapterTranslateEditionUID | RFC 9559,Section 5.1.2.8.3 |
| 0x6D80 | ContentEncodings | RFC 9559,Section 5.1.4.1.31 |
| 0x6DE7 | MinCache | Reclaimed (RFC 9559,Appendix A.16) |
| 0x6DF8 | MaxCache | Reclaimed (RFC 9559,Appendix A.17) |
| 0x6E67 | ChapterSegmentUUID | RFC 9559,Section 5.1.7.1.4.6 |
| 0x6EBC | ChapterSegmentEditionUID | RFC 9559,Section 5.1.7.1.4.7 |
| 0x6FAB | TrackOverlay | Reclaimed (RFC 9559,Appendix A.23) |
| 0x7373 | Tag | RFC 9559,Section 5.1.8.1 |
| 0x7384 | SegmentFilename | RFC 9559,Section 5.1.2.2 |
| 0x73A4 | SegmentUUID | RFC 9559,Section 5.1.2.1 |
| 0x73C4 | ChapterUID | RFC 9559,Section 5.1.7.1.4.1 |
| 0x73C5 | TrackUID | RFC 9559,Section 5.1.4.1.2 |
| 0x7446 | AttachmentLink | RFC 9559,Section 5.1.4.1.24 |
| 0x75A1 | BlockAdditions | RFC 9559,Section 5.1.3.5.2 |
| 0x75A2 | DiscardPadding | RFC 9559,Section 5.1.3.5.7 |
| 0x7670 | Projection | RFC 9559,Section 5.1.4.1.28.41 |
| 0x7671 | ProjectionType | RFC 9559,Section 5.1.4.1.28.42 |
| 0x7672 | ProjectionPrivate | RFC 9559,Section 5.1.4.1.28.43 |
| 0x7673 | ProjectionPoseYaw | RFC 9559,Section 5.1.4.1.28.44 |
| 0x7674 | ProjectionPosePitch | RFC 9559,Section 5.1.4.1.28.45 |
| 0x7675 | ProjectionPoseRoll | RFC 9559,Section 5.1.4.1.28.46 |
| 0x78B5 | OutputSamplingFrequency | RFC 9559,Section 5.1.4.1.29.2 |
| 0x7BA9 | Title | RFC 9559,Section 5.1.2.12 |
| 0x7D7B | ChannelPositions | Reclaimed (RFC 9559,Appendix A.27) |
| 0x7FFF | Reserved | RFC 9559 |
| 0x010000-0x203FFE | Not valid for use as an Element ID | RFC 9559,Section 27.1 |
| 0x22B59C | Language | RFC 9559,Section 5.1.4.1.19 |
| 0x22B59D | LanguageBCP47 | RFC 9559,Section 5.1.4.1.20 |
| 0x23314F | TrackTimestampScale | RFC 9559,Section 5.1.4.1.15 |
| 0x234E7A | DefaultDecodedFieldDuration | RFC 9559,Section 5.1.4.1.14 |
| 0x2383E3 | FrameRate | Reclaimed (RFC 9559,Appendix A.26) |
| 0x23E383 | DefaultDuration | RFC 9559,Section 5.1.4.1.13 |
| 0x258688 | CodecName | RFC 9559,Section 5.1.4.1.23 |
| 0x26B240 | CodecDownloadURL | Reclaimed (RFC 9559,Appendix A.21) |
| 0x2AD7B1 | TimestampScale | RFC 9559,Section 5.1.2.9 |
| 0x2EB524 | UncompressedFourCC | RFC 9559,Section 5.1.4.1.28.15 |
| 0x2FB523 | GammaValue | Reclaimed (RFC 9559,Appendix A.25) |
| 0x3A9697 | CodecSettings | Reclaimed (RFC 9559,Appendix A.19) |
| 0x3B4040 | CodecInfoURL | Reclaimed (RFC 9559,Appendix A.20) |
| 0x3C83AB | PrevFilename | RFC 9559,Section 5.1.2.4 |
| 0x3CB923 | PrevUUID | RFC 9559,Section 5.1.2.3 |
| 0x3E83BB | NextFilename | RFC 9559,Section 5.1.2.6 |
| 0x3EB923 | NextUUID | RFC 9559,Section 5.1.2.5 |
| 0x3FFFFF | Reserved | RFC 9559 |
| 0x01000000-0x101FFFFE | Not valid for use as an Element ID | RFC 9559,Section 27.1 |
| 0x1043A770 | Chapters | RFC 9559,Section 5.1.7 |
| 0x114D9B74 | SeekHead | RFC 9559,Section 5.1.1 |
| 0x1254C367 | Tags | RFC 9559,Section 5.1.8 |
| 0x1549A966 | Info | RFC 9559,Section 5.1.2 |
| 0x1654AE6B | Tracks | RFC 9559,Section 5.1.4 |
| 0x18538067 | Segment | RFC 9559,Section 5.1 |
| 0x1941A469 | Attachments | RFC 9559,Section 5.1.6 |
| 0x1C53BB6B | Cues | RFC 9559,Section 5.1.5 |
| 0x1F43B675 | Cluster | RFC 9559,Section 5.1.3 |
| 0x1FFFFFFF | Reserved | RFC 9559 |
IANA has created a new registry called the "Matroska Compression Algorithms" registry.The values correspond to the unsigned integerContentCompAlgo value described inSection 5.1.4.1.31.6.¶
To register a new Compression Algorithm in this registry, one needs a Compression Algorithm value,a description, a Change Controller, anda Reference to a document describing the Compression Algorithm.¶
The Compression Algorithms are to be allocated according to the "Specification Required" policy[RFC8126]. Available values range from 4-18446744073709551615.¶
Table 54 shows the initial contents of the "Matroska Compression Algorithms" registry.The Change Controller for the initial entries is the IETF.¶
| Compression Algorithm | Description | Reference |
|---|---|---|
| 0 | zlib | RFC 9559,Section 5.1.4.1.31.6 |
| 1 | bzlib | RFC 9559,Section 5.1.4.1.31.6 |
| 2 | lzo1x | RFC 9559,Section 5.1.4.1.31.6 |
| 3 | Header Stripping | RFC 9559,Section 5.1.4.1.31.6 |
IANA has created a new registry called the "Matroska Encryption Algorithms" registry.The values correspond to the unsigned integerContentEncAlgo value described inSection 5.1.4.1.31.9.¶
To register a new Encryption Algorithm in this registry, one needs an Encryption Algorithm value,a description, a Change Controller, andan optional Reference to a document describing the Encryption Algorithm.¶
The Encryption Algorithms are to be allocated according to the "First Come First Served" policy[RFC8126]. Available values range from 6-18446744073709551615.¶
Table 55 shows the initial contents of the "Matroska Encryption Algorithms" registry. The Change Controller for the initial entries is the IETF.¶
| Encryption Algorithm | Description | Reference |
|---|---|---|
| 0 | Not encrypted | RFC 9559,Section 5.1.4.1.31.9 |
| 1 | DES | RFC 9559,Section 5.1.4.1.31.9 |
| 2 | 3DES | RFC 9559,Section 5.1.4.1.31.9 |
| 3 | Twofish | RFC 9559,Section 5.1.4.1.31.9 |
| 4 | Blowfish | RFC 9559,Section 5.1.4.1.31.9 |
| 5 | AES | RFC 9559,Section 5.1.4.1.31.9 |
IANA has created a new registry called the "Matroska AES Cipher Modes" registry.The values correspond to the unsigned integerAESSettingsCipherMode value described inSection 5.1.4.1.31.12.¶
To register a new AES Cipher Mode in this registry, one needs an AES Cipher Mode value,a description, a Change Controller, andan optional Reference to a document describing the AES Cipher Mode.¶
The AES Cipher Modes are to be allocated according to the "First Come First Served" policy[RFC8126]. Available values range from 3-18446744073709551615.¶
The value 0 is not valid for use as an AES Cipher Mode.¶
Table 56 shows the initial contents of the "Matroska AES Cipher Modes" registry.The Change Controller for the initial entries is the IETF.¶
| AES Cipher Mode | Description | Reference |
|---|---|---|
| 0 | Not valid for use as an AES Cipher Mode | RFC 9559,Section 5.1.4.1.31.12 |
| 1 | AES-CTR | RFC 9559,Section 5.1.4.1.31.12 |
| 2 | AES-CBC | RFC 9559,Section 5.1.4.1.31.12 |
IANA has created a new registry called the "Matroska Content Encoding Scopes" registry.The values correspond to the unsigned integerContentEncodingScope value described inSection 5.1.4.1.31.3.¶
To register a new Content Encoding Scope in this registry, one needs a Content Encoding Scope value,a description, a Change Controller, anda Reference to a document describing the Content Encoding Scope.¶
The Content Encoding Scopes are to be allocated according to the "Specification Required" policy[RFC8126]. Available values range from 0x8-0x8000000000000000.¶
The Content Encoding Scope is a bit-field value, so only power of 2 values can be registered.¶
The value 0 is not valid for use as a Content Encoding Scope.¶
Table 57 shows the initial contents of the "Matroska Content Encoding Scopes" registry. The Change Controller for the initial entries is the IETF.¶
| Content Encoding Scope | Description | Reference |
|---|---|---|
| 0x0 | Not valid for use as a Content Encoding Scope | RFC 9559,Section 5.1.4.1.31.3 |
| 0x1 | Block | RFC 9559,Section 5.1.4.1.31.3 |
| 0x2 | Private | RFC 9559,Section 5.1.4.1.31.3 |
| 0x4 | Next | RFC 9559,Section 5.1.4.1.31.3 |
IANA has created a new registry called the "Matroska Content Encoding Types" registry.The values correspond to the unsigned integerContentEncodingType value described inSection 5.1.4.1.31.4.¶
To register a new Content Encoding Type in this registry, one needs a Content Encoding Type value,a description, a Change Controller, anda Reference to a document describing the Content Encoding Type.¶
The Content Encoding Types are to be allocated according to the "Specification Required" policy[RFC8126]. Available values range from 2-18446744073709551615.¶
Table 58 shows the initial contents of the "Matroska Content Encoding Types" registry. The Change Controller for the initial entries is the IETF.¶
| Content Encoding Type | Description | Reference |
|---|---|---|
| 0 | Compression | RFC 9559,Section 5.1.4.1.31.4 |
| 1 | Encryption | RFC 9559,Section 5.1.4.1.31.4 |
IANA has created a new registry called the "Matroska Stereo Modes" registry.The values correspond to the unsigned integerStereoMode value described inSection 5.1.4.1.28.3.¶
To register a new Stereo Mode in this registry, one needs a Stereo Mode value,a description, a Change Controller, anda Reference to a document describing the Stereo Mode.¶
The Stereo Modes are to be allocated according to the "Specification Required" policy[RFC8126]. Available values range from 15-18446744073709551615.¶
Table 59 shows the initial contents of the "Matroska Stereo Modes" registry. The Change Controller for the initial entries is the IETF.¶
| Stereo Mode | Description | Reference |
|---|---|---|
| 0 | mono | RFC 9559,Section 5.1.4.1.28.3 |
| 1 | side by side (left eye first) | RFC 9559,Section 5.1.4.1.28.3 |
| 2 | top - bottom (right eye is first) | RFC 9559,Section 5.1.4.1.28.3 |
| 3 | top - bottom (left eye is first) | RFC 9559,Section 5.1.4.1.28.3 |
| 4 | checkboard (right eye is first) | RFC 9559,Section 5.1.4.1.28.3 |
| 5 | checkboard (left eye is first) | RFC 9559,Section 5.1.4.1.28.3 |
| 6 | row interleaved (right eye is first) | RFC 9559,Section 5.1.4.1.28.3 |
| 7 | row interleaved (left eye is first) | RFC 9559,Section 5.1.4.1.28.3 |
| 8 | column interleaved (right eye is first) | RFC 9559,Section 5.1.4.1.28.3 |
| 9 | column interleaved (left eye is first) | RFC 9559,Section 5.1.4.1.28.3 |
| 10 | anaglyph (cyan/red) | RFC 9559,Section 5.1.4.1.28.3 |
| 11 | side by side (right eye first) | RFC 9559,Section 5.1.4.1.28.3 |
| 12 | anaglyph (green/magenta) | RFC 9559,Section 5.1.4.1.28.3 |
| 13 | both eyes laced in one Block (left eye is first) | RFC 9559,Section 5.1.4.1.28.3 |
| 14 | both eyes laced in one Block (right eye is first) | RFC 9559,Section 5.1.4.1.28.3 |
IANA has created a new registry called the "Matroska Alpha Modes" registry.The values correspond to the unsigned integerAlphaMode value described inSection 5.1.4.1.28.4.¶
To register a new Alpha Mode in this registry, one needs an Alpha Mode value,a description, a Change Controller, andan optional Reference to a document describing the Alpha Mode.¶
The Alpha Modes are to be allocated according to the "First Come First Served" policy[RFC8126]. Available values range from 2-18446744073709551615.¶
Table 60 shows the initial contents of the "Matroska Alpha Modes" registry. The Change Controller for the initial entries is the IETF.¶
| Alpha Mode | Description | Reference |
|---|---|---|
| 0 | none | RFC 9559,Section 5.1.4.1.28.4 |
| 1 | present | RFC 9559,Section 5.1.4.1.28.4 |
IANA has created a new registry called the "Matroska Display Units" registry.The values correspond to the unsigned integerDisplayUnit value described inSection 5.1.4.1.28.14.¶
To register a new Display Unit in this registry, one needs a Display Unit value,a description, a Change Controller, anda Reference to a document describing the Display Unit.¶
The Display Units are to be allocated according to the "Specification Required" policy[RFC8126]. Available values range from 5-18446744073709551615.¶
Table 61 shows the initial contents of the "Matroska Display Units" registry. The Change Controller for the initial entries is the IETF.¶
| Display Unit | Description | Reference |
|---|---|---|
| 0 | pixels | RFC 9559,Section 5.1.4.1.28.14 |
| 1 | centimeters | RFC 9559,Section 5.1.4.1.28.14 |
| 2 | inches | RFC 9559,Section 5.1.4.1.28.14 |
| 3 | display aspect ratio | RFC 9559,Section 5.1.4.1.28.14 |
| 4 | unknown | RFC 9559,Section 5.1.4.1.28.14 |
IANA has created a new registry called the "Matroska Horizontal Chroma Sitings" registry.The values correspond to the unsigned integerChromaSitingHorz value described inSection 5.1.4.1.28.23.¶
To register a new Horizontal Chroma Siting in this registry, one needs a Horizontal Chroma Siting value,a description, a Change Controller, andan optional Reference to a document describing the Horizontal Chroma Siting.¶
The Horizontal Chroma Sitings are to be allocated according to the "First Come First Served" policy[RFC8126]. Available values range from 3-18446744073709551615.¶
Table 62 shows the initial contents of the "Matroska Horizontal Chroma Sitings" registry.The Change Controller for the initial entries is the IETF.¶
| Horizontal Chroma Siting | Description | Reference |
|---|---|---|
| 0 | unspecified | RFC 9559,Section 5.1.4.1.28.23 |
| 1 | left collocated | RFC 9559,Section 5.1.4.1.28.23 |
| 2 | half | RFC 9559,Section 5.1.4.1.28.23 |
IANA has created a new registry called the "Matroska Vertical Chroma Sitings" registry.The values correspond to the unsigned integerChromaSitingVert value described inSection 5.1.4.1.28.24.¶
To register a new Vertical Chroma Siting in this registry, one needs a Vertical Chroma Siting value,a description, a Change Controller, andan optional Reference to a document describing the Vertical Chroma Siting.¶
The Vertical Chroma Sitings are to be allocated according to the "First Come First Served" policy[RFC8126]. Available values range from 3-18446744073709551615.¶
Table 63 shows the initial contents of the "Matroska Vertical Chroma Sitings" registry.The Change Controller for the initial entries is the IETF.¶
| Vertical Chroma Siting | Description | Reference |
|---|---|---|
| 0 | unspecified | RFC 9559,Section 5.1.4.1.28.24 |
| 1 | top collocated | RFC 9559,Section 5.1.4.1.28.24 |
| 2 | half | RFC 9559,Section 5.1.4.1.28.24 |
IANA has created a new registry called the "Matroska Color Ranges" registry.The values correspond to the unsigned integerRange value described inSection 5.1.4.1.28.25.¶
To register a new Color Range in this registry, one needs a Color Range value,a description, a Change Controller, anda Reference to a document describing the Color Range.¶
The Color Ranges are to be allocated according to the "Specification Required" policy[RFC8126]. Available values range from 4-18446744073709551615.¶
Table 64 shows the initial contents of the "Matroska Color Ranges" registry. The Change Controller for the initial entries is the IETF.¶
| Color Range | Description | Reference |
|---|---|---|
| 0 | unspecified | RFC 9559,Section 5.1.4.1.28.25 |
| 1 | broadcast range | RFC 9559,Section 5.1.4.1.28.25 |
| 2 | full range (no clipping) | RFC 9559,Section 5.1.4.1.28.25 |
| 3 | defined by MatrixCoefficients / TransferCharacteristics | RFC 9559,Section 5.1.4.1.28.25 |
IANA has created a new registry called the "Matroska Tags Target Types" registry.The values correspond to the unsigned integerTargetTypeValue value described inSection 5.1.8.1.1.1.¶
To register a new Tags Target Type in this registry, one needs a Tags Target Type value,a description, a Change Controller, anda Reference to a document describing the Tags Target Type.¶
The Tags Target Types are to be allocated according to the "Specification Required" policy[RFC8126]. Available values range from 1-9, 11-19, 21-29, 31-39, 41-49, 51-59, 61-69, and 71-18446744073709551615.¶
The value 0 is not valid for use as a Tags Target Type.¶
Table 65 shows the initial contents of the "Matroska Tags Target Types" registry. The Change Controller for the initial entries is the IETF.¶
| Tags Target Type | Description | Reference |
|---|---|---|
| 70 | COLLECTION | RFC 9559,Section 5.1.8.1.1.1 |
| 60 | EDITION / ISSUE / VOLUME / OPUS / SEASON / SEQUEL | RFC 9559,Section 5.1.8.1.1.1 |
| 50 | ALBUM / OPERA / CONCERT / MOVIE / EPISODE | RFC 9559,Section 5.1.8.1.1.1 |
| 40 | PART / SESSION | RFC 9559,Section 5.1.8.1.1.1 |
| 30 | TRACK / SONG / CHAPTER | RFC 9559,Section 5.1.8.1.1.1 |
| 20 | SUBTRACK / MOVEMENT / SCENE | RFC 9559,Section 5.1.8.1.1.1 |
| 10 | SHOT | RFC 9559,Section 5.1.8.1.1.1 |
| 0 | Not valid for use as a Tags Target Type | RFC 9559,Section 5.1.8.1.1.1 |
IANA has created a new registry called the "Matroska Chapter Codec IDs" registry.The values correspond to the unsigned integerChapProcessCodecID,ChapterTranslateCodec, andTrackTranslateCodec values described inSection 5.1.7.1.4.15.¶
To register a new Chapter Codec ID in this registry, one needs a Chapter Codec ID value,a description, a Change Controller, anda Reference to a document describing the Chapter Codec ID.¶
The Chapter Codec IDs are to be allocated according to the "Specification Required" policy[RFC8126]. Available values range from 2-18446744073709551615.¶
Table 66 shows the initial contents of the "Matroska Chapter Codec IDs" registry. The Change Controller for the initial entries is the IETF.¶
| Chapter Codec ID | Description | Reference |
|---|---|---|
| 0 | Matroska Script | RFC 9559,Section 5.1.7.1.4.15 |
| 1 | DVD-menu | RFC 9559,Section 5.1.7.1.4.15 |
IANA has created a new registry called the "Matroska Projection Types" registry.The values correspond to the unsigned integerProjectionType value described inSection 5.1.4.1.28.42.¶
To register a new Projection Type in this registry, one needs a Projection Type value,a description, a Change Controller, andan optional Reference to a document describing the Projection Type.¶
The Projection Types are to be allocated according to the "First Come First Served" policy[RFC8126]. Available values range from 4-18446744073709551615.¶
Table 67 shows the initial contents of the "Matroska Projection Types" registry. The Change Controller for the initial entries is the IETF.¶
| Projection Type | Description | Reference |
|---|---|---|
| 0 | rectangular | RFC 9559,Section 5.1.4.1.28.42 |
| 1 | equirectangular | RFC 9559,Section 5.1.4.1.28.42 |
| 2 | cubemap | RFC 9559,Section 5.1.4.1.28.42 |
| 3 | mesh | RFC 9559,Section 5.1.4.1.28.42 |
IANA has created a new registry called the "Matroska Track Types" registry.The values correspond to the unsigned integerTrackType value described inSection 5.1.4.1.3.¶
To register a new Track Type in this registry, one needs a Track Type value,a description, a Change Controller, anda Reference to a document describing the Track Type.¶
The Track Types are to be allocated according to the "Specification Required" policy[RFC8126]. Available values range from 4-15, 19-31, and 34-18446744073709551615.¶
The value 0 is not valid for use as a Track Type.¶
Table 68 shows the initial contents of the "Matroska Track Types" registry.The Change Controller for the initial entries is the IETF.¶
| Track Type | Description | Reference |
|---|---|---|
| 0 | Not valid for use as a Track Type | RFC 9559,Section 5.1.4.1.3 |
| 1 | video | RFC 9559,Section 5.1.4.1.3 |
| 2 | audio | RFC 9559,Section 5.1.4.1.3 |
| 3 | complex | RFC 9559,Section 5.1.4.1.3 |
| 16 | logo | RFC 9559,Section 5.1.4.1.3 |
| 17 | subtitle | RFC 9559,Section 5.1.4.1.3 |
| 18 | buttons | RFC 9559,Section 5.1.4.1.3 |
| 32 | control | RFC 9559,Section 5.1.4.1.3 |
| 33 | metadata | RFC 9559,Section 5.1.4.1.3 |
IANA has created a new registry called the "Matroska Track Plane Types" registry.The values correspond to the unsigned integerTrackPlaneType value described inSection 5.1.4.1.30.4.¶
To register a new Track Plane Type in this registry, one needs a Track Plane Type value,a description, a Change Controller, andan optional Reference to a document describing the Track Plane Type.¶
The Track Plane Types are to be allocated according to the "First Come First Served" policy[RFC8126]. Available values range from 3-18446744073709551615.¶
Table 69 shows the initial contents of the "Matroska Track Plane Types" registry.The Change Controller for the initial entries is the IETF.¶
| Track Plane Type | Description | Reference |
|---|---|---|
| 0 | left eye | RFC 9559,Section 5.1.4.1.30.4 |
| 1 | right eye | RFC 9559,Section 5.1.4.1.30.4 |
| 2 | background | RFC 9559,Section 5.1.4.1.30.4 |
Matroska files and streams are found in three main forms: audio-video,audio-only, and (occasionally) stereoscopic video.¶
Historically, Matroska files and streams have used the following media types with an "x-" prefix.For better compatibility, a systemSHOULD be able to handle both formats.Newer systemsSHOULD NOT use the historic format and use the format that follows the format in[RFC6838] instead.¶
IANA has registered three media types per the templates (see[RFC6838]) in the following subsections.¶
N/A¶
As Matroska has evolved since 2002, many parts that were considered for use in the format were neverused and often incorrectly designed. Many of the elements that were defined then are notfound in any known files but were part of public specs. DivX also had a few custom elements thatwere designed for custom features.¶
In this appendix, we list elements that have a known ID thatSHOULD NOT be reused to avoid collidingwith existing files. These might be reassigned by IANA in the future if there are no more IDs for a given size.A short description of what each ID was used for is included, but the text is not normative.¶
\Segment\Cluster\BlockGroup\Slices\TimeSlice\Delay¶\Segment\Cluster\BlockGroup\Slices\TimeSlice\SliceDuration¶\Segment\Cluster\BlockGroup\ReferenceFrame¶\Segment\Cluster\BlockGroup\ReferenceFrame\ReferenceOffset¶BlockGroup element for this Smooth FF/RW video track to the containingBlockGroupelement. See[DivXTrickTrack].¶\Segment\Cluster\BlockGroup\ReferenceFrame\ReferenceTimestamp¶BlockGroup pointed to by ReferenceOffset, expressed in Track Ticks; seeSection 11.1. See[DivXTrickTrack].¶\Segment\Cluster\EncryptedBlock¶SimpleBlock (seeSection 10.2),but the data inside theBlock are Transformed (encrypted and/or signed).¶\Segment\Tracks\TrackEntry\TrackOffset¶Block's Timestamp, expressed in Matroska Ticks -- i.e., in nanoseconds; seeSection 11.1.This can be used to adjust the playback offset of a track.¶\Segment\Tracks\TrackEntry\TrackOverlay¶Track specified (in the u-integer).This means that when this track has a gap onSilentTracks, the overlay track should be used instead. The order of multipleTrackOverlay matters; the first one is the one that should be used.If the first one is not found, it should be the second, etc.¶\Segment\Tracks\TrackEntry\TrickTrackUID¶TrackUID of the Smooth FF/RW video in the paired EBML structure corresponding to this video track. See[DivXTrickTrack].¶\Segment\Tracks\TrackEntry\TrickTrackSegmentUID¶SegmentUUID of theSegment containing the track identified by TrickTrackUID. See[DivXTrickTrack].¶\Segment\Tracks\TrackEntry\TrickTrackFlag¶MasterTrackUID andMasterTrackSegUID should be present, andBlockGroups for this track must contain ReferenceFrame structures.Otherwise, TrickTrackUID and TrickTrackSegUID must be present if this track has a corresponding Smooth FF/RW track. See[DivXTrickTrack].¶\Segment\Tracks\TrackEntry\TrickMasterTrackUID¶TrackUID of the video track in the paired EBML structure that corresponds to this Smooth FF/RW track. See[DivXTrickTrack].¶\Segment\Tracks\TrackEntry\TrickMasterTrackSegmentUID¶SegmentUUID of theSegment containing the track identified by MasterTrackUID. See[DivXTrickTrack].¶\Segment\Attachments\AttachedFile\FileUsedStartTime¶TimestampScale. See[DivXWorldFonts].¶\Segment\Attachments\AttachedFile\FileUsedEndTime¶TimestampScale. See[DivXWorldFonts].¶\Segment\Tags\Tag\+SimpleTag\TagDefaultBogus¶TagDefault element with a bogus element ID; seeSection 5.1.8.1.2.4.¶