FOREWORD PROVIDED BY ISO
The systems part of this Recommendation†|†International Standard addresses the combining of one or more elementary streams of video and audio, as well as other data, into single or multiple streams which are suitable for storage or transmission. Systems coding follows the syntactical and semantic rules imposed by this specification and provides information to enable synchronized decoding of decoder buffers over a wide range of retrieval or receipt conditions.
System coding shall be specified in two forms: the Transport Stream and the Program Stream. Each is optimized for a different set of applications. Both the Transport Stream and Program Stream defined in this Recommendation†|†International Standard provide coding syntax which is necessary and sufficient to synchronize the decoding and presentation of the video and audio information, while ensuring that data buffers in the decoders do not overflow or underflow. Information is coded in the syntax using time stamps concerning the decoding and presentation of coded audio and visual data and time stamps concerning the delivery of the data stream itself. Both stream definitions are packet-oriented multiplexes.
The basic multiplexing approach for single video and audio elementary streams is illustrated in Figure 0-1. The video and audio data is encoded as described in ITU T Rec. H.262†|†ISO/IEC 13818-2 and ISO/IEC 13818-3. The resulting compressed elementary streams are packetized to produce PES packets. Information needed to use PES packets independently of either Transport Streams or Program Streams may be added when PES packets are formed. This information is not needed and need not be added when PES packets are further combined with system level information to form Transport Streams or Program Streams. This systems standard covers those processes to the right of the vertical dashed line.
Figure 0-1 -- Simplified overview of ITU T Rec. H.222.0†|†ISO/IEC 13818-1 scope
The Program Stream is analogous and similar to ISO/IEC 11172 Systems layer. It results from combining one or more streams of PES packets, which have a common time base, into a single stream.
For applications that require the elementary streams which comprise a single program to be in separate streams which are not multiplexed, the elementary streams can also be encoded as separate Program Streams, one per elementary stream, with a common time base. In this case the values encoded in the SCR fields of the various streams shall be consistent.
Like the single Program Stream, all elementary streams can be decoded with synchronization.
The Program Stream is designed for use in relatively error-free environments and is suitable for applications which may involve software processing of system information such as interactive multi-media applications. Program Stream packets may be of variable and relatively great length.
The Transport Stream combines one or more programs with one or more independent time bases into a single stream. PES packets made up of elementary streams that form a program share a common timebase. The Transport Stream is designed for use in environments where errors are likely, such as storage or transmission in lossy or noisy media. Transport Stream packets are 188 bytes in length.
Program and Transport Streams are designed for different applications and their definitions do not strictly follow a layered model. It is possible and reasonable to convert from one to the other; however, one is not a subset or superset of the other. In particular, extracting the contents of a program from a Transport Stream and creating a valid Program Stream is possible and is accomplished through the common interchange format of PES packets, but not all of the fields needed in a Program Stream are contained within the Transport Stream; some must be derived. The Transport Stream may be used to span a range of layers in a layered model, and is designed for efficiency and ease of implementation in high bandwidth applications.
The scope of syntactical and semantic rules set forth in the systems specification differ: the syntactical rules apply to systems layer coding only, and do not extend to the compression layer coding of the video and audio specifications; by contrast, the semantic rules apply to the combined stream in its entirety.
The systems specification does not specify the architecture or implementation of encoders or decoders, nor those of multiplexors or demultiplexors. However, bit stream properties do impose functional and performance requirements on encoders, decoders, multiplexors and demultiplexors. For instance, encoders must meet minimum clock tolerance requirements. Notwithstanding this and other requirements, a considerable degree of freedom exists in the design and implementation of encoders, decoders, multiplexors, and demultiplexors.
0.1 Transport Stream
The Transport Stream is a stream definition which is tailored for communicating or storing one or more programs of coded data according to ITU T Rec. H.262†|†ISO/IEC 13818-2 and ISO/IEC 13818-3 and other data in environments in which significant errors may occur. Such errors may be manifested as bit value errors or loss of packets.
Transport Streams may be either fixed or variable rate. In either case the constituent elementary streams may either be fixed or variable rate. The syntax and semantic constraints on the stream are identical in each of these cases. The Transport Stream rate is defined by the values and locations of Program Clock Reference (PCR) fields, which in general are separate PCR fields for each program.
There are some difficulties with constructing and delivering a Transport Stream containing multiple programs with independent time bases such that the overall bit rate is variable. Refer to 188.8.131.52 on page 13.
The Transport Stream may be constructed by any method that results in a valid stream. It is possible to construct Transport Streams containing one or more programs from elementary coded data streams, from Program Streams, or from other Transport Streams which may themselves contain one or more programs.
The Transport Stream is designed in such a way that several operations on a Transport Stream are possible with minimum effort. Among these are:
1. Retrieve the coded data from one program within the Transport Stream, decode it and present the decoded results as shown in Figure 0-2 on page xiii .
2. Extract the Transport Stream packets from one program within the Transport Stream and produce as output a different Transport Stream with only that one program as shown in Figure 0-3 on page xiii .
3. Extract the Transport Stream packets of one or more programs from one or more Transport Streams and produce as output a different Transport Stream (not illustrated).
4. Extract the contents of one program from the Transport Stream and produce as output a Program Stream containing that one program as shown in Figure 0-4 on page xiv .
5. Take a Program Stream, convert it into a Transport Stream to carry it over a lossy environment, and then recover a valid, and in certain cases, identical Program Stream.
Figure 0-2 on page xiii and Figure 0-3 on page xiii illustrate prototypical demultiplexing and decoding systems which take as input a Transport Stream. Figure 0-2 on page xiii illustrates the first case, where a Transport Stream is directly demultiplexed and decoded. Transport Streams are constructed in two layers: a system layer and a compression layer. The input stream to the Transport Stream decoder has a system layer wrapped about a compression layer. Input streams to the Video and Audio decoders have only the compression layer.
Operations performed by the prototypical decoder which accepts Transport Streams either apply to the entire Transport Stream ("multiplex-wide operations"), or to individual elementary streams ("stream-specific operations"). The Transport Stream system layer is divided into two sub-layers, one for multiplex-wide operations (the Transport Stream packet layer), and one for stream-specific operations (the PES packet layer).
A prototypical decoder for Transport Streams, including audio and video, is also depicted in Figure 0-2 on page xiii to illustrate the function of a decoder. The architecture is not unique -- some system decoder functions, such as decoder timing control, might equally well be distributed among elementary stream decoders and the channel specific decoder -- but this figure is useful for discussion. Likewise, indication of errors detected by the channel specific decoder to the individual audio and video decoders may be performed in various ways and such communication paths are not shown in the diagram. The prototypical decoder design does not imply any normative requirement for the design of a Transport Stream decoder. Indeed non-audio/video data is also allowed, but not shown.
Figure 0-2 -- Prototypical transport demultiplexing and decoding example
Figure 0-3 illustrates the second case, where a Transport Stream containing multiple programs is converted into a Transport Stream containing a single program. In this case the remultiplexing operation may necessitate the correction of Program Clock Reference (PCR) values to account for changes in the PCR locations in the bit stream.
Figure 0-3 -- Prototypical transport multiplexing example
Figure 0-4 on page xiv below illustrates a case in which an multi-program Transport Stream is first demultiplexed and then converted into a Program Stream.
Figure 0-4 -- Prototypical Transport Stream to Program Stream conversion
Figure 0-3 on page xiii and Figure 0-4 indicate that it is possible and reasonable to convert between different types and configurations of Transport Streams. There are specific fields defined in the Transport Stream and Program Stream syntax which facilitate the conversions illustrated. There is no requirement that specific implementations of demultiplexors or decoders include all of these functions.