Contents Page

səhifə	8/35
tarix	25.06.2016
ölçüsü	2.59 Mb.

1 ... 4 5 6 7 8 9 10 11 ... 35

6.3.4 Extension and user data

extension_start_code -- The extension_start_code is the bit string ‘000001B5’ in hexadecimal. It identifies the beginning of extensions beyond ISO/IEC 11172-2.

6.3.4.1 User data

user_data_start_code -- The user_data_start_code is the bit string ‘000001B2’ in hexadecimal. It identifies the beginning of user data. The user data continues until receipt of another start code.

user_data -- This is an 8 bit integer, an arbitrary number of which may follow one another. User data is defined by users for their specific applications. In the series of consecutive user_data bytes there shall not be a string of 23 or more consecutive zero bits.

6.3.5 Sequence extension

extension_start_code_identifier -- This is an 4-bit integer which identifies the extension. See Table 6-2.

profile_and_level_indication -- This is an 8-bit integer used to signal the profile and level identification. The meaning of the bits is given in clause 8.

NOTE - In a scalable hierarchy the bitstreams of each layer may set profile_and_level_indication to a different value as specified in clause 8.

progressive_sequence -- When set to ‘1’ the coded video sequence contains only progressive frame-pictures. When progressive_sequence is set to ‘0’ the coded video sequence may contain both frame-pictures and field-pictures, and frame-picture may be progressive or interlaced frames.

chroma_format -- This is a two bit integer indicating the chrominance format as defined in the Table 6-5.

Table 6-5. Meaning of chroma_format

chroma_format	Meaning
00	reserved
01	4:2:0
11	4:4:4

horizontal_size_extension -- This 2 bit integer is the 2 most significant bits from horizontal_size.

vertical_size_extension -- This 2 bit integer is the 2 most significant bits from vertical_size.

bit_rate_extension -- This 12 bit integer is the 12 most significant bits from bit_rate.

vbv_buffer_size_extension -- This 8 bit integer is the 8 most significant bits from vbv_buffer_size.

low_delay -- This flag, when set to ‘1’, indicates that the sequence does not contain any B-pictures, that the frame reordering delay is not present in the VBV description and that the bitstream may contain “big pictures”, i.e. that C.7 of the VBV may apply.

When set to ‘0’, it indicates that the sequence may contain B-pictures, that the frame reordering delay is present in the VBV description and that bitstream shall not contain big pictures, i.e. C.7 of the VBV does not apply.

This flag is not used during the decoding process and therefore can be ignored by decoders, but it is necessary to define and verify the compliance of low-delay bitstreams.

frame_rate_extension_n -- This is a 2 bit integer used to determine the frame_rate. See frame_rate_code.

frame_rate_extension_d -- This is a 5 bit integer used to determine the frame_rate. See frame_rate_code.

6.3.6 Sequence display extension

This specification does not define the display process. The information in this extension does not affect the decoding process and may be ignored by decoders that conform to this specification.

video_format -- This is a three bit integer indicating the representation of the pictures before being coded in accordance with this specification. Its meaning is defined in Table 6-6. If the sequence_display_extension() is not present in the bitstream then the video format may be assumed to be “Unspecified video format”.

Table 6-6. Meaning of video_format

video_format	Meaning
000	component
001	PAL
111	reserved

colour_description -- A flag which if set to ‘1’ indicates the presence of colour_primaries, transfer_characteristics and matrix_coefficients in the bitstream.

colour_primaries -- This 8-bit integer describes the chromaticity coordinates of the source primaries, and is defined in Table 6-7.

Table 6-7. Colour Primaries

Value	Primaries
0	(forbidden)
1	Recommendation ITU R BT.709 primary x y green 0,300 0,600 blue 0,150 0,060 red 0,640 0,330 white D65 0,3127 0,3290
8-255	reserved

In the case that sequence_display_extension() is not present in the bitstream or colour_description is zero the chromaticity is assumed to be that corresponding to colour_primaries having the value 1.

transfer_characteristics -- This 8-bit integer describes the opto-electronic transfer characteristic of the source picture, and is defined in Table 6-8.

Table 6-8. Transfer Characteristics

Value	Transfer Characteristic
0	(forbidden)
1	Recommendation ITU R BT.709 V = 1,099 L_c^0,45- 0,099 for 1³ L_c ³ 0,018 V = 4,500 L_c for 0,018> L_c ³ 0
8	Linear transfer characteristics i.e. V = L_c
9-255	reserved

In the case that sequence_display_extension() is not present in the bitstream or colour_description is zero the transfer characteristics are assumed to be those corresponding to transfer_characteristics having the value 1.

matrix_coefficients -- This 8-bit integer describes the matrix coefficients used in deriving luminance and chrominance signals from the green, blue, and red primaries, and is defined in Table 6-9.

In this table:

E’_Y is analogue with values between 0 and 1

E’_PB and E’_PR are analogue between the values -0,5 and 0,5

E’_R,E’_G and E’_Bare analogue with values between 0 and 1

Y, Cb and Cr are related to E’_Y, E’_PB and E’_PR by the following formulae.

Y = ( 219 * E’_Y ) + 16.

Cb = ( 224 * E’_PB ) + 128.

Cr = ( 224 * E’_PR ) + 128.

NOTE - The decoding process given by this specification limits output sample values for Y, Cr and Cb to the range [0:255]. Thus sample values outside the range implied by the above equations may occasionally occur at the output of the decoding process. In particular the sample values 0 and 255 may occur.

Table 6-9. Matrix Coefficients

Value	Matrix
0	(forbidden)
1	Recommendation ITU R BT.709 E¢_Y = 0,7154 E¢_G + 0,0721 E¢_B + 0,2125 E¢_R E¢_PB = -0,386 E¢_G + 0,500 E¢_B -0,115 E¢_R E¢_PR = -0,454 E¢_G - 0,046 E¢_B + 0,500 E¢_R
8-255	reserved

In the case that sequence_display_extension() is not present in the bitstream or colour_description is zero the matrix coefficients are assumed to be those corresponding to matrix_coefficients having the value 1.

display_horizontal_size -- See display_vertical_size.

display_vertical_size -- display_horizontal_size and display_vertical_size together define a rectangle which may be considered as the “intended display’s” active region. If this rectangle is smaller than the encoded frame size then the display process may be expected to display only a portion of the encoded frame. Conversely if the display rectangle is larger than the encoded frame size then the display process may be expected to display the reconstructed frames on a portion of the display device rather than on the whole display device.

display_horizontal_size shall be in the same units as horizontal_size (samples of the encoded frames).

display_vertical_size shall be in the same units as vertical_size (lines of the encoded frames).

display_horizontal_size and display_vertical_size do not affect the decoding process but may be used by the display process that is not standardised in this specification.

6.3.7 Sequence scalable extension

It is a syntactic restriction that if a sequence_scalable_extension() is present in the bitstream following a given sequence_extension() then sequence_scalable_extension() shall follow every other occurrence of sequence_extension(). Thus a bitstream is either scalable or it is not scalable. It is not possible to mix scalable and non-scalable coding within a sequence.

scalable_mode -- The scalable_mode indicates the type of scalability used in the video sequence. If no sequence_scalable_extension() is present in the bitstream then no scalability is used for that sequence. scalable_mode also indicates the macroblock_type tables to be used. However in the case of spatial scalability if no picture_spatial_scalable_extension() is present for a given picture then that picture shall be decoded in a non-scalable manner (i.e. as if sequence_scalable_extension() had not been present).

Table 6-10. Definition of scalable_mode

scalable_mode	Meaning	picture_spatial_scalable- _extension()	macroblock_type tables
sequence_scalable_extension() not present				B-2, B-3 and B-4
00	data partitioning		B-2, B-3 and B-4
01	spatial scalability	present	B-5, B-6 and B-7
		not present	B-2, B-3 and B-4
10	SNR scalability		B-8
11	temporal scalability		B-2, B-3 and B-4

layer_id -- This is an integer which identifies the layers in a scalable hierarchy. The base layer always has layer_id = 0. However the base layer of a scalable hierarchy does not carry a sequence_scalable_extension() and hence layer_id, except in the case of data partitioning. Each successive layer has a layer_id which is one greater than the layer for which it is an enhancement.

In the case of data partitioning layer_id shall be zero for partition zero and layer_id shall be one for partition one.

lower_layer_prediction_horizontal_size -- this is a 14-bit integer indicating the horizontal size of the lower layer frame which is used for prediction. This shall contain the value contained in horizontal_size (horizontal_size_value and horizontal_size_extension) in the lower layer bitstream.

lower_layer_prediction_vertical_size -- this is a 14-bit integer indicating the vertical size of the lower layer frame which is used for prediction. This shall contain the value contained in vertical_size (vertical_size_value and vertical_size_extension) in the lower layer bitstream.

horizontal_subsampling_factor_m -- This affects the spatial scalable upsampling process, as defined in 7.7.2. The value zero is forbidden.

horizontal_subsampling_factor_n -- This affects the spatial scalable upsampling process, as defined in 7.7.2. The value zero is forbidden.

vertical_subsampling_factor_m -- This affects the spatial scalable upsampling process, as defined in 7.7.2. The value zero is forbidden.

vertical_subsampling_factor_n -- This affects the spatial scalable upsampling process, as defined in 7.7.2. The value zero is forbidden.

picture_mux_enable -- If set to 1, picture_mux_order and picture_mux_factor are used for remultiplexing prior to display.

mux_to_progressive_sequence -- This flag when set to ‘1’ indicates that the decoded pictures corresponding to the two layers shall be temporally multiplexed to generate a progressive sequence for display. When the temporal multiplexing is intended to generate an interlaced sequence this flag shall be ‘0’.

picture_mux_order -- It denotes number of enhancement layer pictures prior to the first base layer picture. It thus assists remultiplexing of pictures prior to display as it contains information for inverting the demultiplexing performed at the encoder.

picture_mux_factor -- It denotes number of enhancement layer pictures between consecutive base layer pictures to allow correct remultiplexing of base and enhancement layers for display. It also assists in remultiplexing of pictures prior to display as it contains information for inverting the temporal demultiplexing performed at the encoder. The value ‘000’ is reserved.

6.3.8 Group of pictures header

group_start_code -- The group_start_code is the bit string ‘000001B8’ in hexadecimal. It identifies the beginning of a group of pictures header.

time_code -- This is a 25-bit integer containing the following: drop_frame_flag, time_code_hours, time_code_minutes, marker_bit, time_code_seconds and time_code_pictures as shown in Table 6-11. The parameters correspond to those defined in the IEC standard publication 461 for “time and control codes for video tape recorders” (see Bibliography, Annex G). The time code refers to the first picture after the group of pictures header that has a temporal_reference of zero. The drop_frame_flag can be set to either ‘0’ or ‘1’. It may be set to ‘1’ only if the frame rate is 29,97Hz. If it is ‘0’ then pictures are counted assuming rounding to the nearest integral number of pictures per second, for example 29,97Hz would be rounded to and counted as 30Hz. If it is ‘1’ then picture numbers 0 and 1 at the start of each minute, except minutes 0, 10, 20, 30, 40, 50 are omitted from the count.

NOTE - The information carried by time_code plays no part in the decoding process.

Table 6-11 — time_code

time_code	range of value	No. of bits	Mnemonic
drop_frame_flag		1	uimsbf
time_code_hours	0 - 23	5	uimsbf
time_code_pictures	0 - 59	6	uimsbf

closed_gop -- This is a one-bit flag which indicates the nature of the predictions used in the first consecutive B-pictures (if any) immediately following the first coded I-frame following the group of picture header .

closed_gop is set to ‘1’ to indicate that these B-pictures have been encoded using only backward prediction or intra coding.

This bit is provided for use during any editing which occurs after encoding. If the previous pictures have been removed by editing, broken_link may be set to ‘1’ so that a decoder may avoid displaying these B-Pictures following the first I-Picture following the group of picture header. However if the closed_gop bit is set to ‘1’, then the editor may choose not to set the broken_link bit as these B-Pictures can be correctly decoded.

broken_link -- This is a one-bit flag which shall be set to ‘0’ during encoding. It is set to ‘1’ to indicate that the first consecutive B-Pictures (if any) immediately following the first coded I-frame following the group of picture header may not be correctly decoded because the reference frame which is used for prediction is not available (because of the action of editing).

A decoder may use this flag to avoid displaying frames that cannot be correctly decoded.

6.3.9 Picture header

picture_start_code -- The picture_start_code is a string of 32 bits having the value 00000100 in hexadecimal.

temporal_reference -- The temporal_reference is a 10-bit unsigned integer associated with each coded picture.

The following specification applies when low_delay is equal to zero.

When a frame is coded as two field pictures, the temporal_reference associated with each coded picture shall be the same. The temporal_reference of each coded frame shall increment by one modulo 1024 when examined in display order at the output of the decoding process, except when a group of pictures header occurs. After a group of pictures header, the temporal_reference of the first frame in display order shall be set to zero.

The following specification applies when low_delay is equal to one.

When low_delay is equal to one, there may be situations where the VBV buffer shall be re-examined several times before removing a coded picture (referred to as a big picture) from the VBV buffer.

If there is a big picture, the temporal_reference of the picture immediately following the big picture shall be equal to the temporal_reference of the big picture incremented by N+1 modulo 1024, where N is the number of times that the VBV buffer is re-examined (N>0). If the big picture is immediately followed by a group of pictures header, the temporal_reference of the first coded picture after the group of pictures header shall be set to N.

The temporal_reference of a picture that does not immediately follow a big picture follows the specification for the case when low delay is equal to zero.

NOTE - If the big picture is the first field of a frame coded with field pictures, then the temporal_reference of the two field pictures of that coded frame are not identical.

picture_coding_type -- The picture_coding_type identifies whether a picture is an intra-coded picture(I), predictive-coded picture(P) or bidirectionally predictive-coded picture(B). The meaning of picture_coding_type is defined in Table 6-12.

NOTE - Intra-coded pictures with only DC coefficients (D-pictures) that may be used in ISO/IEC 11172-2 are not supported by this specification.

Table 6-12 --- picture_coding_type

picture_coding_type	coding method
000	forbidden
001	intra-coded (I)
111	reserved

vbv_delay -- The vbv_delay is a 16-bit unsigned integer. In all cases other then when vbv_delay has the value hexadecimal FFFF, the value of vbv_delay is the number of periods of a 90 kHz clock derived from the 27 MHz system clock that the VBV shall wait after receiving the final byte of the picture start code before decoding the picture. vbv_delay shall be coded to represent the delay as specified above or it shall be coded with the value hexadecimal FFFF. If any vbv_delay field in a sequence is coded with hexadecimal FFFF then all of them shall be coded with this value. If vbv_delay takes the value hexadecimal FFFF, input of data to the VBV buffer is defined in C.3.2 of annex C, otherwise input to the VBV buffer is defined in clause C.3.1.

If low_delay is equal ‘1’ and if the bitstream contains big pictures, the vbv_delay values encoded in the picture_header() of big pictures may be wrong if not equal to hexadecimal FFFF.

NOTE - There are several ways of calculating vbv_delay in an encoder.

In all cases it may be calculated by noting that the end-to-end delay through the encoder and decoder buffer is constant for all pictures. The encoder is capable of knowing the delay experienced by the relevant picture start code in the encoder buffer and the total end-to-end delay. Therefore the value encoded in vbv_delay (the decoder buffer delay of the picture start code) is calculated as the total delay less the delay of the corresponding picture start code in the encoder buffer measured in periods of a 90 kHz clock derived from the 27 MHz system clock.

Alternatively, for constant bitrate operation only, vbv_delay may be calculated from the state of the VBV as follows:

vbv_delay_n = 90 000 * B_n^* / R

where:

n > 0

B_n^* = VBV occupancy, measured in bits, immediately before removing picture n from the buffer but after removing any header(s), user data and stuffing that immediately precedes the data elements of picture n.

R = the actual bitrate (i.e. to full accuracy rather than the quantised value given by bit_rate in the sequence header.)

An equivalent method of calculating vbv_delay for variable bitrate streams can be derived from the equation in C.3.1. This will be in the form of a recurrence relation for the vbv_delay given the previous vbv_delay, the decoding times of the current and previous pictures, and the number of bytes in the previous picture. This method can be applied if, at the time vbv_delay is encoded, the average bitrate of the transfer of the picture data of the previous picture is known.

full_pel_forward_vector -- This flag that is used in ISO/IEC 11172-2 is not used by this specification. It shall have the value ‘0’.

forward_f_code -- This 3 bit string (which is used in ISO/IEC 11172-2) is not used by this specification. It shall have the value ‘111’.

full_pel_backward_vector -- This flag that is used in ISO/IEC 11172-2 is not used by this specification. It shall have the value ‘0’.

backward_f_code -- This 3 bit string (which is used in ISO/IEC 11172-2) is not used by this specification. It shall have the value ‘111’.

extra_bit_picture -- A bit indicates the presence of the following extra information. If extra_bit_picture is set to ‘1’, extra_information_picture will follow it. If it is set to ‘0’, there are no data following it. extra_bit_picture shall be set to ‘0’, the value ‘1’ is reserved for possible future extensions defined by ITU-T|ISO/IEC.

extra_information_picture -- Reserved. A decoder conforming to this specification that encounters extra_information_picture in a bitstream shall ignore it (i.e. remove from the bitstream and discard). A bitstream conforming to this specification shall not contain this syntax element.

6.3.10 Picture coding extension

f_code[s][t] -- A 4 bit unsigned integer taking values 1 through 9, or 15. The value zero is forbidden and the values 10 through 14 are reserved. It is used in the decoding of motion vectors, see 7.6.3.1.

In an I-picture in which concealment_motion_vectors is zero f_code[s][t] is not used (since motion vectors are not used) and shall take the value 15 (all ones).

Similarly, in an I-picture or a P-picture f_code[1][t] is not used in the decoding process (since it refers to backwards motion vectors) and shall take the value 15 (all ones).

See Table 7-7 for the meaning of the indices; s and t.

intra_dc_precision -- This is a 2-bit integer defined in the Table 6-13.

Table 6-13 Intra DC precision

intra_dc_precision	Precision (bits)
00	8
01	9
11	11

The inverse quantisation process for the Intra DC coefficients is modified by this parameter as explained in 7.4.1.

picture_structure -- This is a 2-bit integer defined in the Table 6-14.

Table 6-14 Meaning of picture_structure

picture_structure	Meaning
00	reserved
01	Top Field
11	Frame picture

When a frame is encoded in the form of two field pictures both fields must be of the same picture_coding_type, except where the first encoded field is an I-picture in which case the second may be either an I-picture or a P-picture.

The first encoded field of a frame may be a top-field or a bottom field, and the next field must be of opposite parity.

When a frame is encoded in the form of two field pictures the following syntax elements may be set independently in each field picture:

• f_code[0][0], f_code[0][1]

• f_code[1][0], f_code[1][1]

• intra_dc_precision, concealment_motion_vectors, q_scale_type

• intra_vlc_format, alternate_scan

top_field_first -- The meaning of this element depends upon picture_structure, progressive_sequence and repeat_first_field.

If progressive_sequence is equal to ‘0’, this flag indicates what field of a reconstructed frame is output first by the decoding process:

In a field picture top_field_first shall have the value ‘0’, and the only field output by the decoding process is the decoded field picture.

In a frame picture top_field_first being set to ‘1’ indicates that the top field of the reconstructed frame is the first field output by the decoding process. top_field_first being set to ‘0’ indicates that the bottom field of the reconstructed frame is the first field output by decoding process

If progressive_sequence is equal to ‘1’, this flag, combined with repeat_first_field, indicates how many times (one, two or three) the reconstructed frame is output by the decoding process.

If repeat_first_field is set to 0, top_field_first shall be set to ‘0’. In this case the output of the decoding process corresponding to this reconstructed frame consists of one progressive frame.

If top_field_first is set to 0 and repeat_first_field is set to ‘1’, the output of the decoding process corresponding to this reconstructed frame consists of two identical progressive frames.

If top_field_first is set to 1 and repeat_first_field is set to ‘1’, the output of the decoding process corresponding to this reconstructed frame consists of three identical progressive frames.

frame_pred_frame_dct -- If this flag is set to ‘1’ then only frame-DCT and frame prediction are used. In a field picture it shall be ‘0’. frame_pred_frame_dct shall be ‘1’ if progressive_frame is ‘1’. This flag affects the syntax of the bitstream.

concealment_motion_vectors -- This flag has the value ‘1’ to indicate that motion vectors are coded in intra macroblocks. This flag has the value ‘0’ to indicate that no motion vectors are coded in intra macroblocks.

q_scale_type -- This flag affects the inverse quantisation process as described in 7.4.2.2.

intra_vlc_format -- This flag affects the decoding of transform coefficient data as described in 7.2.1.

alternate_scan -- This flag affects the decoding of transform coefficient data as described in 7.3.

repeat_first_field -- This flag is applicable only in a frame picture, in a field picture it shall be set to zero and does not affect the decoding process.

If progressive_sequence is equal to 0 and progressive_frame is equal to 0, repeat_first_field shall be zero, and the output of the decoding process corresponding to this reconstructed frame consists of two fields.

If progressive_sequence is equal to 0 and progressive_frame is equal to 1:

If this flag is set to 0, the output of the decoding process corresponding to this reconstructed frame consists of two fields. The first field (top or bottom field as identified by top_field_first) is followed by the other field.

If it is set to 1, the output of the decoding process corresponding to this reconstructed frame consists of three fields. The first field (top or bottom field as identified by top_field_first) is followed by the other field, then the first field is repeated.

If progressive_sequence is equal to 1:

If this flag is set to 0, the output of the decoding process corresponding to this reconstructed frame consists of one frame.

If it is set to 1, the output of the decoding process corresponding to this reconstructed frame consists of two or three frames, depending on the value of top_field_first.

chroma_420_type -- If chroma_format is “4:2:0”, the value of chroma_420_type shall be the same as progressive_frame; else chroma_420_type has no meaning and shall be equal to zero. This flag exists for historical reasons.

progressive_frame -- If progressive_frame is set to 0 it indicates that the two fields of the frame are interlaced fields in which an interval of time of the field period exists between (corresponding spatial samples) of the two fields. In this case the following restriction applies:

• repeat_first_field shall be zero (two field duration).

If progressive_frame is set to 1 it indicates that the two fields (of the frame) are actually from the same time instant as one another. In this case a number of restrictions to other parameters and flags in the bitstream apply:

• picture_structure shall be “Frame”

• frame_pred_frame_dct shall be 1

progressive_frame is used when the video sequence is used as the lower layer of a spatial scalable sequence. Here it affects the up-sampling process used in forming a prediction in the enhancement layer from the lower layer.

composite_display_flag -- This flag is set to 1 to indicate that the following fields that are of use when the input pictures have been coded as (analogue) composite video prior to encoding into a bitstream that complies with this specification. If it is set to 0 then these parameters do not occur in the bitstream.

The information relates to the picture that immediately follows the extension. In the case that this picture is a frame picture the information relates to the first field of that frame. The equivalent information for the second field may be derived (there is no way to represent it in the bitstream).

NOTES

1 The various syntactic elements that are included in the bitstream if composite_display_flag is ‘1’ are not used in the decoding process.

2 repeat_first_field will cause a composite video field to be repeated out of the 4 field or 8 field sequence. It is recommended that repeat_first_field and composite_display_flag are not both set simultaneously.

v_axis -- A 1-bit integer used only when the bitstream represents a signal that had previously been encoded according to PAL systems. v_axis is set to 1 on a positive sign, v_axis is set to 0 otherwise.

field_sequence -- A 3-bit integer which defines the number of the field in the eight field sequence used in PAL systems or the four field sequence used in NTSC systems as defined in the Table 6-15.

Table 6-15 Definition of field_sequence.

field sequence	frame	field
000	1	1
001	1	2
111	4	8

sub_carrier -- This is a 1-bit integer. Set to 0 means the sub-carrier/line frequency relationship is correct. When set to 1 the relationship is not correct.

burst_amplitude -- This is a 7-bit integer defining the burst amplitude (for PAL and NTSC only). The amplitude of the sub-carrier burst is quantised as a Recommendation ITU R BT.601 luminance signal, with the MSB omitted.

sub_carrier_phase -- This is an 8-bit integer defining the phase of the reference sub-carrier at the field-synchronisation datum with respect, to field start as defined in Recommendation ITU R BT.470. See Table 6-16.

Table 6-16 Definition of sub_carrier_phase.

sub_carrier_phase	Phase
0	([360^o÷256] * 0)
1	([360^o÷256] * 1)
255	([360^o÷256] * 255)

6.3.11 Quant matrix extension

Each quantisation matrix has a default set of values. When a sequence_header_code is decoded all matrices shall be reset to their default values. User defined matrices may be downloaded and this can occur in a sequence_header() or in a quant_matrix_extension().

With 4:2:0 data only two matrices are used, one for intra blocks the other for non-intra blocks.

With 4:2:2 or 4:4:4 data four matrices are used. Both an intra and a non-intra matrix are provided for both luminance blocks and for chrominance blocks. Note however that it is possible to download the same user defined matrix into both the luminance and chrominance matrix at the same time.

The default matrix for intra blocks (both luminance and chrominance) is:

8	16	19	22	26	27	29	34
16	16	22	24	27	29	34	37
27	29	35	38	46	56	69	83

The default matrix for non-intra blocks (both luminance and chrominance) is:

16	16	16	16	16	16	16	16
16	16	16	16	16	16	16	16
16	16	16	16	16	16	16	16

load_intra_quantiser_matrix -- This is a one-bit flag which is set to ‘1’ if intra_quantiser_matrix follows. If it is set to ‘0’ then there is no change in the values that shall be used.

intra_quantiser_matrix -- This is a list of sixty-four 8-bit unsigned integers. The new values, encoded in the default zigzag scanning order as described in 7.3.1, replace the previous values. The first value shall always be 8. For all of the 8-bit unsigned integers, the value zero is forbidden. With 4:2:2 and 4:4:4 data the new values shall be used for both the luminance intra matrix and the chrominance intra matrix. However the chrominance intra matrix may subsequently be loaded with a different matrix.

load_non_intra_quantiser_matrix -- This is a one-bit flag which is set to ‘1’ if non_intra_quantiser_matrix follows. If it is set to ‘0’ then there is no change in the values that shall be used.

non_intra_quantiser_matrix -- This is a list of sixty-four 8-bit unsigned integers. The new values, encoded in the default zigzag scanning order as described in 7.3.1, replace the previous values. For all the 8-bit unsigned integers, the value zero is forbidden. With 4:2:2 and 4:4:4 data the new values shall be used for both the luminance non-intra matrix and the chrominance non-intra matrix. However the chrominance non-intra matrix may subsequently be loaded with a different matrix.

load_chroma_intra_quantiser_matrix -- This is a one-bit flag which is set to ‘1’ if chroma_intra_quantiser_matrix follows. If it is set to ‘0’ then there is no change in the values that shall be used. If chroma_format is “4:2:0” this flag shall take the value ‘0’.

chroma_intra_quantiser_matrix -- This is a list of sixty-four 8-bit unsigned integers. The new values, encoded in the default zigzag scanning order as described in 7.3.1, replace the previous values. The first value shall always be 8. For all of the 8-bit unsigned integers, the value zero is forbidden.

load_chroma_non_intra_quantiser_matrix -- This is a one-bit flag which is set to ‘1’ if chroma_non_intra_quantiser_matrix follows. If it is set to ‘0’ then there is no change in the values that shall be used. If chroma_format is “4:2:0” this flag shall take the value ‘0’.

chroma_non_intra_quantiser_matrix -- This is a list of sixty-four 8-bit unsigned integers. The new values, encoded in the default zigzag scanning order as described in 7.3.1, replace the previous values. For all the 8-bit unsigned integers, the value zero is forbidden.

6.3.12 Picture display extension

This specification does not define the display process. The information in this extension does not affect the decoding process and may be ignored by decoders that conform to this specification.

The picture display extension allows the position of the display rectangle whose size is specified in sequence_display_extension() to be moved on a picture-by-picture basis. One application for this is the implementation of pan-scan.

frame_centre_horizontal_offset -- This is a 16-bit signed integer giving the horizontal offset in units of 1/16th sample. A positive value shall indicate that the centre of the reconstructed frame lies to the right of the centre of the display rectangle.

frame_centre_vertical_offset -- This is a 16-bit signed integer giving the vertical offset in units of 1/16th sample. A positive value shall indicate that the centre of the reconstructed frame lies below the centre of the display rectangle.

The dimensions of the display rectangular region are defined in the sequence_display_extension(). The coordinates of the region within the coded picture are defined in the picture_display_extension().

The centre of the reconstructed frame is the centre of the rectangle defined by horizontal_size and vertical_size.

Since (in the case of an interlaced sequence) a coded picture may relate to one, two or three decoded fields the picture_display_extension() may contain up to three offsets.

The number of frame centre offsets in the picture_display_extension() shall be defined as follows:

if ( progressive_sequence == 1) {

if ( repeat_first_field == ‘1’ ) {

if ( top_field_first == ‘1’ )

number_of_frame_centre_offsets = 3

else

number_of_frame_centre_offsets = 2

} else {

number_of_frame_centre_offsets = 1

}

} else {

if (picture_structure == “field”) {

number_of_frame_centre_offsets = 1

} else {

if (repeat_first_field == ‘1’ )

number_of_frame_centre_offsets = 3

else

number_of_frame_centre_offsets = 2

}

A picture_display_extension() shall not occur unless a sequence_display_extension() followed the previous sequence_header().

In the case that a given picture does not have a picture_display_extension() then the most recently decoded frame centre offset shall be used. Note that each of the missing frame centre offsets have the same value (even if two or three frame centre offsets would have been contained in the picture_display_extension() had been present). Following a sequence_header() the value zero shall be used for all frame centre offsets until a picture_display_extension() defines non-zero values.

Figure 6-16 illustrates the picture display parameters. As shown the frame centre offsets contained in the picture_display_extension() shall specify the position of the centre of the reconstructed frame from the centre of the display rectangle.

NOTES -

1 The display rectangle may also be larger than the reconstructed frame.

2 Even in a field picture the frame_centre_vertical_offset still represents the offset of the centre of the frame in 1/16^ths of a frame line (not a line in the field).

3 In the example of Figure 6-16 both frame_centre_horizontal_offset and frame_centre_vertical_offset have negative values.

Figure 6-16. Frame centre offset parameters

6.3.12.1 Pan-scan

The frame centre offsets may be used to implement pan-scan in which a rectangular region is defined which may be panned around the entire reconstructed frame.

By way of example only; this facility may be used to identify a 3/4 aspect ratio window in a 9/16 coded picture format. This would allow a decoder to produce usable pictures for a conventional definition television set from an encoded format intended for enhanced definition. The 3/4 aspect ratio region is intended to contain the “most interesting” region of the picture.

The 3/4 region is defined by display_horizontal_size and display_vertical_size. The 9/16 frame size is defined by horizontal_size and vertical_size.

6.3.13 Picture temporal scalable extension

NOTE - See also 7.9.

reference_select_code -- This is a 2-bit code that identifies reference frames or reference fields for prediction depending on the picture type.

forward_temporal_reference -- A 10 bit unsigned integer value which indicates temporal reference of the lower layer frame to be used to provide the forward prediction. If the lower layer indicates temporal reference with more than 10 bits, the least significant bits are encoded here. If the lower layer indicates temporal reference with fewer than 10 bits, all bits are encoded here and the more significant bits shall be set to zero.

backward_temporal_reference -- A 10 bit unsigned integer value which indicates temporal reference of the lower layer frame to be used to provide the backward prediction. If the lower layer indicates temporal reference with more than 10 bits, the least significant bits are encoded here. If the lower layer indicates temporal reference with fewer than 10 bits, all bits are encoded here and the more significant bits shall be set to zero.

6.3.14 Picture spatial scalable extension

lower_layer_temporal_reference -- A 10 bit unsigned integer value which indicates temporal reference of the lower layer frame to be used to provide the prediction. If the lower layer indicates temporal reference with more than 10 bits, the least significant bits are encoded here. If the lower layer indicates temporal reference with fewer than 10 bits, all bits are encoded here and the more significant bits shall be set to zero.

lower_layer_horizontal_offset -- This 15 bit signed (twos complement) integer specifies the horizontal offset (of the top left hand corner) of the upsampled lower layer frame relative to the enhancement layer picture. It is expressed in units of the enhancement layer picture sample width. If the chrominance format is 4:2:0 or 4:2:2 then this parameter shall be an even number.

lower_layer_vertical_offset -- This 15 bit signed (twos complement) integer specifies the vertical offset (of the top left hand corner) of the upsampled lower layer picture relative to the enhancement layer picture. It is expressed in units of the enhancement layer picture sample height. If the chrominance format is 4:2:0 then this parameter shall be an even number.

spatial_temporal_weight_code_table_index -- This 2 bit integer indicates which table of spatial temporal weight codes is to be used as defined in 7.7. Permissible values of spatial_temporal_weight_code_table_index are defined in Table 7-21.

lower_layer_progressive_frame -- This flag shall be set to 0 if the lower layer frame is interlaced and shall be set to ‘1’ if the lower layer frame is progressive. The use of this flag in the spatial scalable upsampling process is defined in 7.7.

lower_layer_deinterlaced_field_select -- This flag affects the spatial scalable upsampling process, as defined in 7.7.

6.3.15 Copyright extension

extension_start_code_identifier -- This is a 4-bit integer which identifies the extension. See Table 6-2.

copyright_flag -- This is a one bit flag. When copyright_flag is set to ‘1’, it indicates that the source video material encoded in all the coded pictures following the copyright extension, in coding order, up to the next copyright extension or end of sequence code, is copyrighted. The copyright_identifier and copyright_number identify the copyrighted work. When copyright_flag is set to ‘0’, it does not indicate whether the source video material encoded in all the coded pictures following the copyright extension, in coding order, is copyrighted or not.

copyright_identifier -- This is a 8-bit integer which identifies a Registration Authority as designated by ISO/IEC JTC1/SC29. Value zero indicates that this information is not available. The value of copyright_number shall be zero when copyright_identifier is equal to zero.

When copyright_flag is set to ‘0’, copyright_identifier has no meaning and shall have the value 0.

original_or_copy -- This is a one bit flag. It is set to ‘1’ to indicate that the material is an original, and set to ‘0’ to indicate that it is a copy.

reserved -- This is a 7-bit integer, reserved for future extension. It shall have the value zero.

copyright_number_1 -- This is a 20-bit integer, representing bits 44 to 63 of copyright_number.

copyright_number_2 -- This is a 22-bit integer, representing bits 22 to 43 of copyright_number.

copyright_number_3 -- This is a 22-bit integer. representing bits 0 to 21 of copyright_number.

copyright_number -- This is a 64-bit integer, derived from copyright_number_1, copyright_number_2, and copyright_number_3 as follows:

copyright_number = (copyright_number_1 << 44) + (copyright_number_2 << 22) + copyright_number_3.

The meaning of copyright_number is defined only when copyright_flag is set to ‘1’. In this case, the value of copyright_number identifies uniquely the copyrighted work marked by the copyrighted extension and is provided by the Registration Authority identified by copyright_identifier. The value 0 for copyright_number indicates that the identification number of the copyrighted work is not available.

When copyright_flag is set to ‘0’, copyright_number has no meaning and shall have the value 0.

6.3.16 Slice

slice_start_code -- The slice_start_code is a string of 32-bits. The first 24-bits have the value 000001 in hexadecimal and the last 8-bits are the slice_vertical_position having a value in the range 01 through AF hexadecimal inclusive.

slice_vertical_position -- This is given by the last eight bits of the slice_start_code. It is an unsigned integer giving the vertical position in macroblock units of the first macroblock in the slice.

In large pictures (when the vertical size of the frame is greater than 2800 lines) the slice vertical position is extended by the slice_vertical_position_extension.

The macroblock row may be calculated as follows:

if ( vertical_size > 2800 )

mb_row = (slice_vertical_position_extension << 7) + slice_vertical_position - 1;

else

mb_row = slice_vertical_position - 1;

The slice_vertical_position of the first row of macroblocks is one. Some slices may have the same slice_vertical_position, since slices may start and finish anywhere. The maximum value of slice_vertical_position is 175 unless slice_vertical_position_extension is present in which case slice_vertical_position shall be in the range [1:128].

priority_breakpoint -- This is a 7-bit integer that indicates the point in the syntax where the bitstream shall be partitioned. The allowed values and their semantic interpretation is given in Table 7-30 priority_breakpoint shall take the value zero in partition 1.

quantiser_scale_code -- A 5 bit unsigned integer in the range 1 to 31 . The decoder shall use this value until another quantiser_scale_code is encountered either in slice() or macroblock(). The value zero is forbidden.

intra_slice_flag -- This flag shall be set to ‘1’ to indicate the presence of intra_slice and reserved_bits in the bitstream.

intra_slice -- This flag shall be set to ‘0’ if any of the macroblocks in the slice are non-intra macroblocks. If all of the macroblocks are intra macroblocks then intra_slice may be set to ‘1’. intra_slice may be omitted from the bitstream (by setting intra_slice_flag to ‘0’) in which case it shall be assumed to have the value zero.

intra_slice is not used by the decoding process. intra_slice is intended to aid a DSM application in performing FF/FR (see D.12).

reserved_bits -- This is a 7 bit integer, it shall have the value zero, other values are reserved.

extra_bit_slice -- This flag indicates the presence of the following extra information. If extra_bit_slice is set to ‘1’, extra_information_slice will follow it. If it is set to ‘0’, there are no data following it. extra_bit_slice shall be set to ‘0’, the value ‘1’ is reserved for possible future extensions defined by ITU-T|ISO/IEC.

extra_information_slice -- Reserved. A decoder conforming to this specification that encounters extra_information_slice in a bitstream shall ignore it (i.e. remove from the bitstream and discard). A bitstream conforming to this specification shall not contain this syntax element.

6.3.17 Macroblock

NOTE - “macroblock_stuffing” which is supported in ISO/IEC11172-2 shall not be used in a bitstream defined by this specification.

macroblock_escape -- The macroblock_escape is a fixed bit-string ‘0000 0001 000’ which is used when the difference between macroblock_address and previous_macroblock_address is greater than 33. It causes the value of macroblock_address_increment to be 33 greater than the value that will be decoded by subsequent macroblock_escape and the macroblock_address_increment codewords.

For example, if there are two macroblock_escape codewords preceding the macroblock_address_increment, then 66 is added to the value indicated by macroblock_address_increment.

macroblock_address_increment -- This is a variable length coded integer coded as per Annex B Table B-1 which indicates the difference between macroblock_address and previous_macroblock_address. The maximum value of macroblock_address_increment is 33. Values greater than this can be encoded using the macroblock_escape codeword.

The macroblock_address is a variable defining the absolute position of the current macroblock. The macroblock_address of the top-left macroblock is zero.

The previous_macroblock_address is a variable defining the absolute position of the last non-skipped macroblock (see 7.6.6 for the definition of skipped macroblocks) except at the start of a slice. At the start of a slice previous_macroblock_address is reset as follows:

previous_macroblock_address = (mb_row * mb_width) -1

The horizontal spatial position in macroblock units of a macroblock in the picture (mb_column) can be computed from the macroblock_address as follows:

mb_column = macroblock_address % mb_width

where mb_width is the number of macroblocks in one row of the picture.

Except at the start of a slice, if the value of macroblock_address recovered from macroblock_address_increment and the macroblock_escape codes (if any) differs from the previous_macroblock_address by more than one then some macroblocks have been skipped. It is a requirement that:

• There shall be no skipped macroblocks in I-pictures except when

either picture_spatial_scalable_extension() follows the picture_header() of the current picture.

or sequence_scalable_extension() is present in the bitstream and scalable_mode = “SNR scalability”.

• The first and last macroblock of a slice shall not be skipped.

• In a B-picture there shall be no skipped macroblocks immediately following a macroblock in which macroblock_intra is one.

6.3.17.1 Macroblock modes

macroblock_type -- Variable length coded indicator which indicates the method of coding and content of the macroblock according to the Tables B-2 through B-8, selected by picture_coding_type and scalable_mode.

macroblock_quant -- Derived from macroblock_type according to the Tables B-2 through B-8. This is set to 1 to indicate that quantiser_scale_code is present in the bitstream.

macroblock_motion_forward -- Derived from macroblock_type according to the Tables B-2 through B-8. This flag affects the bitstream syntax and is used by the decoding process.

macroblock_motion_backward -- Derived from macroblock_type according to the Tables B-2 through B-8. This flag affects the bitstream syntax and is used by the decoding process.

macroblock_pattern -- Derived from macroblock_type according to the Tables B-2 through B-8. This is set to 1 to indicate that coded_block_pattern() is present in the bitstream.

macroblock_intra -- Derived from macroblock_type according to the Tables B-2 through B-8. This flag affects the bitstream syntax and is used by the decoding process.

spatial_temporal_weight_code_flag -- Derived from the macroblock_type. This indicates whether the spatial_temporal_weight_code is present in the bitstream.

When spatial_temporal_weight_code_flag is ‘0’ (indicating that spatial_temporal_weight_code is not present in the bitstream) the spatial_temporal_weight_class is derived from Tables B-5 to B-7. When spatial_temporal_weight_code_flag is ‘1’ spatial_temporal_weight_class is derived from Table 7-20.

spatial_temporal_weight_code -- This is a two bit code which indicates, in the case of spatial scalability, how the spatial and temporal predictions shall be combined to form the prediction for the macroblock. A full description of how to form the spatial scalable prediction is given in 7.7.

frame_motion_type -- This is a two bit code indicating the macroblock prediction type, defined in Table 6-17.

If frame_pred_frame_dct is equal to 1 then frame_motion_type is omitted from the bitstream. In this case motion vector decoding and prediction formation shall be performed as if frame_motion_type had indicated “Frame-based prediction”.

In the case of intra macroblocks (in a frame picture) when concealment_motion_vectors is equal to 1 frame_motion_type is not present in the bitstream. In this case motion vector decoding and update of the motion vector predictors shall be performed as if frame_motion_type had indicated “Frame-based”. See 7.6.3.9.

Table 6-17 Meaning of frame_motion_type

code

spatial_temporal

_weight_class

prediction type

motion_vector

_count

mv_format

dmv

00

reserved

01

0,1

Field-based

2

field

0

11

0,2,3

Dual-Prime

1

field

1

field_motion_type -- This is a two bit code indicating the macroblock prediction type, defined in Table 6-18.

In the case of intra macroblocks (in a field picture) when concealment_motion_vectors is equal to 1 field_motion_type is not present in the bitstream. In this case motion vector decoding and update of the motion vector predictors shall be performed shall be performed as if field_motion_type had indicated “Field-based”. See 7.6.3.9.

Table 6-18 Meaning of field_motion_type

code	spatial_temporal _weight_class	prediction type	motion_vector _count	mv_format	dmv
00		reserved
01	0,1	Field-based	1	field	0
11	0	Dual-Prime	1	field	1

dct_type -- This is a flag indicating whether the macroblock is frame DCT coded or field DCT coded. If this is set to ‘1’, the macroblock is field DCT coded

In the case that dct_type is not present in the bitstream then the value of dct_type (used in the remainder of the decoding process) shall be derived as shown in Table 6-19.

Table 6-19. Value of dct_type if dct_type is not in the bitstream.

Condition	dct_type
picture_structure == “field”	unused because there is no frame/field distinction in a field picture.
frame_pred_frame_dct == 1	0 (“frame”)
macroblock is skipped	unused - macroblock is not coded

6.3.17.2 Motion vectors

motion_vector_count is derived from field_motion_type or frame_motion_type as indicated in Table 6-17 and Table 6-18.

mv_format is derived from field_motion_type or frame_motion_type as indicated in the Table 6-17 and Table 6-18. mv_format indicates if the motion vector is a field-motion vector or a frame-motion vector. mv_format is used in the syntax of the motion vectors and in the process of motion vector prediction.

dmv is derived from field_motion_type or frame_motion_type as indicated in Table 6-17 and Table 6-18

motion_vertical_field_select[r][s] -- This flag indicates which reference field shall be used to form the prediction. If motion_vertical_field_select[r][s] is zero then the top reference field shall be used, if it is one then the bottom reference field shall be used. (See Table 7-7 for the meaning of the indices; r and s.)

6.3.17.3 Motion vector

motion_code[r][s][t] -- This is a variable length code, as defined in Table B-10, which is used in motion vector decoding as described in 7.6.3.1. (See Table 7-7 for the meaning of the indices; r, s and t.)

motion_residual[r][s][t] -- This is an integer which is used in motion vector decoding as described in 7.6.3.1. (See Table 7-7 for the meaning of the indices; r, s and t.) The number of bits in the bitstream for motion_residual[r][s][t], r_size, is derived from f_code[s][t] as follows;

r_size = f_code[s][t] - 1

NOTE - The number of bits for both motion_residual[0][s][t] and motion_residual[1][s][t] is denoted by f_code[s][t].

dmvector[t] -- This is a variable length code, as defined in Table B-11, which is used in motion vector decoding as described in 7.6.3.1. (See Table 7-7 for the meaning of the index; t.)

6.3.17.4 Coded block pattern

coded_block_pattern_420 -- A variable length code that is used to derive the variable cbp according to Table B-9.

coded_block_pattern_1 --

coded_block_pattern_2 -- For 4:2:2 and 4:4:4 data the coded block pattern is extended by the addition of either a two bit or six bit fixed length code, coded_block_pattern_1 or coded_block_pattern_2. Then the pattern_code[i] is derived using the following:

for (i=0; i<12; I++) {

if (macroblock_intra)

pattern_code[i] = 1;

else

pattern_code[i] = 0;

}

if (macroblock_pattern) {

for (i=0; i<6; i++)

if ( cbp & (1<<(5-i)) ) pattern_code[i] = 1;

if (chroma_format == “4:2:2”)

for (i=6; i<8; i++)

if ( coded_block_pattern_1 & (1<<(7-i)) ) pattern_code[i] = 1;

if (chroma_format == “4:4:4”)

for (i=8; i<12; i++)

if ( coded_block_pattern_2 & (1<<(11-i)) ) pattern_code[i] = 1;

}

If pattern_code[i] equals to 1, i=0 to (block_count-1), then the block number i defined in Figures 6-8, 6-9 and 6-10 is contained in this macroblock.

The number “block_count” which determines the number of blocks in the macroblock is derived from the chrominance format as shown in Table 6-20.

Table 6-20 block_count as a function of chroma_format

chroma_format	block_count
4:2:0	6
4:2:2	8
4:4:4	12

6.3.18 Block

The semantics of block() are described in clause 7.

1 ... 4 5 6 7 8 9 10 11 ... 35