This clause specifies the additional decoding process required for the spatial scalable extensions.
Both the lower layer and the enhancement layer shall use the “restricted slice structure” (no gaps between slices).
Figure 7-13 is a diagram of the video decoding process with spatial scalability The diagram is simplified for clarity.
Figure 7-13. Simplified motion compensation process for spatial scalability
7.7.1 Higher syntactic structures
In general the base layer of a spatial scalable hierarchy can conform to any coding standard including Recommendation ITU T H.261, ISO/IEC11172-2 this specification. Note however, that within this specification the decodability of a spatial scalable hierarchy is only considered in the case that the base layer conforms to this specification or ISO/IEC11172-2.
Due to the “loose coupling” of layers only one syntactic restriction is needed in the enhancement layer if both lower and enhancement layer are interlaced. In that case picture_structure has to take the same value as in the reference frame used for prediction from the lower layer. See 7.7.3.1 for how to identify this reference frame.
7.7.2 Prediction in the enhancement layer
A motion compensated temporal prediction is made from reference frames in the enhancement layer as described in 7.6. In addition, a spatial prediction is formed from the lower layer decoded frame (dlower[y][x]), as described in 7.7.3. These predictions are selected individually or combined to form the actual prediction.
In general up to four separate predictions are formed for each macroblock which are combined together to form the final prediction macroblock p[y][x].
In the case that a macroblock is not coded, either because the entire macroblock is skipped or the specific macroblock is not coded there is no coefficient data. In this case f[y][x] is zero and the decoded samples are simply the prediction, p[y][x].
Forming the spatial prediction requires identification of the correct reference frame and definition of the spatial resampling process, which is done in the following clauses.
The resampling process is defined for a whole frame, however, for decoding of a macroblock, only the 16x16 region in the upsampled frame, which corresponds to the position of this macroblock, is needed.
7.7.3.1 Selection of reference frame
The spatial prediction is made from the reconstructed frame of the lower layer referenced by the lower_layer_temporal_reference. However, if lower and enhancement layer bitstreams are embedded in an Recommendation ITU T H.220.0 | ISO/IEC 13818-1 (Systems) multiplex, this information is overridden by the timing information given by the decoding time stamps (DTS) in the PES headers.
NOTE - If group_of_pictures_header() occurs often in the lower layer bitstream then the temporal reference in the lower layer may be ambiguous (because temporal_reference is reset after a group_of_pictures_header()).
The reconstructed picture from which the spatial prediction is made shall be one of the following:
• The coincident or most recently decoded lower layer picture
• The coincident or most recently decoded lower layer I-picture or P-picture
• The second most recently decoded lower layer I-picture or P-picture provided that the lower layer does not have low_delay set to ‘1’. Note furthermore that spatial scalability will only work efficiently when predictions are formed from frames in the lower layer which are also coincident (or very close) in display time with the predicted frame in the enhancement layer.
7.7.3.2 Resampling process
The spatial prediction is made by resampling the lower layer reconstructed frame to the same sample grid as the enhancement layer. This grid is defined in terms of frame coordinates, even if a lower-layer interlaced frame was actually coded with a pair of field pictures.
This resampling process is illustrated in Figure 7-14.
Figure 7-14. Formation of the “spatial” prediction by interpolation of the lower layer picture
Spatial predictions shall only be made for macroblocks in the enhancement layer that lie wholly within the upsampled lower layer reconstructed frame.
The upsampling process depends on whether the lower layer reconstructed frame is interlaced or progressive, as indicated by lower_layer_progressive_frame and whether the enhancement layer frame is interlaced or progressive, as indicated by progressive_frame.
When lower_layer_progressive_frame is ‘1’, the lower layer reconstructed frame (renamed to prog_pic) is resampled vertically as described in 7.7.3.4. The resulting frame is considered to be progressive if progressive_frame is ‘1’ and interlaced if progressive_frame is ‘0’. The resulting frame is resampled horizontally as described in 7.7.3.6. lower_layer_deinterlaced_field_select shall have the value ‘1’.
When lower_layer_progressive_frame is ‘0’ and progressive_frame is ‘0’, each lower layer reconstructed field is deinterlaced as described in 7.7.3.4, to produce a progressive field (prog_pic). This field is resampled vertically as described in 7.7.3.5. The resulting field is resampled horizontally as described in 7.7.3.6. Finally the resulting field is subsampled to produce an interlaced field. lower_layer_deinterlaced_field_select shall have the value ‘1’.
When lower_layer_progressive_frame is ‘0’ and progressive_frame is ‘1’, each lower layer reconstructed field is deinterlaced as described in 7.7.3.4, to produce a progressive field (prog_pic). Only one of these fields is required. When lower_layer_deinterlaced_field_select is ‘0’ the top field is used, otherwise the bottom field is used. The one that is used is resampled vertically as described in 7.7.3.5. The resulting frame is resampled horizontally as described in 7.7.3.6.
For interlaced frames, if the current (and implicitly the lower-layer) frame are encoded as field pictures, the deinterlacing process described in 7.7.3.5 is done within the field.
lower_layer_vertical_offset and lower_layer_horizontal_offset, defining the position of the lower layer frame within the current frame, shall be taken into account in the resampling definitions in 7.7.3.5 and 7.7.3.6 respectively. The lower layer offsets are limited to even values when the chrominance in the enhancement layer is subsampled in that dimension in order to align the chrominance samples between the two layers.
The upsampling process is summarised Table 7-15.
Table 7-15 Upsampling process
lower_layer_
deinterlaced_
field_select
|
lower_layer_
progressive_frame
|
progressive_
frame
|
Apply
deinterlace
process
|
Entity used
for prediction
|
0
|
0
|
1
|
yes
|
top field
|
1
|
0
|
1
|
yes
|
bottom field
|
1
|
0
|
0
|
yes
|
both fields
|
7.7.3.3 Colour component processing
Due to the different sampling grids of luminance and chrominance components, some variables used in 7.7.3.4 to 7.7.3.6 take different values for luminance and chrominance resampling. Furthermore it is permissible for the chrominance formats in the lower layer and the enhancement layer to be different from one another.
The table 7-16 defines the values for the variables used in 7.7.3.4 to 7.7.3.6
Table 7-16 Local variables used in 7.7.3.3 to 7.7.3.5
variable
|
value for luminance processing
|
value for chrominance processing
|
ll_h_size
|
lower_layer_prediction_horizontal_size
|
lower_layer_prediction_horizontal_size / chroma_ratio_horizontal[lower]
|
ll_v_size
|
lower_layer_prediction_vertical_size
|
lower_layer_prediction_vertical_size / chroma_ratio_vertical[lower]
|
v_subs_n
|
vertical_subsampling_factor_n
|
vertical_subsampling_factor_n
* format_ratio_vertical
|
Tables 7-17 and 7-18 give additional definitions.
Table 7-17 chrominance subsampling ratios for layer = {lower, enhance}
chrominance format
lower layer
|
chroma_ratio_
horizontal[layer]
|
chroma_ratio_
vertical[layer]
|
4:2:0
|
2
|
2
|
4:2:2
|
2
|
1
|
4:4:4
|
1
|
1
|
Table 7-18 chrominance format ratios
chrominance format
lower layer
|
chrominance format
enhancement layer
|
format_ratio_
horizontal
|
format_ratio_
vertical
|
4:2:0
|
4:2:0
|
1
|
1
|
4:2:0
|
4:2:2
|
1
|
2
|
4:4:4
|
4:4:4
|
1
|
1
|
7.7.3.4 Deinterlacing
If deinterlacing needs not to be done (according to table 7-16), the lower layer reconstructed frame (dlower[y][x]) is renamed to input_pic.
First, each lower layer field is padded with zeros to form a progressive grid at a frame rate equal to the field rate of the lower layer, and with the same number of lines and samples per line as the lower layer frame. Table 7-19 specifies the filters to be applied next. The luminance component is filtered using the relevant two field aperture filter if picture_structure == “Frame-Picture” or else using the one field aperture filter . The chrominance component is filtered using the one field aperture filter.
The temporal and vertical columns of the table indicate the relative spatial and temporal coordinates of the samples to which the filter taps defined in the other two columns apply. An intermediate sum is formed by adding the multiplied coefficients together.
Table 7-19. Deinterlacing Filter
|
|
two field aperture
|
|
one field aperture
|
Temporal
|
Vertical
|
Filter for first field
|
Filter for second field
|
Filter (both fields)
|
-1
|
-2
|
0
|
-1
|
0
|
-1
|
0
|
0
|
2
|
0
|
1
|
+2
|
-1
|
0
|
0
|
The output of the filter (sum) is then scaled according to the following formula:
prog_pic[y][x] = sum // 16
and saturated to lie in the range [0:255].
The filter aperture can extend outside the coded picture size. In this case the samples of the lines outside the active picture shall take the value of the closest neighbouring existing sample (below or above) of the same field as defined below.
For all samples [y][x]:
if (y<0 && (y&1 == 1))
y=1
if (y<0 && (y&1 == 0))
y=0
if (y >= ll_v_size &&
( (y-ll_v_size)&1 == 1))
y = ll_v_size - 1
if (y >= ll_v_size &&
((y-ll_v_size)&1 == 0))
y = ll_v_size - 2
7.7.3.5 Vertical resampling
The frame subject to vertical resampling, prog_pic, is resampled to the enhancement layer vertical sampling grid using linear interpolation between the sample sites according to the following formula, where vert_pic is the resulting field:
vert_pic[yh + ll_v_offset][x] = (16 - phase) * prog_pic[y1][x] + phase * prog_pic[y2][x]
where yh+ ll_v_offset = output sample coordinate in vert_pic
y1 = (yh * v_subs_m) / v_subs_n
y2 = y1 + 1 if y1 < ll_v_size - 1
y1 otherwise
phase = (16 * (( yh * v_subs_m) % v_subs_n)) // v_subs_n
Samples which lie outside the lower layer reconstructed frame which are required for upsampling are obtained by border extension of the lower layer reconstructed frame.
NOTE - The calculation of phase assumes that the sample position in the enhancement layer at yh = 0 is spatially coincident with the first sample position of the lower layer. It is recognised that this is an approximation for the chrominance component if the chroma_format == 4:2:0.
7.7.3.6 Horizontal resampling
The frame subject to horizontal resampling, vert_pic, is resampled to the enhancement layer horizontal sampling grid using linear interpolation between the sample sites according to the following formula, where hor_pic is the resulting field:
hor_pic[y][xh + ll_h_offset] = ((16 - phase) * vert_pic[y][x1] + phase * vert_pic[y][x2]) // 256
where xh+ ll_h_offset = output sample coordinate in hor_pic
x1 = (xh * h_subs_m) / h_subs_n
x2 = x1 + 1 if x1 < ll_h_size - 1
x1 otherwise
phase = (16 * (( xh * h_subs_m) % h_subs_n)) // h_subs_n
Samples which lie outside the lower layer reconstructed frame which are required for upsampling are obtained by border extension of the lower layer reconstructed frame.
7.7.3.7 Reinterlacing
If reinterlacing needs not to be done, the result of the resampling process, hor_pic, is renamed to spat_pred_pic.
If hor_pic was derived from the top field of a lower layer interlaced frame, the even lines of hor_pic are copied to the even lines of spat_pred_pic.
If hor_pic was derived from the bottom field of a lower layer interlaced frame the odd lines of hor_pic are copied to the odd lines of spat_pred_pic.
If hor_pic was derived from a lower layer progressive frame, hor_pic is copied to spat_pred_pic.
The spatial and temporal predictions can be selected or combined to form the actual prediction. The macroblock_type (Tables B-5, B-6 and B-7) ) and the additional spatial_temporal_weight_code (Table 7-21) indicate, by use of the spatial_temporal_weight_class, whether the prediction is temporal-only, spatial-only or a weighted combination of temporal and spatial predictions. Classes are defined in the following way:
Class 0 indicates temporal-only prediction
Class 1 indicates that neither field has spatial-only prediction
Class 2 indicates that the top field is spatial-only prediction
Class 3 indicates that the bottom field is spatial-only prediction
Class 4 indicates spatial-only prediction
In intra pictures, if spatial_temporal_weight_class is 0, normal intra coding is performed, otherwise the prediction is spatial-only. In predicted and interpolated pictures, if the spatial_temporal_weight_class is 0, prediction is temporal-only, if the spatial_temporal_weight_class is 4, prediction is spatial-only, otherwise one or a pair of prediction weights is used to combine the spatial and temporal predictions.
The possible spatial_temporal_weights are given in a weight table which is selected in the picture spatial scalable extension. Up to four different weight tables are available for use depending on whether the current and lower layers are interlaced or progressive, as indicated in Table 7-20 (allowed, yet not recommended values given in brackets).
Table 7-20. Intended (allowed) spatial_temporal_weight_code_table_index values
Lower layer format
|
Enhancement layer format
|
spatial_temporal_weight_
code_table_index
|
Progressive or interlaced
|
Progressive
|
00
|
Progressive coincident with enhancement layer top fields
|
Interlaced
|
10 (00; 01; 11)
|
Interlaced (picture_structure != Frame-Picture)
|
Interlaced
|
00
|
In macroblock_modes(), a two bit code, spatial_temporal_weight_code, is used to describe the prediction for each field (or frame), as shown in the Table 7-21. In this table spatial_temporal_integer_weight identifies those spatial_temporal_weight_codes that can also be used with dual prime prediction (see tables 7-22, 7-23).
Table 7-21 spatial_temporal_weights and spatial_temporal_weight_classes for the spatial_temporal_weight_code_table_index and spatial_temporal_weight_codes
spatial_temporal_
weight_code_table_
index
|
spatial_
temporal_
weight_code
|
spatial_
temporal_
weight (s)
|
spatial_
temporal_
weight class
|
spatial_
temporal_
integer_weight
|
00*
|
-
|
(0,5)
|
1
|
0
|
01
|
00
|
(0; 1)
|
3
|
1
|
|
01
|
(0; 0,5)
|
1
|
0
|
|
11
|
(0,5; 0,5)
|
1
|
0
|
10
|
00
|
(1; 0)
|
2
|
1
|
|
01
|
(0,5; 0)
|
1
|
0
|
|
11
|
(0,5; 0,5)
|
1
|
0
|
11
|
00
|
(1; 0)
|
2
|
1
|
|
01
|
(1; 0,5)
|
2
|
0
|
|
11
|
(0,5; 0,5)
|
1
|
0
|
* For spatial_temporal_weight_code_table_index == 00 no spatial_temporal_weight_code is transmitted.
|
|
|
|
|
NOTE - Spatial-only prediction (weight_class == 4) is signalled by different values of macroblock_type (see tables B-5 to B-7).
When the spatial_temporal_weight combination is given in the form (a; b), “a” gives the proportion of the prediction for the top field which is derived from the spatial prediction and “b” gives the proportion of the prediction for the bottom field which is derived from the spatial prediction for that field.
When the spatial_temporal_weight is given in the form (a), “a” gives the proportion of the prediction for the picture which is derived from the spatial prediction for that picture.
The precise method for predictor calculation is as follows:
pel_pred_temp[y][x] is used to denote the temporal prediction (formed within the enhancement layer) as defined for pel_pred[y][x] in 7.6. pel_pred_spat[y][x] is used to denote the prediction formed from the lower layer by extracting the appropriate samples, co-located with the current macroblock position, from spat_pred_pic.
If the spatial_temporal_weight is zero then no prediction is made from the lower layer. Therefore;
pel_pred[y][x] = pel_pred_temp[y][x];
If the spatial_temporal_weight is one then no prediction is made from the enhancement layer. Therefore;
pel_pred[y][x] = pel_pred_spat[y][x];
If the weight is one half then the prediction is the average of the temporal and spatial predictions. Therefore;
pel_pred[y][x] = (pel_pred_temp[y][x] + pel_pred_spat[y][x])//2;
When progressive_frame == 0 chrominance is treated as interlaced, that is, the first weight is used for the top field chrominance lines and the second weight is used for the bottom field chrominance lines.
Addition of prediction and coefficient data is then done as in 7.6.8.
In frame pictures where field prediction is used the possibility exists that one of the fields is predicted using spatial-only prediction. In this case no motion vector is present in the bitstream for the field which has spatial-only prediction. For the case where both fields of a frame have spatial-only prediction, the macroblock_type is such that no motion vectors are present in the bitstream for that macroblock.
The spatial_temporal_weight_class also indicates the number of motion vectors which are present in the coded bitstream and how the motion vector predictors are updated as defined in Table 7-22 and Table 7 23.
Table 7-22. Updating of motion vector predictors in Field Pictures
frame_motion_type
|
|
|
|
|
|
|
macroblock_motion_forward
|
|
|
|
|
|
|
macroblock_motion_backward
|
|
|
|
|
|
|
macroblock_intra
|
|
|
|
|
|
|
spatial_temporal_weight_class
|
|
|
|
|
|
|
Predictors to update
|
Field-based‡
|
-
|
-
|
1
|
0
|
PMV[1][0][1:0] = PMV[0][0][1:0]◊
|
Field-based
|
1
|
1
|
0
|
0
|
PMV[1][0][1:0] = PMV[0][0][1:0]
|
|
|
|
|
|
PMV[1][1][1:0] = PMV[0][1][1:0]
|
Field-based
|
1
|
0
|
0
|
0,1
|
PMV[1][0][1:0] = PMV[0][0][1:0]
|
Dual prime
|
1
|
0
|
0
|
0
|
PMV[1][0][1:0] = PMV[0][0][1:0]
|
NOTE - PMV[r][s][1:0] = PMV[u][v][1:0] means that;
PMV[r][s][1] = PMV[u][v][1] and PMV[r][s][0] = PMV[u][v][0]
◊ If concealment_motion_vectors is zero then PMV[r][s][t] is set to zero (for all r, s and t).
‡ field_motion_type is not present in the bitstream but is assumed to be Field-based
§ PMV[r][s][t] is set to zero (for all r, s and t). See 7.6.3.4.
|
Table 7-23. Updating of motion vector predictors in Frame Pictures
frame_motion_type
|
|
|
|
|
|
|
macroblock_motion_forward
|
|
|
|
|
|
|
macroblock_motion_backward
|
|
|
|
|
|
|
macroblock_intra
|
|
|
|
|
|
|
spatial_temporal_weight_class
|
|
|
|
|
|
|
Predictors to update
|
Frame-based‡
|
-
|
-
|
1
|
0
|
PMV[1][0][1:0] = PMV[0][0][1:0]◊
|
Frame-based
|
1
|
1
|
0
|
0
|
PMV[1][0][1:0] = PMV[0][0][1:0]
|
|
|
|
|
|
PMV[1][1][1:0] = PMV[0][1][1:0]
|
Frame-based
|
1
|
0
|
0
|
0,1,2,3
|
PMV[1][0][1:0] = PMV[0][0][1:0]
|
Dual prime@
|
1
|
0
|
0
|
0,2,3
|
PMV[1][0][1:0] = PMV[0][0][1:0]
|
NOTE - PMV[r][s][1:0] = PMV[u][v][1:0] means that;
PMV[r][s][1] = PMV[u][v][1] and PMV[r][s][0] = PMV[u][v][0]
◊ If concealment_motion_vectors is zero then PMV[r][s][t] is set to zero (for all r, s and t).
‡ frame_motion_type is not present in the bitstream but is assumed to be Frame-based
§ PMV[r][s][t] is set to zero (for all r, s and t). See 7.6.3.4.
@ Dual prime can not be used when spatial_temporal_integer_weight = ‘0’.
|
|
|
|
|
|
7.7.5.1 Resetting motion vector predictors
In addition to the cases identified in 7.6.3.4 the motion vector predictors shall be reset in the following cases;
• In a P-picture when a macroblock is purely spatially predicted (spatial_temporal_weight_class == 4)
• In a B-picture when a macroblock is purely spatially predicted (spatial_temporal_weight_class == 4)
NOTE - In case of spatial_temporal_weight_class == 2 in a frame picture when field-based prediction is used, the transmitted vector is applied for the bottom field (see Table 7-25). However this vector[0][s][1:0] is predicted from PMV[0][s][1:0] . PMV[1][s][1:0] is then updated as shown in Table 7-23.
Table 7-24. Predictions and motion vectors in field pictures
field_motion_type
|
|
|
|
|
|
|
|
macroblock_motion_forward
|
|
|
|
|
|
|
|
macroblock_motion_backward
|
|
|
|
|
|
|
|
macroblock_intra
|
|
|
|
|
|
|
|
spatial_temporal_weight_class
|
|
|
|
|
|
|
|
Motion vector
|
Prediction formed for
|
Field-based‡
|
-
|
-
|
1
|
0
|
vector'[0][0][1:0]◊
|
None (motion vector is for concealment)
|
Field-based
|
1
|
1
|
0
|
0
|
vector'[0][0][1:0]
|
whole field, forward
|
|
|
|
|
|
vector'[0][1][1:0]
|
whole field, backward
|
Field-based
|
1
|
0
|
0
|
0,1
|
vector'[0][0][1:0]
|
whole field, forward
|
16x8 MC
|
1
|
1
|
0
|
0
|
vector'[0][0][1:0]
|
upper 16x8 field, forward
|
|
|
|
|
|
vector'[1][0][1:0]
|
lower 16x8 field, forward
|
|
|
|
|
|
vector'[1][1][1:0]
|
lower 16x8 field, backward
|
16x8 MC
|
1
|
0
|
0
|
0,1
|
vector'[0][0][1:0]
|
upper 16x8 field, forward
|
|
|
|
|
|
vector'[1][0][1:0]
|
lower 16x8 field, forward
|
16x8 MC
|
0
|
1
|
0
|
0,1
|
vector'[0][1][1:0]
|
upper 16x8 field, backward
|
|
|
|
|
|
vector'[1][1][1:0]
|
lower 16x8 field, backward
|
Dual prime
|
1
|
0
|
0
|
0
|
vector'[0][0][1:0]
|
whole field, same parity, forward
|
|
|
|
|
|
vector'[2][0][1:0]*†
|
whole field, opposite parity, forward
|
NOTE - Motion vectors are listed in the order they appear in the bitstream
◊ the motion vector is only present if concealment_motion_vectors is one
‡ field_motion_type is not present in the bitstream but is assumed to be Field-based
* These motion vectors are not present in the bitstream
† These motion vectors are derived from vector’[0][0][1:0] as described in 7.6.3.6
§ The motion vector is taken to be (0; 0) as explained in 7.6.3.5
|
|
|
|
|
|
|
Table 7-25. Predictions and motion vectors in frame pictures
frame_motion_type
|
|
|
|
|
|
|
|
macroblock_motion_forward
|
|
|
|
|
|
|
|
macroblock_motion_backward
|
|
|
|
|
|
|
|
macroblock_intra
|
|
|
|
|
|
|
|
spatial_temporal_weight_class
|
|
|
|
|
|
|
|
Motion vector
|
Prediction formed for
|
Frame-based‡
|
-
|
-
|
1
|
0
|
vector'[0][0][1:0]◊
|
None (motion vector is for concealment)
|
Frame-based
|
1
|
1
|
0
|
0
|
vector'[0][0][1:0]
|
frame, forward
|
|
|
|
|
|
vector'[0][1][1:0]
|
frame, backward
|
Frame-based
|
1
|
0
|
0
|
0,1,2,3
|
vector'[0][0][1:0]
|
frame, forward
|
Field-based
|
1
|
1
|
0
|
0
|
vector'[0][0][1:0]
|
top field, forward
|
|
|
|
|
|
vector'[1][0][1:0]
|
bottom field, forward
|
|
|
|
|
|
vector'[1][1][1:0]
|
bottom field, backward
|
Field-based
|
1
|
0
|
0
|
0,1
|
vector'[0][0][1:0]
|
top field, forward
|
|
|
|
|
|
vector'[1][0][1:0]
|
bottom field, forward
|
Field-based
|
1
|
0
|
0
|
2
|
|
top field, spatial
|
|
|
|
|
|
vector'[0][0][1:0]
|
bottom field, forward
|
Field-based
|
1
|
0
|
0
|
3
|
vector'[0][0][1:0]
|
top field, forward
|
|
|
|
|
|
|
bottom field, spatial
|
Field-based
|
0
|
1
|
0
|
0,1
|
vector'[0][1][1:0]
|
top field, backward
|
|
|
|
|
|
vector'[1][1][1:0]
|
bottom field, backward
|
Field-based
|
0
|
1
|
0
|
2
|
|
top field, spatial
|
|
|
|
|
|
vector'[0][1][1:0]
|
bottom field, backward
|
Field-based
|
0
|
1
|
0
|
3
|
vector'[0][1][1:0]
|
top field, backward
|
|
|
|
|
|
|
bottom field, spatial
|
Dual prime@
|
1
|
0
|
0
|
0,2,3
|
vector'[0][0][1:0]
|
top field, same parity, forward
|
|
|
|
|
|
vector'[0][0][1:0]*
|
bottom field, same parity, forward
|
|
|
|
|
|
vector'[3][0][1:0]*†
|
bottom fld., opposite parity, forward
|
NOTE - Motion vectors are listed in the order they appear in the bitstream
◊ the motion vector is only present if concealment_motion_vectors is one
‡ frame_motion_type is not present in the bitstream but is assumed to be Frame-based
* These motion vectors are not present in the bitstream
† These motion vectors are derived from vector’[0][0][1:0] as described in 7.6.3.6
§ The motion vector is taken to be (0; 0) as explained in 7.6.3.5
@ Dual prime can not be used when spatial_temporal_integer_weight = ‘0’.
|
|
|
|
|
|
|
7.7.6 Skipped macroblocks
In all cases, a skipped macroblock is the result of a prediction only, and all the DCT coefficients are considered to be zero.
If sequence_scalable_extension is present and scalable_mode = “spatial scalability”, the following rules apply in addition to those given in 7.6.6.
In I-pictures, skipped macroblocks are allowed. These are defined as spatial-only predicted.
In P-pictures and B-pictures, the skipped macroblock is temporal-only predicted.
In B-pictures a skipped macroblock shall not follow a spatial-only predicted macroblock.
In the case of spatial scalability, VBV buffer underflow in the lower layer may cause problems. This is because of possible uncertainty in precisely which frames will be repeated by a particular decoder.
0>0> |