|  |  |
| --- | --- |
| **Joint Collaborative Team on Video Coding (JCT-VC)**  **of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11**  9th Meeting: Geneva, Switzerland, 27 April – 07 May, 2012 | Document: JCTVC-I0425  M24674 |

|  |  |  |  |
| --- | --- | --- | --- |
| *Title:* | **AHG7: A combined study on JCTVC-I0216 and JCTVC-I0107** | | |
| *Status:* | Input Document to JCT-VC | | |
| *Purpose:* | Proposal | | |
| *Author(s) or Contact(s):* | Minhua Zhou Texas Instruments Inc., USA | Tel: Email:  : | +1-214-480-3816 [zhou@ti.com](mailto:zhou@ti.com) |
| *Source:* | Texas Instruments Inc; | | |

\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_

# Abstract

This contribution reports results of a combined study on JCTVC-I0216 and JCTVC-I0107. This contribution proposes to combine both solutions for further coding loss reduction. In the proposal, 4x4 inter PUs are permanently disabled (same as JCTVC-I0216), 8x4 and 4x8 inter PUs are restricted to have either unidirectional merge mode (same as JCTVC-I0107) or unidirectional predictive mode (same as JCTVC-I0216 and JCTVC-I0107). The inter prediction direction flag is not signaled for 4x8 and 8x8 inter PUs in B-slices (same as JCTVC-I0216), the merge mode signaling remains the same as HM6.0 (same as JCTVC-I0107), the merging candidate list derivation is modified that the bi-predictive merging candidates are converted into uni-predictive candidates for 8x4 and 4x8 PUs. The conversion is performed after the completion of the current HM6.0 merging candidate derivation process to minimize changes to the current design, and is based on the values of reference picture indices, the prediction direction (list 0 or list 1) is chosen if the corresponding reference picture index has a smaller value. Experimental results shown that the coding lossless of combined design is reduced to 0.4/0.3/0.6/0.4 (% in RA-Main/RA-HE10/LB-Main/LB-HE10) when compared to 0.4/0.3/0.6/0.4 in JCTVC-I0216 and 0.4/0.3/0.4/0.3 in JCTVC-I0107.

# Introduction

# Motion compensation bandwidth is an increasing bottleneck for video coding when video resolutions are moving to UHD. JCTVC-I0216 and JCTVC-I0107 have proposed a similar method to address this issue. In this contribution, it is proposed to combine two solutions into one to further reduce the coding loss caused by the memory bandwidth restrictions.

The key parts of JCTVC-I0216 are as follows:

1. 4x4 inter PUs are permanently disabled
2. Merge mode is disabled for 8x4 and 4x8 PUs of B-slices, and merge flag and merge index are not signaled accordingly for 8x4 and 4x8 PUs. No changes in the merging candidate derivation process.
3. Normal bi-prediction mode is disabled for 8x4 and 4x8 PUs, the inter prediction direction flag is NOT signaled in the bitstream.

For the same restriction, the key parts of JCTVC-I0107 are as follows:

1. 4x4 inter PUs are disabled
2. For 8x4 and 4x8 PUs, bi-predictive merging candidates are converted to list 0 uni-predictive candidates in the merging candidate list. The conversion is interleaved into the current HM6.0 merging candidate list derivation process. No changes are introduced to the merge mode signaling.
3. Normal bi-prediction mode is disabled for 8x4 and 4x8 PUs, but the inter prediction direction flag is STILL signaled in the bitstream

Both methods have room to improve. In JCTVC-I0216 disabling merge mode for 8x4 and 4x8 PUs in B-slices leads to additional coding loss. In JCTVC-I0107 there is no need to transmit inter prediction direction flag for 8x4 and 4x8 PUs of B-slices if they are restricted to have uni-prediction mode. Also, it is not desirable to interleave the conversion of bi-predictive merging candidates to uni-predictive ones into every step of the current HM6.0 merging candidate list derivation process.

# Proposed combination of JCTVC-I0216 and JCTVC-I0107

It is proposed to harmonize two solutions and combine them as follows:

1. 4x4 inter PUs are permanently disabled (same as JCTVC-I0216)
2. For 8x4 and 4x8 PUs of B-slices, bi-predictive merging candidates are converted to list 0 uni-predictive candidates in the merging candidate list. The current HM6.0 merging candidate list derivation process remains unchanged, the conversion is performed after the HM6.0 merging candidate list derivation is completed and based on values of reference picture indices. No changes are introduced to the merge mode signaling. (simplified from JCTVC-I0107)
3. Normal bi-prediction mode is disabled for 8x4 and 4x8 PUs, the inter prediction direction flag is NOT signaled in the bitstream. (same as JCTVC-I0216)

Fig. 1 modified merging candidate list derivation process for memory bandwidth restriction of small block size PUs

Fig.1 illustrates the modified merging candidate list derivation process, after the merging candidate list derivation process currently defined in the HM6.0, it is checked whether the current PU is subject to the restriction. If the current PU is restricted to have uni-directional prediction, the inter prediction mode of each merging candidate in the merging candidate list is checked and the bi-predictive merging candidates are converted into uni-predictive candidates based on the values of reference picture indices. For a bi-predictive merging candidate in the list, if the value of its list 0 reference picture index is less than or equal to the value of its list 1 reference picture index, the candidate is converted to a list 0 predicted candidate by discarding its motion data (motion vector, reference index) in the list 1 direction, the inter prediction direction is changed to list 0 prediction for the PU. Otherwise, the candidate is converted to a list 1 predicted candidate by discarding its motion data (motion vector, reference index) in the list 0 direction, the inter prediction direction is changed to list 1 prediction for the PU. The conversion of bi-predictive merging candidates to uni-predictive candidates for memory bandwidth restriction is fully parallelizable for the candidates in the merging candidate list, because it happens at the last stage of the derivation process. The changes to the current HM6.0 merging candidate list derivation process are kept at minimum.

# Test Settings and Conditions

The simulations of this document have used HM6.0 software, the simulation platform is LSF equipped with Intel(R) Xeon(R) CPU X5570 64 bits Linux machines of different frequencies, the common test conditions and reference configurations specified in [1] are followed.

# Experimental results

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
|  | **Random Access Main** | | | **Random Access HE10** | | |
|  | Y | U | V | Y | U | V |
| Class A | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #VALUE! |
| Class B | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #VALUE! |
| Class C | 0.4% | 0.4% | 0.4% | 0.3% | 0.3% | 0.3% |
| Class D | 0.5% | 0.4% | 0.5% | 0.5% | 0.5% | 0.3% |
| Class E |  |  |  |  |  |  |
| **Overall** | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #VALUE! |
|  | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #VALUE! |
| Class F | 0.1% | 0.2% | 0.2% | #VALUE! | #VALUE! | #VALUE! |
| Enc Time[%] | #VALUE! | | | #VALUE! | | |
| Dec Time[%] | #NUM! | | | #NUM! | | |
|  |  |  |  |  |  |  |
|  | **Low delay B Main** | | | **Low delay B HE10** | | |
|  | Y | U | V | Y | U | V |
| Class A |  |  |  |  |  |  |
| Class B | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #VALUE! |
| Class C | 0.4% | 0.5% | 0.3% | 0.3% | 0.5% | 0.4% |
| Class D | 0.5% | 0.5% | 1.0% | 0.5% | 0.4% | 0.1% |
| Class E | 0.2% | -0.1% | 0.4% | #VALUE! | #VALUE! | #VALUE! |
| **Overall** | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #VALUE! |
|  | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #VALUE! |
| Class F | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #VALUE! |
| Enc Time[%] | #VALUE! | | | #VALUE! | | |
| Dec Time[%] | #NUM! | | | #NUM! | | |

# Table 1. Experimental results for proposed combination

# Conclusions

# References

[1] F. Bossen, “Common test conditions and software reference configurations,” JCT-VC Document, JCTVC-G1100, San Jose, CA, USA, February 2012.

[2] [B. Bross](mailto:benjamin.bross@hhi.fraunhofer.de), [W.-J. Han](mailto:wjhan.han@samsung.com), [J.-R. Ohm](mailto:ohm@ient.rwth-aachen.de), [G. J. Sullivan](mailto:garysull@microsoft.com), [T. Wiegand](mailto:thomas.wiegand@hhi.fraunhofer.de) “High Efficiency Video Coding (HEVC) Test Model 6 (HM 6) Encoder Description,” JCT-VC Document, JCTVC-G1003, San Jose, CA, USA, February 2012.

[3] T. Hellman, W. Wan, “Reducing HEVC worst-case memory bandwidth by restricting bi-directional 8x4 and 4x8 prediction units,” JCT-VC Document, JCTVC-I0216, 9th Meeting: Geneva, Switzerland, 27 April – 07 May, 2012

[4] K. Kondo, T. Suzuki, T. Yamamoto, “AHG7: Modification of merge candidate derivation to reduce MC memory bandwidth,” JCT-VC Document, JCTVC-I0107, 9th Meeting: Geneva, Switzerland, 27 April – 07 May, 2012

# Patent rights declaration(s)

**Texas Instruments, Inc. may have IPR relating to the technology described in this contribution and, conditioned on reciprocity, is prepared to grant licenses under reasonable and non-discriminatory terms as necessary for implementation of the resulting ITU-T Recommendation |ISO/IEC International Standard (per box 2 of the ITU-T/ITU-R/ISO/IEC patent statement and licensing declaration form).**

# CD text

Changes marked as yellow

## Section 7.3.2.1- Sequence parameter set RBSP syntax

|  |  |
| --- | --- |
| **temporal\_id\_nesting\_flag** | u(1) |
| if( log2\_min\_coding\_block\_size\_minus3 = = 0 ) |  |
| **~~inter\_4x4\_enabled\_flag~~** | ~~u(1)~~ |
| **disable\_inter\_4x8\_8x4\_bidir\_flag** | u(1) |
| **num\_short\_term\_ref\_pic\_sets** | ue(v) |

## Section 7.4.2.1- Sequence parameter set semantics

**Replace:**

**~~inter\_4x4\_enabled\_flag~~** ~~specifies whether inter prediction can be applied to blocks having the size of 4 by 4 luma samples~~.

**With:**

**disable\_inter\_4x8\_8x4\_bidir\_flag** specifies whether bi-directional inter prediction can be applied to blocks having the size of 4 by 8 and 8 by 4 luma samples. If not present, this flag’s value is inferred to be 0.

**Section 7.3.7 Prediction unit syntax**

|  |  |
| --- | --- |
| } |  |
| } else { /\* MODE\_INTER \*/ |  |
| **merge\_flag[** x0 **][** y0 **]** | ae(v) |
| if( merge\_flag[ x0 ][ y0 ] ) { |  |
| if( MaxNumMergeCand > 1 ) |  |
| **merge\_idx[** x0 **][** y0 **]** | ae(v) |
| } else { |  |
| disable\_bidir = (log2CbSize == 3 &&  disable\_inter\_4x8\_8x4\_bidir\_flag &&   (PartMode == PART\_Nx2N || PartMode == PART\_2NxN))  ? 1 : 0; |  |
| if( slice\_type = = B && && !disable\_bidir) |  |
| **inter\_pred\_flag[** x0 **][** y0 **]** | ae(v) |
| if( inter\_pred\_flag[ x0 ][ y0 ] = = Pred\_LC ) { |  |
| if( num\_ref\_idx\_lc\_active\_minus1 > 0 ) |  |
| **ref\_idx\_lc[** x0 **][** y0 **]** | ae(v) |
| mvd\_coding(mvd\_lc[ x0 ][ y0 ][ 0 ],   mvd\_lc[ x0 ][ y0 ][ 1 ]) |  |
| **mvp\_lc\_flag[ x0 ][ y0 ]** | ae(v) |
| } else { /\* Pred\_L0 or Pred\_BI \*/ |  |
| if( num\_ref\_idx\_l0\_active\_minus1 > 0 ) |  |
| **ref\_idx\_l0**[ x0 ][ y0 ] | ae(v) |
| mvd\_coding(mvd\_l0[ x0 ][ y0 ][ 0 ],   mvd\_l0[ x0 ][ y0 ][ 1 ]) |  |

**Section 8.5.2.1.1** **Derivation process for luma motion vectors for merge mode**

1. The following assignments are made with N being the candidate at position merge\_idx[ xP][ yP ] in the merging candidate list mergeCandList ( N = mergeCandList[ merge\_idx[ xP][ yP ] ] ) and X being replaced by 0 or 1:

mvLX[ 0 ] = mvLXN[ 0 ] (8‑99)

mvLX[ 1 ] = mvLXN[ 1 ] (8‑100)

refIdxLX =  refIdxLXN (8‑101)

predFlagLX = predFlagLXN (8‑102)

1. For each merging candidate in the list if log2CbSize is equal to 3, disable\_inter\_4x8\_8x4\_bidir\_flag is set to 1, PartMode is equal to PART\_Nx2N or PART\_2NxN, and both predFlagL0 and predFlagL1 are equal to 1, the following conversion is made:

If (refIdxL0 ≤ refIdx1)

predFlagL1 = 0 and refIdx1 = -1;

Otherwise,

predFlagL0 = 0 and refIdx0 = -1;