From 1afa161f59d9fe766713d1d21d524998801edcf2 Mon Sep 17 00:00:00 2001
From: Philipp Hancke <phancke@microsoft.com>
Date: Tue, 25 Oct 2022 16:00:39 +0200
Subject: [PATCH] doc: align VLA documentation with code

clarifying that the number of temporal layers is limited to
a single byte and moving the format description from the source
to the document.

drive-by editorial fixes

BUG=webrtc:12000

Change-Id: I33f85e0a81e1dc16ef762171c52a79919080e048
Reviewed-on: https://webrtc-review.googlesource.com/c/src/+/279940
Commit-Queue: Philipp Hancke <phancke@microsoft.com>
Reviewed-by: Harald Alvestrand <hta@webrtc.org>
Reviewed-by: Per Kjellander <perkj@webrtc.org>
Cr-Commit-Position: refs/heads/main@{#38523}
---
 .../video-layers-allocation00/README.md       | 28 +++++-----
 .../rtp_video_layers_allocation_extension.cc  | 55 +------------------
 2 files changed, 17 insertions(+), 66 deletions(-)

diff --git a/docs/native-code/rtp-hdrext/video-layers-allocation00/README.md b/docs/native-code/rtp-hdrext/video-layers-allocation00/README.md
index f367adab4c..c4454d8ee1 100644
--- a/docs/native-code/rtp-hdrext/video-layers-allocation00/README.md
+++ b/docs/native-code/rtp-hdrext/video-layers-allocation00/README.md
@@ -2,7 +2,7 @@
 
 The goal of this extension is for a video sender to provide information about
 the target bitrate, resolution and frame rate of each scalability layer in order
-to aid a middle box to decide which layer to relay.
+to aid a selective forwarding middlebox to decide which layer to relay.
 
 **Name:** "Video layers allocation version 0"
 
@@ -18,7 +18,7 @@ layers and a middle box can choose a layer to relay for each receiver.
 
 This extension support temporal layers, multiple spatial layers sent on a single
 rtp stream (SVC), or independent spatial layers sent on multiple rtp streams
-(Simulcast).
+(simulcast).
 
 ## RTP header extension format
 
@@ -32,9 +32,8 @@ rtp stream (SVC), or independent spatial layers sent on multiple rtp streams
 //   up to 2 bytes           |---------------|
 //   when sl_bm == 0         |sl2_bm |sl3_bm |
 //                           +-+-+-+-+-+-+-+-+
-//   Number of temporal      |#tl|#tl|#tl|#tl|
-// layers per spatial layer  :---------------:
-//    up to 4 bytes          |      ...      |
+// Number of temporal layers |#tl|#tl|#tl|#tl|
+// per spatial layer         |   |   |   |   |
 //                           +-+-+-+-+-+-+-+-+
 //  Target bitrate in kpbs   |               |
 //   per temporal layer      :      ...      :
@@ -56,23 +55,24 @@ rtp stream (SVC), or independent spatial layers sent on multiple rtp streams
 
 RID: RTP stream index this allocation is sent on, numbered from 0. 2 bits.
 
-NS: Number of RTP streams - 1. 2 bits, thus allowing up-to 4 RTP streams.
+NS: Number of RTP streams minus one. 2 bits, thus allowing up-to 4 RTP streams.
 
 sl_bm: BitMask of the active Spatial Layers when same for all RTP streams or 0
-otherwise. 4 bits thus allows up to 4 spatial layers per RTP streams.
+otherwise. 4 bits, thus allows up to 4 spatial layers per RTP streams.
 
 slX_bm: BitMask of the active Spatial Layers for RTP stream with index=X.
-byte-aligned. When NS < 2, takes one byte, otherwise uses two bytes.
+When NS < 2, takes one byte, otherwise uses two bytes. Zero-padded to byte
+alignment.
 
 \#tl: 2-bit value of number of temporal layers-1, thus allowing up-to 4 temporal
-layer per spatial layer. One per spatial layer per RTP stream. values are stored
-in (RTP stream id, spatial id) ascending order. zero-padded to byte alignment.
+layers. Values are stored in ascending order of spatial id. Zero-padded to byte
+alignment.
 
-Target bitrate in kbps. Values are stored using leb128 encoding. one value per
-temporal layer. values are stored in (RTP stream id, spatial id, temporal id)
+Target bitrate in kbps. Values are stored using leb128 encoding [1]. One value per
+temporal layer. Values are stored in (RTP stream id, spatial id, temporal id)
 ascending order. All bitrates are total required bitrate to receive the
 corresponding layer, i.e. in simulcast mode they include only corresponding
-spatial layer, in full-svc all lower spatial layers are included. All lower
+spatial layers, in full-svc all lower spatial layers are included. All lower
 temporal layers are also included.
 
 Resolution and framerate. Optional. Presence is inferred from the rtp header
@@ -82,3 +82,5 @@ id, spatial id) ascending order.
 
 An empty layer allocation (i.e nothing sent on ssrc) is encoded as
 special case with a single 0 byte.
+
+[1] https://aomediacodec.github.io/av1-spec/#leb128
diff --git a/modules/rtp_rtcp/source/rtp_video_layers_allocation_extension.cc b/modules/rtp_rtcp/source/rtp_video_layers_allocation_extension.cc
index 6816a6277f..5172ed4ce7 100644
--- a/modules/rtp_rtcp/source/rtp_video_layers_allocation_extension.cc
+++ b/modules/rtp_rtcp/source/rtp_video_layers_allocation_extension.cc
@@ -150,59 +150,8 @@ SpatialLayersBitmasks SpatialLayersBitmasksPerRtpStream(
 
 }  // namespace
 
-//                           +-+-+-+-+-+-+-+-+
-//                           |RID| NS| sl_bm |
-//                           +-+-+-+-+-+-+-+-+
-// Spatial layer bitmask     |sl0_bm |sl1_bm |
-//   up to 2 bytes           |---------------|
-//   when sl_bm == 0         |sl2_bm |sl3_bm |
-//                           +-+-+-+-+-+-+-+-+
-//   Number of temporal      |#tl|#tl|#tl|#tl|
-// layers per spatial layer  :---------------:
-//    up to 4 bytes          |      ...      |
-//                           +-+-+-+-+-+-+-+-+
-//  Target bitrate in kpbs   |               |
-//   per temporal layer      :      ...      :
-//    leb128 encoded         |               |
-//                           +-+-+-+-+-+-+-+-+
-// Resolution and framerate  |               |
-// 5 bytes per spatial layer + width-1 for   +
-//      (optional)           | rid=0, sid=0  |
-//                           +---------------+
-//                           |               |
-//                           + height-1 for  +
-//                           | rid=0, sid=0  |
-//                           +---------------+
-//                           | max framerate |
-//                           +-+-+-+-+-+-+-+-+
-//                           :      ...      :
-//                           +-+-+-+-+-+-+-+-+
-//
-// RID: RTP stream index this allocation is sent on, numbered from 0. 2 bits.
-// NS: Number of RTP streams - 1. 2 bits, thus allowing up-to 4 RTP streams.
-// sl_bm: BitMask of the active Spatial Layers when same for all RTP streams or
-//     0 otherwise. 4 bits thus allows up to 4 spatial layers per RTP streams.
-// slX_bm: BitMask of the active Spatial Layers for RTP stream with index=X.
-//     byte-aligned. When NS < 2, takes ones byte, otherwise uses two bytes.
-// #tl: 2-bit value of number of temporal layers-1, thus allowing up-to 4
-//     temporal layer per spatial layer. One per spatial layer per RTP stream.
-//     values are stored in (RTP stream id, spatial id) ascending order.
-//     zero-padded to byte alignment.
-// Target bitrate in kbps. Values are stored using leb128 encoding.
-//     one value per temporal layer.  values are stored in
-//     (RTP stream id, spatial id, temporal id) ascending order.
-//     All bitrates are total required bitrate to receive the corresponding
-//     layer, i.e. in simulcast mode they include only corresponding spatial
-//     layer, in full-svc all lower spatial layers are included. All lower
-//     temporal layers are also included.
-// Resolution and framerate.
-//     Optional. Presense is infered from the rtp header extension size.
-//     Encoded (width - 1), 16-bit, (height - 1), 16-bit,  max frame rate 8-bit
-//     per spatial layer per RTP stream.
-//     Values are stored in (RTP stream id, spatial id) ascending order.
-//
-// An empty layer allocation (i.e nothing sent on ssrc) is encoded as
-// special case with a single 0 byte.
+// See /docs/native-code/rtp-rtpext/video-layers-allocation00/README.md
+// for the description of the format.
 
 bool RtpVideoLayersAllocationExtension::Write(
     rtc::ArrayView<uint8_t> data,