From 8181b4f1e072645e21057bab9f604bd2ab8c2481 Mon Sep 17 00:00:00 2001 From: Jakob Ivarsson Date: Wed, 14 Apr 2021 11:08:52 +0200 Subject: [PATCH] Add conceptual documentation for NetEq. Many things are omitted in this doc and it can definitely be improved, but I hope it captures the most important parts. Bug: webrtc:12568 Change-Id: I13097d633ca19cecc9dd43bdb777b0ca48f151dd Reviewed-on: https://webrtc-review.googlesource.com/c/src/+/215142 Commit-Queue: Jakob Ivarsson Reviewed-by: Minyue Li Reviewed-by: Artem Titov Cr-Commit-Position: refs/heads/master@{#33724} --- g3doc/sitemap.md | 1 + modules/audio_coding/neteq/g3doc/index.md | 102 ++++++++++++++++++++++ 2 files changed, 103 insertions(+) create mode 100644 modules/audio_coding/neteq/g3doc/index.md diff --git a/g3doc/sitemap.md b/g3doc/sitemap.md index 81e3b0dda1..e58bf9d7ce 100644 --- a/g3doc/sitemap.md +++ b/g3doc/sitemap.md @@ -13,6 +13,7 @@ * [SCTP](/pc/g3doc/sctp_transport.md) * Congestion control and bandwidth estimation * Audio + * [NetEq](/modules/audio_coding/neteq/g3doc/index.md) * AudioEngine * [ADM](/modules/audio_device/g3doc/audio_device_module.md) * [Audio Coding](/modules/audio_coding/g3doc/index.md) diff --git a/modules/audio_coding/neteq/g3doc/index.md b/modules/audio_coding/neteq/g3doc/index.md new file mode 100644 index 0000000000..d0624f46ef --- /dev/null +++ b/modules/audio_coding/neteq/g3doc/index.md @@ -0,0 +1,102 @@ + + + +# NetEq + +NetEq is the audio jitter buffer and packet loss concealer. The jitter buffer is +an adaptive jitter buffer, meaning that the buffering delay is continuously +optimized based on the network conditions. Its main goal is to ensure a smooth +playout of incoming audio packets from the network with a low amount of audio +artifacts (alterations to the original content of the packets) while at the same +time keep the delay as low as possible. + +## API + +At a high level, the NetEq API has two main functions: +[`InsertPacket`](https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/api/neteq/neteq.h;l=198;drc=4461f059d180fe8c2886d422ebd1cb55b5c83e72) +and +[`GetAudio`](https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/api/neteq/neteq.h;l=219;drc=4461f059d180fe8c2886d422ebd1cb55b5c83e72). + +### InsertPacket + +[`InsertPacket`](https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/api/neteq/neteq.h;l=198;drc=4461f059d180fe8c2886d422ebd1cb55b5c83e72) +delivers an RTP packet from the network to NetEq where the following happens: + +1. The packet is discarded if it is too late for playout (for example if it was + reordered). Otherwize it is put into the packet buffer where it is stored + until it is time for playout. If the buffer is full, discard all the + existing packets (this should be rare). +2. The interarrival time between packets is analyzed and statistics is updated + which is used to derive a new target playout delay. The interarrival time is + measured in the number of GetAudio ‘ticks’ and thus clock drift between the + sender and receiver can be accounted for. + +### GetAudio + +[`GetAudio`](https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/api/neteq/neteq.h;l=219;drc=4461f059d180fe8c2886d422ebd1cb55b5c83e72) +pulls 10 ms of audio from NetEq for playout. A much simplified decision logic is +as follows: + +1. If there is 10 ms audio in the sync buffer then return that. +2. If the next packet is available (based on RTP timestamp) in the packet + buffer then decode it and append the result to the sync buffer. + 1. Compare the current delay estimate (filtered buffer level) with the + target delay and time stretch (accelerate or decelerate) the contents of + the sync buffer if the buffer level is too high or too low. + 2. Return 10 ms of audio from the sync buffer. +3. If the last decoded packet was a discontinuous transmission (DTX) packet + then generate comfort noise. +4. If there is no available packet for decoding due to the next packet having + not arrived or been lost then generate packet loss concealment by + extrapolating the remaining audio in the sync buffer or by asking the + decoder to produce it. + +In summary, the output is the result one of the following operations: + +* Normal: audio decoded from a packet. +* Acceleration: accelerated playout of a decoded packet. +* Preemptive expand: decelerated playout of a decoded packet. +* Expand: packet loss concealment generated by NetEq or the decoder. +* Merge: audio stitched together from packet loss concealment to decoded data + in case of a loss. +* Comfort noise (CNG): comfort noise generated by NetEq or the decoder between + talk spurts due to discontinuous transmission of packets (DTX). + +## Statistics + +There are a number of functions that can be used to query the internal state of +NetEq, statistics about the type of audio output and latency metrics such as how +long time packets have waited in the buffer. + +* [`NetworkStatistics`](https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/api/neteq/neteq.h;l=273;drc=4461f059d180fe8c2886d422ebd1cb55b5c83e72): + instantaneous values or stats averaged over the duration since last call to + this function. +* [`GetLifetimeStatistics`](https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/api/neteq/neteq.h;l=280;drc=4461f059d180fe8c2886d422ebd1cb55b5c83e72): + cumulative stats that persist over the lifetime of the class. +* [`GetOperationsAndState`](https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/api/neteq/neteq.h;l=284;drc=4461f059d180fe8c2886d422ebd1cb55b5c83e72): + information about the internal state of NetEq (is only inteded to be used + for testing and debugging). + +## Tests and tools + +* [`neteq_rtpplay`](https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/audio_coding/neteq/tools/neteq_rtpplay.cc;drc=cee751abff598fc19506f77de08bea7c61b9dcca): + Simulate NetEq behavior based on either an RTP dump, a PCAP file or an RTC + event log. A replacement audio file can also be used instead of the original + payload. Outputs aggregated statistics and optionally an audio file to + listen to. +* [`neteq_speed_test`](https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/audio_coding/neteq/test/neteq_speed_test.cc;drc=2ab97f6f8e27b47c0d9beeb8b6ca5387bda9f55c): + Measure performance of NetEq, used on perf bots. +* Unit tests including bit exactness tests where RTP file is used as an input + to NetEq, the output is concatenated and a checksum is calculated and + compared against a reference. + +## Other responsibilities + +* Dual-tone multi-frequency signaling (DTMF): receive telephone events and + produce dual tone waveforms. +* Forward error correction (RED or codec inband FEC): split inserted packets + and prioritize the payloads. +* NACK (negative acknowledgement): keep track of lost packets and generate a + list of packets to NACK. +* Audio/video sync: NetEq can be instructed to increase the latency in order + to keep audio and video in sync.