Add conceptual documentation for NetEq.
Many things are omitted in this doc and it can definitely be improved, but I hope it captures the most important parts. Bug: webrtc:12568 Change-Id: I13097d633ca19cecc9dd43bdb777b0ca48f151dd Reviewed-on: https://webrtc-review.googlesource.com/c/src/+/215142 Commit-Queue: Jakob Ivarsson <jakobi@webrtc.org> Reviewed-by: Minyue Li <minyue@webrtc.org> Reviewed-by: Artem Titov <titovartem@webrtc.org> Cr-Commit-Position: refs/heads/master@{#33724}
This commit is contained in:
parent
a743303211
commit
8181b4f1e0
@ -13,6 +13,7 @@
|
|||||||
* [SCTP](/pc/g3doc/sctp_transport.md)
|
* [SCTP](/pc/g3doc/sctp_transport.md)
|
||||||
* Congestion control and bandwidth estimation
|
* Congestion control and bandwidth estimation
|
||||||
* Audio
|
* Audio
|
||||||
|
* [NetEq](/modules/audio_coding/neteq/g3doc/index.md)
|
||||||
* AudioEngine
|
* AudioEngine
|
||||||
* [ADM](/modules/audio_device/g3doc/audio_device_module.md)
|
* [ADM](/modules/audio_device/g3doc/audio_device_module.md)
|
||||||
* [Audio Coding](/modules/audio_coding/g3doc/index.md)
|
* [Audio Coding](/modules/audio_coding/g3doc/index.md)
|
||||||
|
|||||||
102
modules/audio_coding/neteq/g3doc/index.md
Normal file
102
modules/audio_coding/neteq/g3doc/index.md
Normal file
@ -0,0 +1,102 @@
|
|||||||
|
<?% config.freshness.reviewed = '2021-04-13' %?>
|
||||||
|
<?% config.freshness.owner = 'jakobi' %?>
|
||||||
|
|
||||||
|
# NetEq
|
||||||
|
|
||||||
|
NetEq is the audio jitter buffer and packet loss concealer. The jitter buffer is
|
||||||
|
an adaptive jitter buffer, meaning that the buffering delay is continuously
|
||||||
|
optimized based on the network conditions. Its main goal is to ensure a smooth
|
||||||
|
playout of incoming audio packets from the network with a low amount of audio
|
||||||
|
artifacts (alterations to the original content of the packets) while at the same
|
||||||
|
time keep the delay as low as possible.
|
||||||
|
|
||||||
|
## API
|
||||||
|
|
||||||
|
At a high level, the NetEq API has two main functions:
|
||||||
|
[`InsertPacket`](https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/api/neteq/neteq.h;l=198;drc=4461f059d180fe8c2886d422ebd1cb55b5c83e72)
|
||||||
|
and
|
||||||
|
[`GetAudio`](https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/api/neteq/neteq.h;l=219;drc=4461f059d180fe8c2886d422ebd1cb55b5c83e72).
|
||||||
|
|
||||||
|
### InsertPacket
|
||||||
|
|
||||||
|
[`InsertPacket`](https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/api/neteq/neteq.h;l=198;drc=4461f059d180fe8c2886d422ebd1cb55b5c83e72)
|
||||||
|
delivers an RTP packet from the network to NetEq where the following happens:
|
||||||
|
|
||||||
|
1. The packet is discarded if it is too late for playout (for example if it was
|
||||||
|
reordered). Otherwize it is put into the packet buffer where it is stored
|
||||||
|
until it is time for playout. If the buffer is full, discard all the
|
||||||
|
existing packets (this should be rare).
|
||||||
|
2. The interarrival time between packets is analyzed and statistics is updated
|
||||||
|
which is used to derive a new target playout delay. The interarrival time is
|
||||||
|
measured in the number of GetAudio ‘ticks’ and thus clock drift between the
|
||||||
|
sender and receiver can be accounted for.
|
||||||
|
|
||||||
|
### GetAudio
|
||||||
|
|
||||||
|
[`GetAudio`](https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/api/neteq/neteq.h;l=219;drc=4461f059d180fe8c2886d422ebd1cb55b5c83e72)
|
||||||
|
pulls 10 ms of audio from NetEq for playout. A much simplified decision logic is
|
||||||
|
as follows:
|
||||||
|
|
||||||
|
1. If there is 10 ms audio in the sync buffer then return that.
|
||||||
|
2. If the next packet is available (based on RTP timestamp) in the packet
|
||||||
|
buffer then decode it and append the result to the sync buffer.
|
||||||
|
1. Compare the current delay estimate (filtered buffer level) with the
|
||||||
|
target delay and time stretch (accelerate or decelerate) the contents of
|
||||||
|
the sync buffer if the buffer level is too high or too low.
|
||||||
|
2. Return 10 ms of audio from the sync buffer.
|
||||||
|
3. If the last decoded packet was a discontinuous transmission (DTX) packet
|
||||||
|
then generate comfort noise.
|
||||||
|
4. If there is no available packet for decoding due to the next packet having
|
||||||
|
not arrived or been lost then generate packet loss concealment by
|
||||||
|
extrapolating the remaining audio in the sync buffer or by asking the
|
||||||
|
decoder to produce it.
|
||||||
|
|
||||||
|
In summary, the output is the result one of the following operations:
|
||||||
|
|
||||||
|
* Normal: audio decoded from a packet.
|
||||||
|
* Acceleration: accelerated playout of a decoded packet.
|
||||||
|
* Preemptive expand: decelerated playout of a decoded packet.
|
||||||
|
* Expand: packet loss concealment generated by NetEq or the decoder.
|
||||||
|
* Merge: audio stitched together from packet loss concealment to decoded data
|
||||||
|
in case of a loss.
|
||||||
|
* Comfort noise (CNG): comfort noise generated by NetEq or the decoder between
|
||||||
|
talk spurts due to discontinuous transmission of packets (DTX).
|
||||||
|
|
||||||
|
## Statistics
|
||||||
|
|
||||||
|
There are a number of functions that can be used to query the internal state of
|
||||||
|
NetEq, statistics about the type of audio output and latency metrics such as how
|
||||||
|
long time packets have waited in the buffer.
|
||||||
|
|
||||||
|
* [`NetworkStatistics`](https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/api/neteq/neteq.h;l=273;drc=4461f059d180fe8c2886d422ebd1cb55b5c83e72):
|
||||||
|
instantaneous values or stats averaged over the duration since last call to
|
||||||
|
this function.
|
||||||
|
* [`GetLifetimeStatistics`](https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/api/neteq/neteq.h;l=280;drc=4461f059d180fe8c2886d422ebd1cb55b5c83e72):
|
||||||
|
cumulative stats that persist over the lifetime of the class.
|
||||||
|
* [`GetOperationsAndState`](https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/api/neteq/neteq.h;l=284;drc=4461f059d180fe8c2886d422ebd1cb55b5c83e72):
|
||||||
|
information about the internal state of NetEq (is only inteded to be used
|
||||||
|
for testing and debugging).
|
||||||
|
|
||||||
|
## Tests and tools
|
||||||
|
|
||||||
|
* [`neteq_rtpplay`](https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/audio_coding/neteq/tools/neteq_rtpplay.cc;drc=cee751abff598fc19506f77de08bea7c61b9dcca):
|
||||||
|
Simulate NetEq behavior based on either an RTP dump, a PCAP file or an RTC
|
||||||
|
event log. A replacement audio file can also be used instead of the original
|
||||||
|
payload. Outputs aggregated statistics and optionally an audio file to
|
||||||
|
listen to.
|
||||||
|
* [`neteq_speed_test`](https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/audio_coding/neteq/test/neteq_speed_test.cc;drc=2ab97f6f8e27b47c0d9beeb8b6ca5387bda9f55c):
|
||||||
|
Measure performance of NetEq, used on perf bots.
|
||||||
|
* Unit tests including bit exactness tests where RTP file is used as an input
|
||||||
|
to NetEq, the output is concatenated and a checksum is calculated and
|
||||||
|
compared against a reference.
|
||||||
|
|
||||||
|
## Other responsibilities
|
||||||
|
|
||||||
|
* Dual-tone multi-frequency signaling (DTMF): receive telephone events and
|
||||||
|
produce dual tone waveforms.
|
||||||
|
* Forward error correction (RED or codec inband FEC): split inserted packets
|
||||||
|
and prioritize the payloads.
|
||||||
|
* NACK (negative acknowledgement): keep track of lost packets and generate a
|
||||||
|
list of packets to NACK.
|
||||||
|
* Audio/video sync: NetEq can be instructed to increase the latency in order
|
||||||
|
to keep audio and video in sync.
|
||||||
Loading…
x
Reference in New Issue
Block a user