mirror of https://github.com/sipwise/rtpengine.git
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
180 lines
11 KiB
180 lines
11 KiB
# Transcoding
|
|
|
|
Currently transcoding is supported for audio streams. The feature can be disabled on a compile-time
|
|
basis, and is enabled by default.
|
|
|
|
Even though the transcoding feature is available by default, it is not automatically engaged for
|
|
normal calls. Normally *rtpengine* leaves codec negotiation up to the clients involved in the call
|
|
and does not interfere. In this case, if the clients fail to agree on a codec, the call will fail.
|
|
|
|
The transcoding feature can be engaged for a call by instructing *rtpengine* to do so by using
|
|
one of the transcoding options in the *ng* control protocol, such as `transcode` or `ptime` (see below).
|
|
If a codec is requested via the `transcode` option that was not originally offered, transcoding will
|
|
be engaged for that call.
|
|
|
|
With transcoding active for a call, all unsupported codecs will be removed from the SDP. Transcoding
|
|
happens in userspace only, so in-kernel packet forwarding will not be available for transcoded codecs.
|
|
However, even if the transcoding feature has been engaged for a call, not all codecs will necessarily
|
|
end up being transcoded. Codecs that are supported by both sides will simply be passed through
|
|
transparently (unless repacketization is active). In-kernel packet forwarding will still be available
|
|
for these codecs.
|
|
|
|
The following codecs are supported by *rtpengine*:
|
|
|
|
* G.711 (a-Law and µ-Law)
|
|
* G.722
|
|
* G.723.1
|
|
* G.729
|
|
* Speex
|
|
* GSM
|
|
* iLBC
|
|
* Opus
|
|
* AMR (narrowband and wideband)
|
|
* EVS (if supplied -- see below)
|
|
|
|
Codec support is dependent on support provided by the `ffmpeg` codec libraries, which may vary from
|
|
version to version. Use the `--codecs` command line option to have *rtpengine* print a list of codecs
|
|
and their supported status. The list includes some codecs that are not listed above. Some of these
|
|
are not actual VoIP codecs (such as MP3), while others lack support for encoding by *ffmpeg* at the
|
|
time of writing (such as QCELP or ATRAC). If encoding support for these codecs becomes available
|
|
in *ffmpeg*, *rtpengine* will be able to support them.
|
|
|
|
Audio format conversion including resampling and mono/stereo up/down-mixing happens automatically
|
|
as required by the codecs involved. For example, one side could be using stereo Opus at 48 kHz
|
|
sampling rate, and the other side could be using mono G.711 at 8 kHz, and *rtpengine* will perform
|
|
the necessary conversions.
|
|
|
|
If repacketization (using the `ptime` option) is requested, the transcoding feature will also be
|
|
engaged for the call, even if no additional codecs were requested.
|
|
|
|
## G.729 support
|
|
|
|
As *ffmpeg* does not currently provide an encoder for G.729, transcoding support for it is available
|
|
via the [bcg729](https://www.linphone.org/technical-corner/bcg729/) library
|
|
(mirror on [GitHub](https://github.com/BelledonneCommunications/bcg729)). The build system looks for
|
|
the *bcg729* headers in a few locations and uses the library if found. If the library is located
|
|
elsewhere, see `daemon/Makefile` to control where the build system is looking for it.
|
|
|
|
In a Debian build environment, `debian/control` lists a build-time dependency
|
|
on *bcg729*. Newer Debian releases (currently *bullseye*, *bookworm*, *sid*)
|
|
include *bcg729* as a package so nothing needs to be done there. Older Debian
|
|
releases do not currently include a *bcg729* package, but one can be built
|
|
locally using these instructions on
|
|
[GitHub](https://github.com/ossobv/bcg729-deb). *Sipwise* provides a
|
|
pre-packaged version of this as part of our [C5
|
|
CE](https://www.sipwise.com/products/class-5-softswitch-carrier-grade-for-voice-over-ip/)
|
|
product which is [available
|
|
here](https://deb.sipwise.com/spce/mr6.2.1/pool/main/b/bcg729/).
|
|
|
|
Alternatively the build dependency
|
|
can be removed from `debian/control` or by switching to a different Debian build profile.
|
|
Set the environment variable
|
|
`export DEB_BUILD_PROFILES="pkg.ngcp-rtpengine.nobcg729"` (or use the `-P` flag to the *dpkg* tools)
|
|
and then build the *rtpengine* packages.
|
|
|
|
## DTMF transcoding
|
|
|
|
*Rtpengine* supports transcoding between RFC 2833/4733 DTMF event packets (`telephone-event` payloads)
|
|
and in-band DTMF audio tones. When enabled, *rtpengine* translates DTMF event packets to in-band DTMF
|
|
audio by generating DTMF tones and injecting them into the audio stream, and translates in-band DTMF
|
|
tones by running the audio stream through a DSP, and generating DTMF event packets when a DTMF tone
|
|
is detected.
|
|
|
|
Support for DTMF transcoding can be enabled in one of two ways:
|
|
|
|
* In the forward direction, DTMF transcoding is enabled by adding the codec `telephone-event` to the
|
|
list of codecs offered for transcoding. Specifically, if the incoming SDP body doesn't yet list
|
|
`telephone-event` as a supported codec, adding the option *codec → transcode → telephone-event* would
|
|
enable DTMF transcoding. The receiving RTP client can then accept this codec and start sending DTMF
|
|
event packets, which *rtpengine* would translate into in-band DTMF audio. If the receiving RTP client
|
|
also offers `telephone-event` in their behalf, *rtpengine* would then detect in-band DTMF audio coming
|
|
from the originating RTP client and translate it to DTMF event packets.
|
|
|
|
* In the reverse direction, DTMF transcoding is enabled by adding the option `always transcode` to the
|
|
`flags` if the incoming SDP body offers `telephone-event` as a supported codec. If the receiving RTP
|
|
client then rejects the offered `telephone-event` codec, DTMF transcoding is then enabled and is
|
|
performed in the same way as described above.
|
|
|
|
Enabling DTMF transcoding (in one of the two ways described above) implicitly enables the flag
|
|
`always transcode` for the call and forces all of the audio to pass through the transcoding engine.
|
|
Therefore, for performance reasons, this should only be done when really necessary.
|
|
|
|
## T.38
|
|
|
|
*Rtpengine* can translate between fax endpoints that speak T.38 over UDPTL and fax endpoints that speak
|
|
T.30 over regular audio channels. Any audio codec can theoretically be used for T.30 transmissions, but
|
|
codecs that are too compressed will make the fax transmission fail. The most commonly used audio codecs
|
|
for fax are the G.711 codecs (`PCMU` and `PCMA`), which are the default codecs *rtpengine* will use in
|
|
this case if no other codecs are specified.
|
|
|
|
For further information, see the section on the `T.38` dictionary key below.
|
|
|
|
## AMR and AMR-WB
|
|
|
|
As AMR supports dynamically adapting the encoder bitrate, as well as restricting the available bitrates,
|
|
there are some slight peculiarities about its usage when transcoding.
|
|
|
|
When setting the bitrate, for example as `AMR-WB/16000/1/23850` in either the `codec-transcode` or the
|
|
`codec-set` options, that bitrate will be used as the highest permitted bitrate for the encoder. If
|
|
no `mode-set` parameter is communicated in the SDP, then that is the bitrate that will be used.
|
|
|
|
If a `mode-set` is present, then the highest bitrate from that mode set which is lower or equal to the
|
|
given bitrate will be used. If only higher bitrates are allowed by the mode set, then the next higher
|
|
bitrate will be used.
|
|
|
|
To produce an SDP that includes the `mode-set` option (when adding AMR to the codec list via
|
|
`codec-transcode`), the full format parameter string can be appended to the codec specification, e.g.
|
|
`codec-transcode-AMR-WB/16000/1/23850//mode-set=0,1,2,3,4,5;octet-align=1`. In this example, the bitrate
|
|
23850 won't actually be used, as the highest permitted mode is 5 (18250 bps) and so that bitrate will
|
|
be used.
|
|
|
|
If a literal `=` cannot be used due to parsing constraints (i.e. being wrongly interpreted as a
|
|
key-value pair), it can be escaped by using two dashes instead, e.g.
|
|
`codec-transcode-AMR-WB/16000/1/23850//mode-set--0,1,2,3,4,5;octet-align--1`
|
|
|
|
The default (highest) bitrates for AMR and AMR-WB are 6700 and 14250, respectively.
|
|
|
|
If a Codec Mode Request (CMR) is received from the AMR peer, then *rtpengine* will adhere to the request
|
|
and switch encoder bitrate unconditionally, even if it's a higher bitrate than originally desired.
|
|
|
|
To enable sending CMRs to the AMR peer, the codec-specific option `CMR-interval` is provided. It takes
|
|
a number of milliseconds as argument. Throughout each interval, *rtpengine* will track which AMR frame
|
|
types were received from the peer, and then based on that will make a decision at the end of the
|
|
interval. If a higher bitrate is allowed by the mode set that was not received from the AMR peer at all,
|
|
then *rtpengine* will request switching to that bitrate per CMR. Only the next-highest bitrate mode that
|
|
was not received will ever be requested, and a CMR will be sent only once per interval. Full example to
|
|
specify a CMR interval of 500 milliseconds (with `=` escapes):
|
|
`codec-transcode-AMR-WB/16000/1/23850//mode-set--0,1,2/CMR-interval--500`
|
|
|
|
Similar to the `CMR-interval` option, *rtpengine* can optionally attempt to periodically increase the
|
|
outgoing bitrate without being requested to by the peer via a CMR. To enable this, set the option
|
|
`mode-change-interval` to the desired interval in milliseconds. If the last CMR from the AMR peer was
|
|
longer than this interval ago, *rtpengine* will increase the bitrate by one step if possible. Afterwards,
|
|
the interval starts over.
|
|
|
|
## EVS
|
|
|
|
Enhanced Voice Services (EVS) is a patent-encumbered codec for which (at the
|
|
time of writing) no implementation exists which can be freely used and
|
|
distributed. As such, support for EVS is only available if an implementation is
|
|
supplied separately. Currently the only implementation supported is the
|
|
ETSI/3GPP reference implementation (either floating-point or fixed-point). Any
|
|
licensing issues that might result from such usage are the responsibility of
|
|
the user of this software.
|
|
|
|
The EVS codec implementation can be provided as a shared object library (*.so*)
|
|
which is loaded in during runtime (at startup). The supported implementations
|
|
can be seen as subdirectories within the `evs/` directory. Currently supported
|
|
are version 17.0.0 of the ETSI/3GPP reference implementation, [*126.442*](https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=1464) for the
|
|
fixed-point implementation and [*126.443*](https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=1465) for the floating-point implementation.
|
|
(The floating-point implementation seems to be significantly faster, but is not
|
|
bit-precise.)
|
|
|
|
To supply the codec implementation as a shared object during runtime, extract
|
|
the reference implementation's *.zip* file and apply the provided `patch`
|
|
([from here](https://github.com/sipwise/rtpengine/tree/master/evs)) that is
|
|
appropriate for the chosen implementation. Run the build using `make`
|
|
(suggested build flags are `RELEASE=1 make`) and it should produce a file
|
|
`lib3gpp-evs.so`. Point *rtpengine* to this file using the `evs-lib-path=`
|
|
option to enable support for EVS.
|