Prosody application note: using wideband sampling rates

Introduction

This application note describes how to set up a channel to operate at 16kHz. By default, the Prosody X firmware operates at a sample rate of 8kHz. However, it is possible to operate at a higher rate in order to support the use of wideband codecs such as AMR-WB.

Echo cancellation, Tone Detection and Tone Regeneration

Note that the echo canceller and tone and grunt detectors are designed to operate on 8kHz audio ONLY. These features will not operate correctly if applied to a channel where the input data is 16kHz sampled audio.

Note also that tone regeneration from RFC2833 packets at a VMP[rx] also only operates correctly at 8kHz.

Setting up the channel

To make a channel generate 16kHz sampled audio, the channel's default output sampling rate must be adjusted.

Allocate your channel in the normal way. Once the channel is created, use sm_channel_set_output_rate() to set the new sampling rate on the channel.

Note that only channels outputting at 8kHz can be connected to TDM. If you need to supply the same audio stream to both a 16kHz wideband codec and to TDM, you will need to resample the audio to the appropriate rate. See "Setting up a resampling path" below for details.

Setting up the RTP endpoints

The RTP endpoints - VMP[tx] and VMP[rx] - must each also be set to operate at the new sampling rate. In particular, failure to set the VMP[tx] rate correctly will affect the generation of RTP timestamps, causing the packets to be decoded incorrectly at the far end.

Create your VMP[rx] and VMP[tx] as normal. Using sm_vmptx_config_sample_rate() and sm_vmprx_config_sample_rate() respectively, set the new sampling rate on the RTP endpoints. The VMP[rx] and VMP[tx] can now be configured with a codec and connected to your 16kHz channel in the normal way.

A note on record and replay

The sampling_rate parameters given to the functions sm_record_start and sm_replay_start, whether directly or via the high-level API functions, do not affect the internal working rate of the channel. They are used to calculate a resampling ratio to scale between the file sample rate and the channel sample rate. In order to ensure sane operation, always configure your channel's sample rate BEFORE starting any replay or record on the channel.

The sampling_rate passed to sm_record_start() is used to calculate a ratio assuming the channel will be given data at a sample rate of 8kHz. It is recommended that the sampling_rate be set to 0, which will cause the recording rate to be the same as the sampling rate of data supplied to the channel.

Other supported rates for a channel operating at 16kHz are:

recording file sample ratesampling_rate value
16kHz8000
12kHz6000
22kHz11000

The sampling_rate given to sm_replay_start() is used to calculate a resampling ratio relative to the current working rate of the channel that the replay is being configured on. Supported rates for a channel operating at 16KHz are thus:

replay file sample ratesampling_rate value
16kHz16000
12kHz12000
22kHz22000

A note on conferencing

A channel that is a low level conference output will produce audio at the channel's configured rate. Only those conference input's that are supplied with audio at that rate will be added to the output.

Therefore, when conferencing multiple channels together, ALL the conferenced channels MUST operate at the same sampling rate. To ensure sane operation, the required rate must be set before the channels are passed to any conferencing API function.

If a mixed conference is required, choose ONE sampling rate as the conference rate. For each participant at a nonstandard rate, use resampling paths to up/down rate the participant's inputs and outputs onto a new channel operating at the conference rate. This new channel can then be included in the conference in the usual way.

Setting up a resampling path

Sometimes it is necessary to change the rate of an audio stream - e.g. playing an 8kHz recording out to a 16kHz channel, or gatewaying between wideband RTP and TDM.In order to do this, the audio must be passed through a path configured to resample to the appropriate rate.

Create a path in the usual manner, and then use sm_path_resample() to configure the resampling ratio to use. The configured path can now be used to interconnect the two channels that are using differing sampling rates.