Prosody application note: configuring universal tone detection

Introduction

This application note describes how to extend the Prosody's default tone detection capability. The application note DTMF Detection Issues covers some frequent problems encountered when detecting tones.

Types of Tone Detection

There are two types of tone detection - simple and complex. A simple tone is detected if the signal meets detection criteria instantaneously. A complex tone (such as a call progress tone) is defined as a sequence of simple tones, with specified cadence.

For both types of tone detection, the maximum range of the detection algorithm is 250Hz to 3406.25Hz, but the range may be further limited using sm_adjust_input_tone_set(). The two pre-loaded tonesets have default range limits as follows:

Simple tones, Top-Down

When starting simple tone detection sm_listen_for() is given an active_tone_set_id.

There are two default tonesets, and more tonesets can be defined using sm_add_input_tone_set().

A toneset defines the set of frequencies, frequency pairing (in the case of dual tones) and various signal parameters, such as signal-to-noise ratio, which must be satisfied.

typedef struct sm_input_tone_set_parms {
	tSMModuleId	module;
	tSM_INT	band1_first_freq_coeffs_id;
	tSM_INT	band1_freq_count;
	tSM_INT	band2_first_freq_coeffs_id;
	tSM_INT	band2_freq_count;

	double	req_third_peak;
	double	req_signal_to_noise_ratio;
	double	req_minimum_power;
	double	req_twist_for_dual_tone;
	tSM_INT	id;
} SM_INPUT_TONE_SET_PARMS;

Frequencies such as band1_first_freq_coeffs_id are defined with reference to the Input Frequency Table, which specifies lower and upper frequency boundaries for individual tones. There is one Input Frequency Table. Extra coefficients (tones) can be added using sm_add_input_freq_coeffs(). A band may be defined to have zero elements. This is done when a tone set is defined for single component tone detection. A zero length band2 indicates that the tones to be detected are simply the tones indicated by band 1 with no second frequency component.

Here is an illustration of how the default two tonesets relate to the table of frequencies.

diagram of frequency table and tonesets

The default toneset 0 has four frequencies in both bands, so it detects all the combinations of these - the sixteen DTMF tones.

Aside - DTMF Detection

DTMF signals each have two tones, a low tone and a high tone. There are four possible low tones, and four possible high tones. The digits are defined to use tone indexes in the following order:

DIGITlow indexhigh indexcentre frequencies Hz
'1'00697 + 1209
'2'01697 + 1336
'3'02697 + 1477
'A'03697 + 1633
'4'10770 + 1209
'5'11770 + 1336
'6'12770 + 1477
'B'13770 + 1633
'7'20852 + 1209
'8'21852 + 1336
'9'22852 + 1477
'C'23852 + 1633
'*'30941 + 1209
'0'31941 + 1336
'#'32941 + 1477
'D'33941 + 1633

When detecting DTMF tones on Prosody, using sm_listen_for():

DTMF Detection Toneset

The toneset for DTMF detection specifies two bands of tones.

Band 1 is the set of 4 low tones, Band 2 is the set of 4 high tones. The tones in each band are defined with reference to the Input Frequency Table. Band 1 starts at tone id 0 (in the table of C2.1), Band 2 starts at tone id 4.

That is:

DTMFToneSet.band1_first_freq_coeffs_id = 0; // DTMF Frequency 1 (684.5 to 709.5Hz)
DTMFToneSet.band1_freq_count = 4;
DTMFToneSet.band2_first_freq_coeffs_id = 4; // DTMF Frequency 5 (1188.9 to 1229.1Hz)i
DTMFToneSet.band2_freq_count = 4;

Other parameters are explained in the description of sm_add_input_tone_set().

Call Progress Toneset

Call-progress detection is enabled by setting enable_cptone_recognition in a call to sm_listen_for(). When this is enabled, the parameter active_tone_set_id is ignored.

The toneset used by default is toneset 1. This can be changed - see below.

Defining a Toneset, Bottom-Up

As an example, we will add a tone set to detect R2 CAS Forward tones, defined according to ITU Q.455 as follows:

Any pairing of frequencies 1980, 1860, 1740, 1620, 1500, 1380 Hz. Frequency ranges of +/- 10Hz must be detected.

In order to define a toneset from scratch, we must first define frequency detection limits in the frequency table.

Centre (Hz)Low cut-off (Hz)High cut-off (Hz)
138013701390
150014901510
162016101630
174017301750
186018501870
198019701990

These frequency limits should be added to the tone frequency table as follows. They will each be assigned an id, which here is stored in the array r2idx:

int define_r2_freqs(tSMModuleId module, int r2idx[6])
{
	int i, rc;
	SM_INPUT_FREQ_COEFFS_PARMS ifcp;
	double lowfreqs[6] = {1370, 1490, 1610, 1730, 1850, 1970 };
	double highfreqs[6] = {1390, 1510, 1630, 1750, 1870, 1990 };

	for(i=0; i<=6; i++) {
		memset(&ifcp, 0, sizeof(ifcp));
		ifcp.module = module;
		ifcp.lower_limit = lowfreqs[i];
		ifcp.upper_limit = highfreqs[i];
		rc = sm_add_input_freq_coeffs(&ifcp);
		if (rc) return rc;	// error
		r2idx[i] = ifcp.id;
	}
	return rc;
}

The R2 Toneset is then created as follows:

int add_tone_set(tSMModuleId module, int r2idx[6], int *r2tone_set_id)
{
	SM_INPUT_TONE_SET_PARMS itsp;
	int r2tone_set_id;
	int rc;
	
	memset(&itsp, 0, sizeof(itsp));
	itsp.module = module;
	itsp.band1_first_freq_coeffs_id = r2idx[0];
	itsp.band1_freq_count = 6;
	itsp.band2_first_freq_coeffs_id = r2idx[0];
	itsp.band2_freq_count = 6;
	
	itsp.req_third_peak = 0.01;		// Max disturbing frequency -20dB
	itsp.req_signal_to_noise_ratio = 0;	// None specified: rely on disturb. frequency
	itsp.req_minimum_power = -30;		// approx -30dBm0
	itsp.req_twist_for_dual_tone = 5.0;	// approx 7dB
	
	rc = sm_add_input_tone_set(&itsp);
	if (!rc) {
		*r2tone_set_id = itsp.id;
	}
	return rc;
}

Any pairing of the component frequencies is allowed, so both Band 1 and Band 2 start at r2idx[0], and have 6 component frequencies. The floating-point parameters specified are approximate, and if accurate detect/reject boundaries are required, the parameters must be empirically optimised.

In TiNG the above calls provide tonesets that are persistent for the life of the process they were called in. This is different from Prosody 1, when the tonesets were persistent until the firmware was downloaded again.

Tones are detected per channel using sm_listen_for() and sm_get_recognised() in a similar way to detection of DTMF tones, as follows:

int listen_for_r2(tSMChannelId channel, int r2tone_set_id)
{
	SM_LISTEN_FOR_PARMS smlfp;
	int rc;
	
	memset(&smlfp, 0, sizeof(smlfp));
	smlfp.channel = channel;
	smlfp.active_tone_set_id = r2tone_set_id;
	smlfp.tone_detection_mode = kSMToneDetectMinDuration64;
	
	return sm_listen_for(&smlfp);
}

Events may be set, as required, as in the example DTMF.C. There are 15 possible outcomes of the tone detection, corresponding to the 15 combinations of R2 frequencies. ITU Q.441 states that the 'value' of each combination of tones is as follows:

Tone pair (indexes)Frequencies (Hz)value
0 + 11380 + 14901
0 + 21380 + 16102
1 + 21490 + 16103
0 + 31380 + 17304
1 + 31490 + 17305
2 + 31610 + 17306
0 + 41380 + 18507
1 + 41490 + 18508
2 + 41610 + 18509
3 + 41730 + 185010
0 + 51380 + 197011
1 + 51490 + 197012
2 + 51610 + 197013
3 + 51730 + 197014
4 + 51730 + 197015

The result is obtained as follows:

int get_r2_result(tSMChannelId channel, int *result)
{
	SM_RECOGNISED_PARMS recog;
	int rc;
	static int map_parms_to_value[6][6] = {	{-1,  1,  2,  4,  7, 11},
						{ 1, -1,  3,  5,  8, 12},
						{ 2,  3, -1,  6,  9, 13},
						{ 4,  5,  6, -1, 10, 14},
						{ 7,  8,  9, 10, -1, 15},
						{11, 12, 13, 14, 15. -1}); 
	
	memset(&recog, 0, sizeof(recog));
	recog.channel = channel;
	
	rc = sm_get_recognised(&recog);
	if (!rc) {
		*result = map_parms_to_value[recog.param0][recog.param1];
	}
	return rc;
}

Adding single tones to the default DTMF Toneset

As explained in an earlier section, the default DTMF toneset consists of two bands, each of four different sets of frequencies. In this section we add extra tones to the DTMF toneset, so that they can be detected in parallel with DTMF detection. Further, the extra tones to be detected are single tones, as opposed to dual tones. The two tones to be added in this case are Fax CNG (1100Hz) and Fax CED (2100Hz).

The DTMF toneset has two bands, each of four frequencies, making up 16 tone pairs. We wish to add an extra 2 tone pairs, i.e. (1100 + 0) and (2100 + 0). We must therefore add two frequencies (1100 and 2100) to one band and a null frequency (0) to the other band. Also, we have decided not to detect the digits 'A' 'B' 'C' and 'D', so the 1633 frequency is removed from the higher band. The bands are now as follows ('L' symbolises DTMF low tone, 'H' symbolises DTMF high tone):

Band 1 (Hz):69777085294111002100
ToneL0L1L2L3CNGCED

Band 2 (Hz):1209133614770
ToneH0H1H2CNG/CED

First, we establish that the default pre-loaded DTMF tone frequencies cannot be used. This is because bands of tones must have consecutive IDs. That is, 1100 must come immediately after 941. Therefore, we must add all these tones, in the correct order, to the end of the tone frequency table.

 
int add_tone_freqs(tSMModuleId module, int *first_freq_id)
{
	SM_INPUT_FREQ_COEFFS_PARMS coeff;
	int rc, i;
	
	static float lolimit[10] = { 679.6875f, 742.1875f, 835.9375f, 914.0625f, 990f, 
	1880f, 1179.6875f, 1304.6875f, 1445.3125f, 0 };
	static float hilimit[10] = { 710.9375f, 789.0625f, 867.1875f, 960.9375f, 1210f, 
	2320f, 1242.1875f, 1367.1875f, 1507.8125f, 0 };
	
	for(i=0; i<10; i++) {
		memset(&coeff, 0, sizeof(coeff));
		coeff.module = module;
		coeff.lower_limit = lolimit[i];
		coeff.upper_limit = hilimit[i];
		
		rc = sm_add_input_freq_coeffs(&coeff); 
		if (rc) return rc;
		if (!i) {
			*first_freq_id = coeff.id;
		}
	}
	return 0;
}

The important index to remember is that of the first tone in the series, here stored in first_freq_id. The frequency bands are then defined as follows. The req_* coefficients are taken from the default DTMF recognition toneset.

int add_tone_set(tSMModuleId module, int first_freq_id, int *tone_set_id)
{
	SM_INPUT_TONE_SET_PARMS tone_set;
	
	memset(&tone_set, 0, sizeof(tone_set));
	
	tone_set.module = module;
	tone_set.band1_first_freq_coeffs_id = first_freq_id;
	tone_set.band1_freq_count = 6;
	tone_set.band2_first_freq_coeffs_id = first_freq_id+6;
	tone_set.band2_freq_count = 4;
	
	tone_set.req_third_peak = 0.0794;
	tone_set.req_signal_to_noise_ratio = 0.756;
	tone_set.req_minimum_power = -30;
	tone_set.req_twist_for_dual_tone = 10;
	
	rc = sm_add_input_tone_set(&tone_set);
	if (rc) return rc;
	*tone_set_id = tone_set.id;
	return 0;
}

Recognised tones will have the following param0 and param1 values:

 param0
 012345
param10'1''4''7''*'- (a)- (a)
1'2''5''8''0'- (a)- (a)
2'3''6''9''#'- (a)- (a)
3- (b)- (b)- (b)- (b)CNGCED

The invalid combinations of param0 and param1 will occur if the following tones appear on the line:

These combinations will be detected, and reported to the application, and should be ignored.

int get_detected(tSMChannelId channel, char *result)
{
	SM_RECOGNISED_PARMS rp;
	int rc;
	static char decode_detected[6][4] = {	{ '1', '2', '3', '-' },
						{ '4', '5', '6', '-' },
						{ '7', '8', '9', '-' },
						{ '*', '0', '#', '-' },
						{ '-', '-', '-', 'N' },
						{ '-', '-', '-', 'E' } };
	memset(&rp, 0, sizeof(rp));
	
	rp.channel = channel;
	
	rc = sm_get_recognised(&rp);
	if (rc) return rc;
	if (rp.type == kSMRecognisedTone) {
		if (rp.param0 > 5 || rp.param1 > 3) {
			// should never happen
		} else {
	    		*result = decode_detected[rp.param0][rp.param1];
		    	if (*result != '-') return 0;
		}
	}
	return -1;
}

Tone detection signal parameters

There are three parameters in the tone detection table which specify signal properties that must be satisfied by the incoming signal. A brief explanation of these is given in prospapi.pdf appendix C.3, but more information on how to calculate these is provided below.

The Prosody signal processing algorithm measures six entities, and performs various comparisons in order to validate the tone, as follows:

req_minimum_power is the measured energy in the two dominant frequencies. It is simply the value of (P1 + P2).

req_signal_to_noise_ratio is the minimum value of (P1+P2) / P0.

It is in fact the ratio (signal power) : (signal + noise power).

The scaling factor for k dB is 10k/10, therefore a required SNR of 6dB would be as follows:

req_twist_for_dual_tone is the maximum value of P1 / P2

To allow a maximum twist of 9dB:

req_third_peak is the maximum value of P3 / P2.

It controls the maximum energy of an interfering frequency, and can be used to control harmonic distortion, or interfering signals.

To reject a signal with an interfering frequency 20dB below the secondary frequency:

Please note that if a system is required to detect to rigorous limits, these coefficients should be optimised empirically.

Note also that the toneset can be adjusted using sm_adjust_input_tone_set(). In particular, the MinOnTime and MinOffTime can be modified only by using sm_adjust_input_tone_set().

Complex Tone Detection - Call Progress Detection

The term 'Call Progress Detection' refers to detection of any tone, which is identified by frequency and cadence - that is, the duration of the tones, pauses, and possibly changes of tone need to be detected.

Call-progress detection involves an extra level of definition above simple tone detection (described above). This level is the Call Progress Tone Table (CPTonetable). By default a set of tones is pre-loaded. Entries in the CPTonetable specify cadences (i.e. sequences, with durations) of tones that are already defined for detection in the tone frequency table. The subset of the frequency table to be used, and the signal parameters are defined in a toneset. The default toneset for Call-Progress detection is toneset 1.

That is:

The application developer may wish to:

Adding a call-progress tone with existing CP frequencies

This is the simplest modification that can be made to the call-progress detection table. Here we use the example of adding UK ring tone (this CP tone is already loaded by default):

The default toneset 1 has the following relevant values:

U.K. Ring tone is specified as follows (repeated):

 frequency (Hz)
400HzOFF400HzOFF
time (msec)nominal4002004002000
min2881282881472
max5122565122496

In the two 'ON' states, the frequency id is 1. freq_id is relative to the band1_first_freq_coeff of the active (in this case default) Toneset (toneset 1) PLUS ONE. A freq_id of 0 means silence (OFF).

The following code segment shows how this tone would be added.

Note: Two different versions of the cadence are loaded, both with the same result id. This is for speed of detection - the actual ring tone may start in any phase of its repeated cadence, and if only one cadence was specified in the CP tone table, the whole of the cadence would have to be visited before a detection occurred.

The entire cadence is not specified in either table entry. This is because the partial cadences specified are considered sufficient to detect this tone, and are unique with respect to other cadences in the table. It is important that no cadences in the table are subsets of other cadences, as the subset cadence will be detected unintentionally. The first entry detects a little after 400 + 200 + 400 = 1000 mS, while the second detects a little after 400 + 2000 + 400 = 2800 mS, whereas the complete cadence takes 400 + 200 + 400 + 2000 = 3000 mS, which is significantly longer.

int add_uk_ring_tone(tSMModuleId module, int uk_ring_id)
{
	SM_INPUT_CPTONE_PARMS cp;
	
	memset(&cp, 0, sizeof(cp));
	cp.module = module;
	cp.id = uk_ring_id;
	cp.state_count = 3;
	cp.states[0].freq_id = 1;	// On for 400ms
	cp.states[0].minimum_cadence = 288;
	cp.states[0].maximum_cadence = 512;
	cp.states[1].freq_id = 0;	// Off for 200ms
	cp.states[1].minimum_cadence = 128;
	cp.states[1].maximum_cadence = 256;
	cp.states[2].freq_id = 1;	// On for 400ms
	cp.states[2].minimum_cadence = 288;
	cp.states[2].maximum_cadence = 512;

	rc = sm_add_input_cptone(&cp);
	if (rc) return rc;

	memset(&cp, 0, sizeof(cp));
	cp.module = module;
	cp.id = uk_ring_id;
	cp.state_count = 3;
	cp.states[0].freq_id = 1; 	// On for 400ms
	cp.states[0].minimum_cadence = 288;
	cp.states[0].maximum_cadence = 512;
	cp.states[1].freq_id = 0;	// Off for 2000ms
	cp.states[1].minimum_cadence = 1472;
	cp.states[1].maximum_cadence = 2496;
	cp.states[2].freq_id = 1; 	// On for 400ms
	cp.states[2].minimum_cadence = 288;
	cp.states[2].maximum_cadence = 512;

	rc = sm_add_input_cptone(&cp);
	if (rc) return rc;
	return 0;
}

Adding a call progress tone with a new frequency

Addition of a new frequency to those in the call-progress frequency set involves three stages:

  1. Add the new input frequency coefficients.
  2. Define a toneset which includes the added input frequency coefficients
  3. Reset call progress detection, so that the new toneset is the Call Progress toneset
  4. Add ALL call progress tones, with parameters relative to the new toneset.

For example, given the default Call-Progress tone set, add a detector for the following tone (repeated):

 frequency (Hz)
1000HzOFF3000HzOFF
time (msec)nominal400200400200
min288128288128
max512256512256

Add the new input frequency coefficients

Assuming 10% tolerance in frequency, add the 1000Hz and 3000Hz tones, using sm_add_input_tone_set(), as described above. If no other tone frequencies have been added, the indices for these new tones will follow from 'Fax CED'.

Define a toneset which includes the added input frequency coefficients

The new toneset includes the default call-progress frequencies (7 frequencies, starting at index 8), and the new tones. Therefore:

    ...
    SM_INPUT_TONE_SET_PARMS tone_set;
    ...
    tone_set.module = module;
    tone_set.band1_first_freq_coeffs_id = 8;
    tone_set.band1_freq_count = 9;
    tone_set.band2_first_freq_coeffs_id = 0; 
    tone_set.band2_freq_count = 0;
    tone_set.req_third_peak = 0.1;
    tone_set.req_signal_to_noise_ratio = 0.5;
    tone_set.req_minimum_power = -23;
    ...
    rc = sm_add_input_tone_set(&tone_set);
    new_tone_set_id = tone_set.id;

Reset call progress detection, so that the new toneset is the Call Progress toneset

There is always one call-progress toneset. By default, this is toneset 1. We wish to use a different toneset, which has just been loaded. The index of this toneset was returned by the call to sm_add_input_tone_set().

Do this using sm_reset_input_cptones(), as documented.

Add ALL call progress tones, with parameters relative to the new toneset.

Because all of the tones in the pre-loaded call-progress table were defined relative to toneset 1, they are no longer valid. They must all be re-loaded, using sm_add_input_cptone(), as described above.

Finally, add the new call-progress tone as follows. Remember that the frequency IDs are relative to the start of the section of tone table used in our toneset. We are using 8..16, where 15 and 16 are the two new tones, so their IDs are 8 and 9.

 
int add_custom_tone(tSMModuleId module, int custom_id)
{
	SM_INPUT_CPTONE_PARMS cp;
	
	memset(&cp, 0, sizeof(cp));
	cp.module = module;
	cp.id = custom_id;
	cp.state_count = 3;
	cp.states[0].freq_id = 8;		// 1000Hz
	cp.states[0].minimum_cadence = 288;	// On for 400ms +/- 28%
	cp.states[0].maximum_cadence = 512;
	cp.states[1].freq_id = 0; // Silence
	cp.states[1].minimum_cadence = 128;	// Off for 200ms
	cp.states[1].maximum_cadence = 256;
	cp.states[2].freq_id = 9;		// 3000Hz
	cp.states[2].minimum_cadence = 288;	// On for 400ms
	cp.states[2].maximum_cadence = 512;

	rc = sm_add_input_cptone(&cp);
	if (rc) return rc;

	memset(&cp, 0, sizeof(cp));
	cp.id = uk_ring_id;
	cp.state_count = 3;
	cp.states[0].freq_id = 9; 	
	cp.states[0].minimum_cadence = 288;	// On for 400ms
	cp.states[0].maximum_cadence = 512;
	cp.states[1].freq_id = 0;	
	cp.states[1].minimum_cadence = 128;	// Off for 2000ms
	cp.states[1].maximum_cadence = 256;
	cp.states[2].freq_id = 8; 	
	cp.states[2].minimum_cadence = 288;	// On for 400ms
	cp.states[2].maximum_cadence = 512;

	rc = sm_add_input_cptone(&cp);
	if (rc) return rc;
	return 0;
}

Once again we have included two alternative definitions of this cadence, to detect the tone quickly, whichever phase it starts in.

Application Control of Cadence Detection

It is possible, using the tone detection mode kSMToneLenDetectionMinDurationxx to retrieve the duration of a tone from a call to sm_get_recognised(). This will allow a certain degree of cadence detection to be carried out by the user application. For example, if the combined DTMF/CNG/CED tone table is used, as described above, the application may be required to validate that the CNG tone lasts 0.5s and/or the CED tone lasts 3s. The application can simply check the param1 (duration) parameter when the tone is detected.

TiNG Modules for Tone Detection

The TiNG module required for tone detection is td (used by sm_listen_for()).


Document reference: AN 1352