Prosody speech processing: Notes on adding call progress tones

There is only one active repertoire of call-progress tones at any time.
The set of call-progress tones uses a given tone-set (modified using sm_add_input_tone_set()).
The default call-progress repertoire uses tone set 1, defined in Prosody speech processing: pre-loaded input tones
A tone can be added to the call-progress repertoire using sm_add_input_cptone(), as long as it uses this same tone set.
If use of a different tone-set is required, the entire current repertoire of call-progress tones must be discarded and a new repertoire created based on the new tone-set.
To do this, create the new tone set using sm_add_input_tone_set(), use sm_reset_input_cptones(), with the new tone-set id, and build the new call-progress repertoire from scratch.
Call progress tones are specified as a sequence of tones, each with a frequency and a duration. The algorithm wakes up every time a tone starts or stops, then looks to see if the detected sequence matches any of the specified sequences in the Call Progress repertoire.
In the event that the cadence is not symmetrical, the state sequence should be in reverse order. i.e. states[0] is later in time than states[1]. An example of this is the S.I.T. tone, where the sequence is tone 2 - silence - tone 3 - silence - tone 4.
The tone-ids used in call progress tone definition must be incremented by 1, and a tone-id of zero is used to specify silence.
Durations are stored internally as multiples of 32ms, so it is useful to round durations before specifying them. This way you can be sure of what durations are actually used.
If the minimum duration for a state is zero, then the tone will be recognised even if it never occurs.
Only on Prosody version 1: In order to detect long periods of silence efficiently, the maximum duration for a silent state is 2528ms (0x4f x 32ms). After this duration, the algorithm will wake up and move to the next state. The next state may also be silence.
Only on Prosody version 1: The previous rule also applies to any tone with index 1. Maximum duration in this case is 1248ms (0x27 x 32ms).
Only on Prosody version 2 (TiNG): If a maximum duration is specified as ~0U (the largest unsigned integer value) this means that there is no maximum duration. When checking a tone specified in this way, the software considers the tone to have matched as soon as it has persisted for the minimum duration. For all other maximum values, the software must wait until the end of the tone so that it can check that its duration does not exceed the maximum.

Only on Prosody version 2 (TiNG): An entry in the call progress tone table will match an incoming tone which contains the correct sequence of tones even if the sequence does not start at the first item in the table. For example, the S.I.T. tone, whose pattern is:

tone 2, silence, tone 3, silence,
tone 4, silence

will match any of these six incoming sequences:

No	Sequence
1	`tone 2,`	`silence,`	`tone 3,`	`silence,`	`tone 4,`	`silence,`
2		`silence,`	`tone 3,`	`silence,`	`tone 4,`	`silence,`	`tone 2,`
3			`tone 3,`	`silence,`	`tone 4,`	`silence,`	`tone 2,`	`silence,`
4				`silence,`	`tone 4,`	`silence,`	`tone 2,`	`silence,`	`tone 3,`
5					`tone 4,`	`silence,`	`tone 2,`	`silence,`	`tone 3,`	`silence,`
6						`silence,`	`tone 2,`	`silence,`	`tone 3,`	`silence,`	`tone 4,`

Only on Prosody version 2 (TiNG): If a tone set is used which has both Nl tones in band 1, denoted L[0]..L[Nl-1], and Nh tones in band 2, denoted H[0]..H[Nh-1], then any combination of tones, L[i] with H[j] will have a freq_id value of i + Nl * j. For example, if there are two tones in band 1, 350 Hz and 900 Hz, and three tones in band 2, 500 Hz, 700 Hz, and 1234 Hz, then the possible combinations are:

	tones with 500 Hz	tones with 700 Hz	tones with 1234 Hz
tones with 350 Hz	1 is 350 + 500	3 is 350 + 700	5 is 350 + 1234
tones with 900 Hz	2 is 900 + 500	4 is 900 + 700	6 is 900 + 1234