Prosody speech processing: API

a structure of the following type:

typedef struct sm_ans_listen_for_parms {
	tSMChannelId channel;					/* in */
	enum kSMANSMode {
		kSMANSModeDisable,
		kSMANSModeDetect,
	} detection_mode;					/* in */
} SM_ANS_LISTEN_FOR_PARMS;

Description

This call controls detection of the ITU-T V.8 tones ANS and ANSam, which are used in the preliminary stages of modem negotiation.

When a tone is recognised, the recognition event associated with the channel is set and the application can then retrieve a tone identifier for the recognised tone by calling sm_get_recognised().

The module ansam is required by this call.

Fields

The channel on which to listen.

detection_mode

The detection mode to use. One of these values:

kSMANSModeDisable: Stop detecting ANS and ANSam tones.
kSMANSModeDetect: Initiate ANS and ANSam tone detection.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_BAD_PARAMETER - illegal mode

Prosody speech processing: API: sm_beep_listen_for

Prototype Definition

int sm_beep_listen_for(struct sm_beep_listen_for_parms *listenp)

Parameters

a structure of the following type:

typedef struct sm_beep_listen_for_parms {
	tSMChannelId channel;					/* in */
	tSM_INT min_duration;					/* in */
	double upper_limit;					/* in */
	double lower_limit;					/* in */
} SM_BEEP_LISTEN_FOR_PARMS;

Description

A beep will be recognised if it lasts for at least min_duration milliseconds and the frequency is between lower_limit and upper_limit.

When a beep is recognised, the recognition event associated with the channel is set and the application can then retrieve the details for the recognised beep by calling sm_get_recognised().

Setting min_duration to zero disables the detection.

The modules td and beepdet are required.

Fields

channel: The channel on which to listen.
min_duration: The minumum beep duration (in milliseconds).
upper_limit: The upper frequency boundary (in Hz).
lower_limit: The lower frequency boundary (in Hz).

Returns

0 if call completed successfully, otherwise a standard error.

Prosody speech processing: API: sm_catsig_listen_for

Prototype Definition

int sm_catsig_listen_for(struct sm_catsig_listen_for_parms *listenp)

Parameters

a structure of the following type:

typedef struct sm_catsig_listen_for_parms {
	tSMChannelId channel;					/* in */
	enum kBESPCatSigAlg {
		kBESPCatSigAlgLiveSpeaker=1,
		kBESPCatSigAlgLiveSpeakerTone,
		kBESPCatSigAlgLiveSpeakerToneReport,
	} catsig_alg_id;						/* in */
	tSM_INT abort_catsig_alg;				/* in */
} SM_CATSIG_LISTEN_FOR_PARMS;

Description

If this call is invoked with abort_catsig_alg set to zero and catsig_alg_id set to identifier for a signal categorisation algorithm then this call invokes a signal categorisation algorithm on the given channel input. Once enough of the signal has been processed in order to classify it into a definite category then the application is notified and it can then retrieve an indication of the signal category by calling sm_get_recognised().

If a recognition event has been previously associated with channel (see sm_channel_set_event()), then the driver will notify the application with that event whenever a signal has been categorised.

If the signal cannot be categorised, then no event will occur. Thus an application would normally timeout if no categorisation event occurs within a reasonable time. In order to cancel a signal categorisation algorithm job, the call should be invoked with abort_catsig_alg set to 1.

Fields

The channel whose input is to be analysed.

catsig_alg_id

The signal categorisation algorithm to use. One of these values:

kBESPCatSigAlgLiveSpeaker: Distinguish a signal coming from a live speaker from one coming from an answering machine. In the result returned by sm_get_recognised() the param1 field contains the value 0 when a machine has been detected and 1 when a live speaker has been detected. Requires the module ansdet to have been downloaded.
kBESPCatSigAlgLiveSpeakerTone: Distinguish a signal coming from a live speaker from one coming from an answering machine, ignoring any initial tones. In the result returned by sm_get_recognised() the param1 field contains the value 0 when a machine has been detected and 1 when a live speaker has been detected. Requires the module td to have been downloaded.
kBESPCatSigAlgLiveSpeakerToneReport: Distinguish a signal coming from a live speaker from one coming from an answering machine, ignoring any initial tones. In the result returned by sm_get_recognised() the param1 field contains the value 0 when a machine has been detected, 1 when a live speaker has been detected, 2 when a tone start has been detected and 3 when a tone end has been detected. Requires the module td to have been downloaded.

abort_catsig_alg

Indicator of whether to abort signal categorisation on this channel (non-zero) or not (zero).

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_WRONG_CHANNEL_STATE - abort field is inconsistent with channel state

Prosody speech processing: API: sm_channel_set_input_threshold

Prototype Definition

int sm_channel_set_input_threshold(struct sm_channel_set_input_threshold_parms *thp)

Parameters

*thp

a structure of the following type:

typedef struct sm_channel_set_input_threshold_parms {
	tSMChannelId channel;					/* in */
	tSM_INT minimum_bits;					/* in */
} SM_CHANNEL_SET_INPUT_THRESHOLD_PARMS;

Description

A channel is considered to be ready for you to fetch data from it when there is enough data. This call allows you to specify how much is 'enough'.

While there is enough data to make a channel ready, the channel's associated read event (as configured with sm_channel_set_event()) remains set.

Fields

channel: The channel whose threshold is to be set.
minimum_bits: The new threshold. When this much data is available to be read, the channel is ready.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error

Prosody speech processing: API: sm_channel_set_output_threshold

Prototype Definition

int sm_channel_set_output_threshold(struct sm_channel_set_output_threshold_parms *thp)

Parameters

*thp

a structure of the following type:

typedef struct sm_channel_set_output_threshold_parms {
	tSMChannelId channel;					/* in */
	tSM_INT minimum_bits;					/* in */
} SM_CHANNEL_SET_OUTPUT_THRESHOLD_PARMS;

Description

A channel is considered to be ready for you to supply data to it when there is enough space. This call allows you to specify how much is 'enough'.

While there is enough space for more data to make a channel ready, the channel's associated write event (as configured with sm_channel_set_event()) remains set.

Fields

channel: The channel whose threshold is to be set.
minimum_bits: The new threshold. If the threshold is greater than zero when this much space available for new data, the channel is ready. If it is less than zero, channel becomes ready when the amount of buffered data falls below that value. For example, the value -1024 means that notification is to happen when only 128 octets (1024 bits) of data is waiting to be sent.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error

Prosody speech processing: API: sm_condition_adjust

Prototype Definition

int sm_condition_adjust(struct sm_condition_adjust_parms *condp)

Parameters

*condp

a structure of the following type:

typedef struct sm_condition_adjust_parms {
	tSMChannelId channel;					/* in */
	enum kSMInputCondAdjust {
		kSMInputCondAdjustNonLinearWithMuting,
		kSMInputCondAdjustNonLinearWithCNG,
		kSMInputCondAdjustAGC,
		kSMInputCondAdjustFixGain,
	} adjust_type;						/* in */
	tSM_INT adjust_value;					/* in */
} SM_CONDITION_ADJUST_PARMS;

Description

Adjusts the input conditioning currently being performed on a channel.

Fields

The channel to which conditioning is being applied.

adjust_type

What sort adjustment to perform. One of these values:

kSMInputCondAdjustNonLinearWithMuting: Select whether linear or non-linear echo cancellation with muting is performed. This adjustment is only valid if the input conditioning is currently kSMInputCondEchoCancelation. The adjust_value field selects which to use, with the value zero selecting linear echo cancellation, and other values selecting non-linear cancellation with muting. Non-linear echo cancellation with muting suppresses the signal when it calculates that, after having the echo removed, there is no significant signal remaining. This mode of operation is suitable for large-scale conferencing and some IVR applications, where it is important that the signal is completely muted whenever the caller is not speaking. Negative values represent the lower limit in dBm0. Positive values cause the default value of -42dBm0 to be used.
kSMInputCondAdjustNonLinearWithCNG: Select whether linear or non-linear echo cancellation with comfort noise generation is performed. This adjustment is only valid if the input conditioning is currently kSMInputCondEchoCancelation. The adjust_value field selects which to use, with the value zero selecting linear echo cancellation, and other values selecting non-linear cancellation with comfort noise generation (CNG). In this mode the echo canceller replaces any residual traces of echo with "comfort noise". This is most appropriate to IP gateway applications where background noise is desirable to maintain the illusion of continuity. Negative values represent the threshold in dBm0. Positive values cause the default value of -42dBm0 to be used.
kSMInputCondAdjustAGC: Selects whether or not AGC should be enabled according to the adjust_value field (zero is disable, one is enable). If AGC is disabled, the signal level is modified only to cancel the echo. If AGC is enabled, after the echo has been removed the signal is adjusted by applying a gain which varies in order to amplify weak signals more than strong ones.
kSMInputCondAdjustFixGain: Selects, based on the adjust_value field, whether or not the AGC gain can adapt (0) or is fixed (1) at its current value. This can be used after the AGC gain factor has adjusted itself to prevent further changes.

adjust_value

The value to use for this adjustment. The interpretation of this value depends on the adjustment type being performed with the exception of the value zero. For any particular adjustment type, there is a default which is used if no adjustment has been made. The value zero adjusts the conditioning back to that default.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_WRONG_CHANNEL_STATE - if channel input signal not currently being conditioned.
ERR_SM_BAD_PARAMETER - illegal parameter value

Prosody speech processing: API: sm_condition_adjust_span

Prototype Definition

int sm_condition_adjust_span(struct sm_condition_adjust_span_parms *condp)

Parameters

*condp

a structure of the following type:

typedef struct sm_condition_adjust_span_parms {
	tSMChannelId channel;					/* in */
	tSM_INT span;						/* in */
} SM_CONDITION_ADJUST_SPAN_PARMS;

Description

Adjusts the input conditioning currently being performed on a channel to use the specified span (also called tail length). A side effect of this is that the input conditioning may be re-initialised.

Fields

channel: The channel to which conditioning is being applied.
span: The new span to use (in milliseconds);

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_WRONG_CHANNEL_STATE - if channel input signal not currently being conditioned.

Prosody speech processing: API: sm_condition_input

Prototype Definition

int sm_condition_input(struct sm_condition_input_parms *condp)

Parameters

*condp

a structure of the following type:

typedef struct sm_condition_input_parms {
	tSMChannelId channel;					/* in */
	tSMChannelId reference;					/* in */
	enum kSMInputCondRef {
		kSMInputCondRefNone,
		kSMInputCondRefUseInput,
		kSMInputCondRefUseOutput,
	} reference_type;					/* in */
	enum kSMInputCond {
		kSMInputCondNone,
		kSMInputCondEchoCancelation,
	} conditioning_type;					/* in */
	tSM_INT conditioning_param;				/* in */
	tSMChannelId alt_data_dest;				/* in */
	enum kSMInputCondAltDest {
		kSMInputCondAltDestNone,
		kSMInputCondAltDestInput,
		kSMInputCondAltDestOutput,
	} alt_dest_type;						/* in */
	int ectest_type;					/* in */
	float ectest_gain;					/* in */
	int ectest_delay;					/* in */
	unsigned ectest_flen;					/* in */
	float *ectest_filt;					/* in */
} SM_CONDITION_INPUT_PARMS;

Description

Applies or disables conditioning to the signal input to channel channel with respect to reference signal on channel reference. The input signal to be conditioned is called the primary. The reference may either be the input to a channel or the output from a channel. In particular, it can be the output from channel (but not its input). Note that Prosody switching functions (such as sm_switch_channel_input() or sm_channel_datafeed_connect()) must not be used on a reference while it is in use.

If input signal conditioning is enabled, the conditioned version of the input is generated and is directed to one of several places.

If alt_dest_type is kSMInputCondAltDestNone then when other speech processing functions such as conferencing or recording are performed on channel, the conditioned version of the input signal will be used.
If alt_dest_type is kSMInputCondAltDestInput then when such speech processing functions are performed on alt_data_dest, the conditioned version of the input signal will be used. (If sm_condition_input() is used on such a channel, the result is undefined).
If alt_dest_type is kSMInputCondAltDestOutput then the conditioned signal appears as the output from the channel alt_data_dest, which may be the same channel as channel.

Note that Prosody switching functions (such as sm_switch_channel_output() or sm_channel_datafeed_connect()) must not be used on the destination while echo cancellation is being performed.

All channels specified by channel, alt_data_dest, and reference will need to be processed by the same module. This can be ensured through the use of sm_channel_alloc_placed()

The two commonest configurations are:

Cancelling the echo from a prompt being played: In this situation you use one channel, playing the prompt on it, using its output as the reference (so reference = channel and reference_type = kSMInputCondRefUseOutput), and recording its input.
Stand alone echo cancellation: Here you want to send the conditioned signal to an external destination, rather than recording it. You need at least two channels because you need two inputs - one for the primary and one for the reference - and one output, while a single Prosody channel can have at most one input and one output. One way to arrange this is to have channel, whose input is the primary, be specified as the output (i.e. alt_data_dest = channel and alt_dest_type = kSMInputCondAltDestOutput), with a different channel being used for reference.

Only on Prosody S: Not supported by Prosody S.

Fields

The channel on whose input conditioning is to be performed.

reference

The reference channel, if any, which is to be used for conditioning.

reference_type

What sort of reference to use. One of these values:

kSMInputCondRefNone: No reference signal.
kSMInputCondRefUseInput: Use reference input signal.
kSMInputCondRefUseOutput: Use reference output signal.

conditioning_type

The type of conditioning to perform. One of these values:

kSMInputCondNone: Disable input conditioning. This also disables any redirection of a signal such as may have been set up using alt_data_dest.
kSMInputCondEchoCancelation: Remove echo from the input with respect to the specified reference signal. Requires the modules echocan and passthru to have been downloaded.

conditioning_param

Unused.

alt_data_dest

A channel to receive the resulting conditioned signal, if any form of conditioning is enabled.

alt_dest_type

What kind of alternative destination to use. One of these values:

kSMInputCondAltDestNone: Conditioned signal not redirected - conditioned signal replaces signal input to channel.
kSMInputCondAltDestInput: Conditioned signal replaces signal input to alt_data_dest.
kSMInputCondAltDestOutput: Conditioned signal replaces signal being output on alt_data_dest.

ectest_type

For Aculab use only: selects an echo cancellation test mode. If non-zero, creates simulated echo. This is not part of the official API and may be changed arbitrarily.

ectest_gain

For Aculab use only: the amount of signal to mix in as simulated echo. Only used when ectest_type is 1.

ectest_delay

For Aculab use only: the delay (in samples) for the simulated echo. Only used when ectest_type is 1.

ectest_flen

For Aculab use only: the length of the echo generation filter. Only used when ectest_type is 2.

ectest_filt

For Aculab use only: the echo generation filter. Only used when ectest_type is 2.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_BAD_PARAMETER - illegal parameter value

Prosody speech processing: API: sm_condition_reinit

Prototype Definition

int sm_condition_reinit(tSMChannelId channel)

Parameters

channel: The channel to which conditioning is being applied.

Description

Re-initialises input conditioning algorithm currently being applied to signal on the specified input channel.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_WRONG_CHANNEL_STATE - if channel input signal not currently being conditioned.

Prosody speech processing: API: sm_conf_prim_abort

Prototype Definition

int sm_conf_prim_abort(tSMChannelId channel)

Parameters

channel: The channel on which the conference output has been started which is to be aborted.

Description

Aborts conference on specified channel which will revert to outputting silence.

This function waits for the conference output to be stopped, and is equivalent to calling sm_conf_prim_stop() with a zero nowait field.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_WRONG_CHANNEL_STATE - if no conference started on channel

Prosody speech processing: API: sm_conf_prim_add

Prototype Definition

int sm_conf_prim_add(struct sm_conf_prim_add_parms *confp)

Parameters

a structure of the following type:

typedef struct sm_conf_prim_add_parms {
	tSMChannelId channel;					/* in */
	tSMChannelId participant;				/* in */
	tSM_INT id;						/* out */
	float factor;						/* in */
} SM_CONF_PRIM_ADD_PARMS;

Description

Adds a new conference participant to the set of input channels whose conferenced sum is currently being output on output channel channel. All channels in a conference must have been allocated on a single Prosody processor module.

The participant must be a channel which has been attached to conferencing with sm_conf_prim_attach() unless the conference type is kSMConfTypeStandard in which case the channel input is implicitly attached if necessary.

On return id will be set to a value which is an identifier for this conference participant. This identifier can be used in the call for the participant to leave the conference (see sm_conf_prim_leave()).

Note that a particular participant input channel is assigned the same id for every conference it is added into (or cloned into) while attached. If a channel is detached and attached again, it may be allocated a different id value. This requires the module inchan to have been downloaded.

If the participant has any kind of tone detection enabled through a call to sm_listen_for() then tones detected will be suppressed from entering the conference. This means that as soon as the detector discovers that a tone is present, this participant will be temporarily suspended from the conference and restored when the tone detector determines that the tone has finished. Note, however, that this may permit a very short initial burst of tone to be audible in the conference as, to keep the transmission latency low, the detector cannot rewind back to the start of a tone. Any such short burst of tone will be shorter than the tone detector's minimum tone criterion.

Fields

channel: The channel on which the conference output has been started to which a new participant is to be added.
participant: The input channel which is to be added to the conference output channel.
id: An identifier which identifies this participant.
factor: This field is present for backwards compatibility and should be zero.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_BAD_PARAMETER - illegal agc value, or attempt to use channels on different Prosody modules
ERR_SM_WRONG_CHANNEL_STATE - if no conference started on channel

Prosody speech processing: API: sm_conf_prim_adj_input

Prototype Definition

int sm_conf_prim_adj_input(struct sm_conf_prim_adj_input_parms *confp)

Parameters

a structure of the following type:

typedef struct sm_conf_prim_adj_input_parms {
	tSMChannelId channel;					/* in */
	tSM_INT volume;						/* in */
	tSM_INT agc;						/* in */
} SM_CONF_PRIM_ADJ_INPUT_PARMS;

Description

Enable or disable automatic-gain-control/noise-reduction for an input channel which is a conference participant in one or more conference summed output channels.

The volume parameter may be set to a the gain (in dB) or to the value kSMConfAdjInputVolumeMute which will cause the input to be completely muted. The range of gain supported is at least +8 to -22 dB,

The default input conference settings for a channel are 0 dB volume adjustment with AGC disabled.

Note: all input settings are lost when the channel is no longer a conference input unless the channel has been explicitly attached for conferencing by calling sm_conf_prim_attach().

Fields

channel: The input channel which is attached to conferencing and which is to be adjusted.
volume: The new value for the volume (in dB).
agc: The new value for the indicator of whether automatic gain control is to be enabled (non-zero) or disabled (zero).

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_BAD_PARAMETER - unknown participant id or bad agc value
ERR_SM_WRONG_CHANNEL_STATE - if channel not in a conference

Prosody speech processing: API: sm_conf_prim_adj_input_settings

Prototype Definition

int sm_conf_prim_adj_input_settings(struct sm_conf_prim_adj_input_settings_parms *confp)

Parameters

a structure of the following type:

typedef struct sm_conf_prim_adj_input_settings_parms {
	tSMChannelId channel;					/* in */
	float max_level_decay;					/* in */
	float target_level;					/* in */
} SM_CONF_PRIM_ADJ_INPUT_SETTINGS_PARMS;

Description

Note: all input settings are lost when the channel is no longer a conference input unless the channel has been explicitly attached for conferencing by calling sm_conf_prim_attach().

Fields

channel: The input channel which is attached to conferencing and which is to be adjusted.
max_level_decay: The rate at which the AGC decays the estimate of the loudest signal, as a fraction.
target_level: The target output level in dBm0.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_WRONG_CHANNEL_STATE - if channel not in a conference

Prosody speech processing: API: sm_conf_prim_adj_output

Prototype Definition

int sm_conf_prim_adj_output(struct sm_conf_prim_adj_output_parms *confp)

Parameters

a structure of the following type:

typedef struct sm_conf_prim_adj_output_parms {
	tSMChannelId channel;					/* in */
	tSM_INT volume;						/* in */
	tSM_INT agc;						/* in */
} SM_CONF_PRIM_ADJ_OUTPUT_PARMS;

Description

Adjust output level for conference being output on channel channel. The volume and agc parameters should be set as for sm_conf_prim_start().

Fields

channel: The channel on which the conference output has been started which is to be adjusted.
volume: The new value for the volume (in dB).
agc: The new value for the indicator of whether automatic gain control is to be enabled.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_BAD_PARAMETER - illegal volume or agc value
ERR_SM_WRONG_CHANNEL_STATE - if no conference started on channel

Prosody speech processing: API: sm_conf_prim_adj_tracking

Prototype Definition

int sm_conf_prim_adj_tracking(struct sm_conf_prim_adj_tracking_parms *trackp)

Parameters

*trackp

a structure of the following type:

typedef struct sm_conf_prim_adj_tracking_parms {
	tSMChannelId channel;					/* in */
	double min_noise_level;					/* in */
	double speech_thresh;					/* in */
} SM_CONF_PRIM_ADJ_TRACKING_PARMS;

Description

Adjusts two parameters for the designated input channel that control the criteria by which the channel is reported as having an active input when it is included as one of the participants in a conference. An input is only added to a conference when it is considered to be active.

The speech detection algorithm assumes a fairly constant level of background noise, over which is the speech. It also assumes that there are some pauses in the speech.

The signal on an incoming timeslot is analysed to produce two measurements that determine the eventual noise threshold. These measurements are Lmin, which is the lowest energy monitored, and Lmax, which is the highest energy monitored. Since the speech is assumed to have pauses, Lmin is the quietest level of noise. To allow for some variation in the level of noise, the noise threshold is set a little above the Lmin level. The signal is assumed to contain speech when it is above this threshold. The exact threshold value used is:

	Lmin + (Lmax - Lmin) * speech_thresh

This means that speech_thresh specifies the proportion of the distance between Lmin and Lmax that the threshold is above Lmin. The diagram illustrates this:

diagram of speech threshold .

The default value of speech_thresh is 0.01, which means it raises the threshold above Lmin by 1% of the difference between the loudest and quietest sounds in the signal. To make the detector less sensitive, this value should be increased, though values above 0.03 usually make it too insensitive.

The other adjustable parameter, min_noise_level, specifies the smallest value permitted for Lmin. If the value calculated from the signal is below this, then this value is used instead. This prevents the threshold from being set too low when there is no noise, such as when the caller has muted their phone. The default value for this level is -53 dBm0. To make the detector less sensitive, this value should be increased, though values above -34 usually make it too insensitive.

When speech_thresh is zero, if the signal level is above min_noise_level then the signal is considered to be active. In this case, setting min_noise_level to -90 dBm0 or lower will cause the input to be considered active always.

Note: all input settings are lost when the channel is no longer a conference input unless the channel has been explicitly attached for conferencing by calling sm_conf_prim_attach().

Fields

channel: The input channel which has been attached to conferencing and which is to be adjusted.
min_noise_level: The new value for the minimum noise level (in dBm0).
speech_thresh: The new value for the speech threshold ratio.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_BAD_PARAMETER - illegal volume or agc value
ERR_SM_WRONG_CHANNEL_STATE - if no conference started on channel

Prosody speech processing: API: sm_conf_prim_attach

Prototype Definition

int sm_conf_prim_attach(struct sm_conf_prim_attach_parms *confp)

Parameters

a structure of the following type:

typedef struct sm_conf_prim_attach_parms {
	tSMChannelId channel;					/* in */
	enum kSMConfType conf_type;				/* in */
} SM_CONF_PRIM_ATTACH_PARMS;

Description

Sets up an input channel channel ready to be added as a participant of one or more conferences through calls to sm_conf_prim_add().

The channel input is kept continuously ready for conferencing until sm_conf_prim_detach() is used on it.

Standard conferencing can implicitly attach a channel input for conferencing when sm_conf_prim_add() adds it to the first conference, and the channel is then also implicitly detached when sm_conf_prim_leave() removes it from the last conference. This implicit attaching and detaching does not apply to conferences with individual volume control.

When a channel input is detached (whether explicitly or implicitly), all of the input conference settings are lost (such as the input ID and volume) and all resources used by that input for conferencing are freed. This means that it is usually more convenient to attach explicitly as this allows the input to be set up before it is a participant in any conference and it retains the settings during any period when it is temporarily not a participant in any conference.

Fields

The channel to be used for conference input.

conf_type

The type of conferencing that will be used. One of these values:

kSMConfTypeStandard: Standard conferencing.
kSMConfTypeIndividualVolume: Deprecated: Obsolete mode that permited individual adjustment of the volume of each participant's contribution to each output.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_WRONG_CHANNEL_STATE - if already attached

Prosody speech processing: API: sm_conf_prim_clone

Prototype Definition

int sm_conf_prim_clone(struct sm_conf_prim_clone_parms *clonep)

Parameters

*clonep

a structure of the following type:

typedef struct sm_conf_prim_clone_parms {
	tSMChannelId channel;					/* in */
	tSMChannelId model;					/* in */
} SM_CONF_PRIM_CLONE_PARMS;

Description

Sets up an output channel channel on which will be output the same conferenced sum as currently being output on channel model. Each current participant of model is added to the set of participants for channel, and the output volume and AGC values are copied across.

The conferences on channel and model will be completely independent of each other, for instance if a new participant is added at a later stage to model, it will not be automatically added to channel.

Both channel and model will need to be on the same module.

If model is set to kSMNullChannelId, this call is equivalent to sm_conf_prim_start() with zero volume and agc parameters.

Fields

channel: The channel on which a conference output is to be started.
model: The channel on which a conference output has already been started and which is to serve as a model.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_WRONG_CHANNEL_STATE - if already in use, or model is not conferencing

Prosody speech processing: API: sm_conf_prim_config_activity_reporting

Prototype Definition

int sm_conf_prim_config_activity_reporting(struct sm_conf_prim_config_activity_reporting_parms *activityp)

Parameters

*activityp

a structure of the following type:

typedef struct sm_conf_prim_config_activity_reporting_parms {
	tSMChannelId channel;					/* in */
	tSM_UT32 delay;						/* in */
	tSM_UT32 sensitivity;					/* in */
} SM_CONF_PRIM_CONFIG_ACTIVITY_REPORTING_PARMS;

Description

Configures the active speaker reporting for a conference output. Once configured a conference output will report changes in the active inputs via sm_conf_prim_status(). The delay is the minimum time between these reports. Specifying a delay of zero will disable active input reporting. Reports will be generated if the ranking of the active inputs change or the measured input power varies significantly. The sensitivity determines how much the power must have changed by before a report is generated. The valid range is 0 to 100. A value of 100 will cause any change in the input power to produce a report.

The activity reporting configuration is not copied to conferences created with sm_conf_prim_clone(). Activity reporting is disabled by default.

Fields

channel: The channel which is a conference output.
delay: The minimum time (in ms) between active input reports.
sensitivity: The sensitivity to input power variation

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error

Prosody speech processing: API: sm_conf_prim_detach

Prototype Definition

int sm_conf_prim_detach(struct sm_conf_prim_detach_parms *confp)

Parameters

a structure of the following type:

typedef struct sm_conf_prim_detach_parms {
	tSMChannelId channel;					/* in */
	enum kSMConfType conf_type;				/* in */
} SM_CONF_PRIM_DETACH_PARMS;

Description

Detaches the channel input from conferencing. The channel must have previously been attached with sm_conf_prim_attach() and not be a participant in any conference.

Fields

The channel to be detached.

conf_type

The type of conferencing to detach. One of these values:

kSMConfTypeStandard: Standard conferencing.
kSMConfTypeIndividualVolume: Deprecated: Obsolete mode that permited individual adjustment of the volume of each participant's contribution to each output.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_WRONG_CHANNEL_STATE - if not attached or in a conference

Prosody speech processing: API: sm_conf_prim_info

Prototype Definition

int sm_conf_prim_info(struct sm_conf_prim_info_parms *confp)

Parameters

a structure of the following type:

typedef struct sm_conf_prim_info_parms {
	tSMChannelId channel;					/* in */
	tSM_INT participant_count;				/* out */
	char speakers[8];					/* out */
} SM_CONF_PRIM_INFO_PARMS;

Description

Returns information regarding the conference currently being output on channel channel.

On return, the parameter participant_count is set to the number of input channels being summed together in order to produce the conferenced output, and speakers is a bit mask with bits being set for each participating input channel in the conference which is currently active. Bits set in speakers correspond to the participant ids returned by sm_conf_prim_add(), with bit b of speakers[N] corresponding to participant id B + 8 * N. Note that the speakers field is always zero on Prososdy X.

Fields

channel: The channel on which the conference output has been started on which information is required.
participant_count: The number of active (i.e. non silent) participants in this conference.
speakers: A bitmap of participants, indicating which are active. This functionality is not available on Prososdy X.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_WRONG_CHANNEL_STATE - if no conference started on channel

Prosody speech processing: API: sm_conf_prim_leave

Prototype Definition

int sm_conf_prim_leave(struct sm_conf_prim_leave_parms *confp)

Parameters

a structure of the following type:

typedef struct sm_conf_prim_leave_parms {
	tSMChannelId channel;					/* in */
	tSM_INT id;						/* in */
} SM_CONF_PRIM_LEAVE_PARMS;

Description

Removes a conference participant (identified by id) from the set of input channels whose conferenced sum is currently being output on the output channel channel.

The parameter id should be the value assigned to this conference participant in an earlier call to sm_conf_prim_add().

Fields

channel: The channel on which the conference output has been started from which a participant is to be removed.
id: The identifier of the participant to be removed from the conference.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_BAD_PARAMETER - unknown participant id
ERR_SM_WRONG_CHANNEL_STATE - if no conference started on channel

Prosody speech processing: API: sm_conf_prim_start

Prototype Definition

int sm_conf_prim_start(struct sm_conf_prim_start_parms *confp)

Parameters

a structure of the following type:

typedef struct sm_conf_prim_start_parms {
	tSMChannelId channel;					/* in */
	tSM_INT volume;						/* in */
	tSM_INT agc;						/* in */
	enum kSMConfType {
		kSMConfTypeStandard,
		kSMConfTypeIndividualVolume,
	} conf_type;						/* in */
} SM_CONF_PRIM_START_PARMS;

Description

Sets up an output channel channel on which will be output the conferenced sum of all participating input channels (each participant is added to the conference through a call to sm_conf_prim_add()). The volume and agc parameters control the output level, and are specified as for sm_replay_start().

The channel and all the participating input channels will all need to be processed by the same module. This can be ensured by using sm_channel_alloc_placed().

This requires the module conf to have been downloaded.

The channel output is reserved for conferencing until sm_conf_prim_abort() or sm_conf_prim_stop() stops the channel output from being used. No other output activity can take place on the channel during this time.

Fields

The channel on which a conference output is to be started.

volume

The volume adjustment (in dB).

agc

Indicator of whether automatic gain control is to be enabled (non-zero) or not (zero).

conf_type

The type of conferencing that will be used. One of these values:

kSMConfTypeStandard: Standard conferencing.
kSMConfTypeIndividualVolume: Deprecated: Obsolete mode that permited individual adjustment of the volume of each participant's contribution to each output.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_BAD_PARAMETER - illegal volume/agc value
ERR_SM_WRONG_CHANNEL_STATE - if already in use

Prosody speech processing: API: sm_conf_prim_status

Prototype Definition

int sm_conf_prim_status(struct sm_conf_prim_status_parms *statusp)

Parameters

a structure of the following type:

typedef struct sm_conf_prim_status_parms {
	tSMChannelId channel;					/* in */
	enum kSMConfStatus {
		kSMConfStatusRunning,
		kSMConfStatusStopped,
		kSMConfStatusActiveInputs,
	} status;						/* out */
	union {
		struct {
			struct conf_active_input {
				tSM_INT id;			/* out */
				tSM_INT power;			/* out */
			} input[4];				/* out */
		} active_inputs;				/* out */
	} u;							/* out */
} SM_CONF_PRIM_STATUS_PARMS;

Description

Returns the current status of the conference or an error to indicate a problem.

When the write event is signalled the user must call this function to determine the nature of the status change.

Fields

The conference channel to interrogate

One of these values:

kSMConfStatusRunning: Indicates that there is nothing significant to report
kSMConfStatusStopped: Indicates that the conference output has been stopped
kSMConfStatusActiveInputs: Indicates there is active input information in the active_inputs field.

u

Additional information relating to the current status of the conference

active_inputs

This field is only valid if the status is kSMConfStatusActiveInputs.

input

The most significant inputs in order of significance.

id: The input identifier. An id of -1 means that none of the inputs are ranked in this position.
power: A measure of the input power level

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error

Prosody speech processing: API: sm_conf_prim_stop

Prototype Definition

int sm_conf_prim_stop(struct sm_conf_prim_stop_parms *stopp)

Parameters

*stopp

a structure of the following type:

typedef struct sm_conf_prim_stop_parms {
	tSMChannelId channel;					/* in */
	tSM_UT32 no_wait;					/* in */
} SM_CONF_PRIM_STOP_PARMS;

Description

Stops the conference on a specified channel which will revert to outputting silence.

Fields

channel: The channel on which the conference output has been started which is to be stopped.
no_wait: Indicates if the function should return without waiting for the conference to stop. In this case the function sm_conf_prim_status() must be used to determine when the conference has stopped.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_WRONG_CHANNEL_STATE - if no conference started on channel

Prosody speech processing: API: sm_discard_recognised

Prototype Definition

int sm_discard_recognised(tSMChannelId channel)

Parameters

channel: The channel for which recognition results are to be discarded.

Description

This call discards all buffered but as yet uncollected (by sm_get_recognised()) recognised items for the channel channel.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error

Prosody speech processing: API: sm_get_recognised

Prototype Definition

int sm_get_recognised(struct sm_recognised_parms *recogp)

Parameters

*recogp

a structure of the following type:

typedef struct sm_recognised_parms {
	tSMChannelId channel;					/* inout */
	enum kSMRecognition {
		kSMRecognisedNothing,
		kSMRecognisedTrainingDigit,
		kSMRecognisedDigit,
		kSMRecognisedTone,
		kSMRecognisedCPTone,
		kSMRecognisedGruntStart,
		kSMRecognisedGruntEnd,
		kSMRecognisedASRResult,
		kSMRecognisedASRUncertain,
		kSMRecognisedASRRejected,
		kSMRecognisedASRTimeout,
		kSMRecognisedCatSig,
		kSMRecognisedOverrun,
		kSMRecognisedANS,
		kSMRecognisedBeep,
		kSMRecognisedOnHook,
	} type;							/* out */
	tSM_INT param0;						/* out */
	tSM_INT param1;						/* out */
} SM_RECOGNISED_PARMS;

Description

This call, typically invoked in response to a recognition event being signalled, allows an application to determine what item, if any, was detected. This includes simple tones, call-progress tones and grunts.

In order to poll a specific input channel, the application should set channel to specify the input channel concerned. On successful completion, the type parameter will have been set to indicate the status of detections on that channel.

If the type returned is kSMRecognisedTone, then param0 and param1 may be used to determine the two component frequencies that together made up the recognised simple tone. Normally param0 will be the zero based index into the set of band 1 frequencies of the active tone set, and param1 will be the zero based index into the set of band 2 frequencies of the active tone set (e.g. if there are 4 frequencies in band 1 for the active tone set, param0 may have any value between 0 and 3, note it does not reflect the actual id for the input frequency, just its offset in the enumerated band 1 set of input frequencies).

When the band 2 set of frequencies is empty in the active tone set then param1 will be the zero based index into the set of band 1 frequencies of the active tone set, and param0 will be zero.

However if a tone detection mode of type kSMToneLen... was specified in sm_listen_for() then param0 will contain identifiers for the two component frequencies packed into a single integer as follows:

param0 = normal-param0 + 256 * normal-param1

and param1 will contain the duration in milliseconds of the detected tone (granularity of 32 mS).

If the type returned is kSMRecognisedCPTone, then no part of the call-progress tone being reported can be recognised as part of a later call-progress tone, but any signal after the call-progress tone will be analysed and may trigger recognition of another call-progress tone. For example, if a ringing signal is being received, and this matches a cadence in the call-progress table, then each complete cadence of ringing received will be reported as a separate call-progress tone.

This function can also be used for 'any channel' operation. This mode of operation is a legacy feature and is not recommended for new applications. See Prosody TiNG: any channel operation for more details.

This function may report that nothing has been detected even if a wait done on an event associated with this channel has woken up. This is because sm_get_recognised() has decided that, although something happened, it was not one of the events which is 'interesting'. This is typically noticed when tone detection has been enabled, which will wake the event periodically (between about once per second to once per minute) to keep the library informed of the channel status. These extra wakeups only cause a tone to be reported if the current status is that a continuous tone is being received and this matches a tone with unlimited duration.

Fields

The channnel on which recognition is being checked.

The recognition result. One of these values:

kSMRecognisedNothing: No digit, simple or call-progress tone has been recognised
kSMRecognisedTrainingDigit
kSMRecognisedDigit: A pulse dialled or DTMF dialled digit has been recognised and a character representation for it has been stored in param0. In param1 will be an indication of the digit type (kSMPulseDigits or kSMDTMFDigits) unless a tone detection mode of type kSMToneLen... was specified in which case it will contain the duration in milliseconds of the detected DTMF digit.
kSMRecognisedTone: A simple tone has been recognised from the active set of input tones for the channel. The parameter param0 and param1 are assigned values as described above.
kSMRecognisedCPTone: A call-progress tone has been recognised and the corresponding identifier has been stored in param0.
kSMRecognisedGruntStart: The beginning of a grunt has been detected. param0 is set to the duration of the preceding silence in milliseconds.
kSMRecognisedGruntEnd: The end of a grunt has been detected, param0 is set to grunt duration in milliseconds, and param1 to grunt average energy in negative dBm0 (average is calculated only over periods during which signal is present).
kSMRecognisedASRResult: Obsolete
kSMRecognisedASRUncertain: Obsolete
kSMRecognisedASRRejected: Obsolete
kSMRecognisedASRTimeout: Obsolete
kSMRecognisedCatSig: A signal has been categorised, the parameter param0 indicates the algorithm id (see sm_catsig_listen_for()) and param1 is a value indicating the signal category with respect to this algorithm (eg. live speaker or answer machine).
kSMRecognisedOverrun: The recognition FIFO has been overrun because it has not been polled frequently enough through calls to sm_get_recognised().
kSMRecognisedANS: An ANS or ANSam tone has been detected (see sm_ans_listen_for()). The parameter param0 will describe the tone detected: 0 for end of tone, 1 for an ordinary ANS tone, or 2 for the modulated ANSam tone. The parameter param1 will be non zero if the ANS or ANSam tone has phase reversals
kSMRecognisedBeep: A beep has been recognised The parameter param0 will be the beep frequency and param1 will be zero at the start of the beep and non-zero when the beep ends.
kSMRecognisedOnHook: An 'on-hook' state has been recognised

param0

A parameter giving details of what was detected. The interpretation of this depends on the type field.

param1

Another parameter giving details of what was detected. The interpretation of this also depends on the type field.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_NO_SUCH_CHANNEL - if channel was invalid or no longer allocated, or call invoked in any channel mode when no suitable channels

Prosody speech processing: API: sm_get_recorded_data

Prototype Definition

int sm_get_recorded_data(struct sm_ts_data_parms *datap)

Parameters

a structure of the following type:

typedef struct sm_ts_data_parms {
	tSMChannelId channel;					/* in */
	char *data;						/* in */
	tSM_INT length;						/* out */
} SM_TS_DATA_PARMS;

Description

This call retrieves a buffer of data recorded by a channel.

Before making a call to this function, the application should set the data parameter to point to a buffer of capacity kSMMaxRecordDataBufferSize octets. The channel from which data is to be fetched must be specified by the channel field.

On return from a successful invocation of this call, if any data was available for collection, channel will indicate the input channel concerned, and length will be set to the number of octets of valid data written by the device driver into the buffer data. The error ERR_SM_NO_DATA_AVAILABLE is never generated. Either the length returned is zero or the function blocks until some data is available.

Fields

channel: The channel which is recording.
data: The recorded data.
length: The amount of recorded data in data.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_NO_DATA_AVAILABLE - not enough data recorded yet to give to application
ERR_SM_NO_RECORD_IN_PROGRESS - no recording current on timeslot.

Prosody speech processing: API: sm_listen_for

Prototype Definition

int sm_listen_for(struct sm_listen_for_parms *listenp)

Parameters

a structure of the following type:

typedef struct sm_listen_for_parms {
	tSMChannelId channel;					/* in */
	enum kSMToneDetection {
		kSMToneDetectionNone,
		kSMToneDetectionNoMinDuration,
		kSMToneDetectionMinDuration64,
		kSMToneDetectionMinDuration40,
		kSMToneEndDetectionNoMinDuration,
		kSMToneEndDetectionMinDuration64,
		kSMToneEndDetectionMinDuration40,
		kSMToneLenDetectionNoMinDuration,
		kSMToneLenDetectionMinDuration64,
		kSMToneLenDetectionMinDuration40,
		kSMToneDetectionAsListenFor,
	} tone_detection_mode;					/* in */
	tSM_INT active_tone_set_id;				/* in */
	enum kSMDigitMapping {
		kSMNoDigitMapping,
		kSMDTMFToneSetDigitMapping,
	} map_tones_to_digits;					/* in */
	tSM_INT enable_cptone_recognition;			/* in */
	tSM_INT enable_grunt_detection;				/* in */
	tSM_INT grunt_latency;					/* in */
	double min_noise_level;					/* in */
	double grunt_threshold;					/* in */
	tSM_UT32 grunt_holdoff;					/* in */
} SM_LISTEN_FOR_PARMS;

Description

This call controls the simple tones, call-progress tones and digits that may be recognised on the specified channel channel.

It may be called at any time to alter the tone and digit recognition properties for a particular channel.

Contact Aculab technical support for details of an application library which can detect pulse-dialled digits.

The parameters tone_detection_mode and active_tone_set_id determine if and by what criteria simple tones are recognised on the input channel. If tone detection is enabled then any simple tone that occurs on the channel and that meets the recognition criteria will be notified to the application.

In order to be recognised, a tone must be a member of the input tone set active_tone_set_id (see Prosody speech processing: pre-loaded input tones for predefined tone sets, and sm_add_input_tone_set() for application defined tone sets). It must also fulfil the criteria for the specified mode (see Prosody speech processing: pre-loaded input tones for more details).

When a tone is recognised, the recognition event associated with the channel is set and the application can then retrieve a tone identifier for the recognised tone by calling sm_get_recognised(). However if map_tones_to_digits is set to a value of kSMDTMFToneSetDigitMapping then when a tone occurs on the channel corresponding to a DTMF digit, sm_get_recognised() reports the digit directly, with the mapping between DTMF tones and DTMF digits already done.

If enable_cptone_recognition is set to a non-zero value, then any call-progress tone that occurs on the channel and that corresponds to a member of set of call-progress tones currently recognisable by the module will be notified to the application. See Prosody speech processing: pre-loaded call-progress tones, for a list of default set of call-progress tones recognisable by the module. To alter the default set of recognisable call-progress tones, see the calls sm_reset_input_cptones() and sm_add_input_cptone().

Note that call-progress tone detection may not be used simultaneously with tone or digit detection on the same channel.

If enable_grunt_detection is set to a non-zero value, then the application will be notified when the signal energy on the input channel goes above an adaptive threshold, which is grunt_threshold above the estimated ambient background noise level. The application will be notified again when this signal burst ends. The grunt_latency parameter, if non-zero, enables holding back of the "end of grunt" notification by grunt_latency milliseconds so if the signal restarts during this period, a premature "end of grunt" notification is not given. The grunt detection algorithm makes the assumption that there is activity on the line at initialisation. Therefore, the first notification will always be an "end of grunt". If the line is silent when grunt detection is enabled, an "end of grunt" notification will happen within grunt_latency milliseconds from the start. For natural speech grunt_latency should be set to 1000 milliseconds or longer.

If a recognition event has been previously associated with channel (see sm_channel_set_event()), then the driver will notify the application with that event whenever one of the above is recognised on the input channel.

Fields

enable_cptone_recognition

The channel on which to listen.

tone_detection_mode

The tone detection to enable, if any. If a tone detection is to be used, the module td must have been downloaded. One of these values:

kSMToneDetectionNone: Simple tones never recognised.
kSMToneDetectionNoMinDuration: Simple tone detection enabled, no minimum period. If the correct frequencies are detected with the correct signal to noise ratio, twist, etc. for however short a duration, the tone is considered to be present and is recognised.
kSMToneDetectionMinDuration64: Simple tone detection enabled, tone must be valid for minimum period to be detected. If the tone is valid for 64mS it will definitely be detected. Tones of shorter duration between 32mS and 64mS may be detected but cannot be guaranteed. The minimum duration of a tone can be increased by setting the parameter kAdjustToneSetIntParamIdMinOnTime with sm_adjust_input_tone_set().
kSMToneDetectionMinDuration40: This mode uses a slightly more complex algorithm for analysing duration of a valid tone, and enables robust detection of tones with duration as short as 40mS.
kSMToneEndDetectionNoMinDuration: This mode is like kSMToneDetectionNoMinDuration but application notified when end of tone detected.
kSMToneEndDetectionMinDuration64: This mode is like kSMToneDetectionMinDuration64 but application notified when end of tone detected.
kSMToneEndDetectionMinDuration40: This mode is like kSMToneDetectionMinDuration40 but application notified when end of tone detected.
kSMToneLenDetectionNoMinDuration: This mode is like kSMToneEndDetectionNoMinDuration but returns additional tone duration information to application.
kSMToneLenDetectionMinDuration64: This mode is like kSMToneEndDetectionMinDuration64 but returns additional tone duration information to application.
kSMToneLenDetectionMinDuration40: This mode is like kSMToneEndDetectionMinDuration40 but returns additional tone duration information to application.
kSMToneDetectionAsListenFor: This mode is only valid when specified in the parameters for sm_record_start() and a tone detection mode is currently active on the same channel, started by sm_listen_for(). Any tones detected on the same channel as the recording will be eliminated from the recorded data.

active_tone_set_id

The tone set for tone detection.

map_tones_to_digits

Indicator of whether tone detection should convert the result into a digit or not. One of these values:

kSMNoDigitMapping: Report tone IDs.
kSMDTMFToneSetDigitMapping: Report tones as digit codes.

Indicator of whether detection of call-progress tones is to be enabled (non-zero) or not (zero).

enable_grunt_detection

Indicator of whether grunt detection is to be enabled (non-zero) or not (zero). If enabled, this requires the module grunt to have been downloaded.

grunt_latency

The duration of silence (in mS) required before a signal is considered to be silent.

min_noise_level

The minimum level, in dBm0, that the noise estimate of the grunt detector may reach. The default is -55 dBm0. Only used if enable_grunt_detection is non zero. Requires the module grunt.

grunt_threshold

The threshold, in dB, above the noise estimate of the grunt detector at which a signal is considered present. The default is 15 dB. Only used if min_noise_level is non zero. Requires the module grunt.

grunt_holdoff

The period, in ms, following start of speech, to disable updating the estimate of the background noise energy (a non-zero period, typically 1000ms, can be required when long periods of uninterrupted speech are expected). Requires the module grunt.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_BAD_PARAMETER - illegal tone or digit mode

Prosody speech processing: API: sm_onhook_listen_for

Prototype Definition

int sm_onhook_listen_for(struct sm_onhook_listen_for_parms *listenp)

Parameters

a structure of the following type:

typedef struct sm_onhook_listen_for_parms {
	tSMChannelId channel;					/* in */
	tSM_INT enable;						/* in */
	double pre_pulse_max_power;				/* in */
	double pulse_min_power;					/* in */
	double post_pulse_floor;				/* in */
	double latitude;					/* in */
	double count_ratio;					/* in */
	double total_ratio;					/* in */
	tSM_INT max_duration;					/* in */
} SM_ONHOOK_LISTEN_FOR_PARMS;

Description

When the detector has determined that the analogue telephone has gone 'on-hook' the recognition event associated with the channel is set and a subsequent call to sm_get_recognised() will return the state kSMRecognisedOnHook.

The module onhook is required.

Fields

channel: The channel on which to listen.
enable: Indicator of whether on-hook detection is to be enabled (non-zero) or disabled (zero).
pre_pulse_max_power: The pre pulse maximum power (in dBm0). If zero is specified, the default of -48.0 is used.
pulse_min_power: The pulse minimum power (in dBm0). If zero is specified, the default of -18.0 is used.
post_pulse_floor: The post pulse power floor (in dBm0). If zero is specified, the default of -42.0 is used.
latitude: The convergence latitude (in dB). If zero is specified, the default of -6.0 is used.
count_ratio: The minimum count ratio. If zero is specified, the default of 4.0 is used.
total_ratio: The minimum total ratio. If zero is specified, the default of 4.0 is used.
max_duration: The maximum post pulse duration (in mS). If zero is specified, the default of 60 is used.

Returns

0 if call completed successfully, otherwise a standard error.

Prosody speech processing: API: sm_play_cptone

Prototype Definition

int sm_play_cptone(struct sm_play_cptone_parms *cptonep)

Parameters

*cptonep

a structure of the following type:

typedef struct sm_play_cptone_parms {
	tSMChannelId channel;					/* in */
	tSM_UT32 duration;					/* in */
	tSM_INT wait_for_completion;				/* in */
	enum kSMPlayCPToneType {
		kSMPlayCPToneTypeOneShot,
		kSMPlayCPToneTypeRepeat,
		kSMPlayCPToneTypeContinuous,
	} type;							/* in */
	tSM_INT tone_count;					/* in */
	struct sm_cadence {
		tSM_INT tone_id;				/* in */
		tSM_INT on_cadence;				/* in */
		tSM_INT off_cadence;				/* in */
	} cadences[kSMMaxPlayCPToneCadences];			/* in */
} SM_PLAY_CPTONE_PARMS;

Description

This call allows an application to generate a call-progress tone on specified output channel channel.

If the call-progress tone is to be output continuously (or until interrupted by sm_play_cptone_abort()), the parameter duration should be set to zero. Otherwise duration should be set to the required call-progress tone duration in milliseconds (the duration parameter is ignored if type is kSMPlayCPToneTypeOneShot).

Each element of cadences specifies an on-period on_cadence and an off-period off_cadence both specified in milliseconds, and also tone_id referencing to one of the module's currently defined simple output tones (see Prosody speech processing: pre-loaded output tones, for list of ids for output tones downloaded with module firmware and see description of sm_add_output_tone() for how an application may define its own simple output tones). Here are some examples of call-progress tones which use kSMPlayCPToneTypeRepeat:

Name	tone_count	cadences
Name	tone_count	pos	tone_id	on_cadence	off_cadence
U.K. ring tone	2	0	17	400	208
U.K. ring tone	2	1	17	400	2000
U.K. busy	1	0	16	384	384
E.C. busy	1	0	18	512	512
S.I.T.	3	0	19	336	32
		1	20	336	32
		2	21	336	1008

The wait_for_completion flag may be set by the application in which case the API call will not complete until the tone has been completely output, however no other Prosody API function can be performed on the channel during this waiting period. Obviously setting this flag is not useful when the tone has been specified as being a continuous tone with no fixed duration, since there would then be no way to stop the tone. See the document Prosody application note: waiting for completion for examples of how to wait without blocking other functions.

Alternatively the application can wait to be notified by an event that tone generation of a given duration has completed. When a write event has been associated with channel (see sm_channel_set_event), then the driver will notify the application with that event whenever it needs to invoke sm_play_cptone_status().

This requires the module tonegen to have been downloaded.

The channel is reserved for playing the tone until the API has reported completion. If the wait_for_completion flag is set, then the API considers that completion has been reported when this API function returns, otherwise completion is reported only by sm_play_cptone_status() returning the status kSMPlayCPToneStatusComplete. In this case the application must call sm_play_cptone_status() periodically and should use an event on the channel to notify it when to check the status. No other output activity can take place on the channel until the completion of the tone has been reported. Note that the event itself does not indicate completion of the tone. It is possible for the event to be signalled even if the tone has not yet completed, so it is essential that the application checks the status and continues waiting if the tone has not completed.

Note that the only way to stop a continuous tone is by calling sm_play_cptone_abort().

Fields

The channel on which the tone is to be played.

duration

The duration of the tone (mS), or the value zero for indefinite length. This parameter is ignored if type is kSMPlayCPToneTypeOneShot.

wait_for_completion

An indicator of whether this function would return as soon as it has set up the generation of the tone (0), or wait until the end of the tone (non-zero). If this value is zero, the application must have an event associated with this channel and invoke sm_play_cptone_status() whenever the event is set.

The style of playing to use. One of these values:

kSMPlayCPToneTypeOneShot: Output sequence of tone_count cadences just once.
kSMPlayCPToneTypeRepeat: Repeatedly output sequence of tone_count cadences.
kSMPlayCPToneTypeContinuous: Continuously output tone specified by tone_id in first element of cadences. Both on_cadence, off_cadence and other elements of cadences are ignored. Since this is the same as using sm_play_tone(), the use of this value is deprecated in favour of sm_play_tone().

tone_count

The number of items in cadences.

cadences

The sequence of tones.

tone_id: A tone identifier.
on_cadence: The duration (in mS) that this tone is on for.
off_cadence: The duration (in mS) of silence after this tone.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_BAD_PARAMETER - unknown tone type, inconsistent parameters

Prosody speech processing: API: sm_play_cptone_abort

Prototype Definition

int sm_play_cptone_abort(tSMChannelId channel)

Parameters

channel: The channel which is playing a call-progress tone.

Description

This call enables an application to abort a previously initiated call-progress tone generation job on the specified channel (as long as the wait_for_completion flag was not used in the previous call to sm_play_cptone()). The channel will revert to outputting silence.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error

Prosody speech processing: API: sm_play_cptone_status

Prototype Definition

int sm_play_cptone_status(struct sm_play_cptone_status_parms *statusp)

Parameters

a structure of the following type:

typedef struct sm_play_cptone_status_parms {
	tSMChannelId channel;					/* inout */
	enum kSMPlayCPToneStatus {
		kSMPlayCPToneStatusComplete,
		kSMPlayCPToneStatusOngoing,
	} status;						/* out */
} SM_PLAY_CPTONE_STATUS_PARMS;

Description

This call, typically invoke in response to a write event being signalled, allows an application to determine the status of a specific on-going call-progress tone generation job.

In order to determine the status of a specific call-progress tone generation job on a particular output channel, the application should set channel to specify the job concerned. On successful completion, the status parameter will indicate the status of that channel.

When this function reports that the channel status is kSMPlayCPToneStatusComplete, this also marks the end of the use of the channel for playing a tone, returning the channel output to an idle state ready to start a new replay or other output operation. Note that this means that if sm_play_cptone_status() is used again on the channel before starting a new tone, then it will report the error ERR_SM_WRONG_CHANNEL_STATE.

Fields

The channel which is playing a call-progress tone.

The status of the channel. One of these values:

kSMPlayCPToneStatusComplete: The call-progress tone generation job has completed.
kSMPlayCPToneStatusOngoing: The call-progress tone generation job is still ongoing.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error

Prosody speech processing: API: sm_play_digits

Prototype Definition

int sm_play_digits(struct sm_play_digits_parms *digitsp)

Parameters

*digitsp

a structure of the following type:

typedef struct sm_play_digits_parms {
	tSMChannelId channel;					/* in */
	tSM_INT wait_for_completion;				/* in */
	struct sm_digits {
		enum kSMDigitType {
			kSMDTMFDigits,
		} type;						/* in */
		tSM_INT qualifier;				/* in */
		char digit_string[kSMMaxDigits_plus1];		/* in */
		tSM_INT inter_digit_delay;			/* in */
		tSM_INT digit_duration;				/* in */
	} digits;						/* in */
} SM_PLAY_DIGITS_PARMS;

Description

This call outputs a sequence of DTMF digits in-band on the output channel specified. The digits structure contains details of the digits to be dialled. The type parameter determines the way digits contained in the zero terminated string digit_string are output on the timeslot.

The qualifier parameter is not currently used and should be set to zero.

The inter_digit_delay and digit_duration parameters are specified in milliseconds. Set parameters to zero for default delay and duration.

The characters permitted in digit_string depend on the type parameter specified. For kSMDTMFDigits, only '0'..'9','*', '#' , and 'A'..'D' are permitted.

The wait_for_completion flag may be set by the application in which case the API call will not return until the digits have been completely output, however no other Prosody API function can be performed on the channel during this waiting period. See the document Prosody application note: waiting for completion for examples of how to wait without blocking other functions.

Alternatively the application can wait to be notified by an event indicating that the digits have been completely output. This requires the module tonegen to have been downloaded.

The channel is reserved for playing the digits until the API has reported completion. If the wait_for_completion flag is set, then the API considers that completion has been reported when this API function returns, otherwise completion is reported only by sm_play_digits_status() returning the status kSMPlayDigitsStatusComplete. In this case the application must call sm_play_digits_status() periodically and should use a write event on the channel to notify it when to check the status. No other output activity can take place on the channel until the completion of the digits has been reported, so Note that the event itself does not indicate completion of the digits. It is possible for the event to be signalled even if the digits have not yet completed, so it is essential that the application checks the status and continues waiting if the digits have not completed.

Fields

The channel on which the digits are to be played.

wait_for_completion

digits

The details of the digits to play.

The type of digits to play. One of these values:

kSMDTMFDigits

qualifier

Unused. Set to zero.

digit_string

The digits to play.

inter_digit_delay

The amount of silence (in mS) to leave between digits.

digit_duration

The duration (in mS) of each digit.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_BAD_PARAMETER - illegal type or digit for type

Prosody speech processing: API: sm_play_digits_status

Prototype Definition

int sm_play_digits_status(struct sm_play_digits_status_parms *statusp)

Parameters

a structure of the following type:

typedef struct sm_play_digits_status_parms {
	tSMChannelId channel;					/* inout */
	enum kSMPlayDigitsStatus {
		kSMPlayDigitsStatusComplete,
		kSMPlayDigitsStatusOngoing,
	} status;						/* out */
} SM_PLAY_DIGITS_STATUS_PARMS;

Description

This call, typically invoke in response to a write event being signalled, allows an application to determine the status of a specific on-going DTMF dialling job.

In order to determine the status of a specific dialling job on a particular output channel, the application should set channel to specify the job concerned. On successful completion, the status parameter indicates the status.

When this function reports that the channel status is kSMPlayDigitsStatusComplete, this also marks the end of the use of the channel for playing digits, returning the channel output to an idle state ready to start a new replay or other output operation. Note that this means that if sm_play_digits_status() is used again on the channel before starting a new tone, then it will report the error ERR_SM_WRONG_CHANNEL_STATE.

Fields

The channel which is playing digits.

The channel status. One of these values:

kSMPlayDigitsStatusComplete: The dialling has completed.
kSMPlayDigitsStatusOngoing: The dialling is still ongoing.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error

Prosody speech processing: API: sm_play_tone

Prototype Definition

int sm_play_tone(struct sm_play_tone_parms *tonep)

Parameters

*tonep

a structure of the following type:

typedef struct sm_play_tone_parms {
	tSMChannelId channel;					/* in */
	tSM_UT32 duration;					/* in */
	tSM_INT wait_for_completion;				/* in */
	tSM_INT tone_id;					/* in */
} SM_PLAY_TONE_PARMS;

Description

This call allows an application to generate a simple output tone specified by tone_id on a given output channel channel, either continuously or for a given duration.

The parameter tone_id references one of the pre-loaded simple output tones, listed in Prosody speech processing: pre-loaded output tones, or one previously defined through a call to sm_add_output_tone().

If the tone is to be output continuously (or until aborted with sm_play_tone_abort()), the parameter duration should be set to zero. Otherwise duration should be set to the required tone duration in milliseconds.

The wait_for_completion flag may be set by the application in which case this API call will not return until the tone has been completely output, however no other Prosody API function can be performed on the channel during this waiting period. Obviously setting this flag is not useful when the tone has been specified as being a continuous tone since there would then be no way to stop the tone. See the document Prosody application note: waiting for completion for examples of how to wait without blocking other functions.

This requires the module tonegen to have been downloaded.

The channel is reserved for playing the tone until the API has reported completion. If the wait_for_completion flag is set, then the API considers that completion has been reported when this API function returns, otherwise completion is reported only by sm_play_tone_status() returning the status kSMPlayToneStatusComplete. In this case the application must call sm_play_tone_status() repeatedly until it reports completion. It should use an event on the channel to notify it when to check the status. No other output activity can take place on the channel until the completion of the tone has been reported, Note that the event itself does not indicate completion of the tone. It is possible for the event to be signalled even if the tone has not yet completed, so it is essential that the application checks the status and continues waiting if the tone has not completed.

Fields

channel: The channel on which the tone is to be generated.
duration: The duration of the tone, or the value zero to indicate a tone that will continue until explicitly aborted with sm_play_tone_abort().
wait_for_completion: An indicator of whether this function would return as soon as it has set up the generation of the tone (0), or wait until the end of the tone (non-zero).
tone_id: The tone to generate.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_BAD_PARAMETER - unknown tone or inconsistent parameters

Prosody speech processing: API: sm_play_tone_abort

Prototype Definition

int sm_play_tone_abort(tSMChannelId channel)

Parameters

channel: The channel which is playing a tone.

Description

This call enables an application to abort a previously initiated tone generation job on the specified channel (as long as the wait_for_completion flag was not used in the previous call to sm_play_tone()). The channel will revert to outputting silence.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error

Prosody speech processing: API: sm_play_tone_list

Prototype Definition

int sm_play_tone_list(struct sm_play_tone_list_parms *tonep)

Parameters

*tonep

a structure of the following type:

typedef struct sm_play_tone_list_parms {
	tSMChannelId channel;					/* in */
	struct sm_play_tone_item {
		enum kSMToneOperation {
			kSMToneOperationStop,
			kSMToneOperationSum,
			kSMToneOperationModulate,
		} operation;					/* in */
		tSM_UT32 duration;				/* in */
		double frequency1;				/* in */
		double amplitude1;				/* in */
		double frequency2;				/* in */
		double amplitude2;				/* in */
	} *tones;						/* in */
	tSM_INT tone_count;					/* in */
} SM_PLAY_TONE_LIST_PARMS;

Description

This call allows an application to generate multiple simple tones on a given output channel channel.

The application can wait to be notified by an event that tone generation of a given duration has completed. When a write event has been associated with channel (see sm_channel_set_event()), then the driver will notify the application with that event whenever it needs to invoke sm_play_tone_list_status().

The sm_play_tone_list_abort() call may be used to stop an ongoing tone generation.

This call offers a superset of the functionality provided by sm_play_tone(), sm_play_cptone() and sm_play_digits().

Fields

The channel on which the tones are to be generated.

tones

Pointer to an array containing the list of tones to play back.

operation

The operation to perform at this stage of the tone generation. One of these values:

kSMToneOperationStop: Stop the tone generation. sm_play_tone_list_status() will report kSMPlayToneListStatusComplete. All other parameters are ignored and should be zero.
kSMToneOperationSum: Combine the first and second signals by summing their output. This is the method used for generating DTMF tones.
kSMToneOperationModulate: Combine the first and second signals by treating the first as a carrier wave, over which the second signal is modulated. The amplitude2 parameter specifies the amplitude of the modulation. The output waveform is amplitude1 * sinewave(frequency1) * (1 + amplitude2 * sinewave(frequency2)) (after all parameters have been scaled appropriately).

duration

The duration of this tone, specified in milliseconds.

frequency1

The first component frequency of the tone to generate, specified in Hz. If operation is kSMToneOperationModulate then this tone specifies the carrier.

amplitude1

The amplitude of the first component frequency, specified in dBm0 (according to CCITT G.711) and must be in the range from -35 dBm0 to +3 dBm0.

frequency2

The second component frequency of the tone to generate, specified in Hz. If operation is kSMToneOperationModulate then this tone specifies the modulating signal.

amplitude2

If operation is kSMToneOperationSum then this is the amplitude of the second component frequency, specified in dBm0 (according to CCITT G.711) and must be in the range from -35 dBm0 to +3 dBm0. If operation is kSMToneOperationModulate then this is the amplitude relative to the carrier wave, with 0 dB corresponding to 100% modulation. For example, a 50% modulation would be specified as 20 * log10(0.5) = -6.0206 dB.

tone_count

The number of entries in the tones array.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_BAD_PARAMETER - unknown tone or inconsistent parameters

Prosody speech processing: API: sm_play_tone_list_abort

Prototype Definition

int sm_play_tone_list_abort(tSMChannelId channel)

Parameters

channel: The channel which is playing a tone.

Description

This call enables an application to abort the previously initiated playing of a list of tones. The channel stops generating tones as soon as possible, causing the status kSMPlayToneListStatusComplete to be reported by sm_play_tone_list_status() when the tone generation has stopped.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error

Prosody speech processing: API: sm_play_tone_list_phase_reverse

Prototype Definition

int sm_play_tone_list_phase_reverse(struct sm_play_tone_list_phase_reverse_parms *pp)

Parameters

*pp

a structure of the following type:

typedef struct sm_play_tone_list_phase_reverse_parms {
	tSMChannelId channel;					/* in */
	tSM_UT32 period;					/* in */
} SM_PLAY_TONE_LIST_PHASE_REVERSE_PARMS;

Description

If non-zero, makes the generated tone have phase reversals every period milliseconds.

Fields

channel: The channel playing a tone list
period: The period between phase reversals

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error

Prosody speech processing: API: sm_play_tone_list_status

Prototype Definition

int sm_play_tone_list_status(struct sm_play_tone_list_status_parms *statusp)

Parameters

a structure of the following type:

typedef struct sm_play_tone_list_status_parms {
	tSMChannelId channel;					/* inout */
	enum kSMPlayToneListStatus {
		kSMPlayToneListStatusOngoing,
		kSMPlayToneListStatusComplete,
		kSMPlayToneListStatusHasCapacity,
		kSMPlayToneListStatusUnderrun,
	} status;						/* out */
} SM_PLAY_TONE_LIST_STATUS_PARMS;

Description

This call, typically invoked in response to a write event being signalled, allows an application to determine the status of a specific on-going tone list generation job.

In order to determine the status of a specific tone generation job on a particular output channel, the application should set channel to specify the job concerned.

When this function reports that the channel status is kSMPlayToneListStatusComplete, this also marks the end of the use of the channel for playing tones, returning the channel output to an idle state ready to start a new replay or other output operation. Note that this means that if sm_play_tone_list_status() is used again on the channel before starting a new tone, then it will report the error ERR_SM_WRONG_CHANNEL_STATE.

Fields

The channel which is playing a tone.

The status of the tone generation. One of these values:

kSMPlayToneListStatusOngoing: The tone generation job is still ongoing.
kSMPlayToneListStatusComplete: The tone generation job has completed. This is signalled when the kSMToneOperationStop code is encountered in the tone list.
kSMPlayToneListStatusHasCapacity: The tone generation job is still ongoing and the module has capacity to buffer further data for the job.
kSMPlayToneListStatusUnderrun: Data has not been supplied sufficiently frequently to generation job and the output has been padded out with silence.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error

Prosody speech processing: API: sm_play_tone_status

Prototype Definition

int sm_play_tone_status(struct sm_play_tone_status_parms *statusp)

Parameters

a structure of the following type:

typedef struct sm_play_tone_status_parms {
	tSMChannelId channel;					/* inout */
	enum kSMPlayToneStatus {
		kSMPlayToneStatusComplete,
		kSMPlayToneStatusOngoing,
	} status;						/* out */
} SM_PLAY_TONE_STATUS_PARMS;

Description

This call, typically invoked in response to a write event being signalled, allows an application to determine the status of a specific on-going tone generation job.

When this function reports that the channel status is kSMPlayToneStatusComplete, this also marks the end of the use of the channel for playing a tone, returning the channel output to an idle state ready to start a new replay or other output operation. Note that this means that if sm_play_tone_status() is used again on the channel before starting a new tone, then it will report the error ERR_SM_WRONG_CHANNEL_STATE.

Fields

The channel which is playing a tone.

The status of the tone generation. One of these values:

kSMPlayToneStatusComplete: The tone generation job has completed.
kSMPlayToneStatusOngoing: The tone generation job is still ongoing.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error

Prosody speech processing: API: sm_put_audio_data

Prototype Definition

int sm_put_audio_data(struct sm_audio_data_parms *datap)

Parameters

a structure of the following type:

typedef struct sm_audio_data_parms {
	tSMChannelId channel;					/* in */
	char *data;						/* in */
	tSM_INT max_length;					/* in */
	tSM_INT done_length;					/* out */
} SM_AUDIO_DATA_PARMS;

Description

Following a call to sm_channel_set_output_threshold() set up with negative minimum_bits threshold, and a call to sm_replay_start(), each time sm_replay_status() indicates that the channel is ready for successive amounts of data, the actual data to be replayed is supplied to the module via successive invocations of this function. The data parameter is a pointer to a buffer of data to replay in the appropriate format, and the max_length parameter gives the number of octets of valid data in the buffer.

The data should be presented in lengths which are multiples of four bytes because this is more efficiently handled than other lengths.

If the module is not yet ready to buffer data, then no data is transferred and the call will return with done_length set to zero.

The application may be stimulated by a driver sent event when capacity on a channel becomes available (see sm_channel_set_event).

Fields

channel: The channel which is replaying.
data: The data being provided.
max_length: The length of the data being provided.
done_length: The amount of data actually written.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_NO_REPLAY_IN_PROGRESS - current replay complete or aborted
ERR_SM_BAD_DATA_LENGTH - invalid length parameter

Prosody speech processing: API: sm_put_last_replay_data

Prototype Definition

int sm_put_last_replay_data(struct sm_ts_data_parms *datap)

Parameters

a structure of the following type:

typedef struct sm_ts_data_parms {
	tSMChannelId channel;					/* in */
	char *data;						/* in */
	tSM_INT length;						/* in */
} SM_TS_DATA_PARMS;

Description

Supplies the last data to a replay started with sm_replay_start(). See sm_put_replay_data() for further details.

The length field can be zero if there is no remaining data.

Fields

channel: The channel which is replaying.
data: The data being provided.
length: The length of the data being provided.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_NO_REPLAY_IN_PROGRESS - current replay complete or aborted

Prosody speech processing: API: sm_put_replay_data

Prototype Definition

int sm_put_replay_data(struct sm_ts_data_parms *datap)

Parameters

a structure of the following type:

typedef struct sm_ts_data_parms {
	tSMChannelId channel;					/* in */
	char *data;						/* in */
	tSM_INT length;						/* in */
} SM_TS_DATA_PARMS;

Description

Following a call to sm_replay_start(), as sm_replay_status() indicates that the channel is ready for successive amounts of data, the actual data to be replayed is supplied to the module via successive invocations of this function. The data parameter is a pointer to a buffer of data to replay in the appropriate format, and the length parameter gives the number of octets of valid data in the buffer.

The data should be presented in lengths which are multiples of four bytes because this is more efficiently handled than other lengths.

If the module has insufficient capacity to buffer all the given data, then some data may be transferred and the call may return the status ERR_SM_NO_CAPACITY or it may block until space is available.

The application may be stimulated by a driver sent event when capacity on a channel becomes available (see sm_channel_set_event).

Fields

channel: The channel which is replaying.
data: The data being provided.
length: The length of the data being provided.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_NO_CAPACITY - module does not currently have capacity to buffer this data
ERR_SM_NO_REPLAY_IN_PROGRESS - current replay complete or aborted
ERR_SM_BAD_DATA_LENGTH - invalid length parameter

Prosody speech processing: API: sm_record_abort

Prototype Definition

int sm_record_abort(struct sm_record_abort_parms *abortp)

Parameters

*abortp

a structure of the following type:

typedef struct sm_record_abort_parms {
	tSMChannelId channel;					/* in */
	tSM_INT discard;					/* in */
} SM_RECORD_ABORT_PARMS;

Description

This call allows an application to terminate a record job on a given input channel prematurely with the option to discard or retain data uncollected by the application.

If discard is set to 1, any uncollected data is discarded, if this parameter is set to zero, the uncollected data is retained for collection by calls to sm_get_recorded_data().

Invoking this call will cause a final record event to be notified to the application.

Fields

channel: The channel which is recording.
discard: An indicator of whether data not yet collected from the channel should be discarded (non-zero) or delivered as normal (0).

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error

Prosody speech processing: API: sm_record_agc_adjust

Prototype Definition

int sm_record_agc_adjust(struct sm_record_agc_adjust_parms *recadjp)

Parameters

*recadjp

a structure of the following type:

typedef struct sm_record_agc_adjust_parms {
	tSMChannelId channel;					/* in */
	tSM_INT gain;						/* in */
} SM_RECORD_AGC_ADJUST_PARMS;

Description

Sets the gain of a recording which is using automatic gain control (AGC) to a specified value. This value will then be modified by the AGC algorithm to adapt to the strength of the signal being received.

It may be useful to set the gain when you know that the signal strength has suddenly changed and the standard AGC adaptation is not fast enough. However, if the gain is set too large, the signal will be distorted by clipping, whereas if the gain is too small, the signal will be attenuated to the point where it disappears. Therefore the gain should only be adjusted when there is good reason to believe the AGC algorithm will be inadequate.

When the AGC algorithm starts, it passes the signal through unchanged (i.e. a gain value of 0dB). As it monitors the signal, it adjusts this gain as necessary to make the signal approach a target of approximately -12dBm0.

Fields

channel: The channel which is recording.
gain: The gain value to set AGC to (in dB).

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_NO_RECORD_IN_PROGRESS - no recording in progress

Prosody speech processing: API: sm_record_agc_adjust_settings

Prototype Definition

int sm_record_agc_adjust_settings(struct sm_record_agc_adjust_settings_parms *recadjp)

Parameters

*recadjp

a structure of the following type:

typedef struct sm_record_agc_adjust_settings_parms {
	tSMChannelId channel;					/* in */
	float max_level_decay;					/* in */
	float target_level;					/* in */
} SM_RECORD_AGC_ADJUST_SETTINGS_PARMS;

Description

Fields

channel: The channel which is recording.
max_level_decay: The rate at which the AGC decays the estimate of the loudest signal, as a fraction.
target_level: The target output level in dBm0.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_NO_RECORD_IN_PROGRESS - no recording in progress

Prosody speech processing: API: sm_record_start

Prototype Definition

int sm_record_start(struct sm_record_parms *recordp)

Parameters

*recordp

a structure of the following type:

typedef struct sm_record_parms {
	tSMChannelId channel;					/* in */
	tSMChannelId alt_data_source;				/* in */
	enum kSMDataFormat type;				/* in */
	tSM_UT32 silence_elimination;				/* in */
	enum kSMToneDetection tone_elimination_mode;		/* in */
	tSM_UT32 tone_elimination_set_id;			/* in */
	tSM_UT32 max_octets;					/* in */
	tSM_UT32 max_elapsed_time;				/* in */
	tSM_UT32 max_silence;					/* in */
	tSM_INT agc;						/* in */
	tSM_INT volume;						/* in */
	enum kSMRecordAltSource {
		kSMRecordAltSourceDefault,
		kSMRecordAltSourceInput,
		kSMRecordAltSourceOutput,
	} alt_data_source_type;					/* in */
	tSM_UT32 sampling_rate;					/* in */
	double min_noise_level;					/* in */
	double grunt_threshold;					/* in */
	tSM_UT32 grunt_holdoff;					/* in */
	tSM_UT32 max_initial_silence;				/* in */
} SM_RECORD_PARMS;

Description

This call starts a new recording job using the specified channel.

Normally alt_data_source is set to kSMNullChannelId and the data that will be recorded will be that switched to this input channel. If however alt_data_source is set to the channel id of another existing channel, then the data source for the recording will be determined by the value of alt_data_source_type. Note that the channel specified in alt_data_source must not be reconfigured while this recording is in progress. When alt_data_source_type selects the output of a channel, the output datafeed from that channel must be referenced by calling sm_channel_get_datafeed() (or the legacy sm_switch_channel_output() ), before starting the recording.

The PCM data received will be encoded into buffers in the format specified by the type parameter which is a value from same range of values permitted in the type parameter of sm_replay_start().

Note that, for compatibility with earlier releases of Prosody, many other values are permitted for the type field. These compatibility values specify a combination of data type and sampling rate. When one of these is used in the type field, the sampling_rate field must be zero, and the actual rate used will be as listed here. They are:

compatibility code	new code
compatibility code	type	sampling rate
kSMDataFormat8KHzALawPCM	kSMDataFormatALawPCM	8000
kSMDataFormat8KHzULawPCM	kSMDataFormatULawPCM	8000
kSMDataFormat8KHzOKIADPCM	kSMDataFormatOKIADPCM	8000
kSMDataFormat8KHzACUBLKPCM	kSMDataFormatACUBLKPCM	8000
kSMDataFormat6KHzALawPCM	kSMDataFormatALawPCM	6000
kSMDataFormat6KHzULawPCM	kSMDataFormatULawPCM	6000
kSMDataFormat6KHzOKIADPCM	kSMDataFormatOKIADPCM	6000
kSMDataFormat6KHzACUBLKPCM	kSMDataFormatACUBLKPCM	6000
kSMDataFormat8KHz16bitMono	kSMDataFormat16bit	8000
kSMDataFormat8KHz8bitMono	kSMDataFormat8bit	8000
kSMDataFormat8KHzSigned8bitMono	kSMDataFormatSigned8bit	8000
kSMDataFormatIMAADPCM	kSMDataFormatIMAADPCM	8000

Any form of record requires the module inchan to have been downloaded in addition to the module that is required for the specific type of record, and any module required for the sampling rate:

record type	extra firmware required
kSMDataFormatALawPCM	recA
kSMDataFormatULawPCM	recmu
kSMDataFormatOKIADPCM	recoki
kSMDataFormatACUBLKPCM	recablk
kSMDataFormatSigned8bit	rec8b
kSMDataFormat8bit	recms8b
kSMDataFormat16bit	rec16b
kSMDataFormatIMAADPCM	recima
kSMDataFormatSpeex	speexrp

The sampling rate firmware:

sampling rate	extra firmware required
8000	-
6000	sixkin
11000	8_to_11

See Prosody application note: speech processing replay and record data formats for more details on data formats supported by Prosody and their appropriate use.

The volume parameter is the change in volume compared to the level of the data (i.e. set this to -6 to attenuate by 6dB). If AGC and volume are both applied, the change in volume requested is applied after AGC.

The agc parameter controls whether automatic gain control is applied to the recorded data. If agc is non-zero then automatic gain control is applied. Even if this is the case, the recording level is still governed by volume. The behaviour of the AGC algorithm may be controlled by changing its parameters, see sm_record_agc_adjust_settings() for more details.

The recorded data may be retrieved by the application through periodic calls to sm_get_recorded_data(). The amount of data recorded is determined by the termination criteria specified in the parameters:

max_octets	max octets of data to record, 0 if no limit
max_elapsed_time	max recording period in mS, 0 if no limit
max_silence	max period of silence in mS before recording terminated, 0 if no limit (see also max_initial_silence )

and also by the function sm_record_abort() which will terminate a recording directly.

If an event has been previously associated with a channel (see sm_channel_set_event()), then the driver will notify the application with that event whenever (for that channel):

recorded data becomes newly available for collection by sm_get_recorded_data()
recorded data remains available for collection by sm_get_recorded_data()
recording terminates due to one of the termination criteria being met

The channel is reserved for recording until sm_record_status() returns the status kSMRecordStatusComplete. No other recording activity can take place on the channel during this time.

Fields

The channel to perform the record.

alt_data_source

kSMNullChannelId, or another channel whose input or output is to be recorded. If this specifies a channel, that channel must not be reconfigured while recording is taking place.

tone_elimination_mode

The format in which to record. (See the main text above for compatibility codes that can also be used in this field.) One of these values:

kSMDataFormatNone: Special value for test purposes only. This indicates that the channel should prepare as if it was about to play or record data, but not actually transfer any data.
kSMDataFormatALawPCM: G.711 A-law. This uses 8 bits per sample.
kSMDataFormatULawPCM: G.711 mu-law. This uses 8 bits per sample.
kSMDataFormatOKIADPCM: A 4-bit coding scheme.
kSMDataFormatACUBLKPCM: This format is obsolete, as cards fitted with SHARC DSPs are no longer supported. It has never been implemented for Prosody X cards.
kSMDataFormat16bit: 16-bit linear coding, where each sample is a signed value (-32768 to 32767). The first octet of each sample is the less significant one.
kSMDataFormat8bit: 8-bit unsigned linear coding, where each sample is an unsigned value (0 to 255). This is Microsoft's 8-bit format.
kSMDataFormatSigned8bit: 8-bit linear coding, where each sample is a signed value (-128 to 127).
kSMDataFormatIMAADPCM: A 4-bit coding scheme standardised by the Interactive Multimedia Association (IMA).
kSMDataFormatSpeex: A patent and royalty-free speech compression codec. Use of the functions sm_replay_start() and sm_record_start() only allows playback and recording using the default "narrowband" Speex configuration. Other operating modes and parameters will be made available via new API calls.

silence_elimination

The maximum duration (in mS) of silence to record. Silences longer than this are truncated to this length. The value zero disables silence elimination. Requires the module grunt.

What types of tones to eliminate from the recording. This allows the same tone detection as sm_listen_for(). Requires the module td unless the value is kSMToneDetectionNone. One of these values:

kSMToneDetectionNone: Simple tones never recognised.
kSMToneDetectionNoMinDuration: Simple tone detection enabled, no minimum period. If the correct frequencies are detected with the correct signal to noise ratio, twist, etc. for however short a duration, the tone is considered to be present and is recognised.
kSMToneDetectionMinDuration64: Simple tone detection enabled, tone must be valid for minimum period to be detected. If the tone is valid for 64mS it will definitely be detected. Tones of shorter duration between 32mS and 64mS may be detected but cannot be guaranteed. The minimum duration of a tone can be increased by setting the parameter kAdjustToneSetIntParamIdMinOnTime with sm_adjust_input_tone_set().
kSMToneDetectionMinDuration40: This mode uses a slightly more complex algorithm for analysing duration of a valid tone, and enables robust detection of tones with duration as short as 40mS.
kSMToneEndDetectionNoMinDuration: This mode is like kSMToneDetectionNoMinDuration but application notified when end of tone detected.
kSMToneEndDetectionMinDuration64: This mode is like kSMToneDetectionMinDuration64 but application notified when end of tone detected.
kSMToneEndDetectionMinDuration40: This mode is like kSMToneDetectionMinDuration40 but application notified when end of tone detected.
kSMToneLenDetectionNoMinDuration: This mode is like kSMToneEndDetectionNoMinDuration but returns additional tone duration information to application.
kSMToneLenDetectionMinDuration64: This mode is like kSMToneEndDetectionMinDuration64 but returns additional tone duration information to application.
kSMToneLenDetectionMinDuration40: This mode is like kSMToneEndDetectionMinDuration40 but returns additional tone duration information to application.
kSMToneDetectionAsListenFor: This mode is only valid when specified in the parameters for sm_record_start() and a tone detection mode is currently active on the same channel, started by sm_listen_for(). Any tones detected on the same channel as the recording will be eliminated from the recorded data.

tone_elimination_set_id

The tone set to use (only relevant if tone_elimination_mode is not kSMToneDetectionNone). See sm_listen_for() for details of how to select an input tone set.

max_octets

The maximum amount of data to record. The value zero indicates no maximum.

max_elapsed_time

The maximum duration of the recording in mS. The value zero indicates no maximum. Requires the module timerx.

max_silence

The maximum silence permitted (in mS). The value zero indicates no maximum. Silences longer than this cause the recording to terminate. Requires the module grunt.

agc

Indicator of whether automatic gain control is to be enabled. (non-zero) or not (zero). Requires the module gainbg.

volume

The desired adjustment to the volume (dB). The range of gain supported is at least +8 to -22 dB, Requires the module gainbg.

alt_data_source_type

If an alt_data_source channel is specified, which kind of data associated with that channel should be recorded. One of these values:

kSMRecordAltSourceDefault: If alt_data_source is an input only channel, then data switched to this channel input will be recorded, otherwise the data being generated on this channel output will be recorded (this feature is normally used to record conferenced outputs). This value is deprecated since it is equivalent to either kSMRecordAltSourceInput or kSMRecordAltSourceOutput which could be used instead.
kSMRecordAltSourceInput: Data switched to alt_data_source input will be recorded. This value is deprecated since several channels can take input from the same timeslot and that is a more straightforward way of achieving the same result.
kSMRecordAltSourceOutput: Data generated on alt_data_source output will be recorded.

sampling_rate

The sampling rate at which to record the data. Currently supported values are:

0 - record at the rate reported via sm_record_status().
8000 - the typical rate for telephony, since it is the rate at which telephone networks themselves operate.
6000 - a rate which reduces file sizes at the cost of lower quality.
11000 - a rate convenient for use with typical PC soundcards. This is sufficiently close to a quarter of the rate used by CDs (44100 Hz) that the difference is not significant, allowing almost universal compatibility with cheap PC soundcards which can handle 11025 Hz sampling.

Note that when you specify a non-zero value here, this function assumes that the source of the data to be recorded is providing data at 8000 samples per second. The use of data at other rates is not supported and will cause the data to be recorded at an incorrect sampling rate. Consequently, the use of a non-zero value in this field is deprecated.

min_noise_level

The minimum level, in dBm0, that the noise estimate of the grunt detector may reach. The default is -55 dBm0. Only used if silence_elimination or max_silence are non zero. Requires the module grunt.

grunt_threshold

grunt_holdoff

max_initial_silence

If both max_silence and this parameter are non-zero, then this parameter specifies the maximum period of silence allowed, in ms, prior to start of speech, whereas the max_silence timeout will now specify maximum period of silence allowed subsequent to the start of speech. Requires the module grunt.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_WRONG_CHANNEL_STATE - if already recording
ERR_SM_WRONG_CHANNEL_TYPE - if attempt to record using output channel
ERR_SM_NOT_SAME_MODULE - alt_data_source channel not located on same module

Prosody speech processing: API: sm_record_status

Prototype Definition

int sm_record_status(struct sm_record_status_parms *statusp)

Parameters

a structure of the following type:

typedef struct sm_record_status_parms {
	tSMChannelId channel;					/* inout */
	enum kSMRecordStatus {
		kSMRecordStatusComplete,
		kSMRecordStatusCompleteData,
		kSMRecordStatusOverrun,
		kSMRecordStatusData,
		kSMRecordStatusNoData,
		kSMRecordStatusRecognition,
	} status;						/* out */
	enum kSMRecognition recog_type;				/* out */
	tSM_INT param0;						/* out */
	tSM_INT param1;						/* out */
	enum kSMRecordHowTerminated {
		kSMRecordHowTerminatedNotYet,
		kSMRecordHowTerminatedLength,
		kSMRecordHowTerminatedMaxTime,
		kSMRecordHowTerminatedSilence,
		kSMRecordHowTerminatedAborted,
		kSMRecordHowTerminatedError,
	} termination_reason;					/* out */
	tSM_UT32 termination_octets;				/* out */
	tSM_UT32 sample_rate;					/* out */
} SM_RECORD_STATUS_PARMS;

Description

This call, typically invoked in response to a read event being signalled, allows an application to determine the status of a specific on-going record job.

In order to determine the status of a specific record job, the application should set channel to specify the job concerned.

A channel ceases to be recording when this function returns a status of kSMRecordStatusComplete. Until this happens, the channel input is reserved for the record and no other recording activity can take place on the channel during this time. After this happens, the channel input returns to being idle and consequently if this function is used again it will return the error ERR_SM_NO_RECORD_IN_PROGRESS.

Fields

The channel which is recording.

The channel's status. One of these values:

kSMRecordStatusComplete: The recording job has completed and all the recorded data has been passed to the application.
kSMRecordStatusCompleteData: The recording job has completed but there still remains recorded data for the application to collect.
kSMRecordStatusOverrun: Data has not been retrieved sufficiently frequently by the application and now some has been lost due to module buffer overrun.
kSMRecordStatusData: The record job is still ongoing and data is available for collection by the application.
kSMRecordStatusNoData: The record job is still ongoing however not enough data is buffered in the module to justify collection by the application.
kSMRecordStatusRecognition: A recognition event has occurred in the recording. The recog_type, param0 and param1 fields contain a report of what was detected. This status can occur when sm_record_start() specified an option such as tone elimination.

recog_type

The recognition event which has occurred. This field, with the param0 and param1 fields, has the same meaning as the type field returned by sm_get_recognised() with its corresponding param0 and param1 fields. This field is only valid when the status field is kSMRecordStatusRecognition. One of these values:

kSMRecognisedNothing: No digit, simple or call-progress tone has been recognised
kSMRecognisedTrainingDigit
kSMRecognisedDigit: A pulse dialled or DTMF dialled digit has been recognised and a character representation for it has been stored in param0. In param1 will be an indication of the digit type (kSMPulseDigits or kSMDTMFDigits) unless a tone detection mode of type kSMToneLen... was specified in which case it will contain the duration in milliseconds of the detected DTMF digit.
kSMRecognisedTone: A simple tone has been recognised from the active set of input tones for the channel. The parameter param0 and param1 are assigned values as described above.
kSMRecognisedCPTone: A call-progress tone has been recognised and the corresponding identifier has been stored in param0.
kSMRecognisedGruntStart: The beginning of a grunt has been detected. param0 is set to the duration of the preceding silence in milliseconds.
kSMRecognisedGruntEnd: The end of a grunt has been detected, param0 is set to grunt duration in milliseconds, and param1 to grunt average energy in negative dBm0 (average is calculated only over periods during which signal is present).
kSMRecognisedASRResult: Obsolete
kSMRecognisedASRUncertain: Obsolete
kSMRecognisedASRRejected: Obsolete
kSMRecognisedASRTimeout: Obsolete
kSMRecognisedCatSig: A signal has been categorised, the parameter param0 indicates the algorithm id (see sm_catsig_listen_for()) and param1 is a value indicating the signal category with respect to this algorithm (eg. live speaker or answer machine).
kSMRecognisedOverrun: The recognition FIFO has been overrun because it has not been polled frequently enough through calls to sm_get_recognised().
kSMRecognisedANS: An ANS or ANSam tone has been detected (see sm_ans_listen_for()). The parameter param0 will describe the tone detected: 0 for end of tone, 1 for an ordinary ANS tone, or 2 for the modulated ANSam tone. The parameter param1 will be non zero if the ANS or ANSam tone has phase reversals
kSMRecognisedBeep: A beep has been recognised The parameter param0 will be the beep frequency and param1 will be zero at the start of the beep and non-zero when the beep ends.
kSMRecognisedOnHook: An 'on-hook' state has been recognised

param0

A parameter giving details of what was detected. The interpretation of this depends on the recog_type field. This field is only valid when the status field is kSMRecordStatusRecognition.

param1

Another parameter giving details of what was detected. The interpretation of this also depends on the recog_type field. This field is only valid when the status field is kSMRecordStatusRecognition.

termination_reason

The reason why a recording has terminated. One of these values:

kSMRecordHowTerminatedNotYet: Recording not yet completed.
kSMRecordHowTerminatedLength: The max_octets criterion specified to sm_record_start() was satisfied.
kSMRecordHowTerminatedMaxTime: The max_elapsed_time criterion specified to sm_record_start() was satisfied.
kSMRecordHowTerminatedSilence: The max_silence criterion specified to sm_record_start() was satisfied. The termination_octets field will indicate approximately how many octets of recorded silence were present at the end of the recording.
kSMRecordHowTerminatedAborted: sm_record_abort() was invoked.
kSMRecordHowTerminatedError: An error occurred.

termination_octets

The amount of data representing silence at the end of the recording (only valid if the recording has completed and termination_reason was kSMRecordHowTerminatedSilence).

sample_rate

The sample rate of data. This is the rate for data not yet collected.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_NO_RECORD_IN_PROGRESS - no recording in progress (perhaps following abort with discard specified)

Prosody speech processing: API: sm_replay_abort

Prototype Definition

int sm_replay_abort(struct sm_replay_abort_parms *abortp)

Parameters

*abortp

a structure of the following type:

typedef struct sm_replay_abort_parms {
	tSMChannelId channel;					/* in */
	tSM_UT32 offset;					/* out */
	tSM_UT32 nowait;					/* in */
} SM_REPLAY_ABORT_PARMS;

Description

This call allows an application to abort prematurely a replay job on the output channel specified. After a replay has been aborted, a final replay event will be notified to the application and silence will be output on the channel.

If the call completes successfully, the parameter offset will be set to a value between 0 and the number of octets of data actually played indicating the point at which replay was aborted. This be a value up to the total number of octets already supplied via sm_put_replay_data() and sm_put_last_replay_data().

Fields

channel: The channel which is replaying.
offset: The number of octets which had been played when the abort completed. Not valid if nowait is non-zero.
nowait: Indicates that the function should return instantly without waiting to determine the offset where the play stops. The place where the play stops can always be determined from the offset reported by sm_replay_status() when it reports that the replay has completed.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error

Prosody speech processing: API: sm_replay_adjust

Prototype Definition

int sm_replay_adjust(struct sm_replay_adjust_parms *adjustp)

Parameters

*adjustp

a structure of the following type:

typedef struct sm_replay_adjust_parms {
	tSMChannelId channel;					/* in */
	tSMChannelId background;				/* in */
	tSM_INT volume;						/* in */
	tSM_INT agc;						/* in */
	tSM_INT speed;						/* in */
} SM_REPLAY_ADJUST_PARMS;

Description

This call allows an application to alter the replay parameters of the current replay job on the specified output channel. The background, volume, agc, and speed parameters are as for the sm_replay_start() call.

Fields

channel: The channel which is replaying.
background: A channel which is producing a signal to be added to the replay signal or kSMNullChannelId.
volume: The volume adjustment (dB).
agc: Indicator of whether automatic gain control is to be enabled. (non-zero) or not (zero).
speed: The percentage of full speed at which the replay should work. The value zero also represents full speed.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_NO_REPLAY_IN_PROGRESS - current replay complete or aborted
ERR_SM_BAD_PARAMETER - out of range volume, speed etc

Prosody speech processing: API: sm_replay_start

Prototype Definition

int sm_replay_start(struct sm_replay_parms *replayp)

Parameters

*replayp

a structure of the following type:

typedef struct sm_replay_parms {
	tSMChannelId channel;					/* in */
	tSMChannelId background;				/* in */
	tSM_INT volume;						/* in */
	tSM_INT agc;						/* in */
	tSM_INT speed;						/* in */
	enum kSMDataFormat {
		kSMDataFormatNone=0,
		kSMDataFormatALawPCM=30,
		kSMDataFormatULawPCM=31,
		kSMDataFormatOKIADPCM=32,
		kSMDataFormatACUBLKPCM=33,
		kSMDataFormat16bit=34,
		kSMDataFormat8bit=35,
		kSMDataFormatSigned8bit=36,
		kSMDataFormatIMAADPCM=17,
		kSMDataFormatSpeex=37,
	} type;							/* in */
	tSM_UT32 data_length;					/* in */
	tSM_UT32 sampling_rate;					/* in */
} SM_REPLAY_PARMS;

Description

Prepares output channel channel for replay of data (a replay job).

Normally the background parameter would be set to kSMNullChannelId. If however, this parameter has assigned to it the channel id of another output channel, then the signal generated on the output channel channel will be combined with the data currently being output on background. Note that a channel must not be reconfigured while it is in use as a background channel for other channels.

To keep a continuous background signal while playing a mixture of different recorded signals and silence, start the replay only once and use sm_put_replay_data() as necessary to feed it data for each signal. To produce silence, simply stop sending data for a suitable period (the channel will report an underrun when it starts sending silence).

The volume parameter determines the gain applied to the replayed data and has a range of at least -24 to +8 in dB.

The agc parameter controls whether automatic gain control is applied to the replayed data. If agc is non-zero then automatic gain control is applied. Even if this is the case, the output level is still governed by volume.

The speed parameter determines rate of replay and is a percentage value expressing the rate of replay compared with the normal replay rate. Only certain speeds are supported, so the speed specified is rounded to the nearest supported value. The following speeds are currently supported:

Ratio	Percentage
2:1	200
3:2	150
4:3	133
1:1	100
3:4	75
2:3	67
1:2	50

An application should not rely on a speed being rounded to a specific value. Other speeds may also be supported, so a requested speed may round to a value nearer than expected.

If speed is set to zero, the data is replayed at an unadjusted rate.

The type parameter determines the format of data that will played back over the timeslot.

compatibility code	new code
compatibility code	type	sampling rate
kSMDataFormat8KHzALawPCM	kSMDataFormatALawPCM	8000
kSMDataFormat8KHzULawPCM	kSMDataFormatULawPCM	8000
kSMDataFormat8KHzOKIADPCM	kSMDataFormatOKIADPCM	8000
kSMDataFormat8KHzACUBLKPCM	kSMDataFormatACUBLKPCM	8000
kSMDataFormat6KHzALawPCM	kSMDataFormatALawPCM	6000
kSMDataFormat6KHzULawPCM	kSMDataFormatULawPCM	6000
kSMDataFormat6KHzOKIADPCM	kSMDataFormatOKIADPCM	6000
kSMDataFormat6KHzACUBLKPCM	kSMDataFormatACUBLKPCM	6000
kSMDataFormat8KHz16bitMono	kSMDataFormat16bit	8000
kSMDataFormat8KHz8bitMono	kSMDataFormat8bit	8000
kSMDataFormat8KHzSigned8bitMono	kSMDataFormatSigned8bit	8000
kSMDataFormatIMAADPCM	kSMDataFormatIMAADPCM	8000

Any form of replay requires the module outchan to have been downloaded in addition to the module that is required for the specific type of replay, and any module required for the sampling rate:

replay type	extra firmware required
kSMDataFormatALawPCM	playA
kSMDataFormatULawPCM	playmu
kSMDataFormatOKIADPCM	playoki
kSMDataFormatACUBLKPCM	playablk
kSMDataFormatSigned8bit	play8b
kSMDataFormat8bit	playms8b
kSMDataFormat16bit	play16b
kSMDataFormatIMAADPCM	playima
kSMDataFormatSpeex	speexrp

The sampling rate firmware:

sampling rate	extra firmware required
8000	-
6000	sixkout
11000	11_to_8
12000	sixkout
16000	-
22000	11_to_8

See document Prosody application note: speech processing replay and record data formats for more details on data formats supported by Prosody and their appropriate use.

The data_length parameter indicates the total number of octets of speech data that the application intends to supply to the driver for replay on the given timeslot. If data_length is set to zero, the replay will be of indefinite length (in this case the replay job can be completed with sm_put_last_replay_data()).

If an event has been previously associated with channel (see sm_channel_set_event), then the driver will notify the application with that event whenever:

capacity becomes available on the given channel to buffer up replay data
replay completes after all data_length octets were output (or last buffer of indefinite replay was output)
replay completes following abort

The channel is reserved for replaying until sm_replay_status() returns the status kSMReplayStatusComplete. No other output activity can take place on the channel during this time.

Fields

The channel to perform the replay.

background

A channel which is producing a signal to be added to the replay signal. Requires the module gainbg.

volume

The volume adjustment (dB). Requires the module gainbg.

agc

Indicator of whether automatic gain control is to be enabled. (non-zero) or not (zero). Requires the module gainbg.

speed

The percentage of full speed at which the replay should work. The value zero also represents full speed. Speeds faster than 100% require the module fast to have been downloaded, while speeds slower than 100% require the module slow.

The type of data to replay. (See the main text above for compatibility codes that can also be used in this field.) One of these values:

kSMDataFormatNone: Special value for test purposes only. This indicates that the channel should prepare as if it was about to play or record data, but not actually transfer any data.
kSMDataFormatALawPCM: G.711 A-law. This uses 8 bits per sample.
kSMDataFormatULawPCM: G.711 mu-law. This uses 8 bits per sample.
kSMDataFormatOKIADPCM: A 4-bit coding scheme.
kSMDataFormatACUBLKPCM: This format is obsolete, as cards fitted with SHARC DSPs are no longer supported. It has never been implemented for Prosody X cards.
kSMDataFormat16bit: 16-bit linear coding, where each sample is a signed value (-32768 to 32767). The first octet of each sample is the less significant one.
kSMDataFormat8bit: 8-bit unsigned linear coding, where each sample is an unsigned value (0 to 255). This is Microsoft's 8-bit format.
kSMDataFormatSigned8bit: 8-bit linear coding, where each sample is a signed value (-128 to 127).
kSMDataFormatIMAADPCM: A 4-bit coding scheme standardised by the Interactive Multimedia Association (IMA).
kSMDataFormatSpeex: A patent and royalty-free speech compression codec. Use of the functions sm_replay_start() and sm_record_start() only allows playback and recording using the default "narrowband" Speex configuration. Other operating modes and parameters will be made available via new API calls.

data_length

The length of the replay (in octets) or zero for indefinite length replay.

sampling_rate

The sampling rate of the data to be played. Currently supported values are:

8000 - the typical rate for telephony, since it is the rate at which telephone networks themselves operate.
6000 - a rate which reduces file sizes at the cost of lower quality, will be upsampled to 8000 Hz.
11000 - a rate convenient for use with typical PC soundcards. This is sufficiently close to a quarter of the rate used by CDs (44100 Hz) that the difference is not significant, allowing almost universal compatibility with cheap PC soundcards which can handle 11025 Hz sampling, will be downsampled to 8000 Hz.
16000 - the typical rate for use with wideband RTP codecs.
12000 - will be upsampled to 16000 Hz (unlikely to be useful).
22000 - approx half rate of 44100Hz, will be downsampled to 16000 Hz allowing use with wideband RTP codecs.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_BAD_PARAMETER - out of range volume, speed or unknown type
ERR_SM_WRONG_CHANNEL_TYPE - attempt to replay on input channel
ERR_SM_WRONG_CHANNEL_STATE - replay or other non-interruptible job still in progress

Prosody speech processing: API: sm_replay_status

Prototype Definition

int sm_replay_status(struct sm_replay_status_parms *statusp)

Parameters

a structure of the following type:

typedef struct sm_replay_status_parms {
	tSMChannelId channel;					/* inout */
	enum kSMReplayStatus {
		kSMReplayStatusComplete,
		kSMReplayStatusCompleteData,
		kSMReplayStatusUnderrun,
		kSMReplayStatusHasCapacity,
		kSMReplayStatusNoCapacity,
	} status;						/* out */
	tSM_UT32 offset;					/* out */
} SM_REPLAY_STATUS_PARMS;

Description

This call, typically invoked in response to a write event being signalled, allows an application to determine the status of replay jobs.

In order to determine the status of a specific replay job on a particular output channel, the application should set channel to specify the job concerned. On successful completion, the status field indicates the current status of the channel.

A channel ceases to be replaying when this function returns a status of kSMReplayStatusComplete. Until this happens, the channel output is reserved for the replay and cannot be used for anything else. After this happens, the channel output returns to being idle and consequently if this function is used again it will return the error ERR_SM_NO_REPLAY_IN_PROGRESS.

Fields

The channel which is replaying.

The channel's status. One of these values:

kSMReplayStatusComplete: the replay job has completed, all module buffered data has been transmitted on the channel
kSMReplayStatusCompleteData: the replay job is still ongoing, all required data for replay has been supplied, the replay job will complete once all module buffered data has been transmitted
kSMReplayStatusUnderrun: data has not been supplied sufficiently frequently to replay job and the output has been padded out with silence
kSMReplayStatusHasCapacity: the replay job is still ongoing and the module has capacity to buffer further data for the job
kSMReplayStatusNoCapacity: the replay job is still ongoing and either all the replay data has now been received by the module or the module temporarily does not have capacity to buffer further data for the job

offset

The number of octets which had been played when the replay completed.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error

Prosody speech processing: API: sm_reset_input_cptones

Prototype Definition

int sm_reset_input_cptones(struct sm_reset_input_cptones_parms *resetp)

Parameters

*resetp

a structure of the following type:

typedef struct sm_reset_input_cptones_parms {
	tSMModuleId module;					/* in */
	tSM_INT tone_set_id;					/* in */
} SM_RESET_INPUT_CPTONES_PARMS;

Description

Initially, each module has predefined a default set of recognisable call-progress tones. If additional recognisable progress tones are to be defined, it may be necessary to discard the default set first. This may be the case because, for example, the default predefined set assigns the same identifier to two variants of a call-progress tone which actually need to be distinguished by the application.

This call resets to empty the set of all call-progress tones recognised by the designated module module. When new input call-progress tones to be recognised for the module are defined through calls to sm_add_input_cptone(), frequency identifiers specified will be with respect to the set of input tones referenced by tone_set_id.

See Prosody speech processing: pre-loaded input tones for predefined sets of input tones, and see sm_add_input_tone_set() on how to define new ones.

This call can only be made when no channel is allocated on the given module.

Fields

module: The module whose call-progress tone set is to be cleared.
tone_set_id: The id of the input tone set in which future call-progress tones are defined.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_NO_SUCH_MODULE - if no such module exists
ERR_SM_CHANNEL_ALLOCATED - channel still allocated

Prosody speech processing: API: sm_add_input_vocab

This function is deprecated.

Prototype Definition

int sm_add_input_vocab(struct sm_input_vocab_parms *vocabp)

Parameters

*vocabp

a structure of the following type:

typedef struct sm_input_vocab_parms {
	tSMModuleId module;					/* in */
	char *filename;						/* in */
	tSM_UT32 item_id;					/* out */
} SM_INPUT_VOCAB_PARMS;

Description

Download an ASR vocabulary item to the specified module module. The vocabulary is contained in the file whose name is referenced by filename. The file must be in ".sas" format.

When this ASR vocabulary item is to be included in the active vocabulary for a specific channel, the returned vocabulary identifier item_id should be included in the list of identifiers specified in a call to sm_asr_listen_for() for that channel.

If the specified vocabulary item has already been loaded onto the specified module, no loading takes place, but item_id is returned with the same value returned by the call to sm_add_input_vocab() which performed the original loading operation. This allows separate applications to make use of the same module without explicitly co-ordinating their vocabulary requirements. Note that the item_id is local to the process invoking sm_add_input_vocab(). Requires the module iwr to have been downloaded.

Fields

module (Deprecated): The module whose vocabulary is to be amended.
filename (Deprecated): The name of the file containing a vocabulary item.
item_id (Deprecated): The identifier which can be used later to refer to this item.

Returns

0 if call completed successfully, otherwise a standard error such as:

ERR_SM_DEVERR - device error
ERR_SM_NO_SUCH_MODULE - if module was invalid
ERR_SM_FILE_ACCESS - problem encountered reading or opening file.
ERR_SM_FILE_FORMAT - file is wrong type.
ERR_SM_DOWNLOAD - problem occurred during download
ERR_SM_NO_RESOURCES - no room to store this vocabulary item.

Prosody speech processing: API: sm_asr_listen_for

This function is deprecated.

Prototype Definition

int sm_asr_listen_for(struct sm_asr_listen_for_parms *listenp)

Parameters