Skip to main content
Version: Next

Wordsets

See ASRaaS Wordsets for more information.

Wordsets occur in the use case of personalization.

A wordset is a set of words that customize the vocabulary used by an application at runtime. For example, an application might use wordsets to fetch identified user-specific information to add recognizable values into a grammar (such as the appropriate bank account information for a specific user). The Krypton recognition engine and NLE use wordsets for dynamic content injection.

You can specify one or more .json wordsets that are passed along inline, and/or one or more compiled wordset resources with URIs.

Inline Wordset​

To use an inline wordset, include the path to the wordset file with the wordset_path parameter:

	"asr": {
"topic": "GEN",
"auto_punctuate": false,
"wordset_path": "/path/to/your/wordset.json",
...
}

You can also use more then one inline wordset. Pass of list of file paths with the wordset_path parmaeter to do so:

	"asr": {
"topic": "GEN",
"auto_punctuate": false,
"wordset_path": [
"/path/to/your/wordset_1.json",
"/path/to/your/wordset_2.json",
"/path/to/your/wordset_3.json"
],
...
}

There is no maximum number of inline wordsets that can be used, but using more than 10 is not recommended for performance reasons (see wordset limits).

Weights can be specified for the inline wordset(s) by setting the inline_wordset_weight parameter:

	"asr": {
"topic": "GEN",
"auto_punctuate": false,
"wordset_path": "/path/to/your/wordset.json",
"inline_wordset_weight": 0.1,
...
}

The default inline wordset weight if not specified is 0.1.

  • To specify individual weights for each inline wordset, use a list of numeric weights that correspond to each wordset in order with the inline_wordset_weight parameter.
  • If you are using multiple inline wordsets and only one weight is given via inline_wordset_weight, that weight will apply to all inline wordsets.

Compiled Wordsets​

To use one compiled wordset, specify the URI of the compiled wordset resource with the wordset_uri parameter:

	"asr": {
"topic": "GEN",
"auto_punctuate": false,
"wordset_uri": "urn:nuance-mix:tag:wordset:lang/<companion_DLM_context_tag>/<wordset_tag>/eng-USA/mix.asr",
...
}

Multiple compiled wordset URIs can be specified (up to 5; see wordset limits). Pass of list of strings with the wordset_uri parmaeter to do so:

	"asr": {
"topic": "GEN",
"auto_punctuate": false,
"wordset_uri": [
"urn:nuance-mix:tag:wordset:lang/<companion_DLM_context_tag>/<wordset_tag_1>/eng-USA/mix.asr",
"urn:nuance-mix:tag:wordset:lang/<companion_DLM_context_tag>/<wordset_tag_2>/eng-USA/mix.asr",
"urn:nuance-mix:tag:wordset:lang/<companion_DLM_context_tag>/<wordset_tag_3>/eng-USA/mix.asr"
],
...
}

The weights of the compiled wordsets can be set using the compiled_wordset_weight parameter:

	"asr": {
"topic": "GEN",
"auto_punctuate": false,
"wordset_uri": "urn:nuance-mix:tag:wordset:lang/<companion_DLM_context_tag>/<wordset_tag>/eng-USA/mix.asr",
"compiled_wordset_weight": 0.2,
...
}

The default weight for a compiled wordset if not specified via the compiled_wordset_weight parameter is 0.1.

Notes:

  • To specify individual weights for each compiled wordset, use a list of numeric weights with the compiled_wordset_weight parameter.
  • If you are using multiple compiled wordset resources and only one weight is given via compiled_wordset_weight, that weight will apply to all compiled wordsets.

See the ASRaaS Wordset Training gRPC API for more information about creating and interacting with ASRaaS compiled wordset resources.