We often use dynamic grammars in our applications. In fact, most of our applications use some form of dynamic grammar. This is why we long ago came to the conclusion that a complete grammar development and deployment solution had to be able to support both static and dynamic grammars.
From many interesting discussions we’ve had lately (in particular since the introduction of the NuGram Platform at SpeechTEK 2008), however, we have come to realize that people who develop grammars (VUI designers, speech scientists, application developers) do not always fully leverage dynamic grammars. For this reason, we thought it could be interesting to share our thoughts on use cases for dynamic grammars.
In this article, we will focus on motivations and usage scenarios. The next article will focus on describing a number of specific examples of dynamic grammars commonly – and perhaps not so commonly – used in speech applications. So let’s start with motivations. The main ones we see are the following:
- The grammar content is only known at run-time — This, of course, is the obvious case. Many situations require grammars to be generated on-the-fly based on information obtained during the call, either from an outside source (e.g., through a web service or a database query) or directly from the user (e.g., from the recognition results of a previous interaction).
- To improve recognition accuracy — Dynamic grammars can significantly improve recognition accuracy by making it possible to constrain the recognition grammar based on information available at run-time. It’s important to emphasize that this is often a much better solution than applying the same constraints while post-processing the recognition result (e.g., using a combination of SWI_vars and SWI_disallow with OSR or Nuance 9). Indeed, constraining the grammar prior to recognizing the utterance will almost always provide faster recognition and, more importantly, more accurate results than removing “disallowed” hypotheses as a post-processing step. It’s easy to understand why. Not sufficiently constraining the recognition grammar not only results in unnecessarily searching, during recognition, hypotheses that will get thrown away during post-processing (therefore wasting computational resources), but the presence of unnecessary alternatives in the grammar will often cause the correct hypothesis to be pruned away from the N-best list, therefore reducing accuracy.
- To avoid using proprietary engine features — For instance, although SWI_vars and SWI_disallow may sometimes offer an acceptable alternative to using dynamic grammars, one should not forget that this implies restricting the application to only work on specific recognition engines. The use of dynamic grammars provide a much more portable solution, while being more accurate.
- To solve maintenance problems — Let’s say that, for accuracy reasons, we want that the date grammar used by an application be constrained to the current year or the next. This, for instance, would be the case for a travel application asking about a departure date. If the grammar is static, this will require someone to modify the grammar once a year, a dangerous proposition given that if for any reason this update doesn’t get done, this may cause the application to completely stop working at some point. A better solution is to use a dynamic grammar that always makes sure that the grammar used is based on the current year. This completely solves the maintenance problem while making sure that the grammar used always provides optimal performance.
We should point out, however, that it’s sometimes much easier to implement complex constraints with a combination of ECMAScript, SWI_vars, and SWI_disallow (when possible) than to dynamically generate a grammar that has the same constraints built-in. For instance, to dynamically generate a grammar that only supports numbers between arbitrary lower and upper bounds is not a trivial matter, while doing it with ECMAScript is rather trivial. In some cases, the best solution is a combination of both techniques.
Now let’s discuss usage scenarios. There are actually many ways in which dynamic grammars can be used. For instance:
- On-the-fly — Dynamic grammars can be used on-the-fly to generate grammars that are based on data specific to a given call. This is the most dynamic situation, in which almost every call ends up using different instances of the same dynamic grammar. In this case, a new grammar instance must be generated, loaded, and compiled for every single call, which may introduce latency if grammars are large.
- Offline (triggered) — The generation of a new grammar can be triggered by an event occurring outside of the IVR application. For instance, the generation of a new speech attendant grammar could be triggered by a change in the company’s corporate directory.
- Offline (scheduled) — Dynamic grammars can be used offline, as part of a regularly scheduled grammar maintenance process. For instance, dynamic grammars could be used in order to provide a biweekly stock quote grammar maintenance service in which new (static) grammars are delivered every other week based on an updated list of companies.
- Offline (build time) — Dynamic grammars can also be used as an integral part of the application build process, where some of the grammars are generated based on company-specific data. For instance, a grammar used to recognize branch names and addresses would need to be produced based on branch data provided by the company. In this case it’s probably necessary to also have a scheduled maintenance process in order to make sure that the application remains up-to-date with changes at the company.
Note that in order to avoid an undesirable delay caused by the compilation of the grammar during a call, grammars generated offline could also be pre-compiled before they are used by the application. This is particularly important, if not mandatory, for very large grammars, some of which might take several seconds – if not minutes – to compile. Note also that any change in a grammar used by an application often implies that other portions of the application be updated as well. For instance, an updated grammar may imply the need for new confirmation prompts.
Next post: A bunch of dynamic grammar examples. If you have any examples to suggest, let us know.
