July 15th, 2010 1 Comment

by Dominique Boucher

An ABNF primer

Interestingly, a lot of hits on our NuGram web site come from people looking for the words ABNF tutorial on one of the major search engines. And although we provide great tools for working with ABNF grammars, we don’t provide any introductory text on the ABNF syntax. That’s a shame!

To remedy this situation, I just put on Slideshare a presentation extracted from our training material that covers the basic concepts of ABNF grammars.

Remember that ABNF is the native syntax for many speech recognition (ASR) engines. And if your ASR doesn’t support it, let NuGram IDE handle the conversion to XML for you!

June 16th, 2010 No Comments

by Dominique Boucher

NuGram IDE 2.2 available now!

Together with the introduction of NuGram Server Free Developer Edition, the Nu Echo team is pleased to announce that it also releases NuGram IDE 2.2. With this new release, designing and testing dynamic grammars has never been easier.

The most important feature introduced in this release is the support for Java to populate dynamic grammars. When using NuGram IDE, you use the exact same code to test and tune your grammar that will run in production, but without the long deployment cycle associated with stopping, deploying and restarting a Java web application. And deploying your grammars in NuGram Server is as simple as deploying JSP pages.

The free Basic Edition can be downloaded directly from within your Eclipse environment. Simply follow the download instructions. Or contact us for the Professional Edition.

We’re sometimes asked to compare the performance of different speech recognition engines on an identical task (same grammar, same set of test utterances). To do so in an effective way, we rely on three important features of NuGram Server (on-the-fly conversion of grammars to any format, semantic interpretation of textual sentences, and a NuGram-specific meta value that removes all semantic tags from generated grammars), which we use extensively in our tuning environment.

One powerful aspect about our tuning environment is that, no matter what recognition engine we use, there is no difference in the way we perform speech recognition experiments and then score and analyze results. It’s all completely transparent. This makes it easy to run the exact same experiment using different recognition engines and then compare results using metrics, graphs, and other tools that are used consistently across all engines.

A big challenge when comparing different engines is that we usually can’t use the same grammar since different engines often use incompatible grammar tag formats. For instance, let’s say we have a recognition grammar for the Loquendo LASR speech recognition engine and we would like to compare the performance we get with this grammar using three different engines: Loquendo LASR, OSR 3.0, and Nuance 8.5. In that case, we have three different tag formats: Loquendo uses SISR, OSR 3.0 uses swi-semantics and Nuance 8.5 uses the Nuance GSL proprietary tag format. So in principle, we would need to convert the grammars for each recognition engine, which can be a significant effort for complex grammars.

No need for manual grammar conversion

It is, however, possible to compare different recognition engines without having to manually convert the grammars. The approach we use is quite simple: With each engine that is not compatible with the original grammar’s tag format, we perform speech recognition using a grammar from which semantic tags have been removed and we then add semantic information back to the recognition result as a post-processing step.

This is all done using NuGram Server, as follows.

We start with the original grammar in ABNF format (here credit-card.abnf), which we use for the recognition test using Loquendo ASR. Then, we add a special-purpose NuGram meta directive to the grammar, which tells NuGram Server to omit the semantic tags when generating the grammar:

#ABNF 1.0 ISO-8859-1;
language en-US;
mode voice;
tag-format <semantics/1.0>

meta "com.nuecho.generation.omit-tags" is "true";

root $main;

When we perform the recognition test with OSR, we tell NuGram Server that we want to use credit-card.grxml (note the extension). NuGram Server then automatically converts credit-card.abnf to the SRGS XML format, while omitting the semantic tags from the grammar. Recognition then proceeds without a hitch, but results are returned without any semantic slots.

Similarly, when we perform the recognition test with Nuance 8.5, we tell NuGram Server that we want to use credit-card.gsl, which tells NuGram Server to automatically convert credit-card.abnf into a GSL grammar (still without semantic tags). Recognition once again proceeds without a hitch and results are returned without semantic slots.

Finally, in order to get recognition results with semantic slots, we simply send the original credit-card.abnf grammar and the recognition results to NuGram Server in order to add semantic slots to the recognition results. In other words, semantic interpretation is done as a post-processing step by NuGram Server based on the SISR tags in the original grammar.

Note that if the original grammar had been a GSL grammar or an OSR grammar, NuGram Server could still have computed the semantic interpretation based on the semantic tags in the original grammar (NuGram understands many different tag formats).

Dealing with engine-proprietary features

Some engine-proprietary features might make results more difficult to compare. For instance, OSR and Nuance 9 provide the special-purpose SWI_disallow key, which can be used to remove hypotheses from the N-best list of recognition hypotheses returned by the engine. This could for instance be used to remove credit card numbers that don’t have a valid checksum, therefore improving recognition accuracy as a result.

This useful feature could make recognition results difficult to compare if some engines have it and others don’t (in which case an equivalent result could be obtained by removing invalid hypotheses in the application). Fortunately, in our recognition tests we have the ability to tell NuGram Server to remove, from the N-best list, those hypotheses that match a specified slot pattern (e.g., SWI_disallow=1). This once again makes it possible to make fair and accurate comparisons between OSR or Nuance 9 and other engines.

May 25th, 2010 No Comments

by Dominique Boucher

The NuGram approach to dynamic grammars

I have just uploaded to Slideshare a short presentation about the Nu Echo approach to dynamic grammars.

For text-based applications too!

Remember that NuGram Server is not only for speech-enabled applications. You can use it to parse text-based sentences, too. So it is an ideal complement to your preferred cloud-based SMS or IM application platform like Tropo, Twilio, Teleku, just to name a few.

Try it now!

It’s free for development use, so don’t be shy. Give it a try! You simply need to register, upload your grammars, and use one of the many APIs we provide.

March 30th, 2010 No Comments

by Dominique Boucher

NuGram 2.1 available now!

The Nu Echo team is proud to announce the availability of NuGram 2.1.

The noteworthy new features in this release are:

  • Enhanced sentence generation tool (NuGram IDE)
    The sentence generation algorithm has been further improved and a new strategy (Rule examples) has been added. Sentences can now be generated from specific sentence patterns. Also, the generation process can be stopped.
  • New sentence explorer (NuGram IDE)
    The user interface of the sentence explorer has been completely changed. It is now much more intuitive and easy to use. It also allows sentence patterns to be added to the sentence generation tool.
  • Semantic interpretation optimizations
    All supported semantic tag languages based on ECMAScript have been optimized (compiled scripts are now properly cached). This dramatically increases the performance of the coverage test tool.
  • Complete rewrite of the underlying parsing algorithm
    The algorithm that matches sentences with the grammar rules has been completely rewritten. It is now much more efficient, in terms of speed and memory consumption.
  • Post-processing API (NuGram Server SDK / Professional Edition only)
    NuGram Server now provides an API to implement and deploy application-specific post-processing routines.
  • Initial support for different target ASR engines (NuGram IDE)
    It is now possible to specify the target ASR engine in the preferences. This affects the way words in grammars are normalized, and also how grammars are converted to GrXML.
  • Small enhancements to the ABNF dynamic grammar templating language
    The templating language now supports new forms, like optional grammar headers.
  • Many small improvements and bug fixes

The free Basic Edition can be downloaded directly from within your Eclipse environment. Simply follow the download instructions. Or contact us for the Professional Edition.

Please, let us know what you think of these new features!