Tag Archives: abnf

An ABNF primer

Interestingly, a lot of hits on our NuGram web site come from people looking for the words ABNF tutorial on one of the major search engines. And although we provide great tools for working with ABNF grammars, we don’t provide any introductory text on the ABNF syntax. That’s a shame!

To remedy this situation, I just put on Slideshare a presentation extracted from our training material that covers the basic concepts of ABNF grammars.

Remember that ABNF is the native syntax for many speech recognition (ASR) engines. And if your ASR doesn’t support it, let NuGram IDE handle the conversion to XML for you!

Refactoring tools and grammar development

Refactoring tools are incredibly popular in the programming community. Most modern programming environments provide refactoring tools of various degrees of sophistication.

But what are refactoring tools? In short, they are tools that modify programs without changing their runtime semantics. In other words, refactoring tools must not introduce an observable difference in the execution of the program. They help abstract common code, change variable names, rename procedures or methods, etc.

Refactoring tools help developers make repetitive code restructuring tasks that would otherwise be highly error-prone if done by hand. Without such tools, even the simplest form of refactoring – renaming a variable in a file – can easily cause unexpected problems if done using a simple search and replace. Now imagine renaming a public method in an object-oriented language, where the method can be invoked from many different places in the whole project source code…

Refactoring applied to speech recognition grammars

Similarly to programming language refactoring tools, grammars refactoring tools help modify grammars without changing the language they accept and the values they return when interpreting sentences. There are a number of common tasks involved when writing speech recognition grammars that can benefit from refactoring tools. Here are a few:

  • Rule renaming. Naming things is hard. I am a programmer myself and I always find it hard to come up with the most precise name for a class, a variable, a procedure, or method. Naming grammar rules is just as hard. The programming environment should make it easy to rename a rule when we find a better name, in such a way that we don’t break the grammar. In other words, the renaming tool must rename the rule definition, as well as all its references (and potentially the root header). But just as important, semantic tags must also be taken into account when renaming a rule. How many times have you forgotten to modify the semantic tag after renaming a rule? A proper refactoring tool must therefore ensure that references to the rule in all semantic tags be modified as well.
  • Slot renaming. Likewise, slot names are often renamed. A renaming tool must ensure that all references in the defining rule as well as the references in other rules be changed at once.
  • Rule extraction. Another common task for the grammar writer is the extraction of a rule expansion to create a new rule. Grammars are often built incrementally. The grammar writer begins by coding a few rules, discovers potential for reuse, and creates new rules encapsulating these reusable parts. If the extracted parts contain semantic tags, it can be tricky (and highly error-prone) to modify them by hand and making sure that the semantic slots computed by the new rule are properly propagated to the referencing rule.

Challenges

SRGS grammars offer a number of important challenges with respect to refactoring tools:

  • They combine two different languages, namely the SRGS language itself for expressing the valid sequences of words, and the semantic tag language. These two languages have very different semantics.
  • The most common semantic tag languages are based on ECMAScript, a highly dynamic scripting language. The refactoring tools must thus understand the ECMAScript language and its various constructs to properly do their job.
  • The semantic tag language can vary from one ASR engine to the other.

Refactoring support in NuGram IDE

The refactorings described above are all supported by NuGram IDE. Moreover, they are aware of the grammar semantic tag language declared by the grammar – they behave differently whether the tag-format header is semantics/1.0 or swi-semantics/1.0 (the Nuance tag format is not yet supported). This, BTW, is the kind of thing that cannot be done by a generic XML editor.

To rename a rule, put the cursor on a rule name (the definition or a reference), and press Alt-Shift-R. You should see something like:

As you can see, all the references that must be changed at once are surrounded by a gray rectangle, even in the semantic tags.

To rename a semantic slot, put the cursor on a reference to the slot and press the same key sequence (Alt-Shift-R):

All the definitions and references will be modified at once when you change the slot’s name (here the semantic tags are in the swi-semantics/1.0 tag format). Note that all the references to the slot will be changed in the other rules as well, not only in the defining rule.

Finally, to extract an expansion in a new rule, simply select the expansion:

and type Alt-Shift-T:

You see that a new private rule has been created (the default visibility for newly created rules can be configured in the preferences), and a new tag has also been created to propagate the slots returned by the new rule to the calling rule.

These were very simple examples. Consider this (somewhat contrived) rule:

If I want to rename the $digit local rule, should the tool also rename the rules.digit property? That’s not clear. If the rule $<special.abnf#digit> is matched, rules.digit will contain the semantic value returned by that rule. Otherwise, it will contain the semantic value returned by the last match to $digit. There is an ambiguity here. The same identifier may refer to two different things.

Fortunately, If I try to rename the $digit rule using NuGram IDE, it won’t blindly attempt to rename the slot. It will instead pop up the following dialog (click to enlarge):

Of course, in practice grammars are rarely that hairy and complex. But refactoring tools must be correct 100% of the time. Otherwise, people would not use them by fear of breaking their programs or grammars.

Finally, note that all NuGram IDE refactoring tools are not only available for plain ABNF grammars, but also for the dynamic extensions as well. It is possible to rename variables, rename macros, and extract macros.

If you think of other repetitive grammar-related tasks that could be automated that way, please let us know. We strongly believe in powerful tools that help make applications more robust!

4 (not so good) reasons to author grammars in XML

At SpeechTEK University this summer, Judi Halperin from Avaya and Jenni McKienzie from Travelocity gave a very good introduction to grammar writing. The slides are definitely worth reading. They did a good job at addressing the most common sources of problems with speech recognition grammars.

However, two things struck me in their presentation: (1) They use the SRGS XML Form as the authoring language for speech recognition grammars, and (2) They mention JSP or ASP pages as the most common way of dynamically generating grammars. I’ll keep the latter point for another post, but let me address the first point here.

Having long ago abandoned the XML Form in favor of ABNF in our own practice, we’re always intrigued by the fact that a large number of grammar developers – including expert developers like Judi and Jenni – continue using the XML Form (in the case of Judi and Jenni’s presentation, I can see that for a teaching situation with time constraints they would choose GRXML for the examples since more people are familiar with that format and those that aren’t can read it easily, their choice was certainly a conscious decision). Indeed, there is just no question in our mind that ABNF, being so much more compact, readable, and easier to manipulate than the XML Form, is by far the better choice.

I therefore tried to put my feet in the shoes of those developers using the XML Form and understand their motivations. So here’s what I came up with:

  1. XML is the native format for the ASR engine. It’s true that some ASR engines – Nuance’s OSR and Nuance 9 in particular – only support the XML Form. It’s also true that support for the SRGS XML format is required by the specification, while support for ABNF is only optional. But there are format converters out there, so even on these platforms, the ABNF format can be used to author the grammar.
  2. It’s painful having to convert from ABNF to XML all the time. That’s a good point. Many testing tools provided with ASR engines (e.g., parseTool) will require you to convert the grammar to the XML form, which can indeed be painful. This is especially true if conversion tools are not well integrated with the environment in which grammars are being edited.
  3. XML is the format for all documents in the project. I heard this a few times. Some hard-core developers like XML. But that implies that the VUI designer, the speech scientist, or whoever authors the grammars, actually is a software developer. Quite often, that’s not the case.
  4. There is no good ABNF editor. I think this is the crux of the problem. Kind of a chicken and egg situation. No one uses ABNF because there is no good editor and no one provides a good ABNF editor because there is no demand for it. At least, with a decent XML editor, you get syntax coloring, code assist based on the document schema, etc. Unfortunately, an XML editor doesn’t know anything about grammars and therefore cannot provide advanced features like syntax checking of semantic tags, or refactoring capabilities (expansion extraction, rule renaming, semantic slot renaming, etc.).

However valid these points might have been at some point, now that there is a complete environment for developing, testing, and debugging recognition grammars in ABNF format (and exporting them to any target ASR engine), I don’t think there is now any remaining reason for not switching to ABNF. Like, immediately.

Am I missing something? Are there other more fundamental reasons I did not see? Let me know!

I am deeply convinced that once you try authoring your grammars in ABNF using NuGram IDE, you won’t want to get back to your old habits of coding grammars in the XML Form. Give it a try! It’s free. And, by the way, remember that more and more speech recognition engines support ABNF natively.

The best time to migrate to NuGram IDE is NOW

You are at the start of a new VoiceXML project. Or you’ve just completed a project and you are slowly entering maintenance mode. Better yet, you’re in the middle of a large project involving speech recognition grammars. Whatever situation you’re in, now is the best time to migrate to NuGram IDE. You may find that this is one of the best moves you’ve done in a long time. Here is why:

  1. It’s easy. If you haven’t already done so, downloading and installing NuGram IDE takes only a few minutes. Then, converting existing grammars to ABNF (assuming that you don’t already uses the ABNF format) is a matter of seconds. On a .grxml file in the Navigator view, simply right-click on the resource to open the contextual menu, and select “Grammar Tools > Convert to ABNF“. It’s as simple as that. You’re using GSL grammars? Don’t despair! The next release, due real soon, will provide a GSL to ABNF converter.
  2. You’ll increase productivity. Yes, installing NuGram IDE and converting grammars will cost you a few minutes of your time. But you will rapidly recover this investment many times over through increased productivity:
    • NuGram IDE provides many powerful tools to help you edit, debug and maintain your grammars in the same environment as your preferred Eclipse-based service creation environment, be it VoiceObjects Desktop, Cisco CVP Studio, etc.
    • NuGram IDE provides a “builder” that automatically converts ABNF grammars to the format of your choice as soon as you save them. No need to manually convert each grammar one at a time.
  3. You’ll increase quality. NuGram IDE was designed to maximize grammar quality by:
    • Helping you find grammar problems quickly and fix them easily. For instance, the grammar editor instantaneously flags syntax errors with meaningful diagnostic information and the coverage tool enables you to make sure that the grammar hasn’t been accidentally broken.
    • Providing powerful transformation and refactoring tools that always preserve the integrity of the grammar, therefore avoiding tedious and error-prone manipulations. This directly results from the fact that all NuGram IDE tools truly understand the underlying grammar structure since they work on an abstract representation level, not on the textual level.
  4. It’s free. We provide the beta version completely free of charge. And once we reach GA, the Basic Edition will remain free. You just need to register to be able to download new versions of NuGram IDE and be notified of new releases.
  5. There’s no risk. You don’t like using NuGram IDE ? Easy. Just export the grammars to your preferred format and go back to using your old tools. But frankly, we don’t believe you’ll ever want to do that.

So why wait? Register and download NuGram IDE now! Start using it and give us feedback. Help us provide you with the best tools ever for grammar development.

Why a grammar platform

On effective grammar tools

Why are there so many VoiceXML “Service Creation Environments” (also called “dialog designers” or “dialog builders”) available – some of them actually quite good – but no decent Grammar Development Environment? Over the past several years, we’ve often ask ourselves that question.

Indeed, we’ve always seen our grammar development tools not only as an essential component of our speech practice, but also as a key competitive differentiator. This is why we’ve invested so much effort constantly improving them based on the feedbacks of the most demanding grammar developers: Our own!

Judging from all the requests we got over the years regarding the availability of our grammar tools, it looks like a large number of people have also asked themselves the very same question.

One obvious reason, of course, is that you can’t make much money selling grammar tools, so why bother? Another, perhaps not-so-obvious reason is that it’s really not trivial to build tools that truly and effectively support the grammar development process. For instance, graphical grammar editing tools may at first glance appear appealing but they in practice just make grammars more cumbersome and difficult to manipulate without really addressing the most difficult challenges faced by grammar developers.

What grammar developers really need are tools that:

  • Really help accelerate the grammar authoring process – with an editor that provides all the advanced features developers should expect;
  • Can test grammar coverage and semantic interpretation correctness – to make sure grammars give the expected result (and that we don’t accidentally break them); and
  • Provide powerful grammar analysis, visualization, and debugging capabilities – to help pinpoint and fix problems in the grammar.

The dynamic grammar challenge

This, in fact, is what our grammar development tools have provided for a long time. There was, however, one important problem: We very often have to build applications that require grammars to be dynamically generated at run time based on input data. Although there are many ways of doing this, the bottom line was that we had a very sophisticated grammar development environment that we just couldn’t use for dynamic grammars. To us, this just made no sense. What’s the point of having great tools if you can only use them for half your grammars?

The fact that grammar development/tuning and dialog implementation require very different skills sets only made this situation worse. A great java developer is not necessarily a great grammar developer (and vice-versa). But the traditional approaches to dynamic grammars typically means that the grammar developer only ends up developing static grammars while the dynamic grammars have to be developed by whoever implements the application. Again, this makes no sense.

A complete grammar solution

Clearly, a complete grammar solution needs to effectively deal with all grammars – static and dynamic. This is why we created the NuGram Platform, whose key foundations are:

  1. The ABNF Template Language. This is essentially the ABNF format, as specified in the W3C Speech Recognition Grammar Specification, with the addition of dynamic grammar extensions, used to add dynamic content to grammars.
  2. NuGram IDE, an integrated grammar development environment that supports the development of static and dynamic grammars in a uniform and consistent way.
  3. A set of Grammar Services, used for instance to instantiate a grammar (based on a grammar template and an instantiation context), to generate the grammar in the required format (e.g., GrXML, GSL, ABNF), or to parse a text string using the grammar.
  4. NuGram Server, the dynamic grammar run-time component of the platform, designed to be easily integrated with any speech application or service creation environment.

Why do we believe this is a very significant step forward for speech application developers? For several reasons. For instance:

  • All grammars required by an application can now be developed, debugged, and tested using a unique, consistent development environment.
  • The ability to use a single grammar authoring language regardless of the target recognition engine eliminates the need to learn about new grammar tools when switching to a new ASR engine and makes grammars much more portable.
  • The ability to develop dynamic grammars in a way that is independent from any application runtime infrastructure also make them much more portable and reusable.

What’s the catch?

So, if this is so valuable, why do we make it available for free? Simply because we believe this will create business opportunities for Nu Echo. Frankly, we’re convinced that grammar developers that start using  NuGram IDE will just not want to go back to their old tools. If that’s the case, then, at some point, if they need outside help to develop or tune their grammars, we hope they’ll think of us.

Also, while the NuGram IDE Basic Edition will remain free, we plan to offer a Professional Edition with more advanced features. While many developers will undoubtedly be quite happy with the Basic Edition, we hope that some users will want to pay for the more advanced features. And, of course, there’s a catch: There will be a runtime license associated with NuGram Server for deploying dynamic grammars. Of course, if you don’t use dynamic grammars, then this has no impact. But we think you will at some point and, when that time comes, you’ll decide that you really want to use one of our runtime solutions.

Give us feedback

You can get yourself a free copy of NuGram IDE at http://www.grammarserver.com. Over the next several months, this blog will discuss a number of topics related to grammar development and the NuGram Platform. We certainly hope you’ll give us some feedback.