Tag Archives: dynamic grammars

Testing dynamic grammars

In my post on NuGram and CouchDB, I neglected to mention how the dynamic grammar was authored and, most importantly, tested. Having a repeatable process for testing grammars is very important when developing a speech application, as most grammars change and get more complex over time.

Of course, the grammar was authored with NuGram IDE. NuGram IDE has some great features to test grammars, and especially dynamic grammars. Dynamic grammars (like the streets grammar) have always been more difficult to debug than static grammars. They can be very easy to write for small applications or prototypes (or blog posts…), but in real applications their coverage tests are often (and should!) run in batch as part of an automated build process. But this is often too cumbersome in practice. For instance, a dynamic grammar implemented as a JSP page requires a web application server to run and if the JSP page makes queries to a database, the DB must be running somewhere too. This greatly complicates the setup to make batch coverage tests. Moreover, writing and testing the dynamic grammar requires some programming skills that speech scientists don’t always have (at least not in large organizations).

With NuGram’s template language, a dynamic grammar can be tested in NuGram IDE Basic Edition in two different ways:

  • Using predefined data encoded as a JSON object (a JSON context), or
  • Using some custom Java code (a Java context).

Both ways require the creation of an instantiation context. It’s simply a mapping between variable names and values. An instantiation context must provide a value for each and every variable used in the grammar template. The values are used to populate the template and produce the resulting (ABNF or XML) grammar. The way the instantiation context is created depends on the type of context. For a JSON context, the instantiation context is the JSON document itself. For the Java context, some Java code populates a map from strings to objects.

The following video shows how to create a JSON context for the street grammar:

This one shows the steps required to create and use a Java context:

Note: there was a subtle (uncovered) bug in the previous version of NuGram IDE. If you want to create Java contexts like in the video above, please make sure to download the latest version.

The whole project used in the videos is available on github. The Java context initializers use the following open-source libraries:

In the next post, I will show how to use the Java context initializer to deploy the streets grammar on the Java-based version of NuGram Server.

And you, how do you test your dynamic grammars?

Bridging NuGram and CouchDB

Mark Headd, a new member of the Voxeo family, published a blog post last week on how to build speech recognition applications with Tropo. Since his post covered things like SRGS grammars and dynamic grammars, I couldn’t resist. I had to enter the fray and show how the dynamic SRGS grammar could be built using NuGram Hosted Server. And while I’m at it, lets use CouchDB instead of an SQL database.

The Database

Mark’s example is a simple address capture dialog. It consists in asking for the zip code, and then asking for the civic number and street name/type. The grammar for the second question is built dynamically based on the entered zip code. All street names/types and their associated zip codes are stored in a SQL database and retrieved by some PHP code.

In my case, I decided to store all the the data in a CouchDB database called “zipcode”. (CouchDB is a nice RESTful, HTTP-based document-oriented database, where documents are stored as plain JSON strings.) Once CouchDB is up and running (I assume here it’s running on the local host, on port 5984, but that could be on any hosting service, like CouchOne), we simply create the database and populate it using the curl commmand-line tool:

% curl -X PUT http://localhost:5984/zipcode
{"ok":true}
% curl -X POST http://localhost:5984/zipcode/_bulk_docs \
       -H 'Content-Type: application/json' \
       -d "`cat zipcodes.json`"

where the file zipcodes.json contains the following data:

{"docs": [
{
    "_id": "18752",
    "type": "zipcode",
    "streets" : [
       {"name":"First", "type":"Avenue"},
       {"name":"Grant", "type":"Avenue"},
       {"name":"Josiah", "type":"Parkway"},
       {"name":"Murphy", "type":"Lane"},
       {"name":"Chery Blossom"," type":"Circle"}
    ]
},
{
    "_id": "19752",
    "type": "zipcode",
    "streets" : [
       {"name":"Milberry", "type":"Extension"},
       {"name":"Jones", "type":"Street"},
       {"name":"Martin Luther King", "type":"Boulevard"},
       {"name":"Halsey", "type":"Place"}
    ]
}
]}

Each document (whose ID is a zip code) contains an attribute streets that lists all street names/types for the given zip code. Here there a only a few streets for two zip codes.

(Of course there are other ways to model the data, but that’s the simplest I could think of.)

The grammar template

Instead of using some code to create the streets grammar dynamically, we create a grammar template that is pushed on www.grammarserver.com (NuGram Hosted Server) that will later be populated with data from the database and rendered in GrXML (or ABNF).

To do that, we just need to register an account (but don’t worry it’s absolutely free).

So here is the grammar template:

#ABNF 1.0 ISO-8859-1;

language en-US;
mode voice;
tag-format <semantics/1.0>;

root $streets;

public $streets =
    $civicNumber $name [$direction]
    {out = rules.civicNumber.number + "," + rules.name + "," + rules.direction}
;

$civicNumber =
    {out.number = ''} ($number {out.number += rules.number}) <1->
;

$name =
    @alt
        @for (street : zipcode.streets)
           (@word street.name @word street.type)
        @end
    @end
;

$number =
     (zero | oh) {out = "0"}
   | one   {out = "1"}
   | two   {out = "2"}
   | three {out = "3"}
   | four  {out = "4"}
   | five  {out = "5"}
   | six   {out = "6"}
   | seven {out = "7"}
   | eight {out = "8"}
   | nine  {out = "9"}
;

$direction =
     north (west {out = 'nw'} | east {out = 'ne'} )
   | south (west {out = 'sw'} | east {out = 'se'} )
;

As you can see, it’s plain ABNF, with the exception of some simple dynamic directives on lines 19-23. And it’s a bit more involved than Mark’s one. It contains semantic tags to better format the recognized utterance.

To publish the grammar, we use curl again:

% curl -X PUT http://www.grammarserver.com/api/grammar/streets.abnf \
       -u username:password \
       -d "`cat streets.abnf`"

We are now ready to write the application.

Connecting the dots

Now that the database is set up and the template published on NuGram Hosted Server, the only thing we need to do is create a simple app that bridges the two. For this, I decided to use Tropo’s web API, and more specifically the Ruby webapi gem (as well as the couchrest and nugramserver-api gems). The app mimics Mark’s one and all CouchDB and NuGram Hosted Server related lines are highlighted below:

require 'rubygems'
require 'sinatra'
require 'tropo-webapi-ruby'
require 'nugramserver-ruby'
require 'couchrest'

couch_server = CouchRest.new "http://localhost:5984"
database = couch_server.database "zipcode"

post '/start.json' do
  tropo = Tropo::Generator.new do
    on :event => 'continue', :next => '/ask_street.json'
    on :event => 'hangup', :next => '/hangup.json'
    ask({ :name => 'zip_code',
          :bargein => 'true' }) do
      say     :value => "Say your 5 digit zip code"
      choices :value => "[5 DIGITS]"
    end
  end
  tropo.response
end

post '/ask_street.json' do
  session = GrammarServer.new.create_session "username", "password"

  tropo_event = Tropo::Generator.parse request.env["rack.input"].read
  zipcode = tropo_event.result.actions.zip_code.value

  grammar = session.instantiate "streets.abnf",
                                :zipcode => database.get(zipcode)

  tropo = Tropo::Generator.new do
    on :event => 'continue', :next => '/say_street.json'
    on :event => 'hangup', :next => '/hangup.json'
    ask({ :name => 'street',
          :bargein => 'true' }) do
      say :value => "What is your street address, beginning with your street number?"
      choices :value => grammar.get_url("grxml")
    end
  end
  tropo.response
end

#...

The app is not complete, some handlers are missing. But you get the idea.

A final note

Of course, this post just covers the basics of integrating a dynamic grammar in a speech app. A real address capture application is certainly a bit more complex than that. For instance, given the large number of streets covered by a single zip code, it may not be desirable to generate grammars dynamically. They may have to be compiled in advance, with a periodic update process. Or you may want to implement some clever grammar caching strategies. Either way, you may instead consider the Java version of NuGram Server (not the hosted one).

Getting started with NuGram Server Dev Edition

Today we announced the availability of the free NuGram Server Developer Edition. With NuGram Server, deploying dynamic grammars is now as simple as writing JSP or PHP pages, but designing them and debugging them becomes so much easier! Let’s see how to use NuGram Server in practice in 4 easy steps.

(The steps below assume the use of Unix or Unix-like environment. On Windows, you can use Cygwin or Mingw. An upcoming post will show the same steps for Windows users not having such an environment already installed.)

What is it, exactly?

So what exactly is NuGram Server? It’s basically a set of Java servlets offering speech recognition grammar-related services. The servlets can be used standalone or deployed as part of another Java web application.

Step 1 — Download NuGram Server

Of course, the first step is to download NuGram Server and request a free license. We will ask you for your name and an email address to which we will send the information to download the license. All you have to do then is save the license to a file (typically nugram-lic.nlb in $HOME/nuecho).

Once NuGram Server is downloaded, unzip the archive in some temporary directory:

[~] cd ~/tmp
[tmp] unzip ~/Downloads/nugram-server.zip

This should create a directory nugram-server-2.2.0-sdk:

[tmp] ls
nugram-server-2.2.0-sdk
[tmp] cd nugram-server-2.2.0-sdk
[nugram-server-2.2.0-sdk] ls
bin  conf  lib  webapp

These directories provide a skeleton NuGram Server instance. The bin directory contains some scripts to start the server in standalone mode (using the Jetty application server), and the webapp/grammars is where the grammars are put.

Step 2 — Download the sample projects

A Git repository hosted on Github contains sample projects to experiment with NuGram Server. It currently provides a single project, a dynamic grammar for a bill payee list. (Note that the projects can be downloaded without having to use Git at all. Simply go to the Github repository page and click on Download Source, and select Zip. You can then skip the second line below.)

On my machine, I simply do:

[~] cd ~/git
[git] git clone http://github.com/nuecho/nugram-server-samples.git
[git] cd nugram-server-samples/projects/bill-payee-list
[bill-payee-list]

Step 3 — Setup NuGram Server

The next thing to do is copy the NuGram Server main directories in the project:

[bill-payee-list] cp -R ~/tmp/nugram-server-2.2.0-sdk/* .
[bill-payee-list] ls
bin  conf  lib  README.md  src  webapp

We must now configure the license in webapp/WEB-INF/web.xml. Search for the com.nuecho.application.grammarserver.license-directory context initialization parameter and change its value to the name of the directory containing your free license (in my case /home/dboucher/nuecho):

<context-param>
 <param-name>com.nuecho.application.grammarserver.license-directory</param-name>
 <param-value>/home/dboucher/nuecho</param-value>
</context-param>

Finally, we must configure the context initializer for the dynamic grammar webapp/grammars/billpayees.abnf. (The context initializer is the piece of Java code that extracts the HTTP parameters and creates the global variables that will be available to the grammar template. More on this in an upcoming post.) We thus locate the initialization parameter com.nuecho.application.grammarserver.context-initializers for the /grammars servlet and replace it with:

<init-param>
  <param-name>com.nuecho.application.grammarserver.context-initializers</param-name>
  <param-value>
   billpayees.abnf=com.nuecho.samples.grammars.BillPayeeList
  </param-value>
</init-param>

Step 4 — Test your setup

To test that everything works fine, you just need to start the server in standalone mode:

[bill-payee-list] sh bin/server.sh
2010-06-16 13:51:37.735::INFO:  Logging to STDERR via org.mortbay.log.StdErrLog
2010-06-16 13:51:37.823::WARN:  Deprecated configuration used for ...
2010-06-16 13:51:37.937::INFO:  jetty-6.1.3
2010-06-16 13:51:38.397::INFO:  NO JSP Support for /webapp, ...
[NuGram Server] ----------------------------------------------
[NuGram Server] NuGram Server v2.2.0
[NuGram Server] ----------------------------------------------
2010-06-16 13:51:39.573::INFO:  NO JSP Support for /lib, ...
2010-06-16 13:51:39.704::INFO:  NO JSP Support for /conf, ...
2010-06-16 13:51:39.826::INFO:  NO JSP Support for /bin, ...
2010-06-16 13:51:39.861::INFO:  Started SocketConnector @ 0.0.0.0:8765

You then use a program like Curl or Wget to instantiate the dynamic grammar template using URLs like:

Can that be simpler?

What next?

You are now ready to experiment with your own dynamic grammars. If you’ve not already done so, download NuGram IDE to get a complete development environment with which you will be able to design and test your grammars without even having to start NuGram Server. You can even test your Java context initializers directly within it.

You can also consult the NuGram dynamic grammar language reference on Slideshare, as well as the reference manual.

My upcoming posts will explain in greater details how to develop Java context initializers, NuGram IDE’s support for them, and how to make efficient use of the caching features of NuGram Server. Stay tuned!

And please, share your dynamic grammars experience with us!

Introducing the NuGram Server Free Developer Edition

The Nu Echo team is pleased to announce the immediate availability of NuGram Server Free Developer Edition, which will finally enable developers to download a completely free version of NuGram Server and immediately take advantage of its complete set of advanced capabilities.

For over a year, hundreds of speech application developers worldwide have taken advantage of NuGram IDE’s powerful features in order to develop better grammars faster. In particular:

  • The grammar editor’s advanced features (syntax coloring, on-the-fly validation, content-assist, sophisticated refactoring tools, etc.) greatly accelerate development and increase quality by detecting a wide range of grammar errors and problems on-the-fly.
  • Its integrated suite of analysis, testing, and debugging tools make it easy to find problems early – and fix them.
  • Its coverage tool helps insuring grammar integrity and making sure that no problem is ever accidentally introduced during development or maintenance.
  • The use of a single development environment regardless of the target speech engine minimizes the learning curve and enhances portability.

One of the revolutionary features of the NuGram Platform is the ability to develop dynamic grammars just as easily as static grammars, using the same powerful environment and set of tools, and to deploy them as simply as JSP pages. This means that there is no longer any need for the traditionally complex, error prone, and difficult to test approaches for developing dynamic grammars. Until now, however, developers could not easily experiment with the dynamic grammar features of the NuGram Platform since, in order to do so, they were required to purchase a license of NuGram Server. With the introduction of a Free Developer Edition, this is no longer the case.

Download NuGram Server Developer Edition now!

And make sure to check our repository of sample dynamic grammars on Github.

The NuGram approach to dynamic grammars

I have just uploaded to Slideshare a short presentation about the Nu Echo approach to dynamic grammars.

For text-based applications too!

Remember that NuGram Server is not only for speech-enabled applications. You can use it to parse text-based sentences, too. So it is an ideal complement to your preferred cloud-based SMS or IM application platform like Tropo, Twilio, Teleku, just to name a few.

Try it now!

It’s free for development use, so don’t be shy. Give it a try! You simply need to register, upload your grammars, and use one of the many APIs we provide.