Introducing Hircine: a Stand-alone Index Builder for RavenDB

RavenDB

Here at MarkedUp, we rely on a variety of technologies to help produce meaningful analytics and reports for WinRT developers who want to know how their applications actually get used in the marketplace.

One of the technologies we use is RavenDB, a “second generation” document database built on top of C#, Lucene.NET, and Esent.

We use Raven for some of our “origin storage” – the unprocessed, raw data we receive from apps that have installed our WinRT client.

For the TL;DR crowd: if you’re tired of having RavenDB build its indexes on application startup, which is slow and painful, use Hircine to build your indexes at compile-time. Check out “Getting started with Hircine” if you need help figuring out how to use Hircine for the first time.

How RavenDB Normally Builds Indexes, and Why It Sucks

RavenDB indexes are C# classes which derive from Raven’s AbstractIndexCreationTaskobject.

When the RavenDB client builds indexes on your remote server what it’s really doing is taking instances of your index classes and serializes them over HTTP into a format that the remote RavenDB server can use to define a searchable / sortable index internally.

Most people typically use this method of the RavenDB client to build all of their indexes at once, and usually this happens on application startup (Global.asax):

Raven.Client.Indexes.IndexCreation.CreateIndexes(typeof(SimpleIndex).Assembly, documentStore);

In this instance SimpleIndex is a type derived from AbstractIndexCreationTask and documentStore is an initialized IDocumentStoreinstance connecting to a live RavenDB database.

When this method is called, Raven iterates over all of the classes defined in SimpleIndex‘s assembly and creates instances of every type that is assignable from AbstractIndexCreationTask – once it has that full list of indexes, it synchronously builds those indexes against theIDocumentStoreinstance, which can be really slow and can even fail sometimes.

Image doing this every time your application starts up, particularly if you have a large number of indexes or have to connect to multiple RavenDB servers. Global.asax is not a great place to build your indexes.

How RavenDB Indexes Are Built by Hircine

Hircinedoes things a little differently, but let’s talk about what it has in common with the RavenDB Client approach:

  • Hircine also looks for index definitions contained inside user-defined .NET / C# assemblies.
  • Hircine uses the same method (ExecuteIndex) as the RavenDB client for building each individual index, so the instructions being sent to the RavenDB server are consistent with what the RavenDB Client does.

So what does Hircine do differently?

  • Hircine runs in its own stand-alone executable (hircine.exe), rather than inside your application.
  • Hircine can builds all of its indexes in parallel by default, using multiple HTTP requests and threads to get the job done faster.
  • Hircine can build indexes found in multiple user defined assemblies at the same time, rather than relying on successive calls to IndexCreation.CreateIndexes.
  • Hircine can build indexes against multiple RavenDB servers in parallel, rather than doing them one at a time like the RavenDB client.
  • Hircine works asynchronously against both remote databases and embedded ones for rapid testing.

From this set of behaviors, you start to appreciate Hircine’s goals:

  1. To fully decouple RavenDB index-building from application startup;
  2. To make it trivial and painless to build indexes from multiple assemblies against multiple servers;
  3. To make the process of index building really, reallyfast; and
  4. To provide a simple interface that developers can integrate into their own build or continuous integration processes.

Hircine in Action

Want to see what Hircine looks like in action? Well here’s a peek, which you can see by running the rake command inside the root directory of the Hircine repository on Github.

image

You can learn more about Hircine’s command-line options from the getting started page.

Installing via Nuget

Hircine ships in two different Nuget packages:

Nuget – Hircine – stand-alone executable that runs directly from the command-line.

Nuget – Hircine.Core – core Hircine engine in a callable assembly, for people who want to integrate Hircine into their own projects.

Contributing to Hircine

Hircine is licensed under Apache 2.0 and the source can be found in Hircine’s Github repository.

MarkedUp happily accepts pull requests on its open source projects, so long as any significant changes are accompanied with an acceptable level of test coverage.

2 Responses to “Introducing Hircine: a Stand-alone Index Builder for RavenDB”

×

Comments are closed.