Easy Mode: Synchronizing Multiple Processes with File Locks

Thanks to the introduction of new language features and frameworks such as C#’s async keyword more and more developers are learning how to use parallelism and multi-processing in production contexts. One of the consequences of this, however, is having to learn how to use synchronization mechanisms to serialize access to shared memory and other resources.

In mobile app development the entire conversation around synchronization is focused on synchronizing access to resources within the context of a single running application – because there are no major mobile app platforms that allow you to run multiple parallel instances of the same application.

In desktop app development, however, it’s totally common for end-users to be able to launch multiple instances of your application and therefore it becomes more important to understand how to synchronize access to shared resources across multiple distinct processes.

There are two common use cases for process synchronization:

  1. The “single application instance” pattern – this means using a synchronization mechanism to prevent the user from launching more than one instance of your application. If you’re writing a Windows tray client that runs in the background and does file synchronization, remote backup, or anything else like that then you need to make sure that the user can’t accidentally kick off multiple instances of your backup software in parallel.
  2. The “shared access to resource” pattern – suppose you have multiple processes that can all write some critical data to the user’s registry on Windows or perhaps an important settings file; you have to pick some method for synchronizing reads and writes to this data if your application can have multiple instances running in parallel.

In the case of the MarkedUp, we need interprocess synchronization between our customer’s apps for the following two cases:

  1. Ensure that only one instance of a customer’s MarkedUp-enable desktop application can empty its “retry” queue – this is specific to a single customer’s application; if an end-user’s machine goes offline or if there’s an issue with our API, the MarkedUp SDK will save all of its undeliverable messages into a retry queue on-disk and retry sending them the next time the application runs OR if Internet connectivity is restored. If there are multiple processes running, we need to make sure that only one process can do this at a time.
  2. Ensure that only one In-app Marketing message can be displayed to a customer at a time, across multiple MarkedUp-enabled applications from different software publishers – this is trickier, because we have to synchronize the In-app Marketing TrayClient across all MarkedUp-enabled applications running on a single user’s desktop.

Given these requirements, we decided that the safest approach to inter-process locking was to use an old-fashioned file lock!

How File Locks Work

The concept of a file lock is simple and it’s fairly ubiquitous in the real-world – anyone who’s done a lot of work with Ruby (Gemfile.lock) or Linux has used a file lock at some point in time.

A file lock is implemented by having multiple process look for a file with the same name at a well-known location on the file system – if the file exists, then each process knows that the lock has been held at some point in time. The lock is acquired when the file is written to disk and the lock is released when it is deleted.

 

Typically the lock file will contain some data about the ID of the process that last held the lock and the time when the process acquired it – that way processes can have “re-entrant locking” and lock timeouts, in the event that a process was forcibly killed before it had a chance to explicitly release the lock.

Here’s an illustrated example:

file-lock example

Process 1 acquires the lock by writing markedup.lock to the file system – and then it does its work against the shared resource. Process 2 attempted to acquire the lock but since the file is already in use, it has to wait until Process 1 releases the lock.

file-lock example step 2

Process 1 releases the lock by deleting the markedup.lock file. Process 2 attempts to acquire the lock later and discovers that the markedup.lock file is missing, so it successfully acquires the lock by recreating the markedup.lock file and does its work against the shared resource.

File Locks in Practice (in .NET)

We picked a file lock over an OS-level mutex lock primarily because we couldn’t guarantee that the same thread that acquired a lock would be the same one that released it (required on Windows) – plus we wanted to have our own recovery mechanism to timeout locks.

We weren’t able to find a C# implementation of a file lock, we wrote and open sourced a simple .NET 3.5+ file lock implementation on Github – it’s licensed under Apache V2.0 so it should be safe for you to use in commercial software.

Here’s a quick demo of it in-action, synchronizing 50 .NET processes serially.

Want to use this inside your own application? Make sure you check out the source for file-lock on Github!

Using Cassandra for Real-time Analytics: Part 2

In the part 1 of this series, we talked about the speed-consistency-volume trade-offs that come along with implementation choices you make in data analytics and why Cassandra is a great choice for real-time analytics. In this post, we’re going to dive a little deeper on the basics of the Cassandra data model and illustrate with the help of MarkedUp’s own model, followed by a short discussion about our read and write strategies.

Once again, lets start off of our LA Cassandra User group meetup’s presentation deck on slideshare. Slides 12-18 are relevant for this post.

Cassandra Data Model

The Cassandra data model consists of a keyspace (analogous to a database), column families (analogous to tables in the relational model), keys and columns. Here’s what the basic Cassandra table (also known as a column family) structure looks like:

Cassandra Column Family structure

  Figure 1. Structure of a super column family in Cassandra

Cassandra’s API also refers to this structure as a map within a map. where the outer map key is the row key and inner map key is the column key. In reality, a typical Cassandra keyspace for, say, an analytics platform, might also contain what’s known as a super column family.

 

Cassandra Super column family structure

                                              Figure 2. Structure of a super column family in Cassandra

 

Evan weaver’s blogpost has a good illustration of the twitter keyspace as a real world example.

MarkedUp’s keyspace has column families such as DailyAppLogs (that count the number of times a particular log or event triggered per app) and Logs (that capture information about each log entry). These are also illustrated in figure 1 below.

The Datastax post about modeling a time series with Cassandra is particularly helpful in deciding upon the schema design. We index our columns on the basis of dates.

Note that since we use a randomPartitioner  where rows are ordered by the MD5 of their keys, using dates as column keys helps in storing data points in a sorted manner within each row. Other analytics applications might prefer indexing by hours or even minutes, if, for example, the exact time of day when the app activity peaks needs to be measured and reported. The only drawback would be more data points and more columns in the keyspace. With a limit of about 2 billion column families in Cassandra though, its almost impossible to exceed the limit. Thus, the fact that Cassandra offers really wide column families leaves us with enough leg room.

The row key in a Cassandra column family is also the “shard” key, which implies that columns for a particular row key are always stored contiguously and in the same node. If you are worried that some of your shards will keep growing at a faster rate than others, resulting in “hotspot” nodes that store those shards, you can further shard your rows by means of composite keys. Eg: (App1, 1) and (App1, 2) can be two shards for App1.

The counter for all events of a particular type coming from apps using MarkedUp are recorded in the same shard. (“What about hotspots then?”, you might wonder! Well, Cassandra offers semi-automatic load balancing so we load balance if a node starts becoming a hotspot. Refer to the Cassandra wiki for more on load balancing)

MarkedUp’s Read/Write Strategy

Now that we have a better understanding of the Cassandra data model, lets look at how we handle writes in MarkedUp. Logs from the Windows 8 apps that use Markedup arrive randomly on a daily basis. For incoming logs, we leverage the batch mutate method.

As you might have probably guessed, a batch_mutate operation groups calls on several keys into a single call. Each incoming log, therefore, triggers updates or inserts in multiple column families, as shown in figure 3. For example, a RuntimeException in AppX on Jan1, 2013 will update the DailyAppLogs CF with key AppX by incrementing the counter stored in the column key corresponding to Jan1, 2013 as well as the Logs CF by inserting a new key LogId. 

MarkedUp write strategy

Figure 3. MarkedUp’s write strategy

 

MarkedUp’s read strategy leverages Cassandra’s get_slice query, which allows you to read a wide range of data focused on the intended query, reducing waste (A ‘slice’ indicates a range of columns within a row). A query to count a wide range of columns can be performed in minimal disk I/O operations. Setting up a get_slice query is as simple as specifying which keyspace and column family you want to use and then setting up the slice predicate by defining which columns within the row you need.

The slice predicate itself can be set up in two ways. You can either specify exactly which columns you need, or you can specify a range of ‘contiguous’ columns using a splice range. Using column keys that can be sorted meaningfully is thus critical.

Figure 4 below illustrates the query “Get all Crash and Error logs for App1 between Date1 and DateN”. The get_slice_range query can easily read the counters as a complete block from the AppLogsByLevel CF because the CF is sorted by dates.

MarkedUp read strategy

Figure 4. MarkedUp’s read strategy

 

If you’ve read our previous blog post closely, you might be wondering if the returned information is even correct, given the fact that Cassandra compromises on consistency in favor of speed and volume (remember the SCV triangle?). Cassandra guarantees what is known as eventual consistency, which means that at some given point (milliseconds away from the triggering of the write operation), some nodes may still have the stale value, although by the end of the operation, every node will have been updated.

Luckily, Cassandra offers tunable consistency levels for queries. So, depending on your appetite for consistent output vis-a-vis speed, you can configure the desired consistency level by chosing different levels of “quorum”. MarkedUp uses ONE for writes and TWO for reads, to keep the web front-end as fluid as possible.

In the part 3 of this series, we’ll talk about some best practices of working with Cassandra and choosing a schema that fits your needs. Stay tuned for more!

MetroAppSite: Free, Open Source Metro-Style Website Templates for Your Windows Store Apps

Getting customers to notice and discover your Windows Store apps is hard, but you can reach users who aren’t inside the Windows Store using simple websites designed to promote your apps.

In addition, if your Windows Store app requires access to the Internet you are required by Windows Store policy to publish and link to a privacy policy hosted online (section 4.1.1.)

We decided to make life a little easier for Windows Store developers and built MetroAppSite – a fully responsive Metro-style website that uses Twitter Bootstrap and other standard frameworks to help developers promote their Windows Store apps.

And like most of our customers, we’re a .NET shop, so we built an ASP.NET MVC4 version of MetroAppSite too!

Features

Here are some of the great features that you get with MetroAppSite:

Metro theming and branding

Give your promotional website the same Metro look-and-feel that your users experience when they download your app from the Windows Store.

We even include a Microsoft Surface screenshot carousel for you to use to show off your Windows Store app’s look-and-feel.

metro-branding-metroappsite

MetroAppSite uses BootMetro and Twitter Bootstrap to give Windows Store developers an easy-to-modify, brandable template they can use to their own ends.

Fully responsive and touch/mobile-friendly

MetroAppSite’s CSS and design is fully responsive and touch-optimized out of the box. It looks great in full-sized web browsers and on mobile devices too!

metroappsite-mobile
Integrates seamlessly with third party services like Google Analytics and UserVoice

Unfortunately there isn’t a MarkedUp Analytics for websites yet, but in the meantime we made it dead-simple to integrate MetroAppSite with Google Analytics so you can measure your pageviews and visitors.

uservoice-logo

Additionally, we added hooks to integrate UserVoice directly into your app’s site so you can collect feedback and support tickets from users easily and seamlessly. UserVoice is what we used for our customer support at MarkedUp and we’ve had a great experience with it!

Templated privacy policy in order to make it easy for you to satisfy Windows Store certification requirements

Writing privacy policies can be a pain, so we made it easy for you to generate a privacy policy for your app using PrivacyChoice.org. You can paste these right into MetroAppSite and meet Windows Store certification requirements easily and thoroughly.

Demo Sites

We created some simple MetroAppSite deployments for you so can see what they look like in production:

Download

MetroAppSite is licensed under the Apache 2.0 license and is free for you to use in commercial or non-commercial projects.

Contribution

We happily accept pull requests via Github.

Introducing Hircine: a Stand-alone Index Builder for RavenDB

RavenDB

Here at MarkedUp, we rely on a variety of technologies to help produce meaningful analytics and reports for WinRT developers who want to know how their applications actually get used in the marketplace.

One of the technologies we use is RavenDB, a “second generation” document database built on top of C#, Lucene.NET, and Esent.

We use Raven for some of our “origin storage” – the unprocessed, raw data we receive from apps that have installed our WinRT client.

For the TL;DR crowd: if you’re tired of having RavenDB build its indexes on application startup, which is slow and painful, use Hircine to build your indexes at compile-time. Check out “Getting started with Hircine” if you need help figuring out how to use Hircine for the first time.

How RavenDB Normally Builds Indexes, and Why It Sucks

RavenDB indexes are C# classes which derive from Raven’s AbstractIndexCreationTaskobject.

When the RavenDB client builds indexes on your remote server what it’s really doing is taking instances of your index classes and serializes them over HTTP into a format that the remote RavenDB server can use to define a searchable / sortable index internally.

Most people typically use this method of the RavenDB client to build all of their indexes at once, and usually this happens on application startup (Global.asax):

Raven.Client.Indexes.IndexCreation.CreateIndexes(typeof(SimpleIndex).Assembly, documentStore);

In this instance SimpleIndex is a type derived from AbstractIndexCreationTask and documentStore is an initialized IDocumentStoreinstance connecting to a live RavenDB database.

When this method is called, Raven iterates over all of the classes defined in SimpleIndex‘s assembly and creates instances of every type that is assignable from AbstractIndexCreationTask – once it has that full list of indexes, it synchronously builds those indexes against theIDocumentStoreinstance, which can be really slow and can even fail sometimes.

Image doing this every time your application starts up, particularly if you have a large number of indexes or have to connect to multiple RavenDB servers. Global.asax is not a great place to build your indexes.

How RavenDB Indexes Are Built by Hircine

Hircinedoes things a little differently, but let’s talk about what it has in common with the RavenDB Client approach:

  • Hircine also looks for index definitions contained inside user-defined .NET / C# assemblies.
  • Hircine uses the same method (ExecuteIndex) as the RavenDB client for building each individual index, so the instructions being sent to the RavenDB server are consistent with what the RavenDB Client does.

So what does Hircine do differently?

  • Hircine runs in its own stand-alone executable (hircine.exe), rather than inside your application.
  • Hircine can builds all of its indexes in parallel by default, using multiple HTTP requests and threads to get the job done faster.
  • Hircine can build indexes found in multiple user defined assemblies at the same time, rather than relying on successive calls to IndexCreation.CreateIndexes.
  • Hircine can build indexes against multiple RavenDB servers in parallel, rather than doing them one at a time like the RavenDB client.
  • Hircine works asynchronously against both remote databases and embedded ones for rapid testing.

From this set of behaviors, you start to appreciate Hircine’s goals:

  1. To fully decouple RavenDB index-building from application startup;
  2. To make it trivial and painless to build indexes from multiple assemblies against multiple servers;
  3. To make the process of index building really, reallyfast; and
  4. To provide a simple interface that developers can integrate into their own build or continuous integration processes.

Hircine in Action

Want to see what Hircine looks like in action? Well here’s a peek, which you can see by running the rake command inside the root directory of the Hircine repository on Github.

image

You can learn more about Hircine’s command-line options from the getting started page.

Installing via Nuget

Hircine ships in two different Nuget packages:

Nuget – Hircine – stand-alone executable that runs directly from the command-line.

Nuget – Hircine.Core – core Hircine engine in a callable assembly, for people who want to integrate Hircine into their own projects.

Contributing to Hircine

Hircine is licensed under Apache 2.0 and the source can be found in Hircine’s Github repository.

MarkedUp happily accepts pull requests on its open source projects, so long as any significant changes are accompanied with an acceptable level of test coverage.