Azure DocumentDB and FSharp query style examples

Some of the core services at OnCall rely on Azure DocumentDb. While doing some refactoring on the services I started looking into different ways of accessing DocumentDb from an application written in F#.

There are several approaches when working with DocumentDB, and while they differ mostly in appearance, I found at least one pitfall to avoid.

Example code

While working on this post I decided to wrap the code up as a small example of using DocumentDB with F#. The code provides you with a starting point for the following set of operations.

  • Getting or creating a database
  • Getting or creating a collection
  • Adding a document to DocumentDB (set)
  • Getting a document from DocumentDB (get, in four variations)
  • Deleting a document from DocumentDB (delete, in two variations)

This is not intended to be copied and pasted directly into your code base, but rather as a starting point if you want to take DocumentDB for a spin using FSharp.

I needed a theme for the code and thought an app for managing a music collection would be suitable. Since nobody listens to CDs anymore the demo is an app for managing your cool retro vinyl records, we will call it "LpLibrary".

The code is available at GitHub https://github.com/clausasbjorn/LpLibrary. I am not posting the code here in its entirety so head over to GitHub if you want the complete example code.

Executing the code will

  • Add three records to your DocumentDB, creating a database and a collection in the process
  • Query for one of the records using four different query styles
  • Delete the three records

Take a look at the main loop for an idea of what is going on.

[<EntryPoint>]
let main argv = 

    // Create three records (of the Lp record type)
    let beachy = { id = "beachy"; Name = "Pet Sounds"; Artist = "The Beach Boys"; LentTo = "Greg" }
    let groovy = { id = "groovy"; Name = "What's Going On"; Artist = "Marvin Gaye"; LentTo = "Julia" }
    let punky = { id = "punky"; Name = "London Calling"; Artist = "The Clash"; LentTo = null }

    // Let's add some records using the set-function
    Async.RunSynchronously <| async {
        do! set beachy
        do! set groovy
        do! set punky
    }

    // And then query for one of them using the different get-functions
    Async.RunSynchronously <| async {
        let! groovyFromDb1 = getChainedLinq "Pet Sounds"        
        print groovyFromDb1

        let! groovyFromDb2 = getEmbeddedQuery "Pet Sounds"        
        print groovyFromDb2

        let! groovyFromDb3 = getPipedSequence "Pet Sounds"
        print groovyFromDb3

        let! groovyFromDb4 = getQueryExpression "Pet Sounds"
        print groovyFromDb4
    }

    // Before deleting them all again using the different delete-functions
    Async.RunSynchronously <| async {
        do! deleteQueryExpression "beachy"
        do! deleteQueryExpression "groovy"
        do! deleteEmbeddedQuery "London Calling"
    }

We will only spend our time on the get-functions and the delete-functions. Let's start by looking at a few different approaches for querying DocumentDB for documents.

Querying DocumentDB

The first query I wrote for DocumentDB was a chained LINQ-query, that works fine, but can become a bit messy. It also doesn't seem very in tune with F#. When I started rewriting the queries I experimented with a couple of different approches.

All of the approaches below are structured similarly. Only the actual query differs between examples.

  1. Get a collection
  2. Query and convert result to list
  3. Use pattern matching to return an optional result

Chained LINQ

let getChainedLinq name = async {
    printfn "Get chained LINQ %s" name
    let! collection = getCollection "LpLibrary" "Lps"
    let query = 
        Seq.toList
        <| client.CreateDocumentQuery<Lp>(collection.DocumentsLink).Where(fun lp -> lp.Name = name)

    match query with
    | [] -> return None
    | document :: _ -> return Some(document)
}

This approach works fine, but the LINQ-chain becomes unwieldy fast if the query gets complex.

Embedded queries

let getEmbeddedQuery name = async {
    printfn "Get embedded query %s" name
    let! collection = getCollection "LpLibrary" "Lps"
    let parameters = new SqlParameterCollection([| new SqlParameter("@Name", name) |])
    let spec = new SqlQuerySpec("SELECT * FROM Lps lps WHERE lps.Name = @Name", parameters)
    let query = Seq.toList <| client.CreateDocumentQuery<Lp>(collection.DocumentsLink, spec)

    match query with
    | [] -> return None
    | document :: _ -> return Some(document)
}

This seems to be common approach for writing DocumentDB queries. I don't really like it. It's very verbose and hard to read. It also relies on hardcoded SQL and weakly typed parameters. I try to avoid this approach if I can.

Piped sequence operations

let getPipedSequence name = async {
    printfn "Get piped sequence %s" name
    let! collection = getCollection "LpLibrary" "Lps"
    let query = 
        client.CreateDocumentQuery<Lp>(collection.DocumentsLink)
        |> Seq.filter (fun lp -> lp.Name = name)
        |> Seq.toList 

    match query with
    | [] -> return None
    | document :: _ -> return Some(document)
}

Of all the approaches, this is the one I find most appealing. Using forward piping and sequence operations looks good and is easy to read. The problem however is that this approach queries the database for documents before doing the filtering, making it unusable for document querying.

Running Fiddler in the background using this approach showed the entire collection being returned with the filtering happening locally afterwards.

Query expressions

let getQueryExpression name = async {
    printfn "Get query expression %s" name
    let! collection = getCollection "LpLibrary" "Lps"
    let query = Seq.toList <| query {
        for d in client.CreateDocumentQuery<Lp>(collection.DocumentsLink) do 
        where (d.Name = name)
        select d
    }

    match query with
    | [] -> return None
    | document :: _ -> return Some(document)
}

Using F# query expressions result in readable code with strongly typing. After experimenting for a bit, this is the approach I like the best. It's strange, because I don't really like this style of writing queries in C#, but with F# it looks OK.

Does it really make a difference?

As long as you don't use an approach that retrieves documents before filtering locally you should be fine. All of the above approaches above (besides from the "pipe forward sequence"-approach) produce a query that is sent to the DocumentDB.

There are slight variations to the queries with the two LINQ-approaches producing the same query, while the embedded query approach (naturally) sends the query you wrote yourself.

Query request to DocumentDB when using LINQ or query expressions
{"query":"SELECT * FROM root WHERE (root.Name = \"Pet Sounds\") "}
Query request to DocumentDB when using embedded query
{"query":"SELECT * FROM Lps lps WHERE lps.Name = @Name","parameters":[{"name":"@Name","value":"Pet Sounds"}]}

There seem to be no notable differences between these approaches, so go ahead and pick the one you prefer.

Deleting documents from DocumentDB

When deleting documents from DocumentDB you have to perform a two step process. There is currently no way to delete a document in one operation. Instead you need to do as follows.

  1. Query for a document
  2. Use the returned document "SelfLink" to delete the document

Since you need the SelfLink to delete a document querying for the document becomes a bit harder when using F#. My preferred approach for querying above was using query expressions. But when writing a strongly typed query like this.

for d in client.CreateDocumentQuery<Lp>(collection.DocumentsLink) do 
where (d.Name = name)
select d

The result of the query is an actual Lp-object, not a Document-object. Since the SelfLink belongs to the Document-object, the result cannot be used for deleting the document.

In order to delete documents I've used the following approaches.

Querying by Id

If you don't need to do advanced querying, but a simple query on an Id is sufficient, the query expression will actually work. By adding a (lower case) "id"-property to your document, you will be able to query without a specific document type (but rather the Document type), on the value of Id.

With this record in DocumentDB
{ id = "beachy"; Name = "Pet Sounds"; Artist = "The Beach Boys"; LentTo = "Greg" } 
You can query on Id
for d in client.CreateDocumentQuery(collection.DocumentsLink) do 
where (d.Id = "beachy")
select d

This will return a Document, giving you access to the SelfLink required to delete the record.

Embedded queries

If you need to do advanced querying you can always use embedded queries, but simply querying for the Document-object rather than your own object. Like so.

    let parameters = new SqlParameterCollection([| new SqlParameter("@Name", name) |])
    let spec = new SqlQuerySpec("SELECT * FROM Lps lps WHERE lps.Name = @Name", parameters)

    let delete =
        Seq.toList <| 
        client.CreateDocumentQuery<Document>(collection.DocumentsLink, spec).AsEnumerable()

Here we refer to the property "Name" in our query, but ask for a Document-object when we run the query. This will return a Document-object and give you access to the SelfLink required to delete the document.

Wrapping up

I have had some fun exploring the different options when using DocumentDB. And even though I've been doing some digging I still feel there are options I haven't explored and things that can be done better. Please don't hesitate to get in touch if you have suggestions for different approaches than the ones I have explored in this post.

View Comments