Archive

Archive for the ‘RavenDB’ Category

RavenDB – Searching across multiple properties

January 18th, 2012 1 comment

Ayende recently posted about Orders Search in RavenDB which got me a little bit excited, since I was pondering how I would do searching in RavenDB without having Full Text Searching from SQL Server.

So digging into it I wanted to try it out for myself how to use it. Given the model:

    public class Post
    {
        public int Id { get; set; }
        public string Title { get; set; }
        public string Description { get; set; }
        public IEnumerable<string> Tags { get; set; }
        public DateTime DatePosted { get; set; }
    }

I’ve setup 10 posts (click here for the insert pastie) just with some really basic data.

So I’m going to detail here all the data that I’ve setup.

Tags

Tag Name # of Posts Containing Tag
html 3
c# 6
ravendb 4
nhibernate 3
javascript 1
coffeescript 2
less 3
search 6
closures 1
jquery 2
css 1
queryover 2
mapreduce 4

Titles

Nothing interesting, just ‘Test Post X’ for each one to identify them.

Description

Basically for this testing I’ve taken the blog post names of a few things from Google Reader, that some-way relate to the tags above. Take a look at the script mentioned above to see the data.

Creating the Index

So the first thing I want to do is create a Map with a Reduce Result, but we aren’t going to add the Reduce to the index, since we don’t need it to store that data or do anything with it. We purely want the Reduce Result that matches the map, so that we can query against it.

public class Post_Search : AbstractIndexCreationTask<Post, Post_Search.ReduceResult>
{
    public class ReduceResult
    {
        public object[] SearchQuery { get; set; }
        public DateTime DatePosted { get; set; }
    }

    public Post_Search()
    {
        Map = posts =>
            from post in posts
            select new
            {
                SearchQuery = post.Tags.Concat(new[]
                                                {
                                                    post.Description,
                                                    post.Title
                                                }),
                DatePosted = post.DatePosted
            };
    }
}

This index is a little bit funky, and differs from what Ayende showed in his example. I wanted to try something a little different.

In my scenario I have a collection of Tag’s that I wanted to include in the search, this the tags is already a collection, I concatenate the additional array of items I want to add into the map.

The SearchQuery is the property that we will search against, while the DatePosted wont be included in the Search, but is there to provide additional filtering on my search.

Querying

Querying threw me off at first, because in order to query against this index, we have to specify the ReduceResult class.

So we end up with the starting of our query looking like this:

var result = session.Query<Post_Search.ReduceResult, Post_Search>()

At first I thought “oh, that means we end up with a ReduceResult result type, this is pointless and useless”. But I commented on Ayende’s blog post and it turns out we can call ‘As<T>’ on the query.

Without filtering the results just yet, our query would look like the following:

var result = session.Query<Post_Search.ReduceResult, Post_Search>()
                    .As<Post>()
                    .ToList();

So if I run this up now, for a quick test, I should get 10 results back of type Post

image

Great!

So now I need to begin filtering out the results. To begin with I’m doing to use the .Where extension. Since we are looking an object array, we can’t directly compare it to a string, but if we explicitly cast it to an object we can look for:

coffeescript expecting 2 results:

var result = session.Query<Post_Search.ReduceResult, Post_Search>()
                    .Where(x => x.SearchQuery == (object)"coffeescript")
                    .As<Post>()
                    .ToList();

 

image

How about javascript expecting 2 (1 via Tag and 1 via the Description)

image

Oh, we didn’t get the desired result… This is because the search is only doing a search on an exact match. Since the search value is an exact match of the tag, the result is returned.

So to fix this we need to make the index analysed. Adding to the index:

Index(x => x.SearchQuery, FieldIndexing.Analyzed);

If we run the exact same query again:

image

Now we get 2 results.

Now to try something a little bit different, using ‘Search’, if we wanted to search for something like mvc which happens to only be in the description, rather than using ‘Where’ like shown above, we can use ‘Search’ like so:

var result = session.Query<Post_Search.ReduceResult, Post_Search>()
                    .Search(x => x.SearchQuery, "mvc")
                    .As<Post>()
                    .ToList();

This will give us the same result, except it looks much cleaner

image

Now there’s 1 catch I’ve found with this, which is searching is always an exact match. I’m not sure (no research done into lucene yet) if lucene has the ability to do a wild-card type search similar to SQL like: ‘%mvc%’, but you can get suggestions from this.

For example if I search for ‘coffee’ rather than ‘coffeescript’ I would expect all documents containing ‘coffee’ to be returned. This doesn’t happen. It does give you suggestions though.

Looking at the management studio for ‘coffee’ :

image

Side Comment: I think it would be cool if RavenDB provided the ability to have say include suggestions, like:

var result = session.Query<Post_Search.ReduceResult, Post_Search>()
                    .Search(x => x.SearchQuery, "coffee")
                    .IncludeAllSuggestions()
                    .As<Post>()
                    .ToList();

Or other variations such as:

.Suggestions.IncludeAll()

.Suggestions.IncludeTop(3)

.Suggestions.IncludeAll(WhenResults.AreEmpty)

.Suggestions.IncludeAll(WhenResults.AreLessThan, 10)

Hopefully you can work out where I’m going with this?

Ok continuing on. Why do we need to call ‘As<T>()’ on the query?

Well from my understanding of how RavenDB works is like this, when we create an index, it’s creating a sub-set of data that points to the document in RavenDB.

For example I have all those documents inserted (link for the lazy), and these are all stored like so:

image

When we created the index with the following Map:

Map = posts =>
    from post in posts
    select new
    {
        SearchQuery = post.Tags.Concat(new[]
                                        {
                                            post.Description,
                                            post.Title
                                        }),
        DatePosted = post.DatePosted
    };

It basically created an index that looks like this, for the data above:

posts/2 SearchQuery: ["c#", "nhibernate", "search", "queryover", "Benjamin Day slides us into "How to be a C# ninja in 10 easy steps"", "Test Post 2"
DatePosted: "2012-01-02T00:00:00.0000000"

So the index actually points directly to a Document in RavenDB, when we search against the index, if a match is found, the index returns the Id ‘posts/2’ back, and that knows to go to the posts collection and grab the document with Id 2.

The problem with the query is we need to specify an object to query against.

So we introduced the ReduceResult (not sure on this naming but I took it from Ayende’s blog), this allows us to specify the Properties we defined in our index, as search criteria, but now our query is expecting ReduceResult:

image

By specifying as we are telling the query that our result is going to be a type of ‘Post’:

image

Conclusion

This functionality is really cool, it allows us to easily search against multiple different properties without having to create messy conjunctions in our LINQ. If we were to attempt to do this without an index, we would probably end up writing something like:

var result = session.Query<Post>()
                    .Where(x =>
                            x.Description.Contains("c#")
                            ||
                            x.Tags.Any(y => y == "c#")
                            ||
                            x.Title.Contains("c#")
                        )
                    .ToList();

And really, that’s just nasty… Specially considering we get the same results for writing more readable code:

image

Categories: RavenDB Tags: , , ,

RavenDB – Map Reduce

December 22nd, 2011 No comments

So, learning Map Reduce in RavenDB I decided that to take what I learnt from the index created in my previous post. I think I picked something rather difficult to begin with, but I’ve succeeded Smile

Given a document Article which has a collection of Tags.

I want to get a Count of each Tag across all Articles.

public class Content
{
    public int Id { get; set; }
    public string Title { get; set; }

    public IEnumerable<Tag> Tags { get; set; }
}

public class Tag
{
    public string Name { get; set; }
}

Note: Tag is it’s own class because I added additional properties to it.

Now I insert some data:

using (var session = documentStore.OpenSession())
{
    session.Store(new Content
    {
        Title = "Test Title for a Video",
        Tags = new List<Tag>
        {
            new Tag() {Name = "c#"},
            new Tag() {Name = "autofac"},
            new Tag() {Name = "asp.net"},
        }
    });
    session.Store(new Content
    {
        Title = "Test Title for an Article",
        Tags = new List<Tag>
        {
            new Tag() {Name = "c#"},
            new Tag() {Name = "nhibernate"},
            new Tag() {Name = "fluent-nhibernate"},
            new Tag() {Name = "mvc"}
        }
    });
    session.Store(new Content
    {
        Title = "Test Title for an Article",
        Tags = new List<Tag>
        {
            new Tag() {Name = "ravendb"},
            new Tag() {Name = "asp.net"},
            new Tag() {Name = "autofac"},
            new Tag() {Name = "c#"}
        }
    });

    session.SaveChanges();
}

 

So I’m expecting a count of:

  • 3 x c#
  • 2 x autofac
  • 2 x asp.net
  • 1 x ravendb
  • 1 x mvc
  • 1 x nhibernate
  • 1 x fluent-nhibernate
  • I’m going to pull these out with a defined type rather than dynamic/object, so I’ve created a new class with Count and Name:

    public class TagResult
    {
        public int      Count   { get; set; }
        public string   Name    { get; set; }
    }

So creating a new Index:

public class All_Tags : AbstractMultiMapIndexCreationTask<TagResult>
{
    public All_Tags()
    {
    }
}

The first thing I need to do is map out ONLY the Tag’s, when I select out the Tag’s, I’m also going to include another field called Count, with a default value of 1. This is so I can re-use it to sum the total number of times the tag is used.

AddMap<Content>(contents => from content in contents
                            from tag in content.Tags
                            select new
                            {
                                Name = tag.Name,
                                Count = 1
                            });

This would give me a result that contains duplicates for the tags. Along the lines of:

c# 1
c# 1
c# 1
autofac 1
autofac 1
asp.net 1
asp.net 1
ravendb 1
mvc 1
nhibernate 1
fluent-nhibernate 1

So what I need to do in the Reduce, is group the tags together by their Name.

Reduce = results => from result in results
                    group result by result.Name into tag
                    select new
                    {
                        Count = tag.Sum(x => x.Count),
                        Name = tag.Key,
                    };

So here, I group all the tags together by their name, but I also sum the ‘count’ value together to get the total number of times the tag is used.

Now run up the app and view the index:

image

Now if I query the index:

image

Awesome. Now to query this, I have to use the TagResult class defined previously, and the All_Tags index just created.

using (var session = documentStore.OpenSession())
{
    var result = session.Query<TagResult, All_Tags>()
                        .ToList();

    foreach (var tag in result)
    {
        Console.WriteLine(tag.Count + " x " + tag.Name);
    }

    session.SaveChanges();
}

Running this I get the following result:

image

The results I expected previously.

So there you have it. Map Reduce.

Categories: RavenDB Tags:

RavenDB Inheritance–Revisited

December 14th, 2011 1 comment

So after my initial post on RavenDB Inheritance, and the issue I had with polymorphic queries, and seeking help from the guys in JabbR and the RavenDB Google Group, Ayende ended up doing a screen cast with me where he solved all my problems.

One of the things he asked me was what I was trying to achieve by having a polymorphic query, which was a very good question, something I hadn’t really thought about.

The problem I was trying to solve was actually displaying search results.

The Problem

So I’m working on a personal project, and I need to display a few things which are similar, but different. There’s 3 different types but I’ll use two to keep it simple. I’ve also cut out most of the properties.

So I have an abstract class Content, with two derived classes, Article and Video.

public abstract class Content
{
    public int Id { get; set; }
    public string Title { get; set; }
    public DateTime DatePublished { get; set; }
}

public class Article : Content
{
    public string HtmlContent { get; set; }
}

public class Video : Content
{
    public string Description { get; set; }
    public string VideoUrl { get; set; }
}

Then I initialize the DocumentStore and store a couple of documents.

var documentStore =
    (new DocumentStore()
            {
                Url = "http://localhost:8080"
            }).Initialize();

using (var session = documentStore.OpenSession())
{
    session.Store(new Video
    {
        DatePublished = DateTime.Now,
        Description = "Test Description for a Video",
        Title = "Test Title for a Video",
        VideoUrl = "http://www.youtube.com/watch?v=PGz9GokDkkg"
    });

    session.Store(new Article
    {
        DatePublished = DateTime.Now,
        Title = "Test Title for an Article",
        HtmlContent = "Some content for the article…"
    });

    session.SaveChanges();
}

This time I’m not using the Convention to store the two documents as ‘Content’, rather I’m allowing it to store them as what they are. This gives me a result in Raven like:

image

Now if I query for Video:

using (var session = documentStore.OpenSession())
{
    var result = session.Query<Video>().ToList();
    
    foreach (var content in result)
    {
        Console.WriteLine(content.Id);
        Console.WriteLine(content.Title);
    }
}

I get the output of the first Document.

image

Likewise if I select ‘Article’ I get the Article document that I previously stored.

So how do I get a list of Content?

The Solution

So, the solution is really, really easy, it’s an index.

The first thing Ayende showed me was creating the index in RavenDB Management Studio, then he showed me doing it in code. I’m just going to show it done in code.

I created a class called ‘All_Content’ (with an underscore) like so:

public class All_Content : AbstractMultiMapIndexCreationTask
{
    public All_Content()
    {
        AddMap<Article>(articles => from article in articles
                                    select new
                                                {
                                                    article.Id,
                                                    article.Title,
                                                    article.DatePublished
                                                });
        AddMap<Video>(videos => from video in videos
                                select new
                                            {
                                                video.Id,
                                                video.Title,
                                                video.DatePublished
                                            });
    }
}

It reminds me of writing a Union View in SQL Server in some ways. It basically maps to the Articles and Videos, but only selects the things I need. Those of which would actually be displayed to the screen or that are common between the two document types.

Then I create the index right after I initialize the DocumentStore:

IndexCreation.CreateIndexes(typeof(All_Content).Assembly, documentStore);

This creates the index in RavenDB for me.

image

As you can see, even tho I specified the class index with an underscore, it converts it to All/Content, that’s a really nice way of presenting it. I think it will go well for being able to create descriptive indexes in the future.

And the index itself:

image

Now I need to actually query against the index. That’s also really really easy. When I specify the type, I can specify the index with it:

using (var session = documentStore.OpenSession())
{
    var result = session.Query<Content, All_Content>().ToList();
    
    foreach (var content in result)
    {
        Console.WriteLine(content.Id);
        Console.WriteLine(content.Title);
    }
}

Now when I run this I get the output:

image

Awesome!

The really interesting thing I found is that if I look at what’s returned:

image

Are the correct CLR types that I originally defined. So I haven’t lost all the additional fields by not defining them. I’m still learning but for now I assume it allows those fields to be searchable.

Extras

One of the additional things Ayende showed me was that you can include other documents that don’t inherit from the base type. You can include those in the index map, and then rather than returning a concrete type, you can specify object, or dynamic.

var result = session.Query<dynamic, All_Content>().ToList();

RavenDB is really powerful. It’s truly amazing, and so much nicer to work with in .NET than other document databases like MongoDB.

Categories: RavenDB Tags:

RavenDB Inheritance

December 10th, 2011 1 comment

Edit: Updated solution: http://www.philliphaydon.com/2011/12/ravendb-inheritance-revisited/

Continuing my learning of RavenDB, I wanted to see how it handled Inheritance.

I found: http://ravendb.net/faq/polymorphic-indexes

Which showed what to do allow you to select over all types of ‘Animal’ for the example shown. So I wanted to see what happens before and after using this method.

So like the example shown I’ve created an Animal, with a Dog and Cat.

public abstract class Animal
{
    public int Id { get; set; }
    public string Name { get; set; }
}

public class Dog : Animal { }
public class Cat : Animal { }

Now if I insert a Dog and Cat:

using (var session = documentStore.OpenSession())
{
    session.Store(new Dog() { Name = "Test Dog" });
    session.Store(new Cat() { Name = "Test Cat" });

    session.SaveChanges();
}

What’s stored in RavenDB is two separate documents, one for ‘dogs’ and one for ‘cats’.

image

If I include the Convention.

var documentConvention =
    new DocumentConvention()
        {
            FindTypeTagName =
                type =>
                    {
                        if (typeof (Animal).IsAssignableFrom(type))
                            return "animals";
                        return DocumentConvention.DefaultTypeTagName(type);
                    }
        };

Note: You can do the conversion when the DocumentStore is initialized, I broke the two up so that it would fit easier into my blog. Otherwise it’s too nested and yucky.

var documentStore =
    (new DocumentStore()
            {
                Url = "http://localhost:8080",
                Conventions = documentConvention
            }).Initialize();

Now when I insert a Dog and Cat I get:

image

Awesome. If we look at the document however:

image

There is no information about it being a cat or dog, I thought it would add some sort of discriminator similar to how NHibernate works.

However, if we look at the Metadata tab:

image

We can see the CLR type is stored in the metadata so RavenDB knows what type to create when we query it.

This means if we query for ‘Animal’ we get a list of Dogs and Cats.

using (var session = documentStore.OpenSession())
{
    var result = session.Query<Animal>();

    foreach (var animal in result)
    {
        Console.WriteLine(animal.Name);
    }
}

 

image

However, if you wanted to query for just Dogs, like so:

var result = session.Query<Dog>().ToList();

It doesn’t seem to work Sad smile

image

I’m probably just doing something wrong, either way, the more I play with RavenDB. The more I love it.

Categories: RavenDB Tags:

RavenDB – Changing the Lo on the HiLo Generator

October 24th, 2011 No comments

Well I’m currently learning RavenDB, it’s awesome! But I noticed when I put data in, all the Id’s generated every time I ran up my application to test were:

1, 2, 3, 4, 5…

1024, 1025, 1026, 1027, 1028…

2048, 2049, 2050, 2051, 2052…

This would be fine after the app is deployed since I wouldn’t be restarting it over and over and over, but during development I personally find it annoying that the numbers jump so high.

Fortunately I figured out a way. (which about an hour later I found on Google Groups, granted I had to use a different keyword to find it)

Basically you just need to create a new instance of the MultiTypeHiLoKeyGenerator class, passing in the arguments and assigning it to the document store:

var documentStore = new DocumentStore { Url = "http://localhost:12321/" };
documentStore.Initialize();

var generator = new MultiTypeHiLoKeyGenerator(documentStore, 10);
documentStore.Conventions.DocumentKeyGenerator =
    entity => generator.GenerateDocumentKey(documentStore.Conventions, entity);

using (var session = documentStore.OpenSession())
{
    session.Store(new Project() { Title = "Hello World" });
    session.SaveChanges();
}

So running up my app once:

image

And again:

image

Now the identity only increases every time the app restarts. And to show it generates more than 1 number…

image

It took a while of hunting on the net, but it turns out Googling & Binging, or searching(StackOverflow/Google Groups) for the keyword ‘Lo’ doesn’t work, the argument is ‘capacity’ and searching for that on Google Groups lead me here:

http://groups.google.com/group/ravendb/…..q=capacity

Hopefully someone else finds this useful Smile

Categories: RavenDB Tags: , , , ,