My main goals are to setup MongoDB for small scale applications that aren’t going to scale up to lots of users and multiple servers. I’ve installed MongoDB as a service and I’ve started to play around with the 10gen C# driver. There are a couple of C# drivers already out there (NoRM and the mongodb-csharp ones being the most popular) and people report varying levels of success. 10gen has also released a driver which has caught up feature-wise to the others. I decided to use this one because it’s easy to use and I do get a warm and fuzzy feeling knowing that it is from the 10gen folks. You can find the repository at https://github.com/mongodb/mongo-csharp-driver
Into the code!
The best place to get started is the 10gen C# driver tutorial at http://www.mongodb.org/display/DOCS/CSharp+Driver+Tutorial. It covers what you need to get started and sometimes a bit more.
The app I’m writing is a simple one that stores usernames and passwords along with some other information like the URL of the website to use the password and any notes that we might want to add. It’s just going to be on our internal network and won’t have any interface to the internet. That means I’m not focusing on things like encrypting the passwords, or other security measures if this were to be used anywhere else in the public.
Connecting to your database is pretty straight forward. There are many other options available, but just to get started, you just need the server name.
MongoServer server = MongoServer.Create("mongodb://myserver");
MongoDatabase db = server.GetDatabase("TheDatabase");
One of the things I like about MongoDB is that just asking for the database will create it.
So now that we have a reference to our new database, what are we going to do with it? Most of the time, we already have a model ready to be stored. Here’s my class that hold a password set.
public class CredentialSet
{
public ObjectId _id { get; set; }
public string Title { get; set; }
public string Username { get; set; }
public string Password { get; set; }
public string WebSite { get; set; }
public string Notes { get; set; }
public int Owner { get; set; }
public DateTime LastUpdate { get; set; }
}
It’s a pretty basic class. The only addition from standard C# is the ObjectId class. This class represents the default MongoDB identifier. You can choose to use your own unique identifier, but for now, I’m just going to use the default.
Our next step is to create an instance of our class and save it to the database. But first, we need a place to store it. In the relational world, we would use tables inside our database to store the data. In the MongoDB world, we use the Collection. Just like the database, the act of getting the reference to it will create it if it doesn’t exist.
MongoCollection passwords = db.GetCollection("passwords");
As you can see, we can specify our CredentialSet class when we get our collection. Even though MongoDB is a schema-less document store, it does make life easier to have a standard, static type to work with. When we specify a class like this, we are telling the driver to use our CredentialSet as the default when pulling our documents from the database. You can still insert any type of document you want, but this style saves us some key strokes later on.
So now let’s save our document.
var password = new CredentialSet();
// set the property values.
passwords.Save(password);
We can now use a tool like MongoVUE to see our record in MongoDB. When we take a look at it, we see something a little unexpected. Our _id is all zeros!
/* 0 */
{
"_id": "000000000000000000000000",
"Title": "A password",
"Username": "username",
"Password": "password",
"WebSite": "www.google.com",
"Notes": "This is a password!",
"Owner": "1",
"LastUpdate": "Tue, 1 Feb 2011 10:47:20 GMT -05:00"
}
Doing some research, I found some references to a known issue with the 10gen client, and a simple fix. We just need to add an attribute to our model’s _id property. Here’s the updated CredentialSet.
public class CredentialSet
{
[BsonId]
public ObjectId _id { get; set; }
public string Title { get; set; }
public string Username { get; set; }
public string Password { get; set; }
public string WebSite { get; set; }
public string Notes { get; set; }
public int Owner { get; set; }
public DateTime LastUpdate { get; set; }
}
This tells the driver that we want to use the _id property as the internal MongoDB identifier. After delete our exiting item using MongoVUE, we can run our sample again and examine the record.
/* 0 */
{
"_id": "4d38833880844214f0a8c60b",
"Title": "A password",
"Username": "username",
"Password": "password",
"WebSite": "www.google.com",
"Notes": "This is a password!",
"Owner": "1",
"LastUpdate": "Tue, 1 Feb 2011 10:47:20 GMT -05:00"
}
Much better. Now we can try to pull that document out. There are lots of queries you might want to do. Way more than I can go through. I’m just going to show two simple examples. The first is pulling out all documents, and the second is finding a specific record based on a single field.
Let’s start with all records.
var allPasswords = passwords.FindAll();
It doesn’t get much easier than this! Again, we can use this simple method because we’ve specified a default document class. From here, we have a collection of CredentialSet objects that we can work with using standard methods such as foreach or Linq to Objects. So now let’s get a specific document.
To get a specific document, we need to build up a Query object to tell the driver how to create the JSON that MongoDB will use to find our document. From there, we use the FindOne method on the collection.
var query = Query.EQ("Title", "A password");
var oneDocument = passwords.FindOne(query);
There are lots of options when creating a Query. The one we used here, EQ, does a simple Equality comparison. It finds all documents where the Title field exactly matches ‘A password’. Since this was the Title we put in above, that’s the one we get back. Just about all of the options for querying are available. The 10gen C# driver page does a good job covering them.
Wrapping up
This was my first use of MongoDB. With the basics of saving and retrieval down, I can move forward on getting an app up and running. I know that this is really simple and it doesn’t cover any of the features MongoDB is known for such as master/slave replication or sharding. I also don’t do any error checking.
Something as simple as this can be done with any relational database. But in order to do this, I’d need to hook in an ORM such as nHibernate or EF4. That means extra code. The MongoDB driver handles all of the class to JSON mapping automatically. That’s what I’m looking for with this.
Standard tutorial disclaimer: None of this code is what I consider Production Ready. It did give me a starting point to move forward from. Hopefully it helps someone else as well.
I’ve been reading up on MongoDB. I picked up MongoDB: The Definitive Guide. It’s not a bad reference to have. I’m actually surprised how thin it is, yet it answered just about every question I had. It covered everything I wanted to know in getting started with MongoDB when looking at it from a “SQL” perspective. It did a good job explaining about de-normalizing your schema because document databases just don’t work that way. It also covered the basics of accessing subdocuments, or arrays in arrays. I pretty much understand its benefits and limits as a database. Rightfully so, it didn't really cover single instance scenarios to the depth I have been thinking about. Sure, we all know how great all of the NoSQL databases scale when you need to handle 1.21 bajillion requests at once. But what if you want to use the easy and speed of storing and accessing data for a small app? I know it’s not sexy enough to cover in the main stream, but what about the people that want a document database for use in a small environment? That’s what I want to know about.
What if I want to create the back end of a billing system or CRM that
isn’t going to scale beyond a single office? What if I’m never going
to go past a thousand users?
I think that there can be a great benefit to the small app developer by using a document or key/value database instead of only a traditional ACID compliant, relational database. I’m going to experiment with using MongoDB as a central database for a few small apps. Why shouldn't we use the ease of access provided by these systems for apps that will never hit millions of users? Sure, ORMs are great, but what if we never had to use them at all? My next little hobby project will be to convert a little password app for my wife to sync up to a MongoDB server. It currently uses a local SQLCE database which works just fine. I think by expanding it to also save on one of my servers, it will provide a backup of the data in case her laptop crashes. It will also let her search my passwords, and me, hers.
I want explore the best settings for MongoDB when you only plan on using it on a single server. I feel that this is an under served area. There are lots of companies, for better or for worse, that don’t have the ability to scale out across multiple servers for their critical systems. Should they stick with traditional relational databases or can they too enjoy the performance benefits of a document database?’
The goal for my next post is to provide the answers to this question. What is the best configuration for a single server environment?
There is so much amazing stuff happening with technology today. I think I’m at a good spot in my career where I need to push my boundaries. I don’t have any development work as part of my job, so I think I need to pick up some hobby projects. My comfort zone is within the Microsoft stack: C#, IIS, and SQL Server really. Below is a list of things that I think I need to learn more about in 2011. It’s more like a list of personal goals, but also in case I lose focus as time goes on it’s here to remind me.
1) MongoDB
I have installed MongoDB twice before, but never really did anything with it other an walk through a couple of tutorials and think, “That’s pretty neat”, and just move on. There’s something about the scalability and ease of use of a NoSQL database that I find interesting.
2) MonoDroid
I know this isn’t a large learning curve since I will still be using C#, but I have never done mobile development. I have an HTC Incredible and I love it. I’m sure I can come up with some app that I want and code it up. I just signed up for the Preview. Crossing my fingers that I get accepted.
3) Clojure
I walked through a few of the tutorials but I haven’t really dove into it. I really need to figure out a project that could use Clojure. Functional programming really interests me, but I just don’t have the need for it… yet.
4) Windows Azure
Yeah, I know that’s really broad and there are a bunch of different parts to it. I have an MSDN subscription so I can play around on a small scale for free. I think it’s going to be important to really understand how it works and what I might be able to use it for.
5) node.js
It’s not super Windows friendly, but it sounds like you can get it to run via Cygwin. It sounds like it’s really cool and worth checking out. At a minimum I can pretend to be one of the cool kids that knows it’s potential.
I’m sure this will change, but it’s a pretty good start I think.
The current popular topic among developers is distributed version control. The current standard is Subversion, which was a nice improvement over CVS. Like most technology, when there are pain points, someone is going to improve on it. That’s where distributed version control systems come in. The two front runners are Git and Mercurial. There are dozens of blog posts about using a DVCS and what’s good about them over Subversion, so I’m not going to talk about that. Instead, this is about the factors in how I chose between them as I migrate away from Subversion.
Here’s what my environment and what I needed:
- As a .NET developer, I use Windows.
- The repository should be easily accessible over the internet from my servers.
- The repository needs to be locked down. The code I write isn’t open source.
The two DVCS’ do quite well with #2 & #3. But one of them does much better at #1 and makes #2 easier. Anyone who has looked at Git or Mercurial knows that Git wasn’t exactly designed with Windows support in mind. It can be used in Windows, and many people do. If it had a native Windows implementation, it probably would have been my choice. This StackOverflow question made it clear, that Git on Windows is not ready for prime time.
Ok, Mercurial it is!
So now to get started. There are tons of good tutorials about getting started. The two places I used were www.hginit.com and the TekPub series Mastering Mercurial. Getting up and running locally is pretty simple. TortoiseHg and VisualHg are on par with their Subversion counterparts, although I still learned about the command line options. I did struggle a bit to have Mercurial import my existing Subversion repositories, but again, a quick search brought up a bunch if pages with the fix.
The next step is to get the repositories up online. The TekPub series had a great walkthrough for getting the hgwebdir.cgi setup. Most of the difficulty is making sure that Python is enabled in IIS, after that it’s permissions that you need to get right. In the past, I played around with PHP and Python a little, so I had done most of those steps before. The overall setup in IIS does require basic authentication as the first level of security. That of course means you need to use SSL to keep the username and password from being sent in clear-text. Luckily, I already have an SSL certificate installed and ready.
Tying up the lose ends
So now I’ve gotten everything setup. I can push and pull from my “master” copy. The one thing that is driving me crazy is that I keep getting prompted for my credentials! I was a little surprised that there weren’t more articles about this. I was led down the correct path from an older post about someone trying to this on Linux. Luckily it translated over directly.
Mercurial on Windows uses INI files for most settings. There are global settings which are stored in the TortoiseHg install folder, and per-user settings in the root of the user’s profile. Here’s the section that needs to be added:
[auth]
group.prefix = server.company.com
group.username = domainuser
group.password = password
The ‘group’ that prefixes the entries is just a friendly name to group of properties for the authentication. It’s not tied to anything like the server name or repository. These settings will apply to any repository hosted on the server entered in the group.prefix field. After this was saved, I no longer got prompted when I pushed or pulled from the main server.
Moving forward
Now I’m in day-to-day mode. It’s still a little bit of a change in workflow compared to Subversion which I used for the last few years. The branching is really nice and the graph display is a slick feature. I haven’t had to do a complicated merge yet, but I hear it’s a nice experience. I can’t wait.