What is a NoSQL Database, and What Are They Good For?

A NoSQL database is any type of database that breaks with the traditional conception of SQL. NoSQL databases such as document-based MongoDB have become more popular in recent years. What is all the hype about?

SQL limits: scalability

SQL has been around forever – 45 years. It holds up surprisingly well, and modern SQL implementations are very fast. But, as the web has grown, so does the need for powerful databases that evolve to meet demand.

The easiest way to scale an SQL database is to run it on a more powerful computer. SQL databases can be replicated to reduce regional load on an individual instance, but by splitting a table (often called sharding) is much more difficult for SQL.

Document-based NoSQL databases solve this problem by design. Each document is independent from other documents in the collection, so collections can be distributed across multiple servers much more easily. Many document databases will include built-in tools to share data between different servers.

But the scalability issue isn’t much of a problem until you have lot of data. You can easily run an SQL database with hundreds of thousands of users without any issues, assuming your structure is solid and your queries are fast.

MySQL and MongoDB will likely do the job for your application, so the choice between the two depends on the structure and syntax you prefer. Ease of development is important, and you might find that the newer MongoDB’s document model and syntax is easier to use than SQL.

NoSQL and SQL structure

Traditional SQL databases are often referred to as relational databases because of the way they are structured. In an SQL database, you will have multiple tables, each containing multiple rows (called records), which themselves have several different columns, or attributes. Each separate table is linked to the other by a primary key, which forms a relationship.

For example, imagine you have a table with each record representing a post made by a user. The primary key here is the username, which can be used to link the publications table to the users table. If you want to find the author’s email address, search for “Jon1996” in the users table and select the “Email” field.

But this data structure may not work for everyone. SQL databases have a rigidly defined schema, which can get in your way if you need to make changes or just prefer to have a different layout. With complex data sets, the relationships between everything can become more complicated than the data itself.

The main type of NoSQL database is a JSON document database, like MongoDB. Instead of storing rows and columns, all data is stored in individual documents. These documents are stored in collections (for example, a “user” document would be stored in an “all users” collection) and do not need to have the same structure as the other documents in the collection.

For example, a “user” document might look like this:

  "posts": [

Username and email fields are just key-value pairs, similar to columns in SQL, but the “posts” field contains an array, which you won’t find in SQL databases. Now suppose we have a collection of articles with documents like:

  "title":"First Post",
  "content":"Hello, World!",

Now when someone visits Jon’s page, your app can retrieve three messages with IDs 1, 2, and 3, which is usually a quick request. Compared to SQL, where you might need to fetch all messages matching Jon’s user ID. Still pretty fast, but the MongoDB request is more direct and makes more sense.

What are NoSQL databases used for?

NoSQL is a broad category and includes many types of databases built with different purposes. Every database is a tool, and your work may require a specific type of tool, or even several different tools.

SQL databases as MySQL, Oracle, and PostgreSQL have been around since before the Internet. They are very stable, have a lot of support, and can generally do the job for most people. If your data is valuable to you and you want an established and consistent solution, stick to SQL.

JSON document databases, as MongoDB and Couchbase, are popular for web applications with changing data models and for storing complex documents. For example, a site like Amazon may often need to change the data model to store products on the site, so a document-based database may work well for them.

Document databases are meant to replace SQL generically and are probably what you think of when you hear “NoSQL”. They are also more intuitive to learn than SQL, because you don’t have to deal with relationships between tables or complex queries.

RethinkDB is a JSON document database designed for real-time applications. In a database like MongoDB, you have to check for updates every few seconds, or implement an API on top of that to track real-time updates, which quickly get heavy. RethinkDB addresses this issue by automatically pushing updates to the Websocket feeds that clients can connect to.

Redis is an extremely powerful key-value database that stores small keys and strings entirely in RAM, which is much faster to read and write than even the fastest SSDs. It is often used with other databases as an in-memory cache for small data that is often written and read. For example, an email app might want to use Redis to store users’ messages (and even send real-time updates with their Pub / Sub Methods). Storing many small messages in this manner can cause performance problems with other types of databases.

Graphic databases are designed to store connections between data. A common use case is social media, where users are connected to each other and interact with other data, such as the posts they have made.

In this example, George is friends with two people, Jon and Jane. If any other type of database wanted to figure out George’s connection to Sarah, they had to interview all of Jon’s friends and all of Jane’s friends. But graph databases understand this connection intuitively; for the query of friends of friends, the popular graphical database Neo4J East 60% faster than MySQL. For friends of friends of friends of friends (3 levels deep) Neo4J is 180 times faster.

Wide column databases as Cassandra and Hbase, are used to store massive amounts of data. They’re designed for such large data sets that you need multiple computers to store it all, and they’re Faster than SQL and other NoSQL databases when spread across multiple nodes.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.