Interview Question: Compare relational (RDBMS) and noSQL databases

Shortlink: http://wp.me/p5Jvc-bf

With the advent of noSQL databases, IT systems have a new mechanism to persist data. So the obvious question arises. How does you compare RDBMS to noSQL databases? Well here are some parameters for comparison.

Dataset Size

RDBMS are designed to handle large datasets while noSQL databases handle huge datasets. Now there is no defined value to demarcate what constitues large and what can be considered as huge. However one can safely say that a large social media type web site like Google or Facebook will use a noSQL database given the quantum of data that it needs to handle. Also NoSQL databases offer better scalability vis-a-vis RDBMS.

Data Consistency Model

RDBMS use Atomicity, Consistency, Isolation and Durability (ACID) model, while noSQL databases use Basic Availability. Soft State and Eventual Consistency (BASE). The ACID model ensures that all database clients are provided a consistent view of the data. Meaning that all committed data is available to all clients. In case of BASE model, the data view is subject to the client consistency standard adopted by the noSQL database. Where consistency is paramout like stock trading, core banking applications we will opt for RDBMS. In case of a social media web site on the other hand the consistency model can be relaxed. Here we choose noSQL databases. The varied client consistency view is lucidly explained here.

CAP theorem

Brewer stated that a distributed computer system cannot simultaneously provide three attributes namely consistency, availability and partition tolerance. According to him, a system can provide at the most two of the three attributes. RDBMSs are CA systems while noSQL databases can be either CP or AP systems. Example of CP systems are Google BigTable, MongoDB, Hbase. Example of AP systems are Amazon Dynamo, SimpleDB, CouchDB. For more details visit this excellent reference.

Data Model

RDBMS have a standardized data model consisting of rows and columns. NoSQL databases have variety of data models like:

  • Key Value (DynamoDB, Terrastore, GenieDB)
  • Column-oriented,(Cassandra, HBase, BigTable)
  • Document-oriented (MongoDB, CouchDB)

Data Integrity

In RDBMS one can ensure data integrity by using foreign keys, constraints like unique, not null, check etc. Broadly constraints ensure that only logical data reside within the database. NoSQL databases does not have the equivalent of constraints due the nature of their design. It is primarily the responsiblility of the application encapsulating the noSQL database to ensure data integrity.

Querying

RDBMS supports the standardized SQL as its querying language. A key feature of RDBMS is support for JOINs. Implementing JOINs in noSQL is not possible/not favored as a solution. NoSQL supports the following broad categories of querying

  • Programming language specific API (Java, Python, .NET, PHP etc)
  • Proprietary Language (e.g. CQL in Cassandra)
  • MapReduce API
  • Object graph navigation API

Mapping of querying capabilities to some sample noSQL Database types

Key Value: Programming langauge specific APIs For example Amazon DynamoDB supports Java, .NET and PHP APIs.

Column Oriented: Cassandra has CQL. HBase has Java APIs.

Document Store: MongoDB has its own document-based Query langauge. MongoDB supports language specific APIs like Python, Java, Ruby, Perl. Terrastore supports HTTP and Java APIs.

NoSQL databases provide their own properiatary APIs to retrieve information. For example, HBase provides Java APIs namely Get, Put, Scans and Delete. Note in noSQL databases all operations are based on the noSQL database’s key column. This is unlike RDBMS where a developer is free to query any columns of any table. When designing a noSQL database, the key column is significant as it will serve as the focal point for all queries. There are methodologies to replicate the behavior of a JOIN. Refer the following blogpost for details on implementing the same in Casssandra. Some noSQL supports SQL like query langauge. For example Cassandra has CQL; Cassandra Query Language.

Schema definition

RDBMS have fixed schema. We need to define the database schema upfront before using it. noSQL databases have no such restrictions. This is an important feature. In social web sites, the need for a new column/table arrives at run time also the data structure can vary widely between two similar entities.

That’s all the points I could think of at the moment.

Advertisements

One thought on “Interview Question: Compare relational (RDBMS) and noSQL databases

  1. Howdy, i read your blog occasionally and i own a similar one and i
    was just curious if you get a lot of spam remarks? If so how do you protect against it,
    any plugin or anything you can suggest? I get so much lately it’s driving me insane so any support is very much appreciated.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s