While there have been plenty of indications over the last year that NoSQL is being preferred to traditional RDBMS for a variety of analytics and cloud applications, some analysts saw great significance in an acknowledgment by Oracle, historically a key RDBMS vendor, that RDBMS are not appropriate for all information management systems.
With its own Big Data Appliance, Oracle, like IBM, now seems to be arguing that NoSQL needs to be leveraged to analyze big -- and especially unstructured -- data sets. Also like IBM, Oracle is working with an open-source distribution of Hadoop, a powerful framework that allows searching of unstructured data across distributed file systems.
Before looking at what the rise of NoSQL might mean for analytics, let's pause and unpack the jargon. RDBMS -- relational database management systems -- were invented by an IBM scientist, Ted Codd, some 50 years ago, and have long been a popular way of organizing data within an information system. The RDBMS model really is quite familiar: it relies on a tabular structure for data within the database, and can quickly and consistently reorganize the tables to show relationships between data.
These are the databases of "fields" and "tags" with which most of us grew up.
It turns out that this isn't so easy when it comes to unstructured data. Unstructured data, as I've previously defined it, presents as hybrid collections of text, image, video, and other kinds of files. Medical databases are an obvious example, containing written medical records, charts and graphs, x-rays, and other images.
NoSQL -- so named because SQL ("structured query language") is not necessarily used as the preferred search language -- refers to a family of information management systems that do not rely on the traditional tabular model. NoSQL was developed in the 1980s by Carlo Strozzi, but practical interest in it has emerged more recently.
NoSQL-managed databases require no fixed tabular structure, no fixed relational ordering, and scale horizontally. In other words, the power of a NoSQL database is typically increased by adding relatively inexpensive small servers rather than piling more memory into expensive centralized data storage.
NoSQL systems appear in various forms, but essentially they deploy techniques for storing -- and searching -- only the data, and not table-based metastructures representing the data. NoSQL systems do not hesitate to replicate large units of data -- whole Web pages, for example -- because storage in a distributed data center is not a major cost. NoSQL, it almost goes without saying, is a neat fit for organizing cloud-based information storage.
There are perceived disadvantages. NoSQL systems, for the most part, do not guarantee consistency and integrity of results from the get-go, as RDBMS do, although consistency and integrity improves over repeated usage. The overwhelming advantage, though, is speed. This is why Websites that store vast quantities of data, like Facebook and Digg, have incorporated NoSQL into their management systems. Speed of data retrieval trumps absolute reliability.
It's early days, of course, for this technology, and it remains to be seen which versions of NoSQL will prove sufficiently robust for enterprise analytics purposes. But with database giants now promoting its virtues, it may soon be time for NoSQL's close-up.
— Kim Davis , Community Editor, Internet Evolution