Document data stores

These data stores are shemaless or schemafree, meaning that the records in the same logical container (table or collection or ...) can be of a different structure each. In other words, two consecutive records can have different number of columns, each of different type. More, each column can hold another record with its own set of columns, creating nested records.

Big data definitions

Before digging into this world made of huge amount of data, streaming data flows and anayltic applications, let's fix some basic ideas.
Let's define the ground concepts of this world.

Big Data

Big data usually includes data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage, and process data within a tolerable elapsed time. Big data "size" is a constantly moving target, as of 2012 ranging from a few dozen terabytes to many petabytes of data.

Evolution of data

The evolution of captured digital information by today enterprises is evolving following a kind of hyperbolic curve.
The reduction of the storage cost, the reduction of the cost of the compute resources, the ubiquity of digital access (PC, Internet, smartphones, tablets, …), the evolution toward a digital world are factors creating more and more sources of digital information.


Storage is a key component of any system. We have heard a lot about storage.
But what is what ? Local storage versus remote storage.
Block storage, file storage, object storage, ... what's the differences ?
Let's try to raise a bit of the curtain on the storage aspect of your infrastructure.

Knowledge sharing

In this category, we can find solutions like :
  • Document management systems : where you will have versioning of documents, workflows, check out / check in possibilities, integration within the Office applications
    • KnowledgeTree
    • O3Spaces
  • Web forums : create threads of conversation with a web browser, post answers or questions, search for pertinent information
    • phpBB
  • Wiki : share your knowledge by creating / modifying / deleting articles on web site.
    • tWiki

Inventory management

One of the core components in any enteprise is the CMDB (Configuration Management DataBase). The CMDB is not just an inventory tools listing all the elements you have in your infrastructure, it is also a tool showing the dependencies between them.
Even in the case of small infrastructure, it is valuable to have in place a good inventory tools with dependency links between the elements.
In such CMDB tools, each componant is called "Configuration Item" or CI. A CI may be a server or a CPU in a server, a software, ...