Advertisment

Making sense of unstructured data

author-image
CIOL Bureau
Updated On
New Update

Sandeep Koul

Advertisment

BANGALORE, INDIA: One of the latest buzzwords in the industry, Big Data, is about data sets that grow so big that they become un-manageable using conventional database management tools. That's because it's a kind of unstructured data, where you can't predict how fast it will grow, how much will it grow to, and what type of data will it contain. This is completely opposite of structured data, which has been used for future/behavioral analysis, because it lies in a proper database and there are tools to analyze it.

Since 90% of data lying in the digital universe is unstructured, there's tremendous interest in developing tools and techniques to manage it. This can be challenging, but we expect lots of exciting things in the area of Big Data next year.

Apache Hadoop

Advertisment

publive-image A big contributor to the Big Data movement is the Apache Hadoop framework, which also lies at the core of most enterprise solutions for Big Data ananlytics. The Apache Hadoop software library is a framework that allows for distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from a single server to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, delivering a highly-available service on top of a cluster of computers, each of which may be prone to failure.

Unstructured data to be used as a strategic business tool

Could you ever imagine that your innocent comment on Facebook or a tweet on Twitter about a product or a service from a company could end up being used to form that company's new business strategy or marketing drive? Given the humongous amount of digital data that's being created every second on popular social networking platforms, organizations can't afford to ignore it. That's why there's work happening to make sense out of all the data being generated on these social networking platforms.

Advertisment

There are many other applications that unstructured data can be put to, depending on the type of data. Let's take different kinds of data being generated by mobile phones for instance. The govt. can analyze mobile phone signals at different traffic intersections to develop a strategy for future transportation systems or even divert traffic in real time to less clogged routes. Another example could be the volume of cell phone calls and density of these calls at a particular location prior to a festival like Diwali could tell an electronic appliance company where to showcase their product advertisements.

40 per cent growth in Big Data next year

The rate at which the digital world that we live in churns out data is enormous and is bound to grow but the question is by what percentage would this data grow and what would be the implications of this increase? The growth of big data paved the way for the coining of the term 'Zettabyte', which is equal to 1 trillion Gigabytes! It is predicted that Big Data would grow by more than 40% from nearly 1.8 Zettabyte in 2011. It is also known that about 75% of this big data is generated by individuals and around 80 % of it is where enterprises have some amount of liability.

Advertisment

Big Data solution providers



publive-image Greenplum


The company, which become a division of EMC in 2010, is into Big Data analytics. It has several products that support both structured and unstructured data.

Greenplum database: Massively parallel processing (MPP) database, built to support the next generation of big data warehousing and analytics, capable of storing and analyzing petabytes of data.

Click here to continue reading!

tech-news