Advertisment

Some big data myths as Gartner sees it

author-image
Pratima Harigunani
Updated On
New Update
ID

MUMBAI, INDIA: With so much hype about big data, it's hard for IT leaders to know how to exploit its potential. Gartner, Inc. dispels some myths to help IT leaders evolve their information infrastructure strategies. "Big data offers big opportunities, but poses even bigger challenges. Its sheer volume doesn't solve the problems inherent in all data," said Alexander Linden, research director at Gartner. "IT leaders need to cut through the hype and confusion, and base their actions on known facts and business-driven outcomes." One myth is that 'Everyone Is Ahead of Us in Adopting Big Data'. On this, Gartner highlights that interest in big data technologies and services is at a record high, with 73 per cent of the organizations Gartner surveyed in 2014 investing or planning to invest in them. But most organizations are still in the very early stages of adoption — only 13 per cent of those we surveyed had actually deployed these solutions. The biggest challenges that organizations face are to determine how to obtain value from big data, and how to decide where to start. Many organizations get stuck at the pilot stage because they don't tie the technology to business processes or concrete use cases. It also pointed how IT leaders believe that the huge volume of data that organizations now manage makes individual data quality flaws insignificant due to the "law of large numbers." Their view is that individual data quality flaws don't influence the overall outcome when the data is analyzed because each flaw is only a tiny part of the mass of data in their organization. "In reality, although each individual flaw has a much smaller impact on the whole dataset than it did when there was less data, there are more flaws than before because there is more data," said Ted Friedman, vice president and distinguished analyst at Gartner. "Therefore, the overall impact of poor-quality data on the whole dataset remains the same. In addition, much of the data that organizations use in a big data context comes from outside, or is of unknown structure and origin. This means that the likelihood of data quality issues is even higher than before. So data quality is actually more important in the world of big data." The general view is that big data technology — specifically the potential to process information via a "schema on read" approach — will enable organizations to read the same sources using multiple data models. Many people believe this flexibility will enable end users to determine how to interpret any data asset on demand. It will also, they believe, provide data access tailored to individual users. In reality, most information users rely significantly on "schema on write" scenarios in which data is described, content is prescribed, and there is agreement about the integrity of data and how it relates to the scenarios. On Data lakes, it emphasised that a data lake's foundational technologies lack the maturity and breadth of the features found in established data warehouse technologies. "Data warehouses already have the capabilities to support a broad variety of users throughout an organization. IM leaders don't have to wait for data lakes to catch up," said Nick Heudecker, research director at Gartner.