“If it doesn’t fit excel, it’s big data.”
That was Gilad Lotan, chief data scientist at Betaworks, giving a digestible meaning of how big data is about volume and variety as much as it is about velocity and variety, which conveniently rounds up to the four essential Vs you need in big data.
Lotan was speaking at Tech in Motion’s first ever Big Data meetup at the spacious office of Mediaocean, a leading software platform provider for the advertising world. He was with two other Big Data panelists Bruce Weed, program director of Big Data and Watson at IBM and Claudia Perlich, chief data scientist at Dstillery.
What is the big deal about big data? In terms of growth, it has reported earnings at $7.6 billion four years ago to expected earnings of $85 billion years from now. To give you a clear picture of earnings to date, revenue for hardware, software and professional services has already reached $27.36 billion.
“How did we get there?” asked moderator Cornelia Bencheton. Big data gained widespread interest in 2004, but since then, you’re either immersed in it or overwhelmed by it. Not many in the field are only too willing to understand it. Even the cultural and philosophical aspect of it is open to scrutiny.
For Weed, variety is the jewel of the four Vs.
The 4 Vs of big data is volume about terabytes to petabytes of data; variety--data in many forms—is structured and unstructured, text and multimedia; velocity, data in motion, the analysis of streaming dta to enable decisions within fractions of a second; and veracity--data certainty and managing the reliability and predictability of inherently imprecise data types.
Perlich is quick to point out though there is no bad data, just data we don't understand, or data that is wrongly interpreted.
Citing a use case study, Lotan talked his company’s investment in Poncho, which aggregates weather over time and how it has a need for an editorial voice, by determining the zip codes it can gather together, among other things.
With all the data out there, Perlich said it’s not surprising why some people think it’s a rabbit hole. She stressed the importance of knowing the decisions you should make.
Data science is complicated and aspires higher than computer science. Everyone has barely scratched the surface.
Lotan is the Chief Data Scientist at betaworks, a technology company that operates as a studio, building new products, growing companies and seed investing. Previously, Gilad ran the data team at SocialFlow and built data products at Microsoft’s FUSE Labs.
He serves on the Poynter Institute’s National Advisory Board as well as Columbia University’s Tow Center for Digital Journalism. His work has been covered by the New York Times, the Guardian, Fast Company and the Atlantic Wire and published across a wide range of academic journals.
Bruce Weed is the city leader (New York and Chicago) for IBM’s Cloud business development with Startups and developers. His focus and expertise are around Big Data and Watson.
Weed has extensive experience in business development, sales and marketing. His additional skills and experience lie in product and brand management, operational strategy, IT strategy, channels and software development.
Perlich currently acts as Chief Scientist at Dstillery (previously m6d) and in this role designs, develops, analyzes and optimizes the machine learning that drives digital advertising.
An active industry speaker and frequent contributor to academic and industry publications, Perlich enjoys serving as a guide in world of data and was recently named winner of the Advertising Research Foundation’s (ARF) Grand Innovation Award, was selected as member of the Crain’s NY annual 40 Under 40 list, WIRED’s Smart List, and FastCompany’s 100 Most Creative People.