“This whole Hadoop thing is really catching on,” says Matt Pfeil, Co-Founder and VP of Customer Solutions for DataStax, the company behind the productization of Apache Cassandra. Cassandra, Pfeil explained in an interview in the SiliconAngle Cube at Hadoop Summit 2012, is the real-time NoSQL database system for high availability, multi-location, online processing. DataStax has integrated Hadoop, Hbase and other portions of the Hadoop stack into Cassandra and just announced DataStax Enterprise 2.1, which adds Apache Mahaut.
“Like many of the companies in Big Data, we are seeing hypergrowth,” he said. The company now has more than 50 employees and 200 customer accounts and just added Prosper Capital to its board. It is selling into multiple verticals from Netflix, which manages 55 clusters with a team of three people, to several Wall Street firms that use it to capture ticker data in real time.
“Unlike a lot of Hadoop distributions, our advantage is operational simplicity and no single point of failure,” he says. “Cassandra only has one kind of node, and its architecture supports workload isolation among nodes, so one application running against a data set does not impact the performance of another running on the same data set on another node.”
It also is designed to run on inexpensive generic infrastructure, keeping costs lower. And it has no geographic or distance limits, so it can run on nodes in different geographies worldwide. Its query language resembles SQL, making adoption easier.
“Two years ago I ran a meeting of users where the most common question was, ‘What is this Hadoop thing?’ Today the most common question is, ‘What makes you different from other Big Data companies?’” he said. “My gut feeling is that every company in the Fortune 500 has at least one Hadoop or NoSQL project going, and users expect these technologies to help their business. They have read too many success stories not to.”