Big data and therefore Hadoop are at the center of many important processes within 15-year-old PayPal, says the Internet bank’s Director of Engineering Anil Madan. But that does not mean that he sees any decrease in the importance of traditional RDBMS applications. Both are important, he said in an interview in the SiliconAngle Cube from Hadoop Summit 2012 (full video below).
“I am involved in several activities such as online advertising and audience segmentation, all of which have Hadoop and Big Data at their heart,” he said. PayPal was involved in Big Data before the term was coined or Hadoop was invented, using various technologies. When Hadoop emerged from Yahoo, PayPal was one of the earliest adapters.
“There is in general a lot of momentum at eBay to leverage these new technologies to get a leap into the future, and as an eBay company PayPal is definitely part of that.” It has partnered with CloudEra and even Yahoo to gain access to talent to help it move quickly into Hadoop adoption.
But equally, he said, RDBMS technology is as important to PayPal as it is in any bank. The two technologies have very different strengths and appropriate uses.
“Relational databases have unique strengths when you want accuracy, integrity, and security,” he said. PayPal uses those strengths to manage its basic money-management functions. “Hadoop is in the opposite space, where you want to work with unstructured data and data mining.”
As an early adopter, PayPal is very aware of the weaknesses of Hadoop including a lack of basic security. It has instituted strong security policies and governance including anonimizing all data before it goes into a Hadoop database and regulating the way users are brought onboard and what access they will have to create as much security as possible.
Madan is pleased to see RDBMS providers including Microsoft and Teradata participating in the Hadoop Summit. That, he says, shows that these vendors recognize the growing importance of Hadoop as a partner rather than a replacement for RDBMS technology.
He sees three major thrusts developing in the Hadoop community today:
1. Maturing the platform and making it enterprise ready,
2. Developing business-friendly tools for Hadoop, and,
3. Creating integration points and partnerships with RDBMS databases.
“PayPal wants to become an anytime, anywhere, anyway service provider in a world where online and offline are blending,” he said. “To do that, we want to tie the two worlds of Hadoop and RDBMS together to be a better service provider to our customers.”