Turns out Hadoop and data analytics vendors aren’t the only ones benefitting from the Big Data revolution.
Data integration powerhouse Informatica reported $196 million in Q1 2012 yesterday, a record for the company in terms of first quarter earnings. It represents an increase of 17 percent from $168.0 million in the first quarter of 2011.
In a call with analysts following the announcement, Informatica CEO Sohaib Abbasi gave some of the credit to the increasing need to move large volumes of data between cloud and on-premise systems:
With the growing adoption of cloud computing as a business-critical platform, Informatica is enabling customers to retain control over their most important IT asset: big data both on premise and in the cloud.
Among key customers wins in Q1 for Informatica was Condé Nast, which Abbasi said “selected Informatica for managing mass data for both products and customers.” Informatica also filled out its Big Data partnerships, forging a new relationship with MapR to move data in and out of its M3 and M5 Hadoop distributions. Informatica HParser Community Edition is also now bundled into MapR’s distributions.
Looking ahead, Abbasi said the next major release of Informatica’s core platform, called Informatica 9.5 and slated for debut at Informatica World 2012 next month in Las Vegas, will focus heavily on Big Data integration:
With continuous data replication, Informatica 9.5 will enable timely analysis of big transaction data. With social MDM, Informatica 9.5 will help enrich mass data by authoritatively relating business and social relationships. And with native Hadoop transforms and visual ID, Informatica 9.5 will empower developers to productively leverage low-cost big data processing computing resources.
Yes, Big Data Requires Data Integration
While part of the value proposition of Hadoop and other parallel processing technologies is to process and analyze data where it resides, thus reducing the amount of data movement required, data integration is still a vital part of the Big Data stack.
Typical early Big Data use cases involve processing huge volumes of multi-structured data in Hadoop, then shipping the results to an analytic database for further analysis. Informatica is in a prime position to pick up a large portion of this business, and indeed it already has. This demand will only rise as Big Data deployments grow in number and scope, especially given the increasing interest in deploying Hadoop in the cloud.
Within Big Data environments, cleaning up messy multi-structured data is also a complex task, which Informatica addresses with its HParser product, released last year.
The company is not without competition, however. On the open source front, both Talend and Pentaho, with its Apache Kettle data integration platform, are pushing hard for Big Data business. And stalwart data integration/data protection vendor Syncsort is seeking to overcome its mainframe legacy and make a new name for itself in the Big Data space.
Check out Informatica CTO talking about his company’s approach to Big Data integration live inside theCUBE at Hadoop World 2012 here, and watch Talend Vice President of Marketing at Strata 2012 discuss his company’s strategy here.

