Hortonworks announces Hortonworks Data Platform (HDP).
Hortonworks is including all the normal features around a distribution for Hadoop such as Pig, Hive, Ozzie, etc. The key features are the new things that they are providing as differentiators in their open source distribution.
The new Hortonwork Data Platform key features include bundling of new software for provisioning, management, and monitoring directly in the core platform all open source. Not a separate proprietary add-on but something download directly from the core platform.
At Hadoop Summit 2012 Hortonworks will showcase their Hortonworks Data Platform (HDP) and their 4 key differentiators:
1. Cluster provisioning and setup (Apache Ambari):
HDP includes easy to use cluster provisioning capabilities with a modern, intuitive UI that simplifies and accelerates the process of getting a Hadoop cluster up and running. The provisioning interface guides the user through a seven-step provisioning process by surveying the nodes for the target cluster and automatically recommending the optimal software configuration. Once the configuration is confirmed, one click will provision and start the entire cluster.
2. Management and monitoring services (Apache Ambari):
Comprehensive, easy-to-use dashboards provide a complete view of the health of the cluster as well as centralized access to HDP’s built-in management functionality. Since these capabilities are implemented as 100-percent open source, the added transparency facilitates easier integration with existing management and monitoring solutions.
This allows a developer can provision the platform right there and run it in production with dashboards all part of the core software distribution. Hortonworks is providing a wizard based provisioning allows operator to step through setup with auto-discovery of the cluster with recommendations for optimal cluster configurations then provide a one-click button provisioning.
3. Metadata services (Apache HCatalog):
HDP enables users to define the structure and location of data within Hadoop for easy and consistent access from any program. This capability makes it easier to create and maintain applications running on Hadoop. Simply put, sharing data within Hadoop via standard table formats expected by relational databases, enterprise data warehouses and other structured data systems makes it easier to integrate data management systems with Hadoop.
This feature really modernizes via the central storage of metadata around data structures that allows for 3rd parties tor read and write out of HCatalog. For example, if a developer is writing a Pig script that needs to process on data and if the structure isn’t stored somewhere the developers has to hard code that structure into the application. External systems can now treat Hadoop data as table via HCatalog other systems can pull out of Hadoop directly into their RDMS. Native integration Seemless mixing from a federated query and federated data access perspective. This should ease up the uptake from non-Hadoop vendors who now can “tool-up” with Hadoop and keep their RDMS queries. This will change the ecosystem big time. This feature will bring in a new set of developers into the core Hadoop community. It will be interesting to see how this changes the nature of the Hadoop community.
4. Data integration services (Talend Open Studio):
HDP provides users with graphical interfaces for connecting to data sources and building complex transformation logic all without writing a line of code. In addition, core platform services enable scheduling of data integration, transformation and data refinement within the platform.
Visual programming will bring in native capability to load and play with data. This should reduce the friction in terms of ease of use of the core platform. Talend Open Studio should be a great solution for data scientists.
Hard Charging Hortonworks – Competing on 100% Open Source Technology Value
According to Hortonworks CEO Rob Bearden, “Unlike alternative Hadoop offerings, HDP is 100-percent open source with no proprietary code, eliminating vendor lock-in and expensive proprietary add-ons. We are excited to deliver a comprehensive, Apache Hadoop-based enterprise data platform that provides the easiest way for the enterprise ecosystem to optimize and integrate their service with Hadoop.”
An obvious slam on other vendors. I see Hortonworks thinking more about other big guys rather than Cloudera although Bearden’s comments are aimed at Cloudera as well.
Hortonworks is partnering with Vmware in this announcement and soon to be announced product with Red Hat. Hortonworks is taking a holistic approach that they hope will be appealing to the vast majority of the enterprise especially out of the work with VMware and Red Hat.
Hortonworks’s position on high availability is to make it work with proven existing HA capabilities in the enterprises. For example the operating system and virtualization have HA capability that many enterprises are using today and Hortonworks wants to enabling existing Hadoop to take advantage of those existing HA in VMware and other environments.
Red Hat of Hadoop – Will it Work
According to Shaun Connolly, Hortonworks VP of Corporate Strategy, Data management space is ripe for the Red Hat model for Hadoop. The reason according to Sean is that Next generation data architecture and the market is so hot for solutions that scale.
Red Hat was accelerated when IBM and HP said it was enterprise ready. Hortonworks strategy is to get key wins with strategic big guys and integrate well into those big deals then make sure the code is 100% open source then rally the community around that. Hortonworks doesn’t want to invest money in “ringing doorbells” but instead putting all their resources into the technology and let the big integration deals to create a “pull market”.
All of the recent work from Hortonwork is dependent on making their Hadoop Hortonworks Data Platform enterprise ready.