Sorry, you need to enable JavaScript to visit this website.

Author:

Sanjay Radia

Company : Hortonworks

Title : Architect / Founder

 
 
author

Big Data Trends and HDFS Evolution

Submitted by Anonymous (not verified) on

Hadoop’s usage pattern, along with the underlying hardware technology and platform, are rapidly evolving. Further, cloud infrastructure, (public & private), and the use of virtual machines are influencing Hadoop. This talk describes HDFS evolution to deal with this flux.

We start with HDFS architectural changes to take advantage of platform changes such as SSDs, and virtual machines. We discuss the unique challenges of virtual machines and the need to move MapReduce temp storage into HDFS to avoid storage fragmentation.

HDFS - What is New and Future

Submitted by Anonymous (not verified) on

Hadoop 2.0 offers significant HDFS improvements: new append-pipeline, federation, wire compatibility, NameNode HA, performance improvements, etc. We describe these features and their benefits. We also discuss development that is underway for the next HDFS release. This includes much needed data management features such as Snapshots and Disaster Recovery. We add support for different classes of storage devices such as SSDs and open interfaces such as NFS; together these extend HDFS as a more general storage system.

Hadoop: Embracing Future Hardware

Submitted by Anonymous (not verified) on

This talk looks at the implications to Hadoop of future server hardware - and to start preparing for them. What would a pure SSD Hadoop filesystem look like, and how to get there via a mixed SSD/HDD storage hierarchy? What impact would that have on ingress, analysis and HBase? What could we do do better if network bandwidth and latency became less of a bottleneck, and how should interprocess communication change? Would it make the graph layer more viable? What would massive arrays of WIMPy cores mean -or a GPU in every sever. Will we need to schedule work differently?

The Evolving Apache Hadoop Eco System - What It Means for Big Data Analytics and Storage Developers

Submitted by Anonymous (not verified) on

This talk will cover its impact on the storage industry and what hadoop means for the big data analytics.

Learning Objectives

Understand the motivations behind some of the design choices for ReFS
Understand when to use this filesystem in Windows 8, in terms of supported features and tested configurations

What the Evolving Apache Hadoop Ecosystem Will Mean for Storage Developers

Submitted by Anonymous (not verified) on

During the last 12 months, the Apcahe Hadoop ecosystem has experienced tremendous growth and empowered enterprises to better handle large volumes of data. In many ways, this data explosion is outpacing current storage, management and processing approaches. As a result of the growing ecosystem, HDFS, the storage engine for Apache Hadoop which allows data to be processed in parallel, has evolved, enabling better isolation, faster startups and upgrades, and better scalability.

Subscribe to Sanjay Radia