Data Storage Innovation Conference 2016 Abstracts

webinar

Break Out Sessions and Agenda Tracks Include:

Note: This agenda is a work in progress. Check back for updates on additional sessions as well as the agenda schedule.

Big Data

SNIA Tutorial:
Big Data Essentials For IT Professionals

Sujee Maniyam, Founder/Big Data Principal, Elephant Scale

Abstract

Big Data is a ‘buzz’ word. There are lots of systems and software that claim to be doing Big Data. How do you separate noise from facts?

In this talk we aim to give you a comprehensive coverage of modern Big Data landscape. We will talk about various components , open source projects that make up Big Data ‘barn’. The talk will cover a few Big Data use cases and recommended designs for those use cases.

Intended Audience
Developers / Managers / Architects / Directors
Level
Beginner / Intermediate


Scalable and Smart Storage Class Memory Layer for Big Data

Robert Geiger, Chief Architect and VP Engineering, Ampool Inc.

Abstract

Today, if events change the decision model, we wait until the next batch model build for new insights. By extending fast “time-to-decisions” into the world of Big Data Analytics to get fast “time-to-insights”, apps will get what used to be batch insights in near real time. Enabling this is technology such as smart in-memory data storage, new storage class memory, and products designed to do one or more parts of an analysis pipeline very well. In this talk we describe how Ampool is building on Apache Geode to allow Big Data analysis solutions to work together with a scalable smart storage class memory layer to allow fast and complex end to end pipelines to be built- closing the loop and providing dramatically lower time to critical insights.

Learning Objectives

  • Explain what SCM means in Big Data, its importance and why it matters
  • Explain the Big Data analysis loop & system properties needed at each phase
  • Review Big Data requirements for storage class memory
  • Demonstrate an in memory big data pipeline Provide benchmark numbers

“Direct-to-Cloud” Technology Innovations that Open Up the Cloud for Big Data App

Jay Migliaccio, Director of Cloud Platforms & Services, Aspera, an IBM company

Abstract

Moving big data in and out of the cloud has presented an insurmountable challenge for organizations looking to leverage the cloud for big data applications. Typical file transfer acceleration "gateways" upload data to cloud object storage in two phases, which introduces significant delays, limits the size of the data that can be transferred, and increases local storage costs, and machine compute time and costs. This session will describe direct-to-cloud capabilities that achieve maximum end-to-end transfer speeds and scale out of storage through direct integration with the underlying object storage interfaces, enabling transferred data to be written directly to object storage and available immediately when the transfer completes. It will explore how organizations across different industries are using direct-to-cloud technology for applications that require the movement of gigabytes, terabytes or petabytes of data in, out and across the cloud.

Learning Objectives

  • An understanding of the root causes of technical bottlenecks associated with using cloud-based serv
  • Methods to overcome these technical bottlenecks and speed up cloud-based big data workflows
  • Insight into how organizations across different industries have successfully deployed cloud-based

Solving the Framework-Storage Gap in Big Data

Haoyuan Li, Founder and CEO, Alluxio

Abstract

As datasets continue to grow, storage has increasingly become the critical bottleneck for enterprises leveraging Big Data frameworks like Spark, MapReduce, Flink, etc. The frameworks themselves are driving much of the exciting innovation in Big Data, but the complexity of the underlying storage systems is slowing the pace that data assets can be leveraged by these frameworks. Traditional storage architectures are inadequate for distributed computing and the size of today’s datasets.

In this talk, Haoyuan Li, co-creator of Tachyon (and a founding committer of Spark) and CEO of Tachyon Nexus will explain how the next wave of innovation in storage will be driven by separating the functional layer from the persistent storage layer, and how memory-centric architecture through Tachyon is making this possible. Li will describe the future of distributed file storage and highlight how Tachyon supports specific use cases.

Li will share the vision of the Tachyon project, highlight exciting new capabilities, and give a preview at upcoming new features. The project is one of the fastest growing big data open source projects. It is deployed at many companies, including Alibaba, Baidu and Barclays. Tachyon manages hundreds of machines in some of the production deployments and brings orders of magnitude improvement. In addition, Tachyon has attracted more than 200 contributors from over 50 institutions, including Alibaba, Redhat, Baidu, Intel, and IBM.

Birds of a Feather

Privacy Versus (In)security: Who Should Win and Why Between Apple and the FBI?

Abstract

The FBI publicly demanded that Apple help the FBI unlock a dead terrorist’s cell phone by providing a special proprietary “back door”. Apple refused, noting that such a tool would invariably escape into the wild and jeopardize the security and privacy of the entire cell phone community: consumer and business. An anonymous 3rd party broke the impasse by providing such a backdoor. But, the issue remains: privacy/security controls versus selected data recovery demands. What is your opinion? Come to this BoF and help us find a win/win solution.


Open Source Storage and Datacenter

Michael Dexter, iXsystems Inc; Christopher R. Hertel, Samba Team; Jordan Hubbard, CTO, iXsystems Inc; Ronald Pagani, Jr, Open Technology Partners, LLC

Abstract

At this Birds of a Feather session, we’ll discuss how open source is enabling new storage and datacenter architectures. All are welcome who have an interest in open source, scale-up and scale-out storage, hyperconvergence and exploring open solutions for the datacenter.

  • How is open source solving your datacenter challenges?
  • What is working? What is not?
  • What would you like to see?
  • Which project(s) are you most excited about?

Compute, Bandwidth and Storage Implications for Next Generation Entertainment Applications

Yahya H. Mirza, CEO/CTO, Aclectic Systems

Abstract

Today there is a great excitement in increasing the immersion of virtual reality productions using real-time computer generated content, offline rendered content, captured content, and their combinations. Camera vendors are striving to rapidly enable film makers to capture more immersive reality. Headset vendors are working to allow viewers to experience a greater sense of presence to actually experience the virtual environments in a life-like manner. Consequently, these emerging forms of entertainment are expanding across multiple dimensions, integrating data from multiple cameras to create 360 degree and 360 degree stereoscopic video. The resulting separate frames are stitched together to form a 360 degree by 180 degree view from a single point in space. More cameras lead to synthesized 3D stereoscopic view from a single point in all directions. This situation is further complicated by the emerging promise of Light Field cameras that dramatically increase the compute, storage and bandwidth requirements over conventional feature films and interactive 3D game applications.

This BOF will start with a short presentation that overviews the various emerging entertainment forms and their implications. Next, a single visual effects shot will be dissected to illustrate the technical issues involved. Finally, a group discussion will be facilitated to discuss how emerging storage technologies will impact these emerging entertainment forms.

Capacity Optimization

SNIA Tutorial:
Advanced Data Reduction Concepts

Ronald Pagani Jr., Principal Consultant, Open Technology Partners LLC; SNIA Data Protection and Capacity Optimization Committee

Abstract

Since arriving over a decade ago, the adoption of data deduplication has become widespread throughout the storage and data protection communities. This tutorial assumes a basic understanding of deduplication and covers topics that attendees will find helpful in understanding today’s expanded use of this technology. Topics will include trends in vendor deduplication design and practical use cases, e.g., primary storage, data protection, replication, etc., and will also cover other data reduction technologies, e.g., compression, etc.

Learning Objectives

  • Have a clear understanding of current data reduction design trends.
  • Have the ability to discern between various deduplication design approaches and strengths.
  • Recognize new potential use cases for data reduction technologies in various storage environments.

How to Reduce Data Capacity in Object Storage: Dedup and More

Dong In Shin, CTO, G-Cube Inc.

Abstract

There are growing interest on object storage as a backup and version storage for their massive data on primary storage due to its intuitive interfaces and relatively low cost of ownership. However, object storage in its early stage now does not consider capacity optimization very well (especially on its open source implementation like Openstack Swift and Ceph). This presentation introduces data reduction techniques from the viewpoint of object storage; we will cover deduplication, compression, and more interesting techniques for capacity optimization on object storage.

Learning Objectives

  • Object storage data layout
  • Data backup using object storage
  • Capacity optimization techniques for object storage

Cloud

Cloud Bursting

Jim Thompson, Senior Systems Engineer, Avere System

Abstract

When it comes to cloud computing, the ability to turn massive amounts of compute cores on and off on-demand can be very attractive to IT departments, who need to manage peaks and valleys in user activity. With cloud bursting, the majority of the data can stay on premises while tapping into compute from public cloud providers, reducing risk and minimizing need to move large files.

Hear from Jim Thompson, Senior Systems Engineer at Avere Systems, on the IT and business benefits that cloud bursting provides, including increased compute capacity, lower IT investment, financial agility, and, ultimately, faster time to market.

Learning Objectives

  • Define cloud bursting and its use cases within the enterprise
  • Outline specific benefits offered by cloud bursting
  • Learn about case studies with actual companies using cloud bursting

Plotting Your Data Exit Before Entering the Private Cloud

Fredrik Forslund, Director of Cloud and Data Center Erasure Solutions, Blancco Technology Group

Abstract

According to the Ponemon Institute, 30 percent of business information is stored in the cloud. Like any relationship, both sides are wide-eyed about the limitless possibilities, attentive and full of promise. What promises, you might ask? Higher IT control, centralized management and delivery efficiencies are just a few.

But not all relationships last forever. Maybe a cloud storage provider isn’t meeting up to its pre-defined expectations, or they’ve decided to change the terms and conditions of their agreement. Perhaps they’ve had repeated outages. Or it might be as simple as coming to the end of the service agreement or contract. Whatever the reason might be, one of the biggest mistakes a company can make is not plotting out every single step of their exit plan well before entering the cloud. This could triple the likelihood of losing customer trust, loyalty and long-term business. And when you factor in the hefty legal fines and repercussions, it’s the kind of damage that’s nearly impossible to bounce back from.

In this session, leading IT, cloud infrastructure and tech experts from Blancco Technology Group and other firms will outline the step-by-step process of developing a written exit plan for data stored in the cloud and how to plot their data exit plan against key regulatory criteria so that they can best minimize the likelihood of data being accessed or stolen by cyber thieves.

Learning Objectives

  • Learn how to create an exit plan for storing data in the private cloud.
  • Learn how to plot data exit plan against regulatory standards.
  • Learn value/benefits of removing data safely.

SNIA Tutorial:
Storage in Combined Service/Product Data Infrastructures

Craig Dunwoody, CTO, GraphStream Incorporated

Abstract

It is increasingly common to combine as-a-service and as-a-product consumption models for elements of an organization's data infrastructure, including applications; development platforms; databases; and networking, processing, and storage resources. Some refer to this as "hybrid" architecture.

Using technical (not marketing) language, and without naming specific vendors or products, this presentation covers some improved storage capabilities becoming available in service and product offerings, and some scenarios for integrating these kinds of offerings with other data infrastructure services and products.

Learning Objectives

  • Considerations for Choosing Storage Services and or Products for Specific Use Cases
  • Evolving economics of service and product consumption models for storage resources
  • Reducing cost and operational complexity by consolidating storage capabilities into a smaller number of platforms

Dealing with the “Other” Latency: What the Cloud Portends for IT Infrastructure

Lazarus Vekiarides, CTO and Co-founder, ClearSky Data

Abstract

As flash storage becomes mainstream, storage pros are frequently bombarded by vendors and the press about the importance of latency when considering the performance of storage systems.

Simultaneously, the public cloud has emerged as a remote computing resource that is disrupting the way businesses use IT. In a world of geographically dispersed islands of compute, however, the latency problem takes on a different complexion: System designers need to consider the impact of physical distance and the speed of light more carefully than the latency of storage media.

In this discussion, we will cover some of the implications of latency on the performance of distributed systems and, in particular storage systems, in the context of the public cloud. We’ll detail:

  • The performance limitations that latency can create
  • Tools and infrastructure that can be used to mitigate these effects
  • A reference architecture that can minimize this problem holistically

Learning Objectives

  • The performance limitations that latency can create
  • Tools and infrastructure that can be used to mitigate these effects
  • A reference architecture that can minimize this problem holistically

SNIA Tutorial:
Why Analytics on Cloud Storage?

Padmavathy Madhusudhanan, Principal Consultant, Wipro Technologies
Radhakrishna Singuru, Principal Architect, Wipro Technologies

Abstract

Cloud platforms that provide a scalable, virtualized infrastructure are becoming ubiquitous. As the underlying storage can meet extreme demands of scalability in this platform, running storage analytics applications in cloud is gaining momentum. Gartner estimates 85% of Fortune 500 companies do not reap the full benefit of their data analytics, causing them to loose potential opportunities. Different Cloud providers do supply various metrics but they seem to be not uniform and inadequate sometimes. This mandates for a Cloud Storage analytics solution that follows a scientific process of transforming storage data metrics into insight for making better decisions.

Learning Objectives

  • Key areas where Cloud storage analytics help ex. capacity planning
  • Challenges in cloud based storage analytics (no uniform metrics across providers etc)
  • Benefits of Cloud based Storage analytics: Data Isolation

Introduction to OpenStack Cinder

Sean McGinnis, Sr. Principal Software Engineer, Dell

Abstract

Cinder is the block storage management service for OpenStack. Cinder allows provisioning iSCSI, fibre channel, and remote storage services to attach to your cloud instances. LVM, Ceph, and other external storage devices can be managed and consumed through the use of configurable backend storage drivers.

Led by a core member of Cinder, this session will provide an introduction to the block storage services in OpenStack as well as give an overview of the Cinder project itself.

Whether you are looking for more information on how to use block storage in OpenStack, are looking to get involved in an open source project, or are just curious about how storage fits into the cloud, this session will provide a starting point to get going.

Learning Objectives

  • Cloud storage management
  • Open source storage management

Deploying and Optimizing for Cloud Storage Systems using Swift Simulator

Gen Xu, Software Engineer, Intel

Abstract

With the rise of cloud systems, IT spending on storage system is increasing. In order to minimize costs, architects must optimize system capacities and characteristics. Current capacity planning is mostly based on trial and errors as well as rough resource estimations. With increasing hardware diversity and software stack complexity this approach is not efficient enough. To meet both Storage capacity and SLA/SLOs requirements needs kind of trade-off.

If you are planning to deploy a storage cluster, growth is what you should be concerned with and prepared for. So how exactly can you architect a storage system, without breaking the bank, while sustaining a sufficient capacity and performance across the scaling spectrum?

The session is designed to present a novel simulation approach which shows flexibility and high accuracy to be used for cluster capacity planning, performance evaluation and optimization before system provisioning. We will focus specifically on storage capacity planning and provide criteria for getting the best price-performance configuration by setting Memory, SSD and Magnetic Disk ratio. We will also highlight performance optimization ability via evaluating different OS parameters (e.g. log flush and write barrier), software configurations (e.g. proxy and object worker numbers) and hardware setups (e.g. CPU, cluster size, the ratio of proxy server to storage server, network topology selection CLOS vs. Fat Tree).

Learning Objectives

  • Design challenges of a cloud storage deployment
  • Storage system modeling technology for OpenStack-Swift
  • Use Case study: plan and optimize a storage cluster to meet capacity and performance requirements.

Swift Use Cases with SwiftStack

John Dickinson, Director of Technology, SwiftStack

Abstract

Swift is a highly available, distributed, scalable, eventually consistent object/blob store available as open source. It is designed to handle non-relational (that is, not just simple row-column data) or unstructured data at large scale with high availability and durability. For example, it can be used to store files, videos, documents, analytics results, Web content, drawings, voice recordings, images, maps, musical scores, pictures, or multimedia. Organizations can use Swift to store large amounts of data efficiently, safely, and cheaply. It scales horizontally without any single point of failure. It offers a single multi-tenant storage system for all applications, the ability to use low-cost industry-standard servers and drives, and a rich ecosystem of tools and libraries. It can serve the needs of any service provider or enterprise working in a cloud environment, regardless of whether the installation is using other OpenStack components. Use cases illustrate the wide applicability of Swift.


Imminent Challenges for the Cloud Storage Industry and the Solution

Mark Carlson, Principal Engineer, Industry Standards, Toshiba
Udayan Singh, Head, SPE-Storage, Platform and Manageability, Tata Consultancy Services

Abstract

The storage industry is being transformed by the adoption of Cloud Storage. Challenges that were overlooked during the initial stages of cloud storage industry growth are now becoming core issues of today and the future. In this session we discuss the major challenges that corporations will face to avail themselves of the best of services from multiple cloud providers; or to move from one cloud provider to another in a seamless manner.

The SNIA CDMI standard addresses these challenges by offering interoperability between clouds storage. SNIA and Tata Consultancy Services (TCS) have partnered to create a SNIA CDMI Conformance Test Program to help cloud storage companies achieve conformance to CDMI and ensure interoperability between clouds. The TCS CDMI Conformance Assurance Solution (CAS) provides cloud storage product testing and detailed reports for conformance to the CDMI specification.

As interoperability becomes critical, end user companies should include the CDMI standard in their RFPs and demand conformance to CDMI from vendors.

Learning Objectives

  • Understand the critical challenges that the cloud storage industry is facing
  • Solution to address the identified challenges
  • Benefits of CDMI conformance testing
  • Benefits for end user companies

Beyond the Cloud: Space-based Data Storage

Scott Sobhani, Co-founder & CEO, Cloud Constellation Corporation

Abstract

As cloud and storage projections continue to rise, the number of organizations moving to the Cloud is escalating and it is clear cloud storage is here to stay. However, is it secure? Data is the lifeblood for government entities, countries, cloud service providers and enterprises alike and losing or exposing that data can have disastrous results. There are new concepts for data storage on the horizon that will deliver secure solutions for storing and moving sensitive data around the world. In this session, attendees will learn about new best practices to bypass the Internet.

Learning Objectives

  • Understanding the next phase of cloud storage
  • New architectural designs for reliable, redundant off-world storage
  • By-passing cross-jurisdictional restrictions to avoid violating privacy regulations
  • Understanding how a mother/daughter satellite configuration allows users to circumvent leaky networks
  • Isolating data from the Internet and leased lines

Containers

Efficient and Agile Persistent Storage for Containers

Carlos Carrero, Sr. Principal Technical Product Manager, Veritas
Chad Ryan Thibodeau, Product Manager, Veritas

Abstract

Containers are called to the next big wave in application delivery, offering better resource utilization, agility and performance than traditional virtualization techniques. Once enterprises start running databases and applications with persistent storage needs, a new challenge appears with this new paradigm. This session will discuss how Veritas uses Software Defined Storage solutions to provide efficient and agile persistent storage for containers, offering enterprise capabilities like resilience, snapshots, I/O acceleration and Disaster Recovery. A reference architecture using commodity servers and server side storage will be presented. Finally, future challenges around quality of service, manageability and visibility will be covered in this session.

Learning Objectives

  • Understand how start working with containers today in a hyper-converged infrastructure
  • Use Software Defined Storage to fulfil persistent storage needs
  • Get a vision on futures on persistent storage for containers

Data Management

5 Ways to Convince Your CEO It’s Time for a Storage Refresh

David Siles, CTO, DataGravity

Abstract

Storage has historically stayed in its “box.” Any developments in the industry were contained to limited dimensions such as costs, feeds or speeds. But in today’s digital world, it’s not enough to just deliver data – all parts of your infrastructure must be able to answer ever-present data security questions, extract valuable insights, identify sensitive information and help protect critical data from incoming threats.

As you go through your next technology refresh, what benefits should your storage deliver to your business, and by which dimensions should you measure them? In this session, DataGravity Chief Technology Officer David Siles will address the new standards for aligning your storage platform with your data’s needs, and share tips for highlighting the consequences of outdated storage to your CEO. Your company’s critical data wasn’t created to live and die in a system that can’t maximize its value or protect it from risks.

Learning Objectives

  • Contextualize storage and IT from a C-level business perspective
  • Identify the storage strategy that best fits your data and company
  • Incorporate data security considerations to discover and protect sensitive data
  • Empower employees w/ increased productivity & collaboration, & ID insights that improve bottom line

The Role of Tape Technology in Managing the Exponential Growth of Cold Data

Osamu Shimizu, Research Engineer, FUJIFILM Corporation

Abstract

The increasing need for data storage capacity due to enormous amounts of newly created data year after year is an endless story. However, budgets for data storage are not nearly increasing at the same rate as data capacity growth, while retaining the data still remains very important. In consequence, having an inexpensive but reliable storage solution is paramount for this situation. Fortunately, most data generated is "cold data", which is rarely accessed but still needs to be retained for quite a long period of time. Tape storage, which has a long proven history with applications in various industries, is suitable for retaining such cold data because of its low TCO (total cost of ownership), advanced performance, high reliability and promising future outlook compared to other candidate technologies for cold storage (e.g. HDD, Optical Discs).

In this presentation, we will go through the reasons why tape storage is suitable for retaining cold data and will present the latest tape technologies and future outlook.

Learning Objectives

  • What is cold data?
  • Tape's advantages in cold storage
  • Latest tape storage technologies
  • Future outlook of tape storage

Embrace Hyper-scale Storage Architecture to Drastically Increase Reliability and Lower Costs

Sudhakar Mungamoori, Vice President, Customer Success and Sales Engineering, Formation Data Systems

Abstract

The tightly coupled architecture used by the vast majority of enterprises today is archaic and a new approach is needed to manage the explosion of data in a world of shrinking IT budgets. Enter: hyper-scale IT architecture.

Sudhakar Mungamoori will explain why a modern, software-driven and loosely coupled architecture is required for hyper-scale IT. He will highlight how this innovative architecture approach mitigates complexity, improves agility and reliability through on demand IT resources and reduces costs by as much as 10X.

Mungamoori will highlight how enterprises can learn from companies like Google and Facebook who built their own loosely coupled IT architectures to capitalize on its advantages. He will discuss use cases and best practices for IT departments that cannot similarly build their own, but are strategically looking to adopt loosely coupled architectures in order to remain competitive without blowing their budget in the face of today’s data deluge.

Learning Objectives

  • Why tightly coupled storage array architectures don’t meet the scale, performance or economic requirements of today’s enterprise, not to mention the future.
  • How distributed, hyper-scale software architectures provide increased scale, availability and resilience across many different workload profiles.
  • Challenges IT departments deploying hyper-scale solutions face, and best practices for how they can overcome these issues
  • How to migrate application workloads from legacy storage arrays to software-defined architectures using tiering and quality of service for guaranteed performance delivery.
  • How de-duplication, journaling and space efficient snapshots provide superior data protection with granular recovery across distributed nodes to minimize data loss and storage consumption.

Data Protection

SNIA Tutorial:
Intro to Data Protection-Backup to Tape, Disk & Beyond

Tom Sas, Worldwide Product Marketing Manager, Hewlett Packard Enterprise

Abstract

Extending the enterprise backup paradigm with disk-based technologies allow users to significantly shrink or eliminate the backup time window. This tutorial focuses on various methodologies that can deliver efficient and cost effective solutions. This includes approaches to storage pooling inside of modern backup applications, using disk and file systems within these pools, as well as how and when to utilize Continuous Data Protection, deduplication and virtual tape libraries (VTL), as well as the cloud.

Learning Objectives

  • Understand backup and restore technology including tape, disk, snapshots, deduplication, virtual tape, replication technologies and the cloud.
  • Compare and contrast backup and restore alternatives to achieve data protection and data recovery.
  • Identify and define backup and restore operations and terms.

SNIA Tutorial:
Trends in Backup and Restoration Technologies

Jason Iehl, Manager - Systems Engineering, Netapp

Abstract

Many disk technologies, both old and new, are being used to augment tried and true backup and data protection methodologies to deliver better information and application restoration performance; These technologies work in parallel with the existing backup paradigm. This session will discuss many of these technologies in detail; Important considerations of data protection include performance, scale, regulatory compliance, recovery objectives and cost; Technologies include contemporary backup, disk based backups, snapshots, continuous data protection and capacity optimized storage, as well as cloud services. This tutorial will cover how these technologies interoperate, as well as best practices recommendations for deployment in today's heterogeneous data centers.

Learning Objectives

  • Understand backup and restore technology including tape, disk, snapshots, deduplication, virtual tape, replication technologies and the cloud.
  • Compare and contrast backup and restore alternatives to achieve data protection and data recovery.
  • Identify and define backup and restore operations and terms.

SNIA Tutorial:
Protecting Data in a Big Data World

David A. Chapa, CTO and Managing Partner, Elovate

Abstract

Data growth is in an explosive state, and these "Big Data" repositories need to be protected. In addition, new regulations are mandating longer data retention, and the job of protecting these ever-growing data repositories is becoming even more daunting. This presentation will outline the challenges, methodologies, and best practices to protect the massive scale "Big Data" repositories.

Learning Objectives

  • Understand the unique challenges of managing and protecting "Big Data" repositories.
  • Understand the various technologies available for protecting "Big Data" repositories.
  • Understand various data protection considerations and best practices for "Big Data" repositories the, for various environments, including Disaster Recovery/Replication, Capacity Optimization, etc.

SNIA Tutorial:
Privacy vs Data Protection: The Impact of EU Data Protection Legislation

Thomas Rivera, Sr. Technical Associate, HDS

Abstract

After reviewing the diverging data protection legislation in the EU member states, the European Commission (EC) decided that this situation would impede the free flow of data within the EU zone. The EC response was to undertake an effort to "harmonize" the data protection regulations and it started the process by proposing a new data protection framework. This proposal includes some significant changes like defining a data breach to include data destruction, adding the right to be forgotten, adopting the U.S. practice of breach notifications, and many other new elements. Another major change is a shift from a directive to a rule, which means the protections are the same for all 27 countries and includes significant financial penalties for infractions. This tutorial explores the new EU data protection legislation and highlights the elements that could have significant impacts on data handling practices.

Learning Objectives

  • Highlight the major changes to the previous data protection directive.
  • Review the differences between “Directives” versus “Regulations”, as it pertains to the EU legislation.
  • Learn the nature of the Reforms as well as the specific proposed changes – in both the directives and the regulations.

Distributed Storage

Maximizing Storage Density While Conserving Valuable Rack Real Estate

Chaz Stevens, Director of Marketing, Aberdeen LLC

Abstract

Aberdeen delivers extreme computing power with our ultra-dense, extreme performance storage devices, packing over 3/4 Petabyte into only 4U of rack space. While consuming just 4U of rack space, Aberdeen’s ultra-dense storage devices suit a wide range of capacity hungry applications, including big data analytics or massive block or object storage

Aberdeen custom configures your server or storage products to your exact specifications. The ease of our online configurator lets you choose the storage device to fit precisely what you want. Exploring the features of Aberdeen’s NAS line, including our N49 4U 78 Bay, Ultra Dense, High Performance 12Gb/s SAS, storage device.


Identifying Performance Bottlenecks with Real-World Applications and Flash-Based Storage

Dennis Martin, President, Demartek

Abstract

Where are today’s storage performance bottlenecks, how do you find them and how does adding flash storage affect them? Demartek will report the results (IOPS, throughput and latency) of vendor-neutral performance tests run on database and virtualization workloads typical of those found in today’s data centers. The tests cover both hybrid and all-flash solutions from several manufacturers and using a variety of form factors and interfaces. You will come away with reasonable estimates of what to expect in practice, observe how different workloads affect storage system performance and notice the difference in performance results depending on where the measurements were taken. Technologies discussed include server-side flash, hybrid storage arrays, all-flash arrays and various interfaces including NVMe.

Learning Objectives

  • Learn how real-world workloads perform on different types of flash storage
  • Learn how different application workloads affect latency
  • Learn about some of the interfaces used for flash storage

High Availability for Centralized NVMe

Zivan Ori, CEO and Co-founder, E8 Storage

Abstract

Using NVMe drives in a centralized manner introduces the need for high availability. Without it, a simple failure in the NVMe enclosure will result in loss of access to a big group of disks. Loss of a single NVMe disk will impact all hosts mapped to this disk. We will review the state of the industry in approaching these problems, the challenges in performing HA and RAID at the speeds and latency of NVMe, and introduce new products in this space.

etc

Panel: Shingled Magnetic Recording (SMR) – Data Management Techniques Examined

Moderator: Tom Coughlin, President, Coughlin Associates

Panelists: Jorge Campello, Global Director of Systems and Solutions, Western Digital; Mark Carlson, Principal Engineer, Industry Standards, Toshiba, Chair, SNIA Technical Council; Josh Bingaman, Firmware Engineering Manager, Seagate Technology

Abstract

The unyielding growth of digital data continues to drive demand for higher capacity, lower-cost storage. With the advent of Shingled Magnetic Recording (SMR), which overlaps HDD tracks to provide a 25 percent capacity increase versus conventional magnetic recording technology, storage vendors are able to offer extraordinary drive capacities within existing physical footprints. That said, IT decision makers and storage system architects need to be cognizant of the different data management techniques that come with SMR technology, namely Drive Managed, Host Managed and Host Aware. This panel session will offer an enterprise HDD market overview from prominent storage analyst Tom Coughlin as well as presentations on SMR data management methods from leading SMR HDD manufacturers (Seagate, Toshiba and Western Digital).

Learning Objectives

  • Enterprise HDD market update
  • Brief introduction of Shingled Magnetic Recording (SMR)
  • Deep dive into SMR data management techniques (Drive Managed, Host Managed and Host Aware

SNIA Tutorial:
Windows Interoperability Workshop

Christopher Hertel, Software Developer, Sr. Program Manager, Dell/Compellent

Abstract

Windows and POSIX are different, and bridging the gap between the two—particularly with Network File Systems—can be a daunting endeavor ...and annoying, too. This tutorial will provide an overview of the SMB3 network file protocol (the heart and soul of Windows Interoperability) and describe some of the unique and powerful features that SMB3 provides. We will also point out and discuss some of the other protocols and services that are integrated with SMB3 (such as PeerDist), and show how the different pieces are stapled together and made to fly. The tutorial will also cover the general structure of Microsoft's protocol documentation, the best available cartography for those lost in the Interoperability Jungle. Some simple code examples will be used sparingly as examples, wherever it may seem clever and useful to do so.

Learning Objectives

  • Become familiar with the Windows Interoperability Ecosystem
  • Better understand Microsoft's Specifications
  • Identify Windows-specific semantic details

SNIA Tutorial:
Object Drives: Simplifying the Storage Stack

Mark Carlson, Principal Engineer, Industry Standards, Toshiba

Abstract

A number of scale out storage solutions, as part of open source and other projects, are architected to scale out by incrementally adding and removing storage nodes. Example projects include:

Hadoop’s HDFS
CEPH
Swift (OpenStack object storage)

The typical storage node architecture includes inexpensive enclosures with IP networking, CPU, Memory and Direct Attached Storage (DAS). While inexpensive to deploy, these solutions become harder to manage over time. Power and space requirements of Data Centers are difficult to meet with this type of solution. Object Drives further partition these object systems allowing storage to scale up and down by single drive increments.

Learning Objectives

  • What are object drives?
  • What value do they provide?
  • Where are they best deployed?

FILE SYSTEMS

SNIA Tutorial:
Massively Scalable File Storage

Philippe Nicolas, Advisor, OpenIO

Abstract

Internet changed the world and continues to revolutionize how people are connected, exchange data and do business. This radical change is one of the causes of the rapid explosion of data volume that required a new data storage approach and design. One of the common elements is that unstructured data rules the IT world. How famous Internet services we all use every day can support and scale with thousands of new users and hundreds of TB added daily and continue to deliver an enterprise-class SLA ? What are various technologies behind a Cloud Storage service to support hundreds of millions users? This tutorial covers technologies introduced by famous papers about Google File System and BigTable, Amazon Dynamo or Apache Hadoop. In addition, Parallel, Scale-out, Distributed and P2P approaches with open source and proprietary ones are illustrated as well. This tutorial adds also some key features essential at large scale to help understand and differentiate industry vendors and open source offerings.

Learning Objectives

  • Understand technology directions for large scale storage deployments
  • Be able to compare technologies
  • Learn from big internet companies about their storage choices and approaches

The Tip of the Iceberg: The Coming Era of Open Source Enterprise Storage

Michael Dexter, Senior Analyst, iXsystems, Inc

Abstract

Joy's Law, named after Sun Microsystems' founder, is proving painfully true: “The best people work for someone else” and the explosive growth of open source now dictates that you are always out numbered by skilled developers you could never obtain or afford. This is not however a bad thing. Collaboration always beats competition when building community infrastructure and the OpenZFS project is taking Sun's next generation file system to unforeseen and unimaginable levels. OpenZFS-powered projects like FreeNAS have inverted the conversation from "I wish I could have this enterprise technology at home" to "Why aren't we using this at work?"

Learning Objectives

  • Learn about Open Source and the OpenZFS file system
  • Understand the advantages that OpenZFS provide to systems of all sizes
  • Survey the OpenZFS community, developers and vendors
  • Explore SDS and Platform-Defined OpenZFS solutions

Ceph For The Enterprise

David Byte, Senior Technology Strategist, SUSE

Abstract

Few, if any, enterprise organizations will be willing to consume an upstream version of Ceph. This session will cover general guidelines for implementing Ceph for the enterprise and cover available reference architectures from SUSE.

Learning Objectives

  • Understanding design choices for Ceph
  • General sizing guidelines
  • Using ceph in a traditional environment
  • Managing ceph in the enterprise

High Performance Storage for Science Workloads

Ulrich Fuchs, IT Service Manager, CERN

Abstract

The unique challenges in the field of nuclear high energy physics are already pushing the limits of storage solutions today, however, the projects planned for the next ten years call for storage capacities, performance and access patterns that exceed the limits of many of today's solutions.

This talk will present the limitations in network and storage and explain the architecture chosen for tomorrow's storage implementations in this field. Tests of various file systems (Lustre, NFS, Block Object storage, GPFS ..) have been performed and the results of performance measurements for different hardware solutions and access patterns will be presented.

Learning Objectives

  • Shared file system and storage performance requirements in science workloads
  • Setup and results of performance measurements of different file systems: the LUSTRE FS, GPFS
  • Technology differences between several file systems and storage solutions

MarFS: A Scalable Near-POSIX File System over Cloud Objects

Gary Grider, HPC Division Leader, Los Alamos National Laboratory

Abstract

Many computing sites need long-term retention of mostly cold data often “data lakes”. The main function of this storage tier is capacity but non trivial bandwidth/access requirements exist. For many years, tape was the best economic solution. Data sets have grown larger more quickly than tape bandwidth improvements and access demands have increased. Disk can be more economically for this storage tier. The Cloud Community has moved towards erasure based object stores to gain scalability and durability using commodity hardware. The Object Interface works for new applications but legacy applications utilize POSIX for their interface. MarFS is a Near-POSIX File System using cloud storage for data and many POSIX file systems for metadata. MarFS will scale the POSIX namespace metadata to trillions of files and billions of files in a single directory while storing the data in efficient massively parallel ways in industry standard erasure protected cloud style object stores.

Learning Objectives

  • Storage tiering in future HPC and large scale computing environments
  • Economic drivers for implementing data lakes/tiered storage
  • HPC specific requirements for data lakes - multi way scaling
  • Overview of existing solution space
  • How the MarFS solution works and for what types of situations

High Performance NAS, New design for New IT Challenges

Pierre Evenou, CEO, Rozo Systems
Philippe Nicolas, Advisor, Rozo Systems

Abstract

Rozo Systems develops a new generation of Scale-Out NAS with a radical new design to deliver a new level of performance. RozoFS is a high scalable, high performance and high resilient file storage product, fully hardware agnostic, that relies on an unique patented Erasure Coding technology developed at University of Nantes in France. This new philosophy in file serving extends what is capable and available today on the market with super fast and seamless data protection techniques. Thus RozoFS is the perfect companion for high demanding environments such HPC, Life Sciences, Media and Entertainment, Oil and Gas.

Learning Objectives

  • Learn new Scale-Out NAS design
  • Learn new Erasure Coding technology
  • Understand Use cases challenges solved by RozoFS

Green Storage

Storage Systems Can Now Get ENERGY STAR Labels and Why You Should Care

Dennis Martin, President, Demartek

Abstract

We all know about ENERGY STAR labels on refrigerators and other household appliances. In an effort to drive energy efficiency in data centers, storage systems can now get ENERGY STAR labels through the EPA announced its ENERGY STAR Data Center Storage program. This program uses the taxonomies and test methods described in the SNIA Emerald Power Efficiency Measurement specification, which is part of the SNIA Green Storage Initiative. In this session, Dennis Martin, President of Demartek, the first SNIA Emerald Recognized Tester company, will discuss the similarities and differences in power supplies used in computers you build yourself and in data center storage equipment, 80PLUS ratings, and why it is more efficient to run your storage systems at 230v or 240v rather than 115v or 120v. Dennis will share his experiences running the EPA ENERGY STAR Data Center Storage tests for storage systems and why vendors want to get approved.

Learning Objectives

  • Learn about power supply efficiencies
  • Learn about running datacenter equipment at 230 volts vs. 115 volts
  • Learn about the SNIA Emerald Power Efficiency Measurement
  • Learn about the EPA ENERGY STAR Data Center Storage program

Power-Efficient Data Storage

Brian Zahnstecher, Principal, PowerRox

Abstract

Everyone wants to save energy in one form or another and energy efficiency is right at the top of lists of data center owner/architect pain points and key concerns. As worldwide data grows at an exponential rate, data storage solutions are creating an ever-increasing footprint for power in the data center. Understanding the key factors of power utilization for storage solutions is critical to optimizing that power footprint whether it be for purposes of system design or application in a data center or elsewhere. This talk will provide a high-level overview of storage technologies and compare/contrast them from a power perspective. More importantly, it will identify the best and simplest opportunities for reducing overall energy usage. Electrical engineering and/or power technical knowledge is not required as this is targeted for both technologists and facilities/business decision-makers.

Learning Objectives

  • Where are the "low-hanging fruit" opportunities for power savings in data storage?
  • What drives peak vs. average power draw in data storage?
  • What are the key characteristics (or figures of merit) for power that differentiate storage solution

Hot Topics

New Fresh Storage Approach for New IT Challenges

Laurent Denel, CEO, OpenIO
Philippe Nicolas, Advisor, OpenIO

Abstract

With a design started in 2006, OpenIO is a new flavor among the dynamic object storage market segment. Beyond Ceph and OpenStack Swift, OpenIO is the last coming player in that space. The product relies on an open source core object storage software with several object APIs, file sharing protocols and applications extensions. The inventors of the solution took a radical new approach to address large scale environment challenges. Among them, the product avoids any rebalance like consistent hashing based systems always trigger. The impact is immediate as new machines contribute immediately without any extra tasks that impact the platform service. OpenIO also introduces the Conscience, an intelligent data placement service, that optimizes the location of the data based on various criteria such nodes workload, storage space… OpenIO is fully hardware agnostic, running on commodity x86 servers promoting a total independence.

Learning Objectives

  • Understand object storage design
  • Learn the limitation of classic approaches
  • Investigate new method to build an design very large scale storage system

Are You Ready to Deploy In-memory Computing Applications?

Shaun Walsh, Managing Partner, G2M Communications

Abstract

Businesses are extracting value from more data, more sources and at increasingly real-time rates. Spark and HANA are just the beginning. This presentation details existing and emerging solutions for in-memory computing solutions that address this market trend and the disruptions that happen when combining big-data (Petabyes) with in-memory/real-time requirements. We will also be providing use cases and survey results from users who have implemented in memory computing applications, It provides an overview and trade-offs of key solutions (Hadoop/Spark, Tachyon, Hana, NoSQL-in-memory, etc) and related infrastructure (DRAM, Nand, 3D-crosspoint, NV-DIMMs, high-speed networking) and the disruption to infrastructure design and operations when "tiered-memory" replaces "tiered storage". And it also includes real customer data on how they are addressing and planning for this transition with this architectural framework in mind. Audience will leave with a framework to evaluate and plan for their adoption of in-memory computing.

Learning Objectives

  • Learn what is takes to evaluate, plan and implement in-memory computing applications

Machine Learning Based Prescriptive Analytics for Data Center Networks

Hariharan Krishnaswamy, Principal Engineer, Dell

Abstract

In modern data center with thousands of servers, thousands of switches and storage devices, and millions of cables, failures could arise anywhere in compute, network or storage layer. The infrastructures provides multiple sources of huge volumes of data - time series data of events, alarms, statistics, IPC, system-wide data structures, traces and logs. Interestingly, data is gathered in different formats and at different rates by different subsystems. In this heterogeneous data representation, the ability to blend and ingest the data to discover hidden correlations and patterns is important. Robust data architecture and machine learning techniques are required to predict impending functional or performance issues and to propose desired actions that can mitigate an unwanted situation before it happens. This presentation will outline the challenges and address machine learning based solutions to und