SNIA Developer Conference September 15-17, 2025 | Santa Clara, CA
Object stores are known for ease of use and massive scalability. Unlike other storage solutions like file systems and block stores, object stores are capable of handling data growth without increase in complexity or developer intervention. Apache Hadoop Ozone is a highly scalable Object Store which extends the design principles of HDFS while maintaining a 10-100x scale compared to HDFS. It can store billions of keys and hundreds of petabytes of data. With the massive scale there is a requirement for it to have very high throughput while maintaining low latency. This talk discusses the Ozone architecture and design decisions which were significant for achieving high throughput and low latency. It will detail how Ozone supports metadata and data layer without compromising throughput or latency. Further the talk discusses the hardships and challenges related to resource management and good performance in Ozone. It would cover some major pain points and present the performance issues in broad categories.
Terraform – infrastructure as code tool from HashiCorp for building, changing, and managing infrastructure. We can use it to manage Multi-Cloud environments with a configuration language called the HashiCorp Configuration Language (HCL). It codifies cloud APIs into declarative configuration files. We will learn how to write the Configuration files in the Terraform to run a single application or manage an entire data center by defining the plan and then executing it to build the described infrastructure. As the configuration changes, Terraform can determine changes and create incremental execution plans in accordance. Using Terraform, we can manage low-level components such as compute instances, storage, networking, and high-level components such as DNS entries and SaaS features.
Object stores are known for ease of use and massive scalability. Unlike other storage solutions like file systems and block stores, object stores are capable of handling data growth without increase in complexity or developer intervention. Apache Hadoop Ozone is a highly scalable Object Store which extends the design principles of HDFS while maintaining a 10-100x scale compared to HDFS. It can store billions of keys and hundreds of petabytes of data. With the massive scale there is a requirement for it to have very high throughput while maintaining low latency. This talk discusses the Ozone architecture and design decisions which were significant for achieving high throughput and low latency. It will detail how Ozone supports metadata and data layer without compromising throughput or latency. Further the talk discusses the hardships and challenges related to resource management and good performance in Ozone. It would cover some major pain points and present the performance issues in broad categories.
Apache Ozone is an object store which scales to tens of billions of objects, hundreds of petabytes of data and thousands of datanodes. Ozone not only supports high throughput data ingestion but also supports high throughput deletion with performance similar to HDFS. Further with massive scale the data can be non-uniformly distributed due to addition of new datanodes, deletion of data etc. Non-uniform distribution can lead to lower utilisation of resources and can affect the overall throughput of the cluster. The talk discusses the balancer service in Ozone which is responsible for uniform distribution of data across the cluster. It would cover the service design and how the service improves upon HDFS balancer service. Further the talk discusses how Ozone deletion matches the HDFS deletion performance of deleting 1 Million keys / hour but can scale much more. Simple design and asynchronous operations enable Ozone to achieve the scale for deletion. The talk would dive deeper into the design and performance enhancements.
In file systems, large sequential writes are more beneficial than small random writes, and hence many storage systems implement a log structured file system. In the same way, the cloud favors large objects more than small objects. Cloud providers place throttling limits on PUTs and GETs, and so it takes significantly longer time to upload a bunch of small objects than a large object of the aggregate size. Moreover, there are per-PUT calls associated with uploading smaller objects. In Netflix, a lot of media assets and their relevant metadata is generated and pushed to cloud. Most of these files are between 10s of bytes to 10s of kilobytes and are saved as small objects on Cloud. In this talk, we would like to propose a strategy to compact these small objects into larger blobs before uploading them to Cloud. We will discuss the policies to select relevant smaller objects, and how to manage the indexing of these objects within the blob. We will also discuss how different cloud storage operations such as reads and deletes would be implemented for such objects. Finally, we would showcase the potential impact of such a strategy on Netflix assets in terms of cost and performance.
Cloud storage footprint is in exabytes and exponentially growing and companies pay billions of dollars to store and retrieve data. In this talk, we will cover some of the space and time optimizations, which have historically been applied to on-premise file storage, and how they would be applied to objects stored in Cloud. Deduplication and compression are techniques that have been traditionally used to reduce the amount of storage used by applications. Data encryption is table stakes for any remote storage offering and today, we have client-side and server-side encryption support by Cloud providers. Combining compression, encryption, and deduplication for object stores in Cloud is challenging due to the nature of overwrites and versioning, but the right strategy can save millions for an organization. We will cover some strategies for employing these techniques depending on whether an organization prefers client side or server side encryption, and discuss online and offline deduplication of objects. Companies such as Box, and Netflix, employ a subset of these techniques to reduce their cloud footprint and provide agility in their cloud operations.
Amazon AWS S3 storage is widely deployed to store everything from customer data, server logs, software repositories and so on. Poorly secured S3 buckets have resulted in many publicized data breaches. The cloud service provider's shared responsibility model places responsibility on customers for protecting the confidentiality, availability and integrity of their data. Thales Cipher Trust Encryption Cloud Object Storage for S3 secures S3 objects by enabling advanced encryption along with dual end point access controls. Access controls are enforced both at the client host running the AWS S3 application and at the AWS S3 server end. The encryption offered by CTE COS for S3 is independent of AWS's S3 server side encryption. See Figure 1 Encryption and access controls are completely transparent to applications while AWS S3 administrative procedures remain unchanged after software agent deployment. Continuously enforced encryption policies protect against unauthorized access even in the case of AWS misconfigurations. Data access to 'protected' S3 buckets is tracked through detailed audit logs. CTE's granular, least-privileged user access policies protects sensitive data in S3 buckets from external attacks and misuse by other privileged users. CTE security administrators can frame client host policies to allow or deny actions involving ACLs like reading, writing, enumerating and deleting S3 buckets or even individual objects in a S3 bucket. In addition, client policies can also specify permissible users and applications capable of accessing protected AWS S3 buckets. AWS S3 server side access controls can also be simultaneously and transparently enabled with custom AWS IAM policies and roles. S3 bucket data accesses are only allowed from hosts configured with Ciphertrust Transparent Encryption. Cloud access controls and its management can therefore be offloaded to client hosts with additional control points for permitting specific local identities and applications. CTE COS S3 dual end point access controls and encryption therefore prevent S3 data breaches against unauthorized accesses even in the midst of misconfigured buckets and rogue insider threats. CTE COS S3 is FIPS 140-2 Level certified and is a part of the Ciphertrust Data Security platform.