Blog

Digital Twins Q&A

Digital Twins Q&A

Mar 9, 2023

A digital twin (DT) is a virtual representation of an object, system or process that spans its lifecycle, is updated from real-time data, and uses simulation, machine learning and reasoning to help decision-making. Digital twins can be used to help answer what-if AI-analytics questions, yield insights on business objectives and make recommendations on how to control or improve outcomes. It’s a fascinating technology that the SNIA Cloud Storage Technologies Initiative (CSTI) discussed at our live webcast “Journey to the Center of Massive Data: Digital Twins.” If you missed the presentation, you can watch it on-demand and access a PDF of the slides at the SNIA Educational Library. Our audience asked several interesting questions which are answered here in this blog. Q. Will a digital twin make the physical twin more or less secure? A. It depends on the implementation. If DTs are developed with security in mind, a DT can help augment the physical twin. Example, if the physical and digital twins are connected via an encrypted tunnel that carries all the control, management, and configuration traffic, then a firmware update of a simple sensor or actuator can include multi-factor authentication of the admin or strong authentication of the control application via features running in the DT, which augments the constrained environment of the physical twin. However, because DTs are usually hosted on systems that are connected to the internet, ill-protected servers could expose a physical twin to a remote intruder. Therefore, security must be designed from the start. Q. What are some of the challenges of deploying digital twins? A. Without AI frameworks and real-time interconnected pipelines in place digital twins’ value is limited. Q. How do you see digital twins evolving in the future? A. Here are a series of evolutionary steps:

From Discrete DT (for both pre- and post-production), followed by composite DT (e.g assembly line, transportation systems), to Organization DT (e.g. supply chains, political parties).
From pre-production simulation, to operational dashboards of current state with human decisions and control, to autonomous limited control functions which ultimately eliminate the need for individual device manager SW separate from the DT.
In parallel, 2D DT content displayed on smartphones, tablets, PCs, moving to 3D rendered content on the same, moving selectively to wearables (AR/VR) as the wearable market matures leading to visualized live data that can be manipulated by voice and gesture.
Over the next 10 years, I believe DTs become the de facto Graphic User Interface for machines, buildings, etc. in addition to the GUI for consumer and commercial process management.

Q. Can you expand on your example of data ingestion at the edge please? Are you referring to data capture for transfer to a data center or actual edge data capture and processing for digital twin. If the latter, what use cases might benefit? A. Where DTs are hosted and where AI processes are computed, like inference or training on time-series data, don’t have to occur in the same server or even the same location. Nevertheless, depending upon the expected time-to-action and time-to-insight, plus how much data needs to be processed and the cost of moving that data, will dictate where digital twins are placed and how they are integrated within the control path and data path. Example, a high-speed robotic arm that must stop if a human puts their hand in the wrong space, will likely have an attached or integrated smart camera which is capable of identifying (inferring) a foreign object. It will stop itself and an associated DT will receive notice of an event after the fact. A digital twin of the entire assembly line may learn of the event from the robotic arm’s DT and inject control commands to the rest of the assembly line to gracefully slow down or stop. Both DT of the discrete robotic arm and the composite DT of the entire assembly are likely executing on compute infrastructure on the premises in order to react quickly. Whereas, the “what if” capabilities of both types of DTs may run in the cloud or local data center as the optional simulation capability of the DT are not subjected to real or near real-time round-trip time-to-action constraints and may require more compute and storage capacity than is locally available. The point is the “Edge” is a key part of the calculus to determine where DTs operate. Meaning, time-actionable-insights, cost of data movement, governance restrictions of data movement, and the availability / cost of compute and store infrastructure, plus access to Data Scientists, IT professionals, and AI frameworks is increasingly driving more and more automation processing to the “Edge” and its natural for DTs to follow the data. Q. Isn’t Google maps also an example of a digital twin (especially when we use it to drive based on our directions we input and start driving based on its inputs)? A. Good question! It is a digital representation of physical process (a route to a destination) that ingests data from sensor (other vehicles whose operators are using Google Maps driving instructions along some portion of the route.) So, yes. DTs are digital representations of physical things, processes or organizations that share data. But Google maps is an interesting example of a self-organizing composite DT, meaning lots of users acting both sensors (aka discrete DTs) and selective digital viewers of the behavior of many physical cars moving through a shared space. Q. You brought up an interesting subject around regulations and compliance. Considering that some constructions would require approvals from regulatory authorities, how would a digital twin (especially when we have pics that re-construct / re-model soft copies of the blueprints based on modifications identified through the 14-1500 pics) comply to regulatory requirements? A. Some safety regulations in various regions of the world apply to processes. E.g Worker safety in factories. Time to certify is very slow as lots of documentation is compiled and analyzed by humans. DTs could use live data to accelerate documentation, simulation or replays of real data within digital twins and could potentially enable self-certification of new or reconfigured process, assuming that regulatory bodies evolve. Q. Digital twin captures the state of its partner in real time. What happens to aging data? Do we need to store data indefinitely? A, Data retention can shrink as DTs and AI frameworks evolve to perform ongoing distributed AI model refreshing. As AI models refresh more dynamically, the increasingly fewer and fewer anomalous events become the gold used for the next model refresh. In short, DTs should help reduce how much data is retained. Part of what DT can be built to do is to filter out compliance data for long-term archival. Q. Do we not run a high risk when model and reality do not align? What if we trust the twin too much? A. Your question targets more general challenges of AI. There is a small but growing cottage industry evolving in parallel with DT and AI. Analysts refer to it as Explainable AI, whose intent is to explain to mere mortals how and why an AI model results in the predictions and decisions it makes. Your concern is valid, and for this reason we should expect that humans will likely be in the control loop wherein the DT doesn’t act autonomically for non-real-time control functions.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Artificial Intelligence

Blog

You’ve Been Framed! An Overview of Programming Frameworks

You’ve Been Framed! An Overview of Programming Frameworks

Alex McDonald

Oct 13, 2022

With the emergence of GPUs, xPUs (DPU, IPU, FAC, NAPU, etc.) and computational storage devices for host offload and accelerated processing, a panoramic wild west of frameworks is emerging, all vying to be one of the preferred programming software stacks that best integrates the application layer with these underlying processing units. On October 26, 2022, the SNIA Networking Storage Forum will break down what’s happening in the world of frameworks in our live webcast, “You’ve Been Framed! An Overview of Programming Frameworks for xPU, GPU & Computational Storage Programming Frameworks.” We’ve convened an impressive group of experts that will provide an overview of programming frameworks that support:

GPUs (CUDA, SYCL, OpenCL, oneAPI)
xPUs (DASH, DOCA, OPI, IPDK)
Computational Storage (SNIA computational storage API, NVMe TP4091 and FPGA programming shells)

We will discuss strengths, challenges and market adoption across these programming frameworks: • AI/ML: OpenCL, CUDA, SYCL, oneAPI • xPU: DOCA, OPI, DASH, IPDK • Core data path frameworks: SPDK, DPDK • Computational Storage: SNIA Standard 0.8 (in public review), TP4091 Register today and join us as we untangle the alphabet soup of these programming frameworks.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Artificial Intelligence Computational Storage DPU GPU Machine Learning Programming Frameworks xPU

Blog

5G Industrial Private Networks and Edge Data Pipelines

5G Industrial Private Networks and Edge Data Pipelines

Alex McDonald

Jan 5, 2022

The convergence of 5G, Edge Compute and Artificial Intelligence (AI) promise to be catalyst for continued digital transformation. For many industries, it will be a game-changer in term of how business in conducted. On January 27, 202, the SNIA Cloud Storage Technologies Initiative (CSTI) will take on this topic at our live webcast “5G Industrial Private Networks and Edge Data Pipelines.” Advanced 5G is specifically designed to address the needs of verticals with capabilities like enhanced mobile broadband (emBB), ultra-reliable low latency communications (urLLC), and massive machine type communications (mMTC), to enable near real-time distributed intelligence applications. For example, automated guided vehicle and autonomous mobile robots (AGV/AMRs), wireless cameras, augmented reality for connected workers, and smart sensors across many verticals ranging from healthcare and immersive media, to factory automation. Using this data, manufacturers are looking to maximize operational efficiency and process optimization by leveraging AI and machine learning. To do that, they need to understand and effectively manage the sources and trustworthiness of timely data. In this presentation, our SNIA experts will take a deep dive into how:

Edge can be defined and the current state of the industry
Industrial Edge is being transformed
5G and Time-Sensitive Networking (TSN) play a foundational role in Industry 4.0
The convergence of high-performance wireless connectivity and AI create new data-intensive use cases
The right data pipeline layer provides persistent, trustworthy storage from edge to cloud

I encourage you to register today. Our experts will be ready to answer your questions.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

5G Artificial Intelligence Cloud Storage Edge Computing

Blog

Storage for AI Q&A

Storage for AI Q&A

Tom Friend

Oct 20, 2021

What types of storage are needed for different aspects of AI? That was one of the many topics covered in our SNIA Networking Storage Forum (NSF) webcast “Storage for AI Applications.” It was a fascinating discussion and I encourage you to check it out on-demand. Our panel of experts answered many questions during the live roundtable Q&A. Here are answers to those questions, as well as the ones we didn’t have time to address. Q. What are the different data set sizes and workloads in AI/ML in terms of data set size, sequential/ random, write/read mix? A. Data sets will vary incredibly from use case to use case. They may be GBs to possibly 100s of PB. In general, the workloads are very heavily reads maybe 95%+. While it would be better to have sequential reads, in general the patterns tend to be closer to random. In addition, different use cases will have very different data sizes. Some may be GBs large, while others may be <1 KB. The different sizes have a direct impact on performance in storage and may change how you decide to store the data. Q. More details on the risks associated with the use of online databases? A. The biggest risk with using an online DB is that you will be adding an additional workload to an important central system. In particular, you may find that the load is not as predictable as you think and it may impact the database performance of the transactional system. In some cases, this is not a problem, but when it is intended for actual transactions, you could be hurting your business. Q. What is the difference between a DPU and a RAID / storage controller? A. A Data Processing Unit or DPU is intended to process the actual data passing through it. A RAID/storage controller is only intended to handle functions such as data resiliency around the data, but not the data itself. A RAID controller might take a CSV file and break it down for storage in different drives. However, it does not actually analyze the data. A DPU might take that same CSV and look at the different rows and columns to analyze the data. While the distinction may seem small, there is a big difference in the software. A RAID controller does not need to know anything about the data, whereas a DPU must be programmed to deal with it. Another important aspect is whether or not the data will be encrypted. If the data will encrypted, a DPU will have to have additional security mechanisms to deal with decryption of the data. However, a RAID-based system will not be affected. Q. Is a CPU-bypass device the same as a SmartNIC? A. Not entirely. They are often discussed together, but a DPU is intended to process data, whereas a SmartNIC may only process how the data is handled (such as encryption, handle TCP/IP functions, etc.). It is possible for a SmartNIC to also act as a DPU where the data itself is processed. There are new NVMe-oF

technologies that are beginning to allow FPGA, TPD, DPU, GPU and other devices direct access to other servers’ storage directly over a high-speed local area network without having to access the CPU of that system. Q. What work is being done to accelerate S3 performance with regard to AI? A. A number of companies are working to accelerate the S3 protocol. Presto and a number of Big Data technologies use it natively. For AI workloads there are a number of caching technologies to handle the re-reads of training on a local system. Minimizing the performance penalty Q. From a storage perspective, how do I take different types of data from different storage systems to develop a model? A. Work with your project team to find the data you need and ensure it can be served to the ML/DL training (or inference) environment in a timely manner. You may need to copy (or clone) data on to a faster medium to achieve your goals. But look at the process as a whole. Do not underestimate the data cleansing/normalization steps in your storage analysis as it can prove to be a bottleneck. Q. Do I have to “normalize” that data to the same type, or can a model accommodate different data types? A. In general, yes. Models can be very sensitive. A model trained on one set of data with one set of normalizations may not be accurate if data that was taken from a different set with different normalizations is used for inference. This does depend on the model, but you should be aware not only of the model, but also the details of how the data was prepared prior to training. Q. If I have to change the data type, do I then need to store it separately? A. It depends on your data, “do other systems need it in the old format?” Q. Are storage solutions that are right for one form of AI also the best for others? A. No. While it may be possible to use a single solution for multiple AIs, in general there are differences in the data that can necessitate different storage. A relatively simple example is large data (MBs) vs. small data (~1KB). Data in that multiple MBs large example can be easily erasure coded and stored more cost effectively. However, for small data, Erasure Coding is not practical and you generally will have to go with replication. Q. How do features like CPU bypass impact performance of storage? A. CPU bypass is essential for those times when all you need to do is transfer data from one peripheral to another without processing. For example, if you are trying to take data from a NIC and transfer it to a GPU, but not process the data in any way, CPU bypass works very well. It prevents the CPU and system memory from becoming a bottleneck. Likewise, on a storage server, if you simply need to take data from an SSD and pass it to a NIC during a read, CPU bypass can really help boost system performance. One important note: if you are well under the limits of the CPU, the benefits of bypass are small. So, think carefully about your system design and whether or not the CPU is a bottleneck. In some cases, people will use system memory as a cache and in these cases, bypassing CPU isn’t possible. Q. How important is it to use All-Flash storage compared to HDD or hybrid? A. Of course, It depends on your workloads. For any single model, you may be able to make due with HDD. However, another consideration for many of the AI/ML systems is that their use can quite suddenly expand. Once there is some amount of success, you may find that more people will want access to the data and the system may experience more load. So beware of the success of these early projects as you may find your need for creation of multiple models from the same data could overload your system. Q. Will storage for AI/ML necessarily be different from standard enterprise storage today? A. Not necessarily. It may be possible for enterprise solutions today to meet your requirements. However, a key consideration is that if your current solution is barely able to handle its current requirements, then adding an AI/ML training workload may push it over the edge. In addition, even if your current solution is adequate, the size of many ML/DL models are growing exponentially every year. So, what you provision today may not be adequate in a year or even several months. Understanding the direction of the work your data scientists are pursuing is important for capacity and performance planning.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

AI Artificial Intelligence Data Storage Machine Learning Networked Storage

Blog

Storage for Applications Webcast Series

Storage for Applications Webcast Series

John Kim

Sep 8, 2021

Everyone enjoys having storage that is fast, reliable, scalable, and affordable. But it turns out different applications have different storage needs in terms of I/O requirements, capacity, data sharing, and security. Some need local storage, some need a centralized storage array, and others need distributed storage—which itself could be local or networked. One application might excel with block storage while another with file or object storage. For example, an OLTP database might require small amounts of very fast flash storage; a media or streaming application might need vast quantities of inexpensive disk storage with extra security safeguards; while a third application might require a mix of different storage tiers with multiple servers sharing the same data. This SNIA Networking Storage Forum “Storage for Applications” webcast series will cover the storage requirements for specific uses such as artificial intelligence (AI), database, cloud, media & entertainment, automotive, edge, and more. With limited resources, it’s important to understand the storage intent of the applications in order to choose the right storage and storage networking strategy, rather than discovering the hard way that you’ve chosen the wrong solution for your application. We kick off this series on October 5, 2020 with “Storage for AI Applications.” AI is a technology which itself encompasses a broad range of use cases, largely divided into training and inference. In this webcast, we’ll look at what types of storage are typically needed for different aspects of AI, including different types of access (local vs. networked, block vs. file vs. object) and different performance requirements. And we will discuss how different AI implementations balance the use of on-premises vs. cloud storage. Tune in to this SNIA Networking Storage Forum (NSF) webcast to boost your natural (not artificial) intelligence about application-specific storage. Register today. Our AI experts will be waiting to answer your questions.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

AI Artificial Intelligence Data Storage Networked Storage

Blog

Can Cloud Storage and Big Data Live Happily Ever After?

Can Cloud Storage and Big Data Live Happily Ever After?

Chip Maurer

Aug 31, 2021

“Big Data” has pushed the storage envelope, creating a seemingly perfect relationship with Cloud Storage. But local storage is the third wheel in this relationship, and won’t go down easy. Can this marriage survive when Big Data is being pulled in two directions? Should Big Data pick one, or can the three of them live happily ever after? This will be the topic of discussion on October 21, 2021 at our live SNIA Cloud Storage Technologies webcast, “Cloud Storage and Big Data, A Marriage Made in the Clouds.” Join us as our SNIA experts will cover:

A short history of Big Data
The impact of edge computing
The erosion of the data center
Managing data-on-the-fly
Grid management
Next-gen Hadoop and related technologies
Supporting AI workloads
Data gravity and distributed data

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Artificial Intelligence big data Cloud Storage Data Analytics

Blog

Q&A on the Ethics of AI

Q&A on the Ethics of AI

Jim Fister

Mar 25, 2021

Earlier this month, the SNIA Cloud Storage Technologies Initiative (CSTI) hosted an intriguing discussion on the Ethics of Artificial Intelligence (AI). Our expert, Rob Enderle, Founder of The Enderle Group, and Eric Hibbard, Chair of the SNIA Security Technical Work Group, shared their experiences and insights on what it takes to keep AI ethical. If you missed the live event, it Is available on-demand along with the presentation slides at the SNIA Educational Library. As promised during the live event, our experts have provided written answers to the questions from this session, many of which we did not have time to get to. Q. The webcast cited a few areas where AI as an attacker could make a potential cyber breach worse, are there also some areas where AI as a defender could make cybersecurity or general welfare more dangerous for humans? A. Indeed, we address several different scenarios where AI running at a speed of thought and reaction is much faster than human reaction. Some that we didn’t address are the impact of AI on general cybersecurity. Phishing attacks using AI are getting more sophisticated, and an AI that can compromise systems with cameras or microphones has the ability to pick up significant amounts of information from users. As we continue to automate a response to an attack there could be situations where an attacker is misidentified, and an innocent person is charged by mistake. AI operates at large scale, sometimes making decisions on data that it not apparent to humans looking at the same data. This might cause an issue where an AI believes a human is in the wrong in ways that we could not otherwise see. An AI might also overreach to an attack, for instance noticing there is an attempt to hack into the infrastructure of a company, shutting down that infrastructure in an abundance of caution could leave workers with no power, lights, or air conditioning. Some water-cooling systems if shut down suddenly will burst and that could cause both safety and severe damage issues. Q. What are some of the technical and legal standards that are currently in place that are trying to regulate AI from an ethics standpoint? Are legal experts actually familiar enough with AI technology and bias training to make informed decisions? A. The legal community is definitely aware of AI. As an example, the American Bar Association Science and Technology Law Section’s (ABA SciTech) Artificial Intelligence & Robotics Committee has been active since at least 2008. ABA SciTech is currently planning its third National Institute Artificial Intelligence (AI) and Robotics for October 2021 in which AI ethics will figure prominently. That said, case law on AI ethics/bias in the U.S. is still limited, but expected to grow as AI becomes more prevalent in business decisions and operations. It is also worth noting that international standards on AI ethics/bias either exist or are under development. For example, the IEEE 7000 Standards Working Groups are already developing standards for the future of ethical intelligent and autonomous technologies. In addition, ISO/IEC JTC 1/SC 42 is developing AI and Machine Learning standards that includes ethics/bias as an element. Q. The webcast talked a lot about automated vehicles and the work done by companies in terms of safety as well as in terms of liability protection. Is there a possibility that these two conflict? A. In the webcast we discussed the fact that autonomous vehicle safety requires a multi-layered approach that could include connectivity in-vehicle, with other vehicles, with smart city infrastructure, and with individuals’ schedules and personal information. This is obviously a complex environment, and current liability process makes it difficult for companies and municipalities to work together without encountering legal risk. For instance, let’s say an autonomous car sees a pedestrian in danger and could place itself between the pedestrian and that danger. But it doesn’t because the resulting accident could result in the vehicle attracting liability. Or, hitting ice on a corner, turning control over to the driver so the driver is clearly responsible for the accident even though the autonomous system could be more effective at reducing the chance of a fatal outcome. Q. You didn’t discuss much on AI as a teacher. Is there a possibility that AI could be used to educate students, and what are some of the ethical implications of AI teaching humans? A. An AI can scale to individually-focused custom teaching plans far better than a human could. However, AIs aren’t inherently unbiased and were they’re corrupted through their training they will perform consistently with that training. If the training promotes unethical behavior that is what the AI will teach. Q. Could an ethical issue involving AI become unsolvable by current human ethical standards? What is an example of that, and what are some steps to mitigate that circumstance? A. Certainly, ethics are grounded in rules and those rules aren’t consistent and are in flux. These two conditions make it virtually impossible to assure the AI is truly ethical because the related standard is fluid. Machines like immutable rules, ethics rules aren’t immutable. Q. I can’t believe that nobody’s brought up HAL from Arthur C. Clarke’s 2001 book. Wasn’t this a prototype of AI ethics issues? A. We spent some time at the end of the session, where Jim mentioned that our “Socratic Forebearers” were some of the early science fiction writers such as Clarke and Isaac Asimov. We spent some time discussing Asimov’s Three Laws of Robotics and how Asimov and others later theorized how smart robots could get around the three laws. In truth, there’s been decades of thought into the ethics of an artificial intelligence, and we’re fortunate to be able to build on that as we address what are now real-world problems.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

AI Artificial Intelligence Business Ethics Machine Learning

Blog

Cloud Analytics Drives Airplanes-as-a-Service Business

Cloud Analytics Drives Airplanes-as-a-Service Business

Jim Fister

Feb 25, 2021

On-demand flying through an app sounds like something for only the rich and famous, yet the use of cloud analytics is making flexible flying a reality at start-up airline, KinectAir. On April 7, 2021, The CTO of KinectAir, Ben Howard, will join the SNIA Cloud Storage Technologies Initiative (CSTI) for a fascinating discussion on first-hand experiences of leveraging cloud analytics methods to bring new business models to life that are competitive and profitable. And since start-up companies may not have legacy data and analytics to consider, we’ll also explore what established businesses using traditional analytics methods can learn from this use case. Join us on April 7th for our live webcast “Adapting Cloud Analytics for Practical Business Use” for views from both start-up and established companies on how to revisit the analytics decision process with a discussion on:

How to build and take advantage of a data ecosystem
Overcoming challenges and roadblocks
How to use cloud resources in unique ways to accomplish business and engineering goals
Considerations for business requirements and developing technical metrics
Thoughts on when to start new vs. adapt existing analytics processes
Real-world examples of cloud analytics and AI

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Airplanes-as-a-Service Artificial Intelligence Cloud Analytics Cloud Storage Data Analytics

Blog

The Effort to Keep Artificial Intelligence Ethical

The Effort to Keep Artificial Intelligence Ethical

Jim Fister

Feb 11, 2021

Artificial Intelligence (AI) technologies are possibly the most substantive and meaningful change to modern business. The ability to process large amounts of data with varying degrees of structure and form enables giant leaps in insight to drive revenue and profit. Likewise, governments and society have significant opportunity for improvement of the lives of the populace through AI. However, with the power that AI brings comes the risks of any technology innovation. The SNIA Cloud Storage Technologies Initiative (CSTI) will explore some of the ethical issues that can arise from AI at our live webcast on March 16, 2021 “The Ethics of Artificial Intelligence.” Our expert speakers, Rob Enderle, President and Principal Analyst at The Enderle Group and Eric Hibbard, Chair of the SNIA Security Technical Work Group, will join me for an interactive discussion on:

How making decisions at the speed of AI could be ethically challenging
Examples of how companies have structures to approach AI policy
The pitfalls of managing the human side of AI development
Potential legal implications of using AI to make decisions
Advice for addressing potential ethics issues before they are unsolvable

It’s sure to be an enlightening discussion on an aspect of AI that is seldom explored. Register today. We look forward to seeing you on March 16^th.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

AI Artificial Intelligence Ethics

Blog

5G Streaming Questions Answered

5G Streaming Questions Answered

Michael Hoard

Dec 2, 2020

The broad adoption of 5G, internet of things (IOT) and edge computing are reshaping the nature and role of enterprise and cloud storage. Preparing for this significant disruption is important. It’s a topic the SNIA Cloud Storage Technologies Initiative covered in our recent webcast “Storage Implications at the Velocity of 5G Streaming,” where my colleagues, Steve Adams and Chip Maurer, took a deep dive into the 5G journey, streaming data and real-time edge AI, 5G use cases and much more. If you missed the webcast, it’s available on-demand along with a copy of the webcast slides.

As you might expect, this discussion generated some intriguing questions. As promised during the live presentation, our experts have answered them all here.

Q. What kind of transport do you see that is going to be used for those (5G) use-cases?

A. At a high level, 5G consists of 3 primary slices: enhanced mobile broadband (eMBB), ultra-low latency communications (URLLC) and massive machine type communication (mMTC). Each of these are better suited for different use cases, for example normal smartphone usage relies on eMBB, factory robotics relies on URLLC, and intelligent device or sensor applications like farming, edge computing and IOT relies on mMTC.

The primary 5G standards-making bodies include:

The 3rd Generation Partnership Project (3GPP) – formulates 5G technical specifications which become 5G standards. Release 15 was the first release to define 5G implementations, and Release 16 is currently underway.
The Internet Engineering Task Force (IETF) partners with 3GPP on the development of 5G and new uses of the technology. Particularly, IETF develops key specifications for various functions enabling IP protocols to support network virtualization. For example, IETF is pioneering Service Function Chaining (SFC), which will link the virtualized components of the 5G architecture—such as the base station, serving gateway, and packet data gateway—into a single path. This will permit the dynamic creation and linkage of Virtual Network Functions (VNFs).
The International Telecommunication Union (ITU), based in Geneva, is the United Nations specialized agency focused on information and communication technologies. ITU World Radio communication conferences revise the international treaty governing the use of the radio-frequency spectrum and the geostationary and non-geostationary satellite orbits.

To learn more, see

Q. What if the data source at the Edge is not close to where the signal is good to connect to cloud? And, I wonder how these algorithm(s) / data streaming solutions should be considered?

A. When we look at a 5G applications like massive Machine Type Communications (mMTC), we expect many kinds of devices will connect only occasionally, e.g. battery-operated sensors attached to farming water sprinklers or water pumps. Therefore, long distance, low bandwidth, sporadically connected 5G network applications will need to tolerate long stretches of no-contact without losing context or connectivity, as well as adapt to variations in signal strength and signal quality.

Additionally, 5G supports three broad ranges of wireless frequency spectrum: Low, Mid and High. The lower frequency range provides lower bandwidth for broader or more wide area wireless coverage. The higher frequency range provides higher bandwidth for limited area or more focused area wireless coverage. To learn more, check out The Wired Guide to 5G.

On the second part of the question regarding algorithm(s) / data streaming solutions, we anticipate streaming IOT data from sporadically connected devices can still be treated as steaming data sources from a data ingestion standpoint. It is likely to consist of broad snapshots (pre-stipulated time windows) with potential intervals of null sets of data when compared with other types of data sources. Streaming data, regardless of interval of data arrival, has value because of the “last known state” value versus previous interval known states. Calculation of trending data is one of the most common meaningful ways to extract value and make decisions.

Q. Is there an improvement with the latency in 5G from cloud to data center?

By 2023, we should see the introduction of 5G ultra reliable low latency connection (URLLC) capabilities, which will increase the amount of time sensitive data ingested into and delivered from wireless access networks. This will increase demand for fronthaul and backhaul bandwidth to move time sensitive data from remote radio units, to baseband stations and aggregation points like metro area central offices.

As an example, to reduce latency, some hyperscalers have multiple connections out to regional co-location sites, central offices and in some cases sites near cell towers. To save on backhaul transport costs and improve 5G latency, some cloud service providers (CSP) are motivated to locate their networks as close to users as possible.

Independent of CSPs, we expect that backhaul bandwidth will increase to support the growth in wireless access bandwidth of 5G over 4G LTE. But it isn’t the only reason backhaul bandwidth is growing. COVID-19 revealed that many cable and fiber access networks were built to support much more download than upload traffic. The explosion in work and study from home, as well as video conferencing has changed the ratio of upload to download. So many wireline operators (which are often also wireless operators) are upgrading their backhaul capacity in anticipation that not everyone will go back to the office any time soon and some may hardly ever return to the office.

Q. Are the 5G speeds ensured from end-to-end (i.e from mobile device to tower and with MSP’s infrastructure)? Understand most of the MSPs have improved the low latency speeds between Device and Tower.

We expect specialized services like 5G ultra reliable low latency connection (URLLC) will help improve low latency and narrow jitter communications. As far as “assured,” this depends on the service provider SLA. More broadly 5G mobile broadband and massive machine type communications are typically best effort networks, so generally, there is no overall guaranteed or assured latency or jitter profile.

5G supports the largest range of radio frequencies. The high frequency range uses milli-meter (mm) wave signals to deliver the theoretical max of 10Gbps, which means by default reduced latency along with higher throughput. For more information on deterministic over-the-air network connections using 5G URLLC and TSN (Time Sensitive Networking), see this ITU presentation “Integration of 5G and TSN.”

To provide a bit more detail, mobile devices communicate via wireless with Remote Radio Head (RRH) units co-located at the antenna tower site, while baseband unit (BBU) processing is typically hosted in local central offices. The local connection between RRHs and BBUs is called the fronthaul network (from antennas to central office). Fronthaul networks are usually fiber optic supporting eCPRI7.2 protocol, which provide time sensitive network delivery. Therefore, this portion of the wireless data path is deterministic even if the over-the-air or other backhaul portions of the network are not.

Q. Do we use a lot of matrix calculations in streaming data, and do we have a circuit model for matrix calculations for convenience?

We see this applies case-by-case based on the type of data. What we often see is many edge hardware systems include extensive GPU support to facilitate matrix calculations for real time analytics.

Q. How do you see the deployment and benefits of Hyperconverged Infrastructure (HCI) on the edge?

Great question. The software flexibility of HCI can provide many advantages on the edge over dedicated hardware solutions. Ease of deployment, scalability and service provider support make HCI an attractive option. See this very informative article from TechTarget “Why hyper-converged edge computing is coming into vogue” for more details.

Q. Can you comment on edge-AI accelerator usage and future potentials? What are the places these will be used?

Edge processing capabilities include many resources to improve AI capabilities. Things like computational storage and increased use of GPUs will only serve to improve analytics performance. Here is a great article on this topic.

Q. How important is high availability (HA) for edge computing?

For most enterprises, edge computing reliability is mission critical. Therefore, almost every edge processing solution we have seen includes complete and comprehensive HA capabilities.

Q. How do you see Computational Storage fitting into these Edge use cases? Any recommendations on initial deployment targets?

The definition and maturity of computational storage is rapidly evolving and is targeted to offer huge benefits for management and scale of 5G data usage on distributed edge devices. First and foremost, 5G data can be used to train deep neural networks at higher rates due to parallel operation of “in storage processing.” Petabytes of data may be analyzed in storage devices or within storage enclosures (not moved over the network for analysis). Secondly, computational storage may also accelerate the process of conditioning data or filtering out unwanted data.

Q. Do you think that the QUIC protocol will be a standard for the 5G communication?

So far, TCP is still the dominate transport layer protocol within the industry. QUIC was initially proposed by Google and is widely adopted in the Chrome/Android ecosystem. QUIC is getting increased interest and adoption due to its performance benefits and ease in implementation (it can be implemented in user space and does not need OS kernel changes).

For more information, here is an informative SNIA presentation on the QUIC protocol.

Please note this is an active area of innovation. There are other methods including Apple IOS devices using MPTCP, and for inter/intra data center communications RoCE (RDMA over Converged Ethernet) is also gaining traction, as it allows for direct memory access without consuming host CPU cycles. We expect TCP/QUIC/RDMA will all co-exist, as other new L3/L4 protocols will continue to emerge for next generation workloads. The choice will depend on workloads, service requirements and system availability.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

5G 5G Streaming Artificial Intelligence Cloud Storage Edge Computing IoT

Subscribe to Artificial Intelligence

Digital Twins Q&A

Find a similar article by tags

Leave a Reply

You’ve Been Framed! An Overview of Programming Frameworks

Find a similar article by tags

Leave a Reply

5G Industrial Private Networks and Edge Data Pipelines

Find a similar article by tags

Leave a Reply

Storage for AI Q&A

Find a similar article by tags

Leave a Reply

Storage for Applications Webcast Series

Find a similar article by tags

Leave a Reply

Can Cloud Storage and Big Data Live Happily Ever After?

Find a similar article by tags

Leave a Reply

Q&A on the Ethics of AI

Find a similar article by tags

Leave a Reply

Cloud Analytics Drives Airplanes-as-a-Service Business

Find a similar article by tags

Leave a Reply

The Effort to Keep Artificial Intelligence Ethical

Find a similar article by tags

Leave a Reply

5G Streaming Questions Answered

Find a similar article by tags

Leave a Reply