J Metz

Sep 14, 2020

title of post
Last month, the SNIA Cloud Storage Technologies Initiative was fortunate to have artificial intelligence (AI) expert, Parviz Peiravi, explore the topic of AI Operations (AIOps) at our live webcast, “IT Modernization with AIOps: The Journey.” Parviz explained why the journey to cloud native and microservices, and the complexity that comes along with that, requires a rethinking of enterprise architecture. If you missed the live presentation, it’s now available on demand together with the webcast slides. We had some interesting questions from our live audience. As promised, here are answers to them all: Q. Can you please define the Data Lake and how different it is from other data storage models?           A. A data lake is another form of data repository with specific capability that allows data ingestion from different sources with different data types (structured, unstructured and semi-structured), data as is and not transformed. The data transformation process Extract, Load, Transform (ELT) follow schema on read vs. schema on write Extract, Transform and Load (ETL) that has been used in traditional database management systems. See the definition of data lake in the SNIA Dictionary here. In 2005 Roger Mougalas coined the term Big Data, it refers to large volume high velocity data generated by the Internet and billions of connected intelligent devices that was impossible to store, manage, process and analyze by traditional database management and business intelligent systems. The need for a high-performance data management systems and advanced analytics that can deal with a new generation of applications such as Internet of things (IoT), real-time applications and, streaming apps led to development of data lake technologies. Initially, the term “data lake” was referring to Hadoop Framework and its distributed computing and file system that bring storage and compute together and allow faster data ingestion, processing and analysis. In today’s environment, “data lake” could refer to both physical and logical forms: a logical data lake could include Hadoop, data warehouse (SQL/No-SQL) and object-based storage, for instance. Q. One of the aspects of replacing and enhancing a brownfield environment is that there are different teams in the midst of different budget cycles. This makes greenfield very appealing. On the other hand, greenfield requires a massive capital outlay. How do you see the percentages of either scenario working out in the short term? A. I do not have an exact percentage, but the majority of enterprises using a brownfield implementation strategy have been in place for a long time. In order to develop and deliver new capabilities with velocity, greenfield approaches are gaining significant traction. Most of the new application development based on microservices/cloud native is being implemented in greenfield to reduce the risk and cost using cloud resources available today in smaller scale at first and adding more resources later. Q. There is a heavy reliance upon mainframes in banking environments. There’s quite a bit of error that has been eliminated through decades of best practices. How do we ensure that we don’t build in error because these models are so new? A. The compelling reasons behind mainframe migration – beside the cost – is ability to develop and deliver new application capabilities, business services and making data available to all other applications. There are four methods for mainframe migration:
  • Data migration only
  • Re-platforming
  • Re-architecting
  • Re-factoring
Each approach provides enterprises different degrees of risk and freedom.  Applying best practices to both application design/development and operational management, is the best way to ensure smooth application migration from a monolith to a new distributed environment such as microservices/cloud native. Data architecture plays a pivotal role in the design process in addition to applying Continuous Integration and Continuous Delivery (CI/CD) process. Q. With the changes into a monolithic data lake, will we be seeing different data lakes with different security parameters, which just means that each lake is simply another data repository? A. If we follow a domain-driven design principal, you could have multiple data lakes with specific governance and security policies appropriate to that domain. Multiple data lakes could be accessed through data virtualization to mimic a monolithic data lake; this approach is based on a logical data lake architecture. Q. What’s the difference between multiple data lakes and multiple data repositories? Isn’t it just a matter of quantity? A. Looking from Big Data perspective, a data lake is not only stored data but also provides capabilities to process and analyze data (e.g. Hadoop framework/HDFS). New trends are emerging that separate storage and compute (e.g., disaggregated storage architectures) hence some vendors use the term “data lake” loosely and offer only storage capability, while others provide both storage and data processing capabilities as an integrated solution. What is more important than the definition of data lake is your usage and specific application requirements to determine which solution is a good fit for your environment.

Olivia Rhye

Product Manager, SNIA

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

The Role of AIOps in IT Modernization

Alex McDonald

Jul 23, 2020

title of post
The almost overnight shift of resources toward remote work has introduced the need for far more flexible, dynamic and seamless end-to-end applications, putting us on a path that requires autonomous capabilities using AIOps – Artificial Intelligence for IT Operations. It’s the topic that the SNIA Cloud Storage Technologies Initiative is going to cover on August 25, 2020 at our live webcast, “IT Modernization with AIOps: The Journey.” Our AI expert, Parviz Peiravi, will provide an overview of concepts and strategies to accelerate the digitalization of critical enterprise IT resources, and help architects rethink what applications and underlying infrastructure are needed to support an agile, seamless data centric environment. This session will specifically address migration from monolithic to microservices, transition to Cloud Native services, and the platform requirements to help accelerate AIOps application delivery within our dynamic hybrid and multi-cloud world. Join this webcast to learn: • Use cases and design patterns: Data Fabrics, Cloud Native and the move from Request Driven to Event Driven •    Foundational technologies supporting observability: how to build a more consistent scalable framework for governance and orchestration •    The nature of an AI data centric enterprise: data sourcing, ingestion, processing, and distribution This webcast will be live, so please bring your questions. We hope to see you on August 25th. Register today.

Olivia Rhye

Product Manager, SNIA

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

A Q&A on the Impact of AI

Alex McDonald

Jun 15, 2020

title of post

It was April Fools’ Day, but the Artificial Intelligence (AI) webcast the SNIA Cloud Storage Technologies Initiative (CSTI) hosted on April 1st was no joke! We were fortunate to have AI experts, Glyn Bowden and James Myers, join us for an interesting discussion on the impact AI is having on data strategies. If you missed the live event, you can watch it here on-demand. The audience asked several great questions. Here are our experts’ answers:

Q. How does the performance requirement of the data change from its capture at the edge through to its use

A. That depends a lot on what purpose the data is being
captured for. For example, consider a video analytics solution to capture
real-time activities. The data transfer will need to be low latency to get the
frames to the inference engine as quickly as possible. However, there is less
of a need to protect that data, as if we lose a frame or two it’s not a major
issue. Resolution and image fidelity are already likely to have been sacrificed
through compression. Now think of financial trading transactions. It may be we
want to do some real-time work against them to detect fraud, or feedback into a
market prediction engine; however we may just want to push them into an archive.
In this case, as long as we can push the data through the acquisition function
quickly, we don’t want to cause issues for processing new incoming data and have
side effects like filling up of caches etc,  so we don’t need to be too concerned with
performance. However, we MUST protect every transaction. This means that each
piece of data and its use will dictate what the performance, protection and any
other requirements are required as it passes through the pipeline.

Q. Need to think of the security, who is seeing the data resource?

A. Security
and governance is key to building a successful and flexible data pipeline. We
can no longer assume that data will only have one use, or that we know in
advance all personas who will access it; hence we won’t know in advance how to protect
the data. So, each step needs to consider how the data should be treated and
protected. The security model is one where the security profile of the data is
applied to the data itself and not any individual storage appliance that it
might pass through. This can be done with the use of metadata and signing to
ensure you know exactly how a particular data set, or even object, can and
should be treated. The upside to this is that you can also build very good data
dictionaries using this metadata, and make discoverability and audit of use
much simpler. And with that sort of metadata, the ability to couple data to
locations through standards such as the SNIA
Cloud Data Management Interface (CDMI
) brings real opportunity.

Q.
Great overview on the inner workings of AI. Would a company’s Blockchain have a
role in the provisioning of AI?

A.
Blockchain can play a role in AI. There are vendors with patents around Blockchain’s
use in distributing training features so that others can leverage trained
weights and parameters for refining their own models without the need to have
access to the original data. Now, is blockchain a requirement for this to
happen? No, not at all. However, it can provide a method to assess the providence
of those parameters and ensure you’re not being duped into using polluted
weights.

Q.
It looks like everybody is talking about AI, but thinking about pattern
recognition / machine learning. The biggest differentiator for human
intelligence is – making a decision and acting on its own, without external
influence. Little children are good example. Can AI make decisions on its own
right now?

A.
Yes and no. Machine Learning (ML) today results in a prediction and a
probability of its accuracy. So that’s only one stage of the cognitive pipeline
that leads from observation, to assessment, to decision and ultimately action.
Basically, ML on its own provides the assessment and decision capability. We
then write additional components to translate that decision into actions. That
doesn’t need to be a “Switch / Case” or “If this then that”
situation. We can plug the outcomes directly into the decision engine so that the
ML algorithm is selecting the outcome desired directly. Our extra code just
tells it how to go about that. But today’s AI has a very narrow focus. It’s not
general intelligence that can assess entirely new features without training and
then infer from previous experience how it should interpret them. It is not yet
capable of deriving context from past experiences and applying them to new and
different experiences.

Q.
Shouldn’t there be a path for the live data (or some cleaned-up version or
output of the inference) to be fed back into the training data to evolve and
improve the training model?

A. Yes
there should be. Ideally you will capture in a couple of places. One would be
your live pipeline. If you are using something like Kafka to do the pipelining you
can split the data to two different locations and persist one in a data lake or
archive and process the other through your live inference pipeline. You might
also then want your inference results pushed out to the archive as well as this
could be a good source of “training data”; it’s essentially labelled
and ready to use. Of course, you would need to manually review this, as if
there is inaccuracy in the model, a few false positives can reinforce that
inaccuracy.

Q.
Can the next topic focus be on pipes and new options?

A. Great Idea. In fact, given the popularity of this presentation, we are looking at a couple more webcasts on AI. There’s a lot to cover! Follow us on Twitter @sniacloud_com for dates of future webcast.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

How AI Impacts Storage and IT

Alex McDonald

Mar 13, 2020

title of post
Artificial intelligence (AI) and machine learning (ML) have had quite the impact on most industries in the last couple of years, but what about the effect on our own IT industry? On April 1, 2020, the SNIA Cloud Storage Technologies Initiative will host a live webcast, “The Impact of Artificial Intelligence on Storage and IT, where our experts will explore how AI is changing the nature of applications, the shape of the data center, and its demands on storage. Learn how the rise of ML can develop new insights and capabilities for IT operations. In this webcast, we will explore:
  • What is meant by Artificial Intelligence, Machine Learning and Deep Learning?
  • The AI market opportunity
  • The anatomy of an AI Solution
  • Typical storage requirements of AI and the demands on the supporting infrastructure
  • The growing field of IT operations leveraging AI (aka AIOps)
Yes, we know this is on April 1st, but it’s no joke! So, don’t be fooled and find out why everyone is talking about AI now. Register today

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

How AI Impacts Storage and IT

Alex McDonald

Mar 13, 2020

title of post
Artificial intelligence (AI) and machine learning (ML) have had quite the impact on most industries in the last couple of years, but what about the effect on our own IT industry? On April 1, 2020, the SNIA Cloud Storage Technologies Initiative will host a live webcast, “The Impact of Artificial Intelligence on Storage and IT, where our experts will explore how AI is changing the nature of applications, the shape of the data center, and its demands on storage. Learn how the rise of ML can develop new insights and capabilities for IT operations. In this webcast, we will explore:
  • What is meant by Artificial Intelligence, Machine Learning and Deep Learning?
  • The AI market opportunity
  • The anatomy of an AI Solution
  • Typical storage requirements of AI and the demands on the supporting infrastructure
  • The growing field of IT operations leveraging AI (aka AIOps)
Yes, we know this is on April 1st, but it’s no joke! So, don’t be fooled and find out why everyone is talking about AI now. Register today

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

AI, Machine Learning and Natural Language Processing in Action

Alex McDonald

Mar 1, 2018

title of post
SNIA Cloud Storage recently hosted a fascinating webcast on the real world use of IBM Watson – the computer that mesmerized viewers on “Jeopardy!” by answering questions accurately and faster than its human competitors. Our webcast, “Customer Support through Natural Language Processing and Machine Learning,” detailed how Watson is being used as a virtual support assistant, named Elio, at NetApp. We had many interesting questions during the live event which is now available on-demand. Here are answers to them all from our expert presenters who have been driving the success of Elio – Ross Ackerman from NetApp and Robin Marcenac from IBM. Q. Why did NetApp build an Elio? A. Elio is an example of NetApp’s data driven innovation. NetApp experts train Elio on our case histories. Using Watson cognitive computing, Elio provides top answers and solves problems, on average, four times faster than traditional methods. Q. What happens if Elio can’t answer my question? A. Elio asks you whether you would like to chat with a technical support engineer or create a case. During technical support hours, Elio can route you to a technical support engineer for additional help. After hours, Elio opens a case, and an engineer contacts you during staffed hours. Q. If I start a chat or open a case, will I always chat with Elio? A.  Not always. When you start a chat, you fill out a brief intake form to help NetApp understand what kind of question you have. If Elio has been trained on the topic, you will chat with Elio. If Elio hasn’t been trained on the topic, during technical support hours you will chat with a technical support engineer; outside technical support hours, Elio will open a case and an engineer will contact you. Q. What does Elio do when a critical issue is reported, which usually, as a customer, you want to speak with a real live person? A. At the beginning of the case creation workflow, users are informed (before being routed to Elio) that they should contact NetApp Support directly for any critical, or “P1”, issues. As an additional safeguard, Elio is trained to understand these “P1” issues, such as a down filer situation. In these scenarios Elio will immediately determine that the user needs support from a live engineer and will route them accordingly to the remaining case creation process or to a live chat session. Q. Can you provide us with Elio’s effectiveness in correctly answering a technical question? A. Elio is providing answers, guidance, and recommendations 4 times faster than methods available without Elio assistance.  Elio has been able to free up thousands of hours of time for our support engineers to provide higher value human-to-human decision based and relationship based interactions. Q. What has been NetApp customer’s reaction to Elio? A. Customers are using Elio with success and getting answers to problems and questions in real time from their choice of channel. Customers have a choice to interact with Elio or not.  Customers are showing their “reaction to Elio” by their growing use of Elio who has engaged and assisted answering thousands of customer questions to date. Q. NetApp Elio – does the customer have a choice not to use it? A. Elio is embedded within both the case creation and live chat workflows on the NetApp Support Site because he is trained and capable of providing answers and solutions to commonly asked questions from our customers and partners. That said, if a user does not wish to chat with Elio they can simply bypass the conversation by clicking a “Skip” button. Q. How did Elio get his name? A. Elio’s first job at NetApp was with SolidFire, NetApp’s Flash based Cloud Infrastructure product. After the acquisition of SolidFire by NetApp, Elio found a great opportunity within NetApp’s customer support team to be its Virtual Support Assistant and first responder to Digital Support assistance. Elio was promoted to be customer facing and interact directly with customers in September 2017.  We look forward to many more great outcomes from Elio in his career.                          

Olivia Rhye

Product Manager, SNIA

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Watson: From Jeopardy! to Digital Support Assistant

Alex McDonald

Jan 29, 2018

title of post
When IBM Watson premiered on “Jeopardy!” viewers were mesmerized by Watson’s ability to answer the quiz show’s questions and most times, beat the human contestants! Fast-forward to today and the real-world applications extend well beyond playing trivia games. Watson is being deployed in a variety of medical and business scenarios. In fact, NetApp is now using Watson as part of Elio, a virtual support assistant that responds to queries in natural language. Elio is built using Watson’s cognitive computing capabilities which enable Elio to analyze unstructured data, by using natural language processing to understand grammar and context, interpret complex questions, and evaluate all possible meanings to determine what is being asked. Elio then reasons and identifies the best answers to questions with help from experts who monitor the quality of answers and continue to train Elio on more subjects. It’s a fascinating application of artificial intelligence (AI) that we will discuss in detail at our SNIA Cloud Storage webcast on February 22, 2018, “The Future of Digital Support: The Application of Unstructured Data and AI.” Elio and Watson represent an innovative and novel use of large quantities of unstructured data to help solve problems, on average, four times faster than traditional methods. Join us at this webcast, where those on the front lines of this innovative application will discuss:
  • The challenges of utilizing large quantities of valuable yet unstructured data
  • How Watson and Elio continuously learn as more data arrives, and navigates an ever growing volume of technical information
  • How Watson understands customer language and provides understandable responses
Learn how these new and exciting technologies are changing the way we look at and interact with large volumes of traditionally hard-to-analyze data. Register now! We look forward to seeing you on the Feb. 22nd.    

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Subscribe to Artificial Intelligence