Snowflake is holding its Summit 2023 event in Las Vegas this week where, not surprisingly, artificial intelligence is a hot topic.
In his keynote speech Tuesday CEO Frank Slootman joked that there has been so much discussion about AI across the industry that “it’s almost like there’s only two letters left in the alphabet.”
“Generative AI, for the first time, is really going to democratise access to data. And that’s critical,” noted Benoit Dageville, Snowflake co-founder and president of product, during the keynote session.
"But Snowflake is all about data. We always say that in order to have an AI strategy, you have to have a data strategy,” Slootman said.
"Our data strategy, no surprise, is the data cloud.”
Snowflake is unleashing a wave of new technologies and capabilities around the Snowflake platform this week, but the data cloud giant is emphasising three overarching “innovation themes” at the Summit 2023 event:
*Snowflake as a single platform for a broad range of use cases around accessing and understanding data, data security and protection, advanced analytics, and Snowflake platform performance and cost optimisation.
*Snowflake’s growing capabilities for distributing and monetising datasets and data-intensive applications.
*Enabling data programmability without having to trade off data security, governance or privacy.
Snowflake for data workloads
In his keynote speech Tuesday Slootman noted the problem of data silos—islands of data created to support specific applications and workloads.
“This is an incredibly hard problem because silos are created just by running workloads."
"Silos are created by introducing new applications."
"And then you have to literally fight it."
"You have to have a very strong strategic posture towards a single data universe without boundaries,” the chief executive said.
“One of the things that we have invested most of our engineering resources in, in terms of enabling the data cloud, are workloads,” Slootman said.
“The whole point of the data cloud is the work needs to come to the data."
"We want to stop the data from going to the work because that endlessly silos and re-silos the world.
“If we can bring the workload capabilities full spectrum, from analytical to transactional to search and everything in between, then the data can stay put because the work can be executed effortlessly on that data,” he said.
“We really think applications should be built on the data cloud, not on a database.”
Slootman noted the range of workloads the Snowflake Data Cloud is running today include operational applications, data warehouses and data lakes, AI and machine learning, data engineering, data governance, collaboration and data security.
Snowflake extends programmability with new Snowpark Container Services
Snowpark is Snowflake’s developer environment that allows programmers to write code in their preferred language and run that code on the Snowflake platform.
At the Summit conference Snowflake unveiled a number of technology innovations that the company said extend data programmability for data scientists, data engineers and application developers so they can build software more quickly and more efficiently in the data cloud.
Topping the list is Snowpark Container Services, now in private preview, that the company said expands the Snowflake compute infrastructure to run a variety of workloads including full-stack applications, the hosting of Large Language Models (LLMs), robust training and more.
“This is a huge expansion of Snowpark,” Slootman said in his keynote, noting that while customers are building new applications on Snowflake, many have older applications they don’t want to rewrite or recompile.
“We can take whole applications, whole sets of services, and we can run them inside the Snowflake governance parameter.”
The new container services expand the scope of Snowpark to include what the company calls “broader infrastructure options” such as accelerated computing Nvidia GPUs and AI software to run more workloads within Snowflake’s Data Cloud, including a wider range of AI and machine learning models, APIs and internally developed applications.
The container services also provide customers with access to a catalog of third-party software and applications including LLMs, notebooks, machine learning (ML) operations tools and more within their accounts.
Related announcements included a public preview of new set of Snowpark ML APIs for more efficient model development, a private preview of a Snowpark Model Registry for scalable MLOps, and an upcoming public preview of Streamlit in Snowflake for turning models into interactive applications.
New native application framework for developing, distributing and monetising apps
Snowflake said the Snowflake Native App Framework is now available as a public preview on Amazon Web Services for developers to build and test Snowflake-native applications.
The framework provides the necessary building blocks to more quickly develop, more easily deploy and effectively operate applications on the Snowflake data cloud.
Snowflake is also touting the ability to monetize applications running on the Snowflake Data Cloud through the Snowflake Marketplace.
Customers can run such applications within their Snowflake accounts and tap into data already maintained on the Snowflake platform.
“Every type of data application has historically required customers to move or copy their data and entrust it to third-party vendors, which is particularly problematic when customer data is highly sensitive,” said Christian Kleinerman, Snowflake senior vice president of product.
“The Snowflake Native App Framework reimagines the status quo, enabling developers to bring their apps directly to their customer’s data, without that data ever leaving the customer’s environment.”
The online marketplace now has more than 25 native applications from Bond Brand Loyalty, Capital One Software, Depository Trust & Clearing Corp. (DTCC), Goldman Sachs, LiveRamp, Matillion, My Data Outlet and others.
Snowflake and Nvidia team up
Snowflake and Nvidia have partnered to help joint customers build custom generative AI models and create customised generative AI applications using their own proprietary data within the Snowflake Data Cloud.
Snowflake and Nvidia are integrating the Nvidia NeMo platform with the Snowflake Data Cloud.
NeMo is Nvidia’s system for developing, customizing and deploying LLMs with billions of parameters.
The integration, combined with Nvidia GPU-accelerated computing, will help businesses and organizations use data in their Snowflake accounts to build custom LLMs for advanced generative AI services including chatbots, search and summarization, according to Snowflake.
The Nvidia alliance makes it possible to develop and customise LLMs that power business applications and services without moving proprietary governed data outside the Snowflake platform.
The technology will be able to work with hundreds of terabytes and even petabytes of raw and curated data, according to Snowflake.
Snowflake already offers industry-specific data clouds for health care and life sciences, manufacturing, financial services, retail and CPG, technology, government and education, telecommunications, and advertising, media and entertainment.
The integration with Nvidia NeMo will help customers develop customised generative AI applications for those industries, Snowflake said
New LLM, Iceberg data tables support
Snowflake also unveiled a number of enhancements and new capabilities to the Snowflake Data Cloud to help businesses and organisations get more value from all of their data.
It’s estimated that 90 percent of all data is unstructured in the form of documents, images, video, audio and other forms. In September 2022 Snowflake acquired Applica, which developed an AI platform for understanding unstructured data in documents.
Earlier this week, Snowflake launched Document AI, a new LLM based on Applica’s generative AI technology to help customers put their unstructured data to work.
Document AI is in private preview.
Data in Snowflake is structured using the company’s proprietary format.
But many customers use the open-source Apache Iceberg data tables format.
At Summit Snowflake unveiled Iceberg Tables that helps customers extend the value of the Snowflake Data Cloud to Iceberg data.
Organisations can work with their own data stored in the Iceberg format—whether the data is managed by Snowflake or externally—while still benefiting from the performance and unified governance capabilities of the Snowflake platform.
Snowflake also said it has seen a 15 per cent improvement in query duration time over the last eight months for stable customer workloads on its platform using the company’s Snowflake Performance Index metric.
engineer sydney