Each Fall, the Snowflake Data Cloud hosts its Snowday event to share exciting new product announcements with data enthusiasts and tech aficionados. As November rolled around this year, Snowflake once again took center stage to unveil the latest and greatest advancements in the Data Cloud, including many great features for AI and machine learning.
From cutting-edge innovations in MLOps to powerful integrations with Large Language Models (LLMs), Snowflake’s event was chock full of exciting announcements for Data Scientists and ML Engineers.
In this post, we’ll recap some of the announcements that we’re most excited about in the AI/ML space.
Snowflake Cortex shows just how much Snowflake is investing in its AI capabilities and how quickly they’re making progress. Snowflake Cortex sits beneath the rich AI-enhanced experiences announced at Snowday, including Snowflake Copilot (see below), Universal Search, and Document AI.
Going one step deeper, Cortex powers LLM (and traditional ML) functionality that can be called using Snowflake SQL. The LLM functions cover some of the most common LLM use cases, including summarization, translation, answer extraction, and sentiment analysis.
For even more advanced users (and prompt engineering), Snowflake Cortex provides a set of Generalized Functions that expose Llama-2, including COMPLETE, TEXT2SQL, EMBED_TEXT, and VECTOR_L2_DISTANCE (vector similarity). These Generalized Functions will allow users to implement entire LLM applications – including architectures like Retrieval Augmented Generation (RAG) – without ever needing to deploy or host their own LLM.
Building on top of Snowflake Cortex, Snowflake Copilot is the new coding assistant tailor-made for the Data Cloud. Snowflake Copilot provides a natural-language chat interface that allows users to ask questions about how to query their data.
Copilot outputs SQL queries and provides buttons to add that SQL to your Snowflake worksheet and run the query. The demo for Snowflake Copilot showcased its ability to go deep into metadata stored in Snowflake to construct complex queries. We’re excited about how Snowflake Copilot uses Cortex to democratize access to data using AI.
Snowpark Container Services
Snowpark Container Services gives developers the ultimate flexibility to deploy any application on Snowflake. We talked a lot about it after Summit 2023, but the demos from Snowday made us even more excited.
Snowflake showed how Snowpark Container Services could be used to host a containerized LLM with Ray Serve exposed to a native React application – all within Snowflake’s secure perimeter. The capability to bring such powerful applications to our most compelling datasets is what we see as a driving force in the next generation of AI.
Snowflake Feature Store
Feature Stores are an important component of an MLOps platform. A feature store allows Data Scientists and ML Engineers to develop features for training ML models, serve those features for inference, and share features with other team members working on different models or applications. We’ve written about feature store architectures on Snowflake in the past but previously needed to rely on many third-party tools.
The Snowflake Feature Store greatly simplifies the technology stack while also providing great functionality for feature pipelines on Snowflake compute. As it strictly uses pre-existing Snowflake objects, It can be used by installing a simple Python library. Features can be defined using SQL or Python by using Snowpark Dataframes to define dynamic tables, and those tables can be orchestrated and materialized on a user-defined schedule.
We’re very excited about the simplicity of using a Snowflake-native Feature Store to accelerate ML model development.
Snowpark Model Registry
The Snowpark Model Registry is an integrated solution to manage and deploy models and their metadata in Snowflake. In our opinion, a Model Registry is another component of any successful MLOps stack. This helps deliver a scalable and secure mechanism for managing ML models for deployment on Snowflake compute, especially for batch inference workloads with Snowpark or real-time inference on Snowpark Container Services.
The Snowpark Model Registry will be available soon in Public Preview and can be used by installing a Python package into any Python environment.
Notebooks! How could we wrap this up without talking about Snowflake Notebooks? Snowflake now has an incredibly long list of user interfaces, including Snowsight, Python Worksheets, Snowflake Copilot, and Streamlit in Snowflake. But Notebooks bring the one that Data Scientists have been waiting for.
Snowflake Notebooks look very similar to the cell-based notebooks that Data Scientists know and love. Users will have Python, SQL, and Markdown cell types at their disposal to fuel exploratory analysis and unlock insights. Python cells will also natively support Streamlit for input and visualization.
In our opinion, a robust data foundation is the first step on the road to Enterprise AI, and Snowflake continues to make its case as the best-of-breed platform to serve as that foundation.
Free Generative AI Workshop
Looking for guidance on how to best leverage Snowflake’s newest AL & ML capabilities? Attend one of our free generative AI workshops for advice, best practices, and answers.