avatarPatrick Cuba

Summary

Snowflake continues to innovate and expand its cloud data platform capabilities, offering a comprehensive suite of services that encompass eight distinct workloads, from data warehousing to cyber security, and emphasizing its evolution beyond traditional data warehouse solutions.

Abstract

Snowflake, known for its cloud data platform, has significantly broadened its offerings since its inception as a data warehouse solution. The platform now supports a diverse range of workloads, including data lakes, data engineering, data applications, data science, collaboration, cyber security, and the revolutionary UNISTORE, which combines OLTP and OLAP capabilities. Snowflake's architecture facilitates SQL-based simplicity for complex data operations, automated maintenance, and robust security features. It also provides a marketplace for data applications and supports real-time data sharing and collaboration. With Snowflake's commitment to innovation, users can expect a service that not only meets current data management needs but also adapts to future demands, all while maintaining ease of use and powerful performance.

Opinions

  • The author believes that Snowflake's evolution has redefined the data warehouse workload, integrating it with seven other workloads, thereby offering a more comprehensive solution.
  • Snowflake's ability to manage a data lake as part of its cloud data platform is seen as a significant advantage, eliminating the need for a separate platform and simplifying data management.
  • The platform's unique architecture is credited with influencing DevOps, DataOps, and data pipeline methodologies, providing both batch-oriented and real-time processing capabilities.
  • Snowflake's support for data science is highlighted as a key feature, with native support for secure Python libraries and the ability to run data science workloads directly within the platform.
  • The author expresses that Snowflake's data sharing and collaboration capabilities have created a new marketplace, enabling real-time sharing of both data and applications.
  • Cyber security is a critical aspect of Snowflake's platform, with the author emphasizing its robust support for analyzing SIEM data and enhancing threat detection capabilities.
  • The introduction of UNISTORE is viewed as a game-changer, combining OLTP and OLAP capabilities for a seamless HTAP experience and supporting traditional Snowflake features like Time-Travel.
  • The author encourages continued learning and professional development through Snowflake's certification programs, training services, and Quickstart guides to fully leverage the platform's capabilities.
  • A note of caution is included, advising readers to test implementation performance before fully committing to the Snowflake platform, with the author providing no guarantees regarding individual implementation outcomes.

Snowflake, the Cloud Data Platform (2024)

Another year and Snowflake continues to dominate the data cloud and the conversation. As more and more prospects realise the power of turning the cloud into a relational database (I’ll explain) the more the constraints (pun intended) of the past disappear!

How is the data cloud now a relational database you ask? A lot of the data operations your data specialists had to configure in the past on your cloud provider are simple SQL statements on Snowflake. Almost all maintenance tasks are managed by Snowflake as a service with no downtime.

Snowflake secures your data in the cloud; you are responsible for what you put into the cloud (of course!).

This article builds on the following from 2022 and 2023:

Let’s talk workloads,

We have covered many aspects of Snowflake up until now, but we haven’t looked at the eight workloads Snowflake targets, first up….

Data Warehouse

Redefining the Data Warehouse

“A data warehouse is a subject-oriented, integrated (by business key), time-variant and non-volatile collection of data in support of management’s decision-making process, and/or in support of auditability as a system-of-record.” — Bill Inmon, wwdvc 2019

In the early powder days, Snowflake was known only as a Data Warehouse and compared to the likes of Teradata, AWS Redshift, Azure Synapse and Google BigQuery. However, Snowflake has evolved into so much more! This is why Snowflake should be referred to as the Data Cloud. Not only has Snowflake delivered much more than a Data Warehouse can, but its innovations have also driven Snowflake’s competitors to rethink their own definition of what they deliver as a Data Warehouse. The competition is fierce but ultimately, we see the competition as where Snowflake was maybe 5–10 years ago (and that’s no exaggeration!).

Of course, this does not mean Snowflake has abandoned the data warehouse workload, it just means Snowflake has redefined it! We have another SEVEN workloads we can now relate the data warehouse to!

Data Lake

A Data Lake without a catalogue is a Data Swamp!

“If you think of a Data Mart as a store of bottled water, cleansed and packaged and structured for easy consumption, the Data Lake is a large body of water in a more natural state. The contents of the Data Lake stream in from a source to fill the lake, and various users of the lake can come to examine, dive in, or take samples.” — James Dixon, 2010, former CTO of Penthaho

Schema-on-read removed the need to spend valuable engineering and modelling time to define the data structure up front, you only defined the required data type structures when you read the data. Snowflake supports the data lake workload as external tables (Apache iceberg included). There are many advantages to having your data lake on Snowflake, such as applying Snowflake semantics (like semi-structure querying), using Snowflake powerful processing engine and ANSI compliant SQL and together with Snowpark extend that functionality with your favourite programming language. This means you do not need a separate platform to manage a data lake as a data lakehouse, Snowflake manages all of this as a service.

Data Engineering

Data Engineering is way of life!

Snowflake’s unique architecture has allowed it to scale far beyond our competitors, so much so that Snowflake has influenced how DevOps, DataOps and streaming data pipelines can be defined for your platform. Not only does the platform manage batch-oriented workloads but real time too. Dynamic tables, streams and tasks, pipeline observability, zero-copy clones are just some of the Snowflake-only innovations that Snowflake’s competitors struggle to replicate. Even if they label their solution as competitive (similar in name only), underneath their respective marketing stories there’s a lot that their customers must do that Snowflake simply provides as a service! And let’s not forget the general availability of running your Python, Java, and Scala in Snowpark for those transformation that SQL alone will not solve!

Data Applications

Rapidly support your Data Apps to your market!

Managed, connected or native applications built on Snowflake. Enjoy semi-structured support and the OLTP promise of UNISTORE (see below) to simplify your overall architecture and reduce time-to-value. With the announcement of Snowpark Containers at last year’s Summit you also can bring your App to the data cloud without having to completely re-write the business logic! Expand your data apps with globally accessible Snowgrid to monetize your data application in the Snowflake marketplace.

What other platform is pushing what is possible in the cloud?

Data Science

Data Science, all in one Data Cloud

“All models are wrong, but some are useful.” — George E. P. Box

Not only does Snowflake support connectivity for data science tools but Snowflake has brought data science into the platform itself, securely! Utilising Anaconda as a package manager for secure python libraries accessible natively within Snowflake, Snowpark enables beyond SQL semantics to be built, designed and deployed within Snowflake and runs faster than Apache Spark! Support for warehouses with GPUs, Spark connectors for running predicate pushdown onto Snowflake (if you desire), shouldn’t you consider Snowflake as an integral part of your data science story?

Collaboration (formerly referred to as Data Sharing)

Share your data and applications in real time!

One of the early industry changing features of Snowflake, being able to share your data with other departments, partners, and customers in real-time (no need for unsecure ftp file transfers) has expanded to include the ability to share data applications as well! Snowflake has created a whole market of data sharing and collaboration that includes the ability to run data cleanrooms on this real time data!

Cyber Security

Securing it all, analysing SIEM

Let’s talk cyber! Although all the surrounding workloads mentioned in this article are impressive and industry setting, underneath it all is security. Snowflake’s native support for structured and semi-structured provides a robust way to analyse SIEM data by using plain SQL. How often have we heard companies not able to perform threat detection over historical data but with Snowflake that is as easy as writing an SQL query. Together with our partners we fill that space of SIEM detection not easily possible on log file dumps.

UNISTORE

Game-changing!

One of the workloads with the most buzz! Snowflake has combined OLTP and OLAP capabilities into a single table for an HTAP experience! Now supporting transactional update and retrieval latencies to further power your data applications in the cloud. What’s more is that this table-type also supports the traditional Snowflake table features you’re already accustomed to, and you can further join this new table types with all your existing tables and enjoy features like Time-Travel too (including Apache Iceberg tables)!

Where to go from here?

Snowflake’s Industry-recognised certification badges

Certification, Snowflake in the last few years has grown into a robust platform to meet your enterprise cloud needs.

How do you get your data engineers, architects, administrators, analysts, and data scientists to up to speed with Snowflake?

  • Get started by executing the self-paced Snow Academy and earn a completion badge.
  • Get training through Snowflake’s training services and achieve industry-recognised Snowflake certification badges. Some are paid, some are free hand-on labs to get your fingers warm!
  • Use a Quickstarts guide to set up real-world examples with partner technology you’re familiar with (or not yet familiar with).
  • Have Snowflake’s Professional services take you through a solution architect led Quickstart service that teaches your engineers and architects the fundamentals of Snowflake to match your business case needs (typically 5–14 days depending on your schedule).

There are many options to help you get set up, to see what the buzz is all about come see us at our yearly marketing events, the top of the pile being Snowflake Summit!

The views expressed in this article are that of my own, you should test implementation performance before committing to this implementation. The author provides no guarantees in this regard.

Snowflake
Snowflake Data Cloud
Recommended from ReadMedium