Is Snowflake a Database or Data Warehouse?

Snowflake is a data warehouse built on the cloud infrastructure of Amazon Web Services or Microsoft Azure, and allows storage and computing to scale independently. The Snowflake data cloud includes a pure cloud SQL data warehouse from scratch, designed with a new patented architecture to manage all aspects of data and analysis. It combines high performance, high concurrency, simplicity and affordability at levels not possible with other data warehouses. The Snowflake data platform is not based on any existing database technology or on any “big data” software platform, such as Hadoop.

Instead, Snowflake combines an entirely new SQL query engine with an innovative architecture designed natively for the cloud. For the user, Snowflake provides all the functionality of an enterprise analytics database, along with many additional special features and unique capabilities. You can ingest data from a variety of sources, and then you can store the data, organize it, and even manipulate it with SQL queries.Snowflake physically separates but logically integrates storage, computing, and services (such as metadata and user management). In contrast, Snowflake is a native cloud platform that eliminates the need for separate data warehouses, data lakes, and data markets, allowing for secure data sharing across the organization.Databases and schemas (namespaces) are used to organize data in Snowflake storage, which uses a column format internally for analysis.

There's virtually no limit to the number of databases and warehouses you can create (of course, you need Snowflake credits to create and use warehouses). The Snowflake documentation also uses the related term “data lake” to highlight that Snowflake is compatible with massive amounts of unstructured or semi-structured data, not just structured data with a fixed database schema (as would be necessary with SQL).The main advantage of using Snowflake as a data warehouse or data lake is its performance when processing large amounts of data, especially when executing complex analytical queries. The database storage layer contains all the data loaded into Snowflake, including structured and semi-structured data. The processing layer consists of virtual stores that execute the data processing tasks necessary for queries.

Each virtual store (or cluster) can access all the data in the storage layer and then work independently, so that the stores don't share or compete for computing resources.Almost everyone who uses Snowflake copies their data from some original source, such as a SaaS application, in Snowflake, instead of using Snowflake as the only database technology. Snowflake was created specifically for the cloud and is designed to address many of the problems found in old hardware-based data stores, such as limited scalability, data transformation problems, and delays or failures due to high volumes of queries.