How To Connect To A Snowflake Database With Python

Learn how to interact with Snowflake using the Snowflake Connector or SQLAlchemy Python packages and take advantage of the templates presented in the article.

Suggested On-Demand Courses:

Are you trying to advance your career as a Data Engineer or Machine Learning Engineer? I would highly recommend for you to check the following courses:

Data Engineering Nanodegree → High Quality Course!
Data Streaming With Apache Kafka & Apache Spark Nanodegree
Python advanced Coding Problems (StrataScratch)→ Best platform I found to prepare Python & SQL coding interviews so far! Better and cheaper than LeetCode.

>>> Not a Medium member yet? Consider signing up with my referral link to gain access to everything Medium has to offer for as little as $5 a month!

Introduction

I was firstly introduced to Snowflake in 2017, when the company for which I worked at the time, decided to migrate from Redshift to a more scalable computing-based claud data warehouse, to meet their growing analytical requirements and to feed BI tools.

Back then, I was a data analyst and I mainly queried Snowflake through a DBMS. Fast forward a few years and now, as a data engineer, I invest time building data pipelines that automatically connect, query, insert and update data into Snowflake through Python. Are you curious to understand how it is done?

Fast forward a few years and now, as a data engineer, I invest time building data pipelines that automatically connect, query, insert and update data into Snowflake through Python. Are you curious to understand how it is done?

In this article, you will learn how to interact with a Snowflake database through Python, either using the snowflake.connector or the sqlalchemy packages.

Both packages supports a number of technologies when it takes to enable secure authentication to Snowflake via Python, but according to my experience two of the most popular options are the externalBrowser authentication and default authentication, therefore I will use these in my examples.

Now, let’s explore the two methods together by creating some useful code templates you could use in your own projects.

Method # 1: Connect Using Snowflake Connector

The first step to use a Snowflake Connector is downloading the package as suggested by the official documentation:

pip install snowflake-connector-python

or

pip install snowflake-connector-python==<version>

Then, you will need to import it in your code:

import snowflake.connector

External Browser Authentication

The snowflake.connector includes a connect() function that can be used as a constructor for creating a connection to the database.

The function allows you to specify a number of parameters, of which some are required to establish a connection and some are optional.

One of these parameters is the authenticator. Quoting the official documentation, if you set authenticator = 'externalbrowser' then the authentication will be attempted using:

“your web browser and Okta, ADFS, or any other SAML 2.0-compliant

identity provider (IdP) that has been defined for your account”.

For instance, if your team is using Okta as an identity provider, you will just need to specify the USER_EMAIL without the need of also passing a user password. This is a template you could use straightaway, by replacing your credentials:

PROS: the 'externalbrowser' authentication allows you to establish a connection to SF with your personal user. Its flexibility makes it the preferred connection option while you are testing a new script locally, as you won’t need credentials for any shared user(this increases your autonomy).

CONS: Every time you establish a new connection, a new window will be automatically opened in your browser and this is somewhat annoying, but nothing major…

When you run the code above, it returns a connection object con_eb that in turn has a method named cursor(). Before you can actually interact with SF, you also need to create a cursor object con_eb.cursor() that, for ease of use, has been saved under the db_cursor_eb variable.

Example #1: Create Table

In the first example, I have created a TRANSACTIONS table in my preferred schema, by passing the SQL command to the cursor object execute() method:

As soon as the command is executed, you can verify if it was successful or not by running db_cursor_eb.fetchone()[0] that in this case returns the message Table TRANSACTIONS successfully created.

INFO: snowflake.connector.cursor:query: [CREATE OR REPLACE TABLE SCHEMA_NAME.TRANSACTIONS ( TR...]

INFO: snowflake.connector.cursor:query execution done

INFO: root:Table TRANSACTIONS successfully created.

Usually fetchone() retrieves the next row of a query result set, but because in this case I was just creating an empty table, it simply returned the execution state. Pretty handy message to display in production!

Default Authentication

If you don’t specify the authenticator parameter, under the hood, Snowflake will use its internal (default) authenticator.

However, in this case, you are required to pass the role USER_NAME and PASSWORD. These should be your personal credentials or the credentials of a shared user. Apart from these two little changes, the template is very similar to the previous one:

PROS: This is usually the connection method used in production, as it does not trigger a noisy authentication through your browser and does not require any additional identity checks.

CONS: You will need to request to be assigned with a role and password before being able to establish a connection. But this is a little price to pay, really…

As before, I have created a cursor object named db_cursor_def.

Example # 2: Insert Into Table

I will now use the db_cursor_def.execute() method to insert data into the TRANSACTIONS table created in the first example.

The records to be inserted will be retrieved from an existent SOURCE_TABLE and queried with a SELECT command:

Note that I set the query LIMIT to 1500 records.

Then I called the cursor object rowcount attribute, that returns the number of rows produced by the last execution, to check that this was exactly the number of rows inserted in the TRANSACTIONS table:

INFO: snowflake.connector.cursor:query: [INSERT INTO SCHEMA_NAME.TRANSACTIONS (TRX_ID, SOURCE_...]

INFO: snowflake.connector.cursor:query execution done

INFO: root: Total rows inserted: 1500

Method # 2: Connect Using SQLAlchemy

An alternative method to connect and interact with Snowflake is by using SQLAlchemy. To begin, you will need to run this command:

pip install --upgrade snowflake-sqlalchemy

It will install both snowflake.sqlalchemy as well as sqlalchemy in case it wasn’t already available.

Then, you should import the create_engine method from sqlalchemy, and the URL method from snowflake.sqlalchemy:

from snowflake.sqlalchemy import URL
from sqlalchemy import create_engine

External Browser Authentication

In effect, to establish a connection with sqlalchemy you need to call the create_engine method and pass a connection string to it.

However, “for convenience, you can use the snowflake.sqlalchemy.e>URL method to construct the connection string and connect to the database”:

As you can see, in order to connect with an externalbrowser authentication, the parameters you should specify are exactly the same as with the snowflake.connector.

Before executing a query, its best practice to first create a connection object: engine_1_con = engine1.connect(). However, it’s worth mentioning that, even calling the execute() method directly on engine_1 would work just fine.

Example # 3: Select From Table + FetchAll( )

The purpose of this example, is to show you how to retrieve the entire dataset inserted into the TRANSACTIONS table with a SELECT statement, fetch all the rows in the dataset with fetchall() into data and save the content of the first column to a list named trx_ids:

I wanted to show you a straightforward example of how to use fetchall(), because, while building data pipelines, you might end up calling this method again and again in your code.

As expected, the query retrieved 1500 rows and the first five trx_ids in the list I created, were displayed:

INFO: snowflake.connector.cursor:query: [SELECT * FROM SCHEMA_NAME.TRANSACTIONS;]

INFO: snowflake.connector.cursor:query execution done

INFO: root:Number of rows retrieved: 1500

Output:

[3440508402, 482408596, 476831653, 3224085714, 3871714048]

Default Authentication

Connecting to Snowflake with sqlalchemy via default authentication employs exactly the same syntax as before and exactly like the snowflake.connector, you will need to replace USER_EMAIL with USER_NAME as well as specify a PASSWORD:

Example # 4: Select From Table + Convert To Pandas DF

In this last example, I would like to show you how to create pandas DataFrames using a sqlalchemy engine.

This can be easily achieved by passing both engine_2 and a sql query to the pandas read_sql() function:

Therefore, If you plan to work with pandas a lot, I would recommend you to connect to Snowflake with sqlalchemy to take advantage of these built-n methods.

Conclusion

In this article, you have learnt that two main ways exist to connect and interact with a Snowflake DB via Python, using different types of authentication.

The popularity of Snowflake is growing in the data industry, hence I hope the code templates I shared above, will help speeding up your daily task while querying Snowflake as Data Analysts, creating training datasets as Data Scientists or building automated ETL pipelines as Data Engineers.

Sources

Snowflake Connector for Python - Snowflake Documentation

The Snowflake Connector for Python provides an interface for developing Python applications that can connect to…

docs.snowflake.com

Python Connector API - Snowflake Documentation

String constant stating the type of parameter marker formatting expected by the interface. The connector supports the…

docs.snowflake.com

Using the Snowflake SQLAlchemy Toolkit with the Python Connector - Snowflake Documentation

Snowflake SQLAlchemy runs on the top of the Snowflake Connector for Python as a dialect to bridge a Snowflake database…

docs.snowflake.com

A Note For My Readers

This post includes affiliate links for which I may make a small commission at no extra cost to you, should you make a purchase