avatarYeung WONG

Summarize

Unlocking the Power of PostgreSQL: A Guide to Python Database Interactions

Step by Step Walkthrough from Installation of PostgreSQL to Application with SQLAlchemy and psycopg2

Background

As a data scientist, you face data every day. In the past, I used to store data in CSV format and JSON if it was unstructured. However, I have recently started researching how to use database and have realized the power of using them. Nevertheless, I faced many difficulties and barriers when working with database, and the information I found on the Internet is sometimes fragmented or outdated. Therefore, to assist people like me who are new to use database, I prepared a comprehensive guideline to walk you through the process.

The article covers the following topics:

  1. Installation of PostgreSQL and tips for setup.
  2. Set up the Python environment.
  3. Python script to interact with PostgreSQL databases including
  • Writing to database
  • Reading a table
  • Executing SQL query
  • Deleting a table

Installation of PostgreSQL

First of all, go to https://www.postgresql.org/download/ and download the executable files. (I am a Windows user and therefore the upcoming tutorial will use Windows version as reference.)

  1. Click “Windows

2. Click “Download the Installer

3. Click the Download button

4. Open the Executable and Click “Next

5. Set the Installation Directory and Click “Next

6. Select Components and Click “Next

Remark: I recommend to select ALL. FYI, pgAdmin 4 is the GUI to visualize the database while Stack Builder helps you quickly install the modules and components to supplement PostgreSQL.

7. Set the Data Directory and Click “Next

8. Type the Password and Click “Next

Remark: You will need to use this password later on when working on on Python’s setup.

9. Set the port and Click “Next

10. Set the locale and Click “Next

11. Click “Next” to start the Installation

12. Done! The installation is completed.

Here, I would just skip the “Stack Builder” setup by unchecking the button and clicking “Finish” since I don’t have any modules needed to connect with Python.

Remark: Don’t worry! In case you want to install via Stack Builder in the future, you can find it in the Windows Start Menu.

Quick Walkthrough

Perfect! The database is installed. Let’s take a look at the GUI interface before connecting with Python.

1. Open “pgAdmin 4” from Windows Menu

2. This is how the interface looks like. Click “Servers

3. Since this is the first time we log-in, we have to type in the password. For me, I click the “Save Password” button and click “OK”.

4. So now you are able to see the structure of the PostgreSQL. As a starting point, let’s simplify the layout as below.

Databases> Schemas> Tables

Inside the Database postgres, we have one Schema called public. For now, you cannot find any Tables in it but later on, we will write the table using Python and then you can see them here.

Set Up Python Environment

Alright. Let’s move on to the Python environment setup.

1. Install Packages

Two packages are required — psyscopg2and SQLAlchemy. Type the below code in the Anaconda prompt.

pip install psycopg2 SQLAlchemy

2. Define the Config Parameters

ENDPOINT = 'localhost'
PORT = 5432
USER = 'postgres'
PASSWORD = '' # INPUT YOUR PASSWORD
DBNAME = 'postgres'
SCHEMA = 'public'
ENGINE_URL = 'postgresql://'+USER+':'+PASSWORD+'@'+ENDPOINT+':'+str(PORT)+'/'+DBNAME

3. Create the Engine

from sqlalchemy import create_engine
engine = create_engine(ENGINE_URL)

Done! That’s all we have to do for the preliminary set up.

Simple and easy, right?

Interacting PostgreSQL with Python

Finally comes to the main dish! In the following, I would go through most of the common actions one by one.

1. Write table to Database

First of all, let’s prepare a sample data for demo.

import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
print(df.shape)
df.head()

To create a table in PostgreSQL database, instead of specify the table columns format and structure, we can just simply use to_sql.

TABLE_NAME = 'demo'
df.to_sql(name=TABLE_NAME, con=engine, schema=SCHEMA, if_exists='replace', index=False)

Please note that for if_exists, there are three options and this is for the scenario when the table already exists. 1. fail — Raise a ValueError. 2. replace — Drop the table before inserting new values. 3. append — Insert new values to the existing table.

When we go back to the pgAdmin 4, you should see the Table demo there. In case you are not, click “Object”, click “Refresh” and click “Tables”.

To double check the data structure, right click “demo”, click “View/Edit Data” and click “All Rows”. Now, you should see the data visualized in the right hand side just like in the below panel.

2. Read table from Database

The next common usage would be reading the table from the database. We can use pandas functionread_sql_table.

TABLE_NAME = 'demo'
df2 = pd.read_sql_table(TABLE_NAME, con=engine)
print(df2.shape)
df2.head()

3. Run SQL Query

For most of the scenarios, we would like to run SQL query on the database. For example, I would like to find all records with A ≤ 10. We can leverage anotherpandas functionread_sql.

4. Delete table in Database

Lastly, if you would like to remove the table, you can follow below code.

import psycopg2
TABLE_NAME = 'demo'
conn = psycopg2.connect(host=ENDPOINT, port=int(PORT), user=USER, password=PASSWORD, dbname=DBNAME)
cur = conn.cursor()
sql = "DROP TABLE IF EXISTS " + TABLE_NAME
cur.execute(sql)
conn.commit()
cur.close()
conn.close()

Remark: If your pgAdmin 4 application is running during the time that you run the above deletion code, you may still find the table demo there even though you click the “Refresh”. Don’t worry! Just close the application and re-start it. You will then see the table is gone!

Conclusion

Alright, that’s all I have tested and learnt about using PostgreSQL with Python. I hope you guys learn something!

Today, I have gone through the steps of using Python with a popular and powerful RDBMS database. This is particularly useful when we are dealing with tabular data and want to store and extract it.

However, there are more and more unstructured data outside the world. In my next article, I would like to share my experience of using Python to interact with MongoDB, one of the most well-known NoSQL database. If you would like to know about it, stay tuned and happy coding!!!

WRITER at MLearning.ai / 28K+ GPTs / GPT alternatives / GPTs Hacks

Postgres
Postgresql
Python
Sqlalchemy
Ml So Good
Recommended from ReadMedium