avatarChristianlauer

Summary

Google has launched Primary and Foreign Key features for BigQuery, enabling data integrity and the possibility to reproduce structures like Star and Snowflake schemes, but caution should be taken to avoid misuse.

Abstract

Google has recently introduced Primary and Foreign Key table constraints for BigQuery, a feature that has been standard in relational databases and other Data Warehouses. These keys are used to ensure data integrity, with primary keys being unique and not null for each row, and foreign keys present in the primary key column or null. After creation, tables can be modified using the ALTER TABLE ADD PRIMARY KEY or ALTER TABLE ADD CONSTRAINT statements. While this new feature allows for the creation of structures like Star and Snowflake schemes, caution should be taken to avoid misusing BigQuery as a relational database, as it is column-based and performs best with denormalized data.

Bullet points

  • Google has launched Primary and Foreign Key features for BigQuery.
  • Primary keys are unique and not null for each row, ensuring data integrity.
  • Foreign keys must be present in the primary key column or be null.
  • Tables can be modified after creation using ALTER TABLE statements.
  • New features allow for the creation of structures like Star and Snowflake schemes.
  • Caution should be taken to avoid misusing BigQuery as a relational database.
  • BigQuery is column-based and performs best with denormalized data.

Google launches Primary and Foreign Keys for BigQuery

How you can now use Primary and Foreign Key Table Constraints

Photo by Jeroen den Otter on Unsplash

In BigQuery, you now have the possibility to use primary and foreign keys when creating tables. In relational databases this feature is a standard, but also other Data Warehouses that combine this concept with column based systems and NoSQL offer primary and foreign keys. Now, Google is following suit with its Data Warehouse.

Both primary and foreign keys are utilized for providing data integrity. A primary key value is unique for each row and is not NULL. Each foreign key value in a row must be present in the primary key column of the primary key table, or be NULL[1].

Primary and Foreign Keys — Image Source: Scaler[2]

Also, after the creation with the CREATE TABLE statement you can provide existing tables with the ALTER TABLE ADD PRIMARY KEY statement, or the ALTER TABLE ADD CONSTRAINT statement if necessary with an appropriate key.

First of all good news, since you can now also reproduce structures like Star and Snowflake schemes in BigQuery from source systems. But one should also warn against misusing it, because using BigQuery as a relational database and its use cases makes no sense, and one should also consider adopting the schema of the source systems. BigQuery is column based and performs best with denormalized data. As I said, you can of course still do it but you should have costs and performance in mind, on the other hand you have data integrity as an argument, this must be assessed against each other.

Other recently added new features that have been added to Google BigQuery and might be interesting for you:

Sources and Further Readings

[1] Google, TABLE_CONSTRAINTS view (2023)

[2] Scaler, Foreign Key in SQL — Scaler Topics (2023)

Data Science
Technology
Google
Bigquery
Programming
Recommended from ReadMedium