avatarConstantin Lungu

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

537

Abstract

lues (and they don’t have duplicates), go for UNION ALL instead of UNION DISTINCT (UNION for some other SQL dialects) to avoid redundant de-duplication.</p><p id="cdd5">In the example below, I’ve unioned two Google Trends tables — one that is only for US terms and another one for the rest of the world. Since one table only contains US and the other everything except the US, we know the union of the two tables to be distinct from the start, thus not needing the UNION DISTINCT.</p><figure id="f2d6"><img src="https://cdn-images-1.readm

Options

edium.com/v2/resize:fit:800/0*f_7wweJfDi_6rLDY"><figcaption></figcaption></figure><p id="0b1a">There’s no difference indeed for on-demand pricing (same amount of data scanned), but quite a difference for capacity pricing users ( 1/2 of slot usage).</p><p id="fb6f">So use UNION DISTINCT (and any other DISTINCT) sparingly and when you actually need it.</p><p id="3e5b"><i>Originally published at <a href="https://datawise.dev/why-you-should-use-union-distinct-sparingly">https://datawise.dev</a> on April 2, 2024.</i></p></article></body>

Why you should use UNION DISTINCT sparingly

Photo by Randy Fath on Unsplash

Let’s help BigQuery do less unneeded work!

If you’re UNIONING two sources known to have distinct values (and they don’t have duplicates), go for UNION ALL instead of UNION DISTINCT (UNION for some other SQL dialects) to avoid redundant de-duplication.

In the example below, I’ve unioned two Google Trends tables — one that is only for US terms and another one for the rest of the world. Since one table only contains US and the other everything except the US, we know the union of the two tables to be distinct from the start, thus not needing the UNION DISTINCT.

There’s no difference indeed for on-demand pricing (same amount of data scanned), but quite a difference for capacity pricing users ( 1/2 of slot usage).

So use UNION DISTINCT (and any other DISTINCT) sparingly and when you actually need it.

Originally published at https://datawise.dev on April 2, 2024.

Sql
Analytics
Data Engineer
Google Cloud
Bigquery
Recommended from ReadMedium