Google launches new Data Service Datastream
New Tool for Seamless Replication from Databases to BigQuery

Google already offers data integration options for BigQuery with Dataflow, Data Prep and the Data Transfer Service, but with Datastream it should now become even easier.
While Data Flow needs a lot of programming and Dataprep is more used for data preparation and the Data Transfer Service only offers some data sources, Datastream is supposed to be a simple replication service for relational databases.
Therefore, Datastream uses BigQuery’s Change Data Capture (CDC) functionality and Storage Write API to efficiently replicate updates directly from source systems in near real time [1].
This of course brings some advantages like near real-time insights in BigQuery, serverless ELT/ETL data pipelines that scales automatically, with no resources to provision or manage. Google Datastream also ensures source schemas change. Datastream seamlessly handles schema drift and automatically replicates new columns and tables added in the source to BigQuery. For now it’s available for: [1]
- MySQL
- PostgreSQL
- AlloyDB
- Oracle databases
The service is still in preview but it seems that everyone also without a company account can already try it out — you just have to enable the API.

Afterwards, you can then create your first stream into Google Data Warehouse BigQuery the process is relatively straight forwards, also find the documentation and a video in the Google Source below.

After Google has already brought out some interesting innovations this year in the area of data analytics and especially BigQuery, now follows a feature that many of us have been eagerly awaiting. If the source could not be accessed via Data Transfer Service or the new service BigLake via Cloud Storage, a more complex replication via Dataflow or a third party provider like talend and co. was needed.
Here, you can simply integrate relational databases via SaaS CDC, so to speak. This should ensure that Data Engineers and Data Scientists spend significantly less time on data integration and can concentrate on more value-adding activities.
For other data sources like BigTable, Google has also recently simplified data integration via Zero ETL Approach — read more here:
Sources and Further Readings
[1] Google, Datastream for BigQuery (2022)
