Loading BigQuery into Snowflake with Matillion Data Loader

Image for post
Image for post

Last week I wrote about the newly released Matillion Data Loader (MDL). Matillion Data Loader is a Code-free, Wizard-based Pipeline builder. My first MDL Data Pipeline was loading an Excel sheet into Snowflake. Creating Data Pipelines in Matillion Data Loader is relatively easy and straightforward. Time to try something else. This time I will be loading BigQuery (GBQ) into Snowflake. Again I will show how to go from Source to Cloud Data Warehouse in a few clicks.

Add Pipeline

First add the Google Big Query Pipeline to select data from GBQ.

Add Google Big Query Data Pipeline
Add Google Big Query Data Pipeline

Destination

Add Snowflake Cloud DWH destinat
Add Snowflake Cloud DWH destinat

To be able to load data into Snowflake, you need to make sure that MDL can connect to Snowflake with the correct credentials and with enough privileges. Therefor you can re-use a Snowflake Destination Configuration if you created one earlier. I re-use the one I created for the previous blogpost.

Connection details

Google Big Query Data Set
Google Big Query Data Set

Google Big Query

If you want to load data from Google Big Query, then you should specify the following:

  • Google OAuth — As I wrote in my previous blogpost, you will need Google Cloud Platform Credentials to connect to Google BigQuery. Follow the steps in the; Getting Started with Authentication and download the JSON-file to your local environment. Manage the GCP Cloud Credentials in MDL and upload the JSON-file.
  • ProjectId — In this case; ‘daanalyticsproject’
  • DatasetId — In this case; ‘new_york_citibike’
Google Big Query Connection Details
Google Big Query Connection Details

Data Sources

By selecting DatasetId; ‘new_york_citibike’ I can select ‘citibike_stations’ and the columns in this Data Source.

Matillion Data Loader Data Sources
Matillion Data Loader Data Sources

Staging & Target Configuration

In this step I specify how the tables are named and created in Snowflake. Both a staging,- as well as a target table will be created. By specifying a prefix in combination with the Data Source Name, you will get the Snowflake Table name. On top of that we specify which Warehouse to use and in which Database Schema we are going to create the table and load the data.

Matillion Data Loader Staging & Target configuration for Snowflake
Matillion Data Loader Staging & Target configuration for Snowflake

Finish

Almost all steps are performed. In the last step I can check wether Matillion Data Loader is ready to retrieve rows from the citibike_stations table in Google Big Query. As you can see on the right clicking the ‘Test’-button resulted in a ‘Success’-message.

Finish the Matillion Data Loader Data Pipeline
Finish the Matillion Data Loader Data Pipeline

Verify the results in Snowflake

In MDL I received a message that 936 rows have been loaded into Snowflake. I can verify that in Snowflake. In Snowflake I can see that there are also 936 rows in the TGT_CITIBIKE_STATIONS-table.

Verify the results in Snowflake
Verify the results in Snowflake

Snowflake History

The Matillion Data Loader job performs a series of steps in Snowflake. These steps can be checked via the History-tab in Snowflake.

Checking the Snowflake History
Checking the Snowflake History

Summary

Just as in the previous blogpost creating a Data Pipeline in Matillion is relatively easy. Also for the Google Big Query Pipeline it needs some preparation at both ends. After that it is a matter of selecting and configuring. Most things speak for itself. One downside of MDL is that you cannot reconfigure a Data Pipeline. If something is not working like expected, you need to start all over again. Luckily enough creating Data Pipelines in Matillion Data Loader is not that hard.

Thanks for reading.

Get started with Matillion Data Loader: https://dataloader.matillion.com

Originally published at https://daanalytics.nl on March 8, 2020.

Written by

Data & Analytics 📊 Consultant @daanalytics_nl & Partner @pongprof & Board Member @nl_OUG | Modern ☁️ Analytics #Oracle #Snowflake #Looker #Matillion #Fivetran

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store