Launching Matillion Data Loader

  1. Using Scale & Speed capabilities in a Serverless environment,
  2. Monitoring jobs from a central dashboard,
  3. Upgrade to Matillion ETL for Advanced Data Transformation, Data Integration, and Job Orchestration.
  1. Start building Data Pipelines
  2. Start loading Data into Snowflake.

From Source to Cloud Data Warehouse in a few clicks

In this blog I will show you my first steps on the Matillion Data Loader path. I will use MDL to load data from an Excel-sheet into Snowflake (SF). The Excel-sheet is located in a Google Cloud Platform (GCP) bucket.

GCP Cloud Credentials

Matillion Data Loader must be able to select the Excel-sheet from the GCP-bucket. There for it is necessary to specify the Google Cloud Credentials. These can be downloaded from GCP. Follow the steps in the; Getting Started with Authentication and download the JSON-file to your local environment. In MDL you can manage the GCP Cloud Credentials and upload the JSON-file.

Snowflake Destination Configuration

To be able to load data into Snowflake, you need to make sure that MDL can connect to Snowflake with the correct credentials with enough privileges (e.g. ‘CREATE VIEW’-privileges in SF).

Matillion Data Loader Pipeline

We need to create a Data Pipeline to get started. For the purpose of this blog, we choose an Excel-sheet as the Source. Snowflake will be the Target Destination Warehouse.

Add the Pipeline

We select Snowflake as the Destination Warehouse. Therefore we can select the Destination Configuration we created earlier. The Pipeline gets a meaningful name.

Add the Data Pipeline to Snowflake
Add the Data Pipeline to Snowflake

Google Cloud Storage

In a next step we need to provide the URI of the GCP-Bucket. . When you copy the URI it seems to include a space at the beginning. If you do not remove this space when you enter the Connection URL in Matillion Data Loader, you will get an error; “That URL is not a valid GS connection URL” . Copy the URI to the next step in the MDL Data Pipeline creation.

Google Cloud Storage Configuration
Google Cloud Storage Configuration

Google Cloud Connection Details

To be able to load the Excel-file from the Google Cloud Storage, we need to specify the Connection details. This means we select the Google Storage Type. After that we can select the GCP Credentials we created earlier. Finally we fill in the copied URI from the previous step. Make sure it has the right format and no spaces at the beginning.

Google Gloud Storage Connection details
Google Gloud Storage Connection details

Data Sources

Without specifying Connection Options, we go straight to te selection of the Data Sources. First click the ‘+’-sign and then select the appropriate sheet. Make sure you give the sheet in Excel a meaningful name. This name will be used to name the tabel in Snowflake.

Matillion Data Loader Data Sources
Matillion Data Loader Data Sources

Data Selection

In the following step, you select the columns you want to add to the table in Snowflake. By default all columns are selected. Also the empty ones.

Matillion Data Loader Data Selection
Matillion Data Loader Data Selection

Snowflake Configuration

We arrived at the fifth step in the Data Pipeline setup. Here we are going to tell Snowflake which table we are going to create by specifying a prefix. Combined with the sheet name, this will be the Target Table name. On top of that we specify which Warehouse to use and in which Database Schema we are going to create the table and load the data.

Data Pipeline Snowflake Configuration
Data Pipeline Snowflake Configuration

Verify the results in Snowflake

If everything works out like planned the MDL Data Pipeline will run and load the data into Snowflake. We can head over there and verify the results. As we can see there is a STG_CHANNELS table in the DAANALYTICS-database in the STAGING-schema. The columns match the columns specified in the sheet. Also the data is exactly the same as in the sheet. Apart from to Matillion-specific metadata columns.

Verify results in Snowflake
Verify results in Snowflake

Data & Analytics 📊 Consultant @daanalytics_nl & Partner @pongprof & Board Member @nl_OUG | Modern ☁️ Analytics #Oracle #Snowflake #Looker #Matillion #Fivetran