Modern data Pipelines. How to load data into Snowflake with Fivetran
Partner Enabled Data Loading with Fivetran into Snowflake
A Data Warehouse without data is useless. This is also the case for Snowflake. Even though it is one of the best Cloud Data Warehouses today. Loading data into Snowflake is easy. There are several ways to achieve this. There were some announcements during the Snowflake Summit 2019, so the possibilities increased and got better. In this blog I will show how to load data into Snowflake with Fivetran. Therefore I use the Partner Connect function and load a Google Sheet into Snowflake.
“Partner Connect simplifies the onboarding process through pre-built integrations with Snowflake’s technology partners. You can automatically provision and configure partner applications and start loading data into Snowflake in minutes for immediate analysis.”
- Business Intelligence (BI) tools allow customers to visualize insights from underlying data sources/databases.
- Data Integration (ETL/ELT) tools allow customers extracting data from the input source, transforming it, and eventually loading it into the final destination data source (ETL). With Snowflake, more modern tools can support Extracting, Loading, and THEN transforming in Snowflake (ELT).
- Data Security/Privacy tools can detect and monitor sensitive data in your source system, anonymize it, and then confirm it is not exposed within Snowflake.
Select the Partner
Make sure you are an Account admin. Click on Partner Connect in the Navigation Bar (or Ribbon).
Setting up a connection is very easy and straight forward. There are a few steps (review disclaimers, successfully connecting, activating, etc. ) which speaks for itself.
In the end, you will have a specific database, a virtual warehouse, a user and a role specifically setup for use with the (in this case) Fivetran.
For this setup, and because we used a similar setup during the training session at the Snowflake Summit 2019, I will load a Google sheet. A few things are important. The first row of the sheet should contain the column names. Then the Google sheet needs a Named Range. This marks the data (including the column names) which will be loaded.
Create Fivetran Connector
To be able to setup the Fivetran Connector, you’ll both need information from the Google sheet as well as Fivetran.
Share the Google Sheet to Fivetran
While setting up the connection in Fivetran, you will get an email address which you should use in the Google Sheet to share the sheet with Fivetran. This will yield a sheet id, which you will use in Fivetran the complete the connection setup.
Setup Fivetran Connector
Fill in all the necessary fields. Get the Sheet Id from Google. If the Sheet Id is correct and the sheet has been shared ok, you will be able to select the Named Range;
Validate Fivetran Connector
Finally Save & Test the created connector to see whether everything has worked out ok.
Everything worked out correct and the Fivetran Connector has run. In the Snowflake Database there will be a ‘PIPELINE_EXTRACT_INGEST’-table in the GSHEETS-schema, loaded with the data from the Google Sheet.
Check Snowflake result
I am able to query the same data as in the Google Sheet from Snowflake.
At the beginning of this post I promised to show how to load data into Snowflake with Fivetran. The setup was relatively easy. Within a few minutes Fivetran helps you setup the necessary objects in Snowflake and loads the data for you from the Google Sheet. This blogpost shows how easy it is to setup a modern data pipeline.
Thanks for reading.
Originally published at https://daanalytics.nl on June 20, 2019.