Welcome back! This post is part four in a series that serves as a walkthrough on how to set up a basic copy data pipeline in Azure Data Factory. So far, we’ve gone through our first two steps, creating a SQL Database Linked Service and creating an Azure Data Lake Gen2 Storage Account Linked Service. Previous posts established that a linked service was simply a connection to a data source and/or a destination. Now we’re going to explore the concept of a dataset in Azure Data Factory and create one for each of our linked services.
What is a Dataset, and Where Does it Fit in?
A dataset is a virtual representation of a data item that is stored in a Linked Service. Datasets are a crucial pillar to Azure Data Factory. Without them, we would not be able to save and extract data from the linked services we specified. This post will explore how to configure a dataset to choose a specific table from a SQL Database, as well as how to read in CSV and Parquet files from an Azure Data Lake Gen2 Storage account.
A dataset is a virtual representation of a data item that is stored in a Linked Service. Datasets are a crucial pillar to Azure Data Factory. Without them, we would not be able to save and extract data from the linked services we specified.
How to Create a Dataset from a SQL Database Linked Service
Make sure your instance of Azure Studio is open.
1) From the Main page, select the ‘Author’ tab, find the dataset dropdown, and expand it.
2) Select the subfolder you would like to create your dataset under, hover over the ellipses, and select ‘Create New Dataset.’
3) The first dataset we will create is for the Cars SQL table in our training database. Search for and select Azure SQL Database. When prompted, name the dataset appropriately, the name should start with “ds_” followed by the name (ds_<<NAME>>).
4) Next, select the linked service we set up in section 2.3.1. Once loaded, you will see a new dropdown appear to select the table from the database.
5) For this example, we select dbo. Cars. Click ‘Create.’
Congratulations! You have now created a dataset to a table, utilizing a SQL Linked Service.
If you would like to explore options in partnering with Tallan to help build out your businesses cloud data analytics platform, please reach out to me at Conner.Wulf@tallan.com or connect on LinkedIn.
Click here to view all of Tallan’s latest offerings, and find what’s right for your organization.