Azure data factory v2

Azure Data Factory visual tools enabled in public preview few days back. In this post, let us see how the look and feel is and whats new in ADF v2.

From Azure portal, while creating Azure data factory we need to select the version as v2, once created click on Author & Monitor.




In my previous post, I had shared an example to copy data from Azure blob to Azure cosmos DB using Copy data wizard.



Let us see an example to copy data from Azure blob to Azure sql database using new UI.

Click on Create pipeline in the landing page as shown above:

If we click on + near Search Resources, we can create Pipelines, Datasets.
We can create Linked Service & Dataset before or while creating pipeline.

Linked service can be created in below flow:
Connections (see in the bottom of below image) -> Click New -> Select Source or Sink -> Test Connection -> Save




Once we create a Pipeline, we can see the list of Activities that can be dragged & dropped on to canvas.



Source dataset can be created in below flow:
Pipeline -> Click Activity -> in the bottom -> Source Dataset -> New -> Connection 

-> Select the Source Linked service (or Click + to create new linked service) -> in the left Pencil Icon (Edit) -> Publish

Note: We cannot close the dataset window without publishing the changes.

For this example, I have created Azure blob storage and created a container called myfolder and uploaded some files.

As shown in below image, we have an option to Test connection, Preview data & Detect format while creating dataset.

Next click on Schema to Import the schema's and publish the changes.




After creating the source dataset, again click on pipeline window and then in the bottom click on Sink to create the target dataset.

Sink dataset can be created in below flow:
Pipeline -> Click Activity -> in the bottom -> Sink Dataset -> New -> Connection 

-> Select the Sink Linked service (or Click + to create new linked service) -> in the left Pencil Icon (Edit) -> Publish

For this example, I have created a table within Azure SQL database.





Next click on Schema to Import the schema's and publish the changes.

We can also create parameters and write expressions if we have to handle some dynamic scenarios within pipeline.


After publishing the sink dataset changes, Click on Pipeline window -> Mapping (like we do the mappings between source and destination in SSIS packages)



In the Settings, we have an option to keep an intermediate staging while copying the data as shown below:



We can now validate the entire pipeline, to see if there are any problems in pipeline / dataset.



We can do a Test Run / Trigger the pipeline. In the output window -> Actions -> we can get the execution info.



We have successfully copied the data from blob to azure sql table using new ADF UI.

As shown in below image, we can create control flow between Activities using On Success / Failure / Completion branching.


In the monitor page, we can also have a look at Activity Runs info.


Another thing is if we click on {} code icon on the UI, we can get the corresponding JSON document of pipeline / dataset.

Initially we were able to provision the Azure-SSIS integration runtime using powershell commands, now we have an UI to do that.

System variables supported by Azure Data Factory

Reference: 

See Also: 

No comments: