Automated publish (CI/CD) for Azure Data Factory using DevOps

ADF team recently announced that they support automated publish for ADF and it was something I wanted for a while in my day today ADF works. Before that when someone wants to create a release in DevOps, that person must go to the master branch and click the publish button manfully to generate ARM template files which are used within ARM template deployment in DevOps. With this announcement, whenever you make a pull request to your collaboration branch, in most cases the master branch, a new build is created and hence manual intervention is not required anymore.  You can read more about this in below Microsoft documentation.https://docs.microsoft.com/en-us/azure/data-factory/continuous-integration-deployment-improvements/?WT.mc_id=DP-MVP-5004277

The reason why I wanted to write this post is that I felt that this documentation is not comprehensive enough so that someone new to DevOps cannot read this document and configure it. Therefore, I’m going to create a step by step guide for this.

In my case I have my master branch and a feature branch in my dev ADF instance.  The idea is whenever I make a pull request from my working branch (“AsankaP”) to master, it should trigger a new release and push the release to QA.

image
Branch set up in Dev ADF

Folder Structure of the Repository

To set up the build pipeline , it needs to understand how I have set up the folder structure of my repository. Within my Moana repository, I have created a folder called “ADFIntegration” and used that to store all the ADF artifacts. In other words my root folder is “ADFIntegration.”

GIT Repository configuration of Azure Data Factory

Within that root folder, I have created a folder called “DevOpsFiles” to store all the files required in this build pipleline setup.

Repository File Structure

STEP 1: Copy package.json file to your master branch repo.

Copy and paste below JSON code and save it as package.json inside your dev repository. Although You can save this file inside any location within repo, ,create a folder within your ADF repository folder and save it inside that fodder to avoid any confusions. In my case, I stored it within the “DevOpsFiles” folder. Even though the nmp package version is 0.1.5 by the time I update this post, it might have changed over time. Check for the latest version of the ADF nmp package using bellow link and change the version to the latest stable version.

Check for latest nmp version: https://www.npmjs.com/package/@microsoft/azure-data-factory-utilities

{
    "scripts":{
        "build":"node node_modules/@microsoft/azure-data-factory-utilities/lib/index"
    },
    "dependencies":{
        "@microsoft/azure-data-factory-utilities":"^0.1.5"
    }
} 

Store Package.json file in repo

STEP2: Creating a Build Pipeline

Go go the DevOps and click on Pipelines and the create a new pipeline. Select Azure Repo as your code location and select where you want to save the YAML file.

image
Select Azure Repo Git for YAML file location

Then you need to select in which repo you want to save the YAML file, I select the same repo I have my ADF code.

Select the same repo as ADF repo

Then it will ask how you want to configure your release and select “Starter Pipeline”.

Select Starter Pipeline

Replace the default template YAML code with the code below and change the area I have mentioned within square brackets. Read the comments in the YAML code to understand where you need to change information.

You will need to change:

  • Azure Subscription Id
  • Dev ADF Instance Name
  • Root folder name [folder location of ADF Artifacts]
  • Folder Location of the package.json file

Apart from that, if your main branch is not master, change “master” to that branch name in trigger section.

# Sample YAML file to validate and export an ARM template into a build artifact
# Requires a package.json file located in the target repository

trigger:
- main #collaboration branch [if the branch is not main, change it to the respective branch name]

pool:
  vmImage: 'ubuntu-latest'
steps:

# Installs Node and the npm packages saved in your package.json file in the build

- task: NodeTool@0
  inputs:
    versionSpec: '10.x'
  displayName: 'Install Node.js'

- task: Npm@1
  inputs:
    command: 'install'
    workingDir: '$(Build.Repository.LocalPath)/ADFIntegration/DevOpsFiles' #replace with the package.json folder, if package file is saved within root foder, remove "DevOpsFiles". if there is no root folder, only keep Build.Repository.LocalPath


 
    verbose: true
  displayName: 'Install npm package'

# Validates all of the Data Factory resources in the repository. You'll get the same validation errors as when "Validate All" is selected.
# Enter the appropriate subscription and name for the source factory.

- task: Npm@1
  inputs:
    command: 'custom'
    workingDir: '$(Build.Repository.LocalPath)/ADFIntegration/DevOpsFiles' #replace with the package.json folder, if package file is saved within root folder, remove "DevOpsFiles". if there is no root folder, only keep Build.Repository.LocalPath

    customCommand: 'run build validate $(Build.Repository.LocalPath)/ADFIntegration /subscriptions/[resource group id of your ADF instance]/resourceGroups/testResourceGroup/providers/Microsoft.DataFactory/factories/[name of your ADF instance]' # Change "ADFIntegration" to the name of your root folder, if there is not root folder, remove that part and keep Build.Repository.LocalPath only.

  displayName: 'Validate'

# Validate and then generate the ARM template into the destination folder, which is the same as selecting "Publish" from the UX.
# The ARM template generated isn't published to the live version of the factory. Deployment should be done by using a CI/CD pipeline. 

- task: Npm@1
  inputs:
    command: 'custom'
    workingDir: '$(Build.Repository.LocalPath)/ADFIntegration/DevOpsFiles' #replace with the package.json folder, if package file is saved within root folder, remove "DevOpsFiles". if there is no root folder, only keep Build.Repository.LocalPath


    customCommand: 'run build export $(Build.Repository.LocalPath)/ADFIntegration /subscriptions/[resource group id of your ADF instance]/resourceGroups/testResourceGroup/providers/Microsoft.DataFactory/factories/[name of your ADF instance] "ArmTemplate"'  # Change "ADFIntegration" to the name of your root folder, if there is not root folder, remove that part.


  displayName: 'Validate and Generate ARM template'

# Publish the artifact to be used as a source for a release pipeline.

- task: PublishPipelineArtifact@1
  inputs:
    targetPath: '$(Build.Repository.LocalPath)/ADFIntegration/DevOpsFiles/ArmTemplate' #replace with the package.json folder,if package file is saved within root folder, remove "DevOpsFiles". if there is no root folder, only keep Build.Repository.LocalPath

    artifact: 'ArmTemplates'
    publishLocation: 'pipeline'

Once all the modifications are done, click “Save and Run” to see whether build pipeline is working correctly. If there is no error, it should generate artifacts in build folder. In my case ARM Template folder.

Build pipeline execution

If you click on the highlighted section.”1 artifact produced”, you should be able to see generated ARM template files by the build.

ARM template files generated from the build pipeline

STEP 3: Create a Release Pipeline

Next step is to create a release pipeline. Go to Releases in DevOps and create new release pipeline. Select Empty Job to start with.

Select empty job in release pipeline.

Click on Add an Artifact and select build as source type. Select the name of the build pipeline you created in previous step. You can keep Latest to the Default Version or can change based on your release requirement.

Select Build as source type

Then click on “Continuous Deployment Trigger” and enable continuous deployment as below. This will make sure when a new build is available, a release is trigger from the release pipeline. Make sure you select correct build branch. In my case master branch.

Enable continuous deployment trigger

Then go to stage and add an agent job. In the Agent Job, click + icon and search for ARM template. Select ARM template and click Add.

Add ARM template deployment task to agent

In ARM template deployment provide Subscription Information, Azure Resource Manager connection and Resource group. By this time, you should have create a ADF instance for QA and Prod. In my case QA ADF instance is asankap-QA

Click on Template and select “ARMTemplateForFactory.json” as the template file as below.

Select template file

Click on “Template parameters” and select “ARMTemplateParametersForFactory.JSON” as parameter file.

Select Template parameter file

Then click on Override template parameters and set the parameters manually. Unfortunately, since build is only accessible in run time, parameters are not generated and you will have to add them manually one by one. Check below image to see how to add parameters you want to override. You can get these parameter names from the JSON you specified previously. Make sure you change the factoryName to your QA or Production ADF instance name.

Manually adding template parameter values.

Save the release pipeline and do a modification to in DEV ADF instance and make a pull request to update the master branch.

Create a pull request to trigger a release

If you go and check DevOps, you will see that the build pipeline is triggered and once the build is ready, the release pipeline is triggered to move changes QA ADF instance.

Build pipeline is triggered by the pull request.

Release pipeline is triggered and a release is created.

I hope this post helps you to create fully automated release pipelines for your ADF environment. Thanks for reading and stay safe! Cheers!

10 thoughts on “Automated publish (CI/CD) for Azure Data Factory using DevOps

  1. porridge111 April 22, 2021 / 5:19 pm

    Thanks for the thorough tutorial! One question – do you publish changes to Dev through the release pipeline too? I tried to do that, but it caused git-mode to become disabled 😕

    Like

    • Asanka Padmakumara April 22, 2021 / 6:57 pm

      Not sure how you have set up the environment here. You don’t have to publish changes to Dev as you used Dev instance to create all the artifacts. Having ARM templates in Dev does not help as well.

      Like

  2. zemag14 April 27, 2021 / 1:59 am

    Thank you for taking the time to put a more comprehensive set of steps to implement CI/CD. Both pipeline build and release run successfully, however after I check my QA ADF environment, I don’t see any of my pipelines (I started with a blank repo and have been creating a simple pipeline with a wait activity for testing purposes). Not sure if it’s an access issue, or ARM Templates aren’t updating?

    Like

    • Anders Knudsen July 16, 2021 / 2:04 am

      I have the same issue. Any updates on this?

      Like

    • Aman Jain September 4, 2021 / 5:38 pm

      same issue any method you get???

      Like

  3. Sushanta Meher May 26, 2021 / 9:14 pm

    Great explanation Asanka! I have one question, if I need to use Gitlab CICD instead of Azure DevOps, how would I implement this?

    Like

  4. Anders Knudsen July 16, 2021 / 2:42 am

    Hey Asanka.
    I really enjoyed your tutorial on CI/CD for ADF, but I have one question. It seems like in the artifact produced, it does not contain any of the pipelines. The “resource” section of the ARM template for the ADF produced in the build pipeline is empty, eventhough a pipeline is found in the dev ADF.
    I can see when I do it manually by-hand, the “resource” section of the ARM template for the ADF in both my collaboration and adf_publish branch contains have my pipelines listed as resources under “resource” in the ARM template.
    Is this anything you can help me correct?
    Best regards Anders

    Like

  5. dasaradh reddy August 9, 2021 / 3:16 pm

    Hi Asanka, Nice article. Can you share any public git repo . helps to execute above steps. Thanks

    Like

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s