This is in reference to this Git repo https://dev.azure.com/mortimer-xyz/mortie23/_git/dbfs-cicd-pieline
This is based on a simplified version of the repo from Adam Paternostro. Read up on this for a much more feature rich version of this.
I only required a small subset of what Adam was doing so I created my own how to in the process. The following is specifically about syncing Databricks init scripts from a Git repo to DBFS.
There is a whole lot of things you need to get this to work. Most of it is documented elsewhere, so instead of re-writing it, I’ll reference it.
Add the content of this repo.
This will create it’s own service principal which we will be using later.
You’ll need to generate a secret for the service principal.
Give the service principal associated to the Azure Resource Manager service connection (create previously) service principal Secret management on the Key Vault.
Now, checking in our Databricks DBFS we see the folder and the scripts we copied.
If we modify the scripts and then
git add .
git commit -m 'message'
git push etc. We can configure the pipeline to run on a pull request completion to