(Data/Power BI) Governance Sessions at SQLBits 2023

This week is SQLBits 2023 week. I will start my journey tomorrow and in preparation I had a look at the schedule to find some interesting sessions in the Data / Power BI Governance area.

There is a huge list of sessions and even training days in the context of data governance planned in this years conference.

Training Days

  • Building an end-to-end and open solution to monitor and govern your entire data estate (Dave Ruijter & Marc Lelijveld)
  • Govern your Power BI environment (Hylke Peek)

General Sessions

Thursday

  • Power BI governance, disaster recovery and auditing (Alex Whitles)
  • Power BI Governance quick start (Asgeir Gunnarsson)
  • Managing access to data sources in your data estate with Microsoft Purview (Erwin de Kreuk)
  • Integrate Data Quality into your processes (Tillmann Eitelberg, Oliver Engels)

Friday

  • Unity Catalogue and Purview: Data Governance Bedfellows (Barny Self)
  • Govern your self-service integration process (Tillmann Eitelberg, Oliver Engels)
  • Business Benefits of Good Governance (Victoria Holt)
  • Maximize the business value of data with Microsoft Purview Data Governance (Blesson John, Gaurav Malhotra)
  • Microsoft Purview Data Governance: Updates & Roadmap (Gaurav Malhotra)
  • Build your trust in your data. Why data governance is the key to success (Johan Ludvig Brattas, Marthe Moengen)
  • Meet the PG: Microsoft Purview (Gaurav Malhotra)

Saturday

  • Extended Governance of Power BI with Microsoft Purview (Oliver Engels, Gabi Münster)
  • The Power BI secret sauce: Security and governance (Kasper de Jonge)

I hope to attend as much as possible of those sessions. If you are attending SQLBits too, I am happy to connect and chat! Just stop me..

Treat your data better and happy conference,

Wolfgang

Advertisement
Posted in Azure Purview, Conference, Data Governance, InformationSharing, Microsoft Purview | Leave a comment

One Way to Try Microsoft Purview (Data Governance) for Free – BUT…

One of the most complained things about Microsoft Purview Data Governance is the missing of a development/trial option. The pricing of Microsoft Purview is based on different usage types – including Data Map Population, Data Map Enrichment and Data Map Consumption.

Quite a price for “just” a demo/test environment

Erwin (t | b) blogged about the different pieces that form the total pricing (blog post: Updated Microsoft Purview pricing). Erwin also includes a pricing example and at the time of his writing, the minimum costs were set to (a minimum requirement) of 1 CU (=Capacity Unit) for the base data map itself (not including any scans or report generations) => costing 284,7€/month. When you add the costs for scanning, insights generation and advanced resource sets, this could easily sum up heavily. And was not suitable for a small demo environment.

But something changed, you can now try Purview (almost) for free

It was the blog post describing a “Low-cost solution for managing access to SQL health, ..”. That blog post mentions a completely new information to me – that the Microsoft Purview Data Map Consumption is free of charge for a metadata storage size below 1 MB. I checked the Purview pricing page and yes – the included Quantity with 1 MB is mentioned there (OH “The first 1 MB of Data Map meta data storage is free for all customers”).

And my reaction was – Nice, very nice.. I can try and create Microsoft Purview instances for free and test new features..

BUT: I wanted to be sure and check, how much metadata (sources, scan results, data assets, classifications) can fit into 1 MB of metadata.

Let’s try it – create a new Purview account

I immediately went over to the Azure Portal and created a new Microsoft Purview account – let’s name it purviewSizeTest.

And feed the data catalog with some demo data. I went for AdventureWorks as it is a small demo database including some tables. If you do not know how to get Adventureworks sample data into an Azure SQL Database, Koen (t) wrote a nice how-to post (https://www.mssqltips.com/sqlservertip/7565/adventureworks-database-installation-azure-sql-database/).

I created two sub-collections in my data map and registered the Azure SQL Database (including a database with AdventureWorks) in my data map.

Next, I defined a scan definition referencing the system defaul scan rule set for Azure SQL (I configured the scan without Lineage extraction) and ran the scan once.

The scan itself ran for ~7 minutes and discovered and ingested 26 data assets into the data map.

How to check the Microsoft Purview Data map size

Now the big question was – how much meta data was produced by these two sub-collections, the Azure SQL source, the scan plus the 26 data assets discovered?

You can check your Data Map Storage Size in the Azure properties (Overview) of your Purview account. In my case the storage size was not reflected and updated immediately – I think it took about 6 hours that the storage size was displayed here. And guess what, the storage size was < 1 MB! 😉

So it is free, right… !? hmm.. is it really free?

After the Storage size check, I needed to check the pricing. And guess what.. the test was (almost) free. Why only almost and not free as I mentioned above?

Well, the data map consumption is free for the small set of meta data, but scans still cost you some € / $. In my case the scan of my Adventureworks database lasted 1.08 vCore hours and cost me ~0.68 € (pricing is 0.598€/1 vCore Hour).

And even after a week of testing, there is no additional pricing entry in my Azure costs for that specific Purview account.

Sounds nice, BUT …..My findings are:

  • You do not get a completly free Microsoft Purview account – Only Data Map Consumption for a small (really small) set of meta data is free (the first 1 MB is free).
  • Data Map Population (Scans, ingestion, classification) and Data Map Enrichment (Insights Generation & Advanced Resource Sets) are NOT free.
  • Purview Applications could also cost you some money (most of them are still in preview right now and free).

=> Be aware, you do not get a completly free Microsoft Purview instance (initial costs for the scan) but you can create an environment containing an Azure SQL source + AdventureWorks database to demo Microsofts Purview features with no running costs.

Pricing example based on the current Microsoft Purview pricing as of 2023-02-09.

Posted in Azure Purview, InformationSharing, Microsoft Purview | 1 Comment

Data Community Austria Day 2023 – Wir leben noch!

Usually, January is the month when we traditionally held our #SQLSatVienna / #DataSatVienna or the #DataCommunityAustria Day. It’s been a long time (a year) since we got together for a big data event in Austria.

Although the covid situation in Austria and around the world got different compared to the years 2021 and 2022, we decided to NOT have an in-person event in January 2023. But we did not wanted to let the #DataCommunity in Austria be alone and wait for the next event to come (or not)

So we got together, asked some of our speaker friends around the world and organized another edition of the #DataCommunityAustriaDay! It will happen on Friday, 13th Januar 2023, starting at 9am CET.

  • 12 sessions
  • 2 tracks
  • 12 speaker
  • Fun, come together, learn and share

You can find more information at our UG webpage: https://sqlusergroupaustria.wordpress.com/2023/01/05/data-community-austria-day-2023/

Register here: https://www.eventbrite.at/e/data-community-austria-day-2023-tickets-505907903157

Posted in Azure, Conference, Data Community, PowerBI | Leave a comment

Microsoft Purview in the Year 2022 – A Recap Tour

A lot happened in the Microsoft Data Governance area, especially in the area of Microsoft Purview Data Governance.

Let’s go back in history and wrap up the announcements that have been made in Microsoft (Azure) Purview area. My main source for this summary is the Security, Compliance and Identity Blog.

TL;TR

There is a video summarizing the news

Remember: In the beginning of the year 2022, Microsoft Purview as still named Azure Purview.

January 2022

February 2022

March 2022

April 2022

May 2022

July 2022

October 2022

November 2022

December 2022

Quite an impressive list – What was your favorite feature of Microsoft Purview in 2022?

Happy data cataloguing & treat your data better!

Wolfgang

Posted in Azure Purview, Data Governance, Microsoft Purview | 1 Comment

Ignite 2022 – Azure Data Platform Update

It’s the week of Ignite 2022 and as in the last years, many new features and updates have been announced. It’s hard to catch up with all the tweets, blog posts and similar things so I will recall the good old tradition of the Ignite Book of News.

https://aka.ms/ignite-book-of-news

This page lists all the announcements in one place, so head over there and browse through the list. If you do not want to read, you can have a look at the 8 minute summary video I recorded to talk about the Azure Data news.

Have fun and #treatYourDataBetter

Wolfgang

Posted in Azure, Azure Purview, Azure Synapse Analytics, Business Applications, Business Intelligence, Cloud, Conference, DataNews, Microsoft Purview, PowerBI | 1 Comment

Microsoft Purview and Data Governance at PASS Data Community Summit

Yesterday the session catalog for the hybrid PASS Data Community Summit 2022 was released. I am now allowed to talk about my sessions that got selected for this event. It looks like there is a little bit of Data Governance and Microsoft Purview focus 😉

  • Ask us Anything – We (Victoria, Erwin, Richard and myself) will be on stage and answer your questions about Data Governance and Microsoft Purview. If you have any questions you want us to answer, please submit them here: https://forms.office.com/r/dTP38LnmsJ
  • When the Standard UI is not Sufficient – Microsoft Purview & Apache ATLAS API. Many use cases in Microsoft Purview are available in the web UI, but sometimes you need to make more (i.e. add custom lineage/properties or query the content of your data catalog). In those cases, the Apache ATLAS API can be used.

Registration for the event is open and the Early bird price ends at July 13th -> https://passdatacommunitysummit.com/

Govern your data and let’s meet at PASS Data Community Summit!#

Wolfgang

Posted in Azure Purview, Conference, Data Community, Data Governance, InformationSharing, Microsoft Purview | Leave a comment

DataGrillen 2022 – Data Governance Session

Last week was DataGrillen 2022 time. An in-person event and it was fun. Big kudos again to the organizers – Ben and William for another edition of a relaxed and perfectly organized event!

I had the pleasure to talk about one of my favorite topics – Data Governance with Microsoft Purview. As the event was a little bit focused on BBQ (Grillen in german) I thought – Why not change the topic a little bit and focus on the real important things of life: How to find the best food (beef) in your organization.

Title slide of my presentation

All in all, the session went well. I enjoyed giving the presentation and according to the session feedback, the attendees did like it too! Thanks again for attending my session (and providing feedback)

The slides are available here – please download them and if you have any questions about Data Governance, Microsoft Purview or you want to start you Data Governance journey and need help: do not hesitate to contact me!

Finally, after more than 2 years, I had a chance to met my mentee Nikola Ilic for the first time “in real life”. We’ve spent hours and hours in virtual meetings, but finally in Lingen we met.

Some pictures from the event:

Happy data cataloguing,

Wolfgang

Posted in Azure, Azure Purview, Conference, Data Community, Data Governance | Leave a comment

(Dynamic) Data Lineage in Microsoft Purview

One of the most important things to better know a data estate is to investigate into the data lineage topic.

In a non scientific definition, data lineage defines …

  • Where does my data come from?
  • Where does my data go to?
  • and What happens on the way?

In this blog post, you’ll see how data lineage from Azure Data Factory and Synapse pipelines are pushed into the Microsoft Purview. There is also a new (at least I found out about it some days ago) functionality that brings metadata-driven, dynamic pipeline runs into the lineage information in your Purview data map.

TL;TR – There is a video for all of that

Data Lineage in Microsoft Purview

In Microsoft Purview, the lineage view allows you to get to know more about the lineage of a certain data asset. Within the chain of your data supply chain, there are different shapes that Purview puts together.

Example of Purview data lineage

First, there are data stores like Azure Data Lake Storage accounts or Azure SQL Databases that store information and within the Purview context, contain data assets. The Microsoft Purview data map pulls this information from the sources during scan processes.

Second, there are transformation steps that connect data assets. In the example above there are two transformations shown (both are Azure Data Factory (ADF) pipelines – to be more specific ADF Copy activities). In the context of Purview, lineage information is pushed from ADF into the Purview data map.

How to get ADF/Synapse lineage into Microsoft Purview?

If you have not connected your Azure Data Factory (ADF) / Synapse workspace with your Purview account, you might start with the registration of a new source in Purview. BUT – there is no ADF source available in the Purview data governance portal.

No ADF source in Purview

In order to connect ADF with Purview, you need to start in ADF. Within the Manage menu, there is a Microsoft Purview section. In there, just connect your ADF instance with an existing Azure Purview account.

Already connected ADF with a Microsoft Purview account

This connection comes with two (main) options – search your Purview data catalog from ADF and the lineage push from ADF into Purview.

Search your Purview data catalog directly from ADF

Are there any additional requirements that lineage is pushed from ADF into Purview?

No. Simply speaking no. You only need an ADF Pipeline containing a copy activity. And – you need to run that pipeline at least once.

Every execution of a pipeline (in an ADF instance connected to Purview) pushes lineage information into the Purview data map. You can check that by a drill-down into the activity run history. Every copy activity execution gets a new icon (lineage status) that indicates the lineage push into Purview. In some cases, the lineage push does not work because of some limitations / requirements for the copy activity (see more in the documentation – https://docs.microsoft.com/en-us/azure/data-factory/tutorial-push-lineage-to-purview)

But what about Dynamic / Meta-data driven pipelines in ADF and Synapse?

In many of our projects, we do not develop ADF pipelines that are “hard-coded” – i.e. that are configured to copy data from a fixed-defined source into a fixed-defined destination. What we do instead is to use metadata driven pipelines.

Within these metadata driven pipelines, usually a lookup activity is used to get the list of objects to load and a ForEach activity loops over the items to load. I will not go into details of metadata driven pipelines (there are tons of blog posts out in the thing called internet).

a simple metadata driven ADF pipeline
ADF copy activity using a data set with parameters. The values are set with the current iteration-item (@item())

There was one problem with this kind of pipelines and their lineage push into Purview: it simply did not work. Dynamic Copy activities were not supported and the lineage of these pipelines did not appear in the Purview data map.

But fortunately this changed a few days ago. I have not seen any public announcement of that feature but for me (and my colleagues) this is huge gamechanger in the integration of ADF/synapse and Microsoft Purview.

Let’s run the example pipeline from the screen shot above and view the monitoring information of that run.

Monitoring log of an execution of the dynamic pipeline

What we can see is, that there are three executions of the Copy activity and all of them push the lineage information (indicated by the icon in the column Lineage status) into the Purview data map.

Lineage view in Microsoft Purview

Let’s head over to Microsoft Purview, search for the pipeline and … hmm… there is no lineage information available.. That’s because the pipeline itself does not expose lineage information – it’s the copy activity that does the work!

No lineage information for a ADF Pipeline in Microsoft Purview

If you open the data asset of the copy activity, we’ll be successful! Lineage is there. And with that – it’s the dynamic (metadata) lineage shown here.

Lineage in Microsoft Purview – dynamic, metadata lineage is here!

That’s it for the first look at the dynamic data lineage support in Microsoft Purview. I have not tested it in more details, but as a first conclusion I am really happy that dynamic copy activities are finally supported!

Posted in Azure, Azure Purview, Azure Synapse Analytics, Microsoft Purview | Leave a comment

Azure Purview is now part of Microsoft Purview

I was ready for a nice relaxing evening today, when an email appeared in my inbox “Azure Purview is now Microsoft Purview!”

Initially I thought… yeah.. “just another Microsoft product name renaming” .. but when I read through it more in depth I found out, that this is NOT just a renaming.

The email I found in my inbox

Microsoft Purview = Azure Purview + Microsoft 365 Compliance portfolio

Microsoft Purview is the (new) name for a comprehensive set of products to govern, protect and manage your entire data estate. From tracking your data sources and their dependencies to managing data compliance regulations, data loss and data risks.

According to https://www.microsoft.com/en-us/security/business/microsoft-purview, Microsoft Purview helps you to

  • Understand and govern data
  • Safeguard data, wherever it lives and
  • Improves risk and compliance posture
source: https://www.microsoft.com/en-us/security/business/microsoft-purview

You can also watch the intro-video of the Microsoft Purview team:

Can you tell me your name please?

Microsoft Purview is no new service or product, it combines already existing services under one single umbrella. For a better reference I reference the new-name-cheatsheet from https://www.microsoft.com/security/blog/2022/04/19/the-future-of-compliance-and-data-governance-is-here-introducing-microsoft-purview/

The new names of the pieces that put together Microsoft Purview.
Source: https://www.microsoft.com/security/blog/2022/04/19/the-future-of-compliance-and-data-governance-is-here-introducing-microsoft-purview/

Not TL;TR – if you need more information

What’s next?

Well, this announcement needs some time on my side to get a better overview and deeper insight about the compliance part of what is now called Microsoft Purview. Stay tuned for some more blog posts and maybe some videos about it.

Love your data, #DoDataBetter and join the #TeamDataGovernance,

Wolfgang

Posted in Azure Purview, Cloud, Data Governance, Microsoft Purview | 2 Comments

Azure Purview Announcements March 2022

Hi members of #TeamDatagovernance!

It’s SQLBits week and I have the feeling that the Azure Purview team released some new features I want to summarize and give you some pointers to the announcement posts and documentation:

Dynamic SQL Lineage

For me, Data lineage is one of those fascinating techniques to better understand your data estate and get a better knowledge how systems are connected and what data flows are there in your data landscape.

Lineage was there in Azure Purview since the beginning (Azure Data Factory, SSIS lineage, Power BI) but this week another very important part of data lineage was put into public preview: Dynamic Lineage Extraction from Azure SQL Databases.

There are different ways how data lineage can be extracted from systems – one of them is static code analysis. The static approach includes all the CREATE PROCEDURE / CREATE VIEW statements and summarizes them into a lineage graph. This approach is powerful but there is more going on than the initially defined DDL statements like the execution of dynamic SQL statements.

The approach that Azure Purview implements is the dynamic lineage extraction (announcement post) that incorporates all the actions that happen in the database.

static versus runtime/dynmic analysis (source: https://azure.microsoft.com/en-us/blog/introducing-dynamic-lineage-extraction-from-azure-sql-databases-in-azure-purview/)

I’ve recorded a short video (youtube link) for a first, quick look on that feature.

Turn on lineage extraction in scan options for Azure SQL databases
Data lineage including a Stored proc

If you want to read more about the feature, learn what pre-requisites have to be met -> https://docs.microsoft.com/en-us/azure/purview/register-scan-azure-sql-database?tabs=sql-authentication#lineagepreview

Azure Purview now understand SAP Business Warehouse (BW)

There are new built-in sources added every now and then.. and today, SAP Business Warehouse (BW) was added to the list of supported sources:

https://techcommunity.microsoft.com/t5/azure-purview-blog/azure-purview-adds-support-for-sap-business-warehouse/ba-p/3253404

SAP BW and Purview documentation: https://docs.microsoft.com/en-us/azure/purview/register-scan-sap-bw

Workflow support for Data Access Request (Preview)

To know your data sources, their connections and so on is a beginning for your data journey, but access securement is a huge and very important topic.

The Purview team released a new block of functionality – Workflow support for managing data access in Azure Purview

I did not have any time to dive into that feature, but you will get an update (blog post, video) in the next weeks.

In the meantime, please have a look at the announcement blog posts from the team:

Happy data cataloguing and data scanning,

Wolfgang

Posted in Azure, Azure Purview, Data Governance, Microsoft Purview | 1 Comment