A first look at Azure Purview – Data Governance for your data estate

Data Governance – the knowledge, management and insights about and over your data estate is a very important topic. In the past years the number of data stores, systems and data interfaces grew around the world and in every company. Often, companies are overwhelmed and unable to cope with these challenges. It’s not only the question of where do we (as a company) store the data but also what are the information pieces stored and is there a need for data protection, encryption and classification?

In addition, the different kinds of data storages, like databases, storages, data lakes, SaaS applications … do not make it easier to gain a full overview about the data estate in the enterprises.

In the past, Microsoft released the Azure Data Catalog, to provide a tool to catalog your data assets. This solutions, well – it did not really worked well for some use cases.

Introducing Azure Purview

Today, at the Azure Data and Analytics event, a new Azure data governance service called Azure Purview (https://aka.ms/AzurePurview) was presented and made available in a public preview.

I have not had a chance to try the actual service, but I found a very interesting video (Microsoft mechanics video) where I took the following screenshots from.

As already described, data is available in many different places, formats and systems within a company. It can be stored in databases, files, SaaS applications either being on-premises, in the cloud or in hybrid environments.

The purpose of Azure Purview is to

  • scan your data estate / data sources in your company
  • classify information within your data sources
  • provide a consistent end-to-end view over your data estate (lineage view)
  • allows you to search and analysis your data estate using a data catalog
  • allows your data stewards to get data insights out of your overall data governance information base
Azure Purview connecting to different data sources

Search your data estate

One of the use cases for Azure Purview is the search catalog feature that allows you to scan and search for keywords within your metadata.

Azure Purview Studio

The screenshot below gives a first glimpse of the search results list. The list includes results for different data sources, glossary terms and you can filter based on different criteria.

Purview search results

Detailed information about one data source object is shown in the next screenshot – the overview page provides information about the type of data source, the last scan time and the hierarchy this element belongs to.

Purview metadata – overview page

In this example, the hierarchy of the table SalesOrderHeader – it’s an Azure SQL table, stored in an Azure SQL Database.

Purview – object hierarchy

Next, the schema view provides you with an overview about the underlying schema of the object – including classifications (either system- or user-defined classifications). Satya mentioned over 100 artificial intelligence powered classification methods to analyze your data.

Purview – Schema view including classification information

and now – one of my favorites: – Data Lineage – Wow, just wow! 😉

This view provides an end-to-end view of your data estate around the currently selected data object (our table). The lineage includes data preparation steps and also Power BI datasets and reports!

Purview – Data lineage

In addition all related objects can be displayed and further analyzed.

Purview – related objects

Fill the Data Governance Service – bring information into Azure Purview

As mentioned in the video, Azure Purview comes with many different connectors. The data sources defined are scanned and the metadata is integrated into the Azure Purview Data Map (a searchable network graph).

Purview – from data sources, connectors to Azure Purview Data map

In addition, Azure Purview has a tight integration to support Apache Atlas (link) – which allows an import of already existing Apache Atlas meta data stores.

Purview – Apache Atlas interface

Azure Purview already comes with a large number of connectors – data sources that are supported to integrate into Azure PurView Data map.

Purview – add sources

Data Sources can be grouped into collections and you can hierarchically apply classification rules and properties within these collections.

Azure Purview comes with system defined classification rules but you can also add your own, custom classification rules:

Purview – custom classification rules
Purview – system defined classification rules

Now, set the file types these classifiers should be applied on and define scan trigger (schedules):

Purview – scan rules – define file types
.. and a schedule

To take it to the next step, the collected metadata can be used from different user groups in your company – search the data catalog information, analyze the data lineage of data objects, find classified information and – very important – classification rule breaches.

What you can do with Azure Purview

To sum it up – I have not tried Azure Purview, but the first glimpse look very interesting…

My highlight of the announcement is the end-to-end lineage view – waiting for such an insight graph for a loooooong time.. and now it’s there, integrated with many different systems! Happy Wolfgang 😉

Know you data, know your data estate!

Wolfgang

Further information

About wolfgang

Data Platform enthusiast
This entry was posted in Azure, Azure Purview, Azure Synapse Analytics, Data Governance, InformationSharing, PowerBI. Bookmark the permalink.

7 Responses to A first look at Azure Purview – Data Governance for your data estate

  1. Pingback: Introducing Azure Purview – Curated SQL

  2. Pingback: Create your own Purview catalog instance | workingondata

  3. Pingback: Data News – 2020-12-07 | workingondata

  4. Pingback: New Microsoft data governance product: Azure Purview | James Serra's Blog

  5. Pingback: New Microsoft data governance product: Azure Purview – SQLServerCentral

  6. M says:

    Hi, nice article. Do you know if Purview provides to me relationship attributes at column level between entities – i.e. foreign key, primary key etc. ?

  7. Pingback: Azure Purview and SQL Server / Azure SQL Views.. | workingondata

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s