Understanding the relationships between datasets is crucial in data analytics, especially in the world of self-service BI. Sempy, a Python library unique to Microsoft Fabric, allows users to visualise these relationships seamlessly. This post explores using Sempy to visualise semantic model relationships and view them in a Power BI Report. Viewing them in Notebook is easy and has been documented on MS Docs.
The idea for this came to me when I saw Adam & Patrick presenting SemPy at Data Minds last year; it just took a year to turn into a post 😉 . The whole idea behind it is more for fun than a real use case. The only use case I could think of is converting Digraph figures into SVG and displaying them in Power BI, I mean any Diagrph. However, there are plenty of other great resources about SemPy and some significant use cases too. My favourites are the repository of Semantic Link Labs by M. Kovalsky and team, a Blog post by Kurt, and many posts by Sandeep.
I am using Microsoft Notebook to explore this. If you are more interested in viewing the Notebook, go ahead and view it. It covers pretty much the same thing below – View on GitHub
If you want to understand the Notebook, read on…
To get started, you’ll need to install the required libraries. This can be done easily with the following commands
%pip install semantic-link
# Once installed, import the necessary modules:
import sempy.fabric as fabric
from sempy.relationships import plot_relationship_metadata
from sempy.relationships import find_relationships
from sempy.fabric import list_relationship_violations
from sempy.dependencies import plot_dependency_metadata
import pandas as pd
import graphviz
import base64
import matplotlib.pyplot as plt
Fetching Semantic Models Next, let’s retrieve all semantic models in your current workspace. Using Semantic Link Labs, this functionality can also be extended to entire tenants or multiple workspaces.
# Get the list of datasets
datasets = fabric.list_datasets()
dataset_svg_list = []
# Convert the list to a DataFrame
df = pd.DataFrame(datasets, columns=['Dataset Name'])
Exploring Dataset Relationships
With the datasets in hand, we can iterate through each dataset to list its relationships. We will use fabric.list_relationships to get relationships.
for dataset_name in df['Dataset Name']:
relationships = fabric.list_relationships(dataset_name)
Visualizing Relationships
Visual representation of data relationships can significantly enhance comprehension. Using the plot_relationship_metadata() function, we can create a diagram illustrating these connections.
if not relationships.empty:
fig = plot_relationship_metadata(relationships)
Storing Visualizations
To integrate these visualizations into Power BI, we convert them into SVG format.
svg_text = fig.pipe(format='svg').decode('utf-8')
We then create ImageURL with right structure so Power BI can display the Image using Image URL categorization
df['ImageURL'] = 'data:image/svg+xml;utf8, '+ df['SVGText']
Full code for this block:
# Iterate over each dataset name
for dataset_name in df['Dataset Name']:
# Get the relationships of the current dataset
relationships = fabric.list_relationships(dataset_name)
if not relationships.empty:
fig = plot_relationship_metadata(relationships)
# print(fig)
# Get the SVG source as a string
svg_text = fig.pipe(format='svg').decode('utf-8')
# Extract content between <svg> and </svg>
start_index = svg_text.find('<svg')
end_index = svg_text.find('</svg>') + len('</svg>')
svg_content = svg_text[start_index:end_index]
# print(svg_content)
# Append the dataset name and SVG text to the list
dataset_svg_list.append((dataset_name, svg_content))
# Convert the list to a DataFrame for better visualization
df = pd.DataFrame(dataset_svg_list, columns=['DatasetName', 'SVGText'])
df['ImageURL'] = 'data:image/svg+xml;utf8, '+ df['SVGText']
drop_cols = ['SVGText']
df.drop(drop_cols, axis=1, inplace=True)
df = df[['DatasetName', 'ImageURL']]
If you are more interested in working with one Semantic Model than all Semantic Models in the Workspace. Then you can use below code block
relationships = fabric.list_relationships("SemanticModelName")
if not relationships.empty:
fig = plot_relationship_metadata(relationships)
print(fig)
svg_text = fig.pipe(format='svg').decode('utf-8')
# Extract content between <svg> and </svg>
start_index = svg_text.find('<svg')
end_index = svg_text.find('</svg>') + len('</svg>')
svg_content = svg_text[start_index:end_index]
# print(svg_content)
Data Limitations
Next comes Data Limitations. Power BI columns have character limits. Chris Webb has a nice blog post explaining the Power BI Character limit – Chris Webb’s BI Blog: What Is The Maximum Length Of A Text Value In Power BI? (crossjoin.co.uk). Handling data limitations for columns can pose challenges when an image is large. Especially when using base 64. To manage this, we split long strings into multiple columns dynamically.
max_length = df['ImageURL'].str.len().max()
chunk_size = 30000
num_columns = round((max_length / chunk_size) + 1)
for i in range(num_columns):
col_name = f'ImageURL{i+1}'
start, stop, step = i*chunk_size,(i+1)*chunk_size,1
df[col_name] = df['ImageURL'].str.slice(start,stop)
And complete code for handling data limitations and storing data in a Delta Table
max_length = df['ImageURL'].str.len().max()
# print(max_length)
chunk_size = 30000
num_columns = round((max_length / chunk_size) + 1)
# print(num_columns)
for i in range(num_columns):
# Create a new column name
# print(i)
col_name = f'ImageURL{i+1}'
# # print(col_name)
start, stop, step = i*chunk_size,(i+1)*chunk_size,1
df[col_name] = df['ImageURL'].str.slice(start,stop)
# # print(start)
# # print(stop)
# cnames = df.columns
# print(cnames)
spark_df=spark.createDataFrame(df)
spark_df.write.format("delta").option("mergeSchema", "true").mode("overwrite").saveAsTable("Dataset_list_relationships_svg")
Semantic Model and Power BI Report
Next comes the Semantic Model. To display the image in Power BI, we create a DAX measure by concatenating all Image Strings. We then categorize the Measure as an Image URL and display the image URL with any visual that supports it.
That’s it! Let me know what you think?
Until next time, Prathy 🙂