Exploring Relationships in Semantic Models with #SemPy

by Prathy Kamasani

Understanding the relationships between datasets is crucial in data analytics, especially in the world of self-service BI. Sempy, a Python library unique to Microsoft Fabric, allows users to visualise these relationships seamlessly. This post explores using Sempy to visualise semantic model relationships and view them in a Power BI Report. Viewing them in Notebook is easy and has been documented on MS Docs.

The idea for this came to me when I saw Adam & Patrick presenting SemPy at Data Minds last year; it just took a year to turn into a post 😉 . The whole idea behind it is more for fun than a real use case. The only use case I could think of is converting Digraph figures into SVG and displaying them in Power BI, I mean any Diagrph. However, there are plenty of other great resources about SemPy and some significant use cases too. My favourites are the repository of Semantic Link Labs by M. Kovalsky and team, a Blog post by Kurt, and many posts by Sandeep.

I am using Microsoft Notebook to explore this. If you are more interested in viewing the Notebook, go ahead and view it. It covers pretty much the same thing below – View on GitHub


If you want to understand the Notebook, read on…

To get started, you’ll need to install the required libraries. This can be done easily with the following commands

%pip install semantic-link

# Once installed, import the necessary modules:
import sempy.fabric as fabric
from sempy.relationships import plot_relationship_metadata
from sempy.relationships import find_relationships
from sempy.fabric import list_relationship_violations
from sempy.dependencies import plot_dependency_metadata

import pandas as pd
import graphviz
import base64
import matplotlib.pyplot as plt

Fetching Semantic Models Next, let’s retrieve all semantic models in your current workspace. Using Semantic Link Labs, this functionality can also be extended to entire tenants or multiple workspaces.

# Get the list of datasets
datasets = fabric.list_datasets()
dataset_svg_list = []

# Convert the list to a DataFrame
df = pd.DataFrame(datasets, columns=['Dataset Name'])

Exploring Dataset Relationships

With the datasets in hand, we can iterate through each dataset to list its relationships. We will use fabric.list_relationships to get relationships.

for dataset_name in df['Dataset Name']:
    relationships = fabric.list_relationships(dataset_name)

Visualizing Relationships

Visual representation of data relationships can significantly enhance comprehension. Using the plot_relationship_metadata() function, we can create a diagram illustrating these connections.

if not relationships.empty:
    fig = plot_relationship_metadata(relationships)

Storing Visualizations

To integrate these visualizations into Power BI, we convert them into SVG format.

svg_text = fig.pipe(format='svg').decode('utf-8')

We then create ImageURL with right structure so Power BI can display the Image using Image URL categorization

df['ImageURL'] = 'data:image/svg+xml;utf8, '+ df['SVGText']

Full code for this block:

# Iterate over each dataset name
for dataset_name in df['Dataset Name']:
    # Get the relationships of the current dataset
    relationships = fabric.list_relationships(dataset_name)
    if not relationships.empty:
        fig = plot_relationship_metadata(relationships)
        # print(fig)
    # Get the SVG source as a string
        svg_text = fig.pipe(format='svg').decode('utf-8')
        # Extract content between <svg> and </svg>
        start_index = svg_text.find('<svg')
        end_index = svg_text.find('</svg>') + len('</svg>')
        svg_content = svg_text[start_index:end_index]
        # print(svg_content)

        # Append the dataset name and SVG text to the list
        dataset_svg_list.append((dataset_name, svg_content))
        # Convert the list to a DataFrame for better visualization
    df = pd.DataFrame(dataset_svg_list, columns=['DatasetName', 'SVGText'])
df['ImageURL'] = 'data:image/svg+xml;utf8, '+ df['SVGText']
drop_cols = ['SVGText']
df.drop(drop_cols, axis=1, inplace=True)
df = df[['DatasetName', 'ImageURL']]

If you are more interested in working with one Semantic Model than all Semantic Models in the Workspace. Then you can use below code block

relationships = fabric.list_relationships("SemanticModelName")    

if not relationships.empty:
    fig = plot_relationship_metadata(relationships)
    print(fig)
    svg_text = fig.pipe(format='svg').decode('utf-8')
    # Extract content between <svg> and </svg>
    start_index = svg_text.find('<svg')
    end_index = svg_text.find('</svg>') + len('</svg>')
    svg_content = svg_text[start_index:end_index]
    # print(svg_content)

Data Limitations

Next comes Data Limitations. Power BI columns have character limits. Chris Webb has a nice blog post explaining the Power BI Character limit – Chris Webb’s BI Blog: What Is The Maximum Length Of A Text Value In Power BI? (crossjoin.co.uk). Handling data limitations for columns can pose challenges when an image is large. Especially when using base 64. To manage this, we split long strings into multiple columns dynamically.

max_length = df['ImageURL'].str.len().max()

chunk_size = 30000
num_columns = round((max_length / chunk_size) + 1)

for  i  in  range(num_columns):
    col_name = f'ImageURL{i+1}'
    start, stop, step = i*chunk_size,(i+1)*chunk_size,1
    df[col_name] = df['ImageURL'].str.slice(start,stop)

And complete code for handling data limitations and storing data in a Delta Table

max_length = df['ImageURL'].str.len().max()
# print(max_length)
chunk_size = 30000
num_columns = round((max_length / chunk_size) + 1)

# print(num_columns)

for i in range(num_columns): 
    # Create a new column name 
    # print(i)
    col_name = f'ImageURL{i+1}'
    # # print(col_name)
    start, stop, step = i*chunk_size,(i+1)*chunk_size,1
    df[col_name] = df['ImageURL'].str.slice(start,stop)
    # # print(start)
    # # print(stop) 

# cnames = df.columns
# print(cnames)

spark_df=spark.createDataFrame(df)
spark_df.write.format("delta").option("mergeSchema", "true").mode("overwrite").saveAsTable("Dataset_list_relationships_svg")

Semantic Model and Power BI Report

Next comes the Semantic Model. To display the image in Power BI, we create a DAX measure by concatenating all Image Strings. We then categorize the Measure as an Image URL and display the image URL with any visual that supports it.

That’s it! Let me know what you think?

Until next time, Prathy 🙂

You may also like

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More