Python in Power BI

Yannawut Kimnaruk
7 min readJun 16, 2022

--

❓ Why Python in Power BI?

Microsoft Power BI supports two different languages, M language and DAX (Data Analysis Expression).

However, it is sometimes more convenient to use Python for the data preparation process since Python has Python Libraries, a set of useful functions that eliminate the need for writing codes from scratch. Moreover, data analysts who are familiar with Python can easily pick up Python to applying in Power BI for the web scraping or EDA (Exploratory Data Analysis) process.

Contents in this article

  • Install Python
  • Python setting in Power BI
  • Use Python to get data
  • Use Python in transformation
  • Use Python to visualize

📥 Install Python

Before using Python in Power BI, you have to install Python. I recommend installing Anaconda since it is easy to manage Python libraries and environments.

  1. Install Anaconda

Go to https://www.anaconda.com/products/distribution. Download and install Anaconda on your computer.

2. Open Anaconda Prompt

Search for Anaconda Prompt and click to open.

You will see a black screen pop-up.

3. Create a Python 3.6 environment

Power BI may have issues when working with high-version Python (as of I write this article, the latest Python version is 3.9). To avoid technical issues, I create a new Python 3.6 environment which I try and don’t have any problem with.

You can downgrade your Python version directly but it is not recommended since it may affect other projects. You can think of the Anaconda environment as a working box containing a specific Python package collection. When you work in this environment, it will not affect other projects.

In the Anaconda Prompt, create a new environment by typing…
(Replace <env_name> with the name of your environment such as python36)

conda create --name <env_name> python=3.6

Then, when you see Proceed ([y]/n)?, type y and Enter.

Wait until the running is complete.

Check that a new environment was created successfully by typing

conda env list

You will see a list of Anaconda environments. If you see a new environment, you are ready to proceed to the next step.

4. Install useful Python libraries

Before installing the Python package, make sure that you are in the environment you just created by typing the following command in Anaconda prompt.

conda activate <env_name>

Then, you will observe the change of environment name in the parenthesis.

You can install Python libraries/packages that you are going to use in Power BI by typing the following command.
(Replace <package_name> with the name of the package you want to install such as pandas, numpy, matplotlib)

pip install <package_name>

⚙️ Python setting in Power BI

After completion of Python installation, it is time for Power BI!!

  1. Open Power BI.
  2. Click File in the upper right corner.

3. Click Option and setting. Click Option.

4. Click Python scripting

5. Change the Detected Python home directory to Other and browse for your Python environment created in step 1.

Tip: Finding the environment directory.

  • Open Anaconda Prompt
  • Type below code (Replace <env_name> with the name of your environment)
conda activate <env_name>
  • Type the below code to show the python location of this environment.
where python

copy the path before python.exe

Now, you are ready to use Python in Power BI

Python can be applied in at least 3 steps: Get data, Transformation, and Visualization

If you want to follow this article, you can download the data from https://www.kaggle.com/datasets/datatattle/covid-19-nlp-text-classification/download

🧺 Use Python to get Data

Python can be used to get data. This is helpful in web scraping (getting data from the website).

In this example, I will create a Python file to get data from 2 sources: a csv file and a created dataframe (a table with rows and columns).

1. Click Get data

2. Search for Python script and click to open a new coding window.

3. Type the below code and click ok.

import pandas as pddf1 = pd.read_csv("C:/Corona_NLP_train.csv", encoding = "ISO-8859-1")
df2 = pd.DataFrame({'A': [1, 3, 6, 8],'B': [10, 30, 50, 90]})

4. You will see a Navigator plain to show data (like in other get data methods.). Select data to load.

🔄 Use Python in transformation

Python can be used in the data transformation step as well. Mostly, I use it for the regular expression to extract sub-strings from another column that match defined patterns such as getting hashtags from Twitter texts.

In this example, I will find text length using len() function in Python.

  1. Click Transform data

2. Select the query you want to transform

3. In the Transform tab, click Run Python script

4. You will see a new Run Python script window. Write your code here and click ok.

Key concept: the data will be a ‘dataset’ DataFrame, so you can manipulate it with pandas functions.

6. The result will be a table. Click to expand the table. Make sure that ‘Use original column name as prefix’ is not checked.

7. You will see the result of data transformation and the steps are added in APPLIED STEPS plain.

📊 Use Python to visualize

Creating a plot using visualization from Power BI alone may have some limitations and some plots may not available in Power BI.

Python will come in handy since there are many libraries in Python that can generate any visualization as you wish.

Python Libraries for Data Visualization that are commonly used these days include Matplotlib, Plotly, Seaborn, and GGplot.

Although writing codes in Python to create a plot may be more difficult than the drag-and-drop concept of Power BI, there are much more plot customization and code examples (to copy).

  1. In the Visualization plain, click the Py icon (abbreviation of Python).

2. You will see an empty Python script editor area. Select the columns you want to visualize.

3. You will see an empty coding area. Write code (Don’t forget plt.show() to show graphs) and click the run icon and wait for the result. Done!!

import seaborn as snsimport matplotlib.pyplot as pltimport redef find_hash(text):line=re.findall(r'(?<=#)\w+',text)return " ".join(line)dataset['hash']=dataset['OriginalTweet'].apply(lambda x:find_hash(x))temp=dataset['hash'].value_counts()[:][1:11]temp= temp.to_frame().reset_index().rename(columns={'index':'Hashtag','hash':'count'})plt.figure(figsize=(20, 15))sns.barplot(x="Hashtag",y="count", data = temp)plt.show()

Note: If it shows an error after running the code, take a deep breath and read the error message.

Conclusion

This article shows how to apply Python in Power BI step by step, so you can get the advantage of both the Power BI interactive dashboard and Python flexibility. You can apply Python code in many ways including data acquisition, transformation, and visualization.

Hope you get some idea about using Python in Power BI.

Please follow me if you want to read more data analytics content.

--

--