Home Power BI Download And Unzip files using Power BI Desktop and R Script

Download And Unzip files using Power BI Desktop and R Script

by Prathy Kamasani April 4, 2016 2 comments

Most of the open data source providers supply data in a compressed format, especially when files are large. That is what I noticed with Company House data. I wanted to explore company house data using Power BI. But, I was not at all looking forward to downloading each file then unzip so that I can import into Power BI model. I have many files to work with, so I wanted to automate things. At least, I wanted to unzip all files using Power BI. Quite quickly, I understood Power BI currently does not support all compressed formats. Chris Webb explains here, how to work with compressed files and why currently all compressed formats are not supported.

However, Power BI supports R Script, which means I can easily automate my steps.Until Power BI Desktop supports unzip functionality, we can use new R Script in the Power BI Desktop. Power BI support documented very well regarding how to run R scripts in Power BI. Using R Script, I can download and unzip company house files.

R Script as a source in Power BI Desktop:

Microsoft recommends it is better to write and execute R Scripts in RStudio( or any other R tools) before you use it in Power BI. Power BI does not have a rich R interface, but I found it does give reasonable error information. Currently, R Script imports only data frames to Power BI.

Here is a simple R data frame in RStudio and Power BI Desktop

2016-04-01_21-33-14

RScript in R Studio

2016-04-01_21-33-31

RScript data frame in R Studio

2016-04-01_21-21-41

R Script In Power BI

Power BI Desktop uses R.Execute to execute R Script.

Download and Unzip from web using Power BI and R Script:

I wanted to download more than one file which has different URL with an incremental number. Hence, I used for loop to get the file number I wanted to download. With few lines of R script, I can easily download and unzip all these files.

R script I used in R Studio:

This script does not return any data frame. If I use the same query in PowerBI Desktop, it would still download and unzip files, but I would not be able to import the data to Power BI. Also, I would not be able to use it as a step in the Power BI query.

We can either return an empty R Data Frame and create a separate query in Power BI to read the unzipped files or use a blank query as a source instead of R Script as the source, and do all operations in the single query.

Same script as above in Power BI Desktop returning empty data frame:

Alternatively, same R Script without returning any data frame, this executes the R Script and does other Power Query operations:

2016-04-03_20-56-33

Downloaded R Files in Power BI

That is all; as usually now we can do additional operations in Power BI. However, there are few things to consider:

  • There is a 30 min timeout limit on all R Scripts, if you are downloading many large files, Power BI may raise a timeout error.
  • Every time you refresh PowerBI it would re-run the R Script; work around for this to run R Script using one query and do other Power BI operations in another query so you can disable the query which using R Script.

Among these are few other things I noticed:

  • I always had to import a data frame. Otherwise, it would still execute the R script, but I would have nothing in Power BI. In fact, I had to cancel the query, which did not save R Script either.
  • If I cancel Power query refresh, it will lock the files or folders which were in the process – need to close and reopen the Power BI to unlock!
  • If I want to change the R Script, before it finishes execution or Power BI refresh, again Power BI caches the script or values I given in the script, which mean I need to close and reopen Power BI. E.g.After I canceled the query and If I changed the for loop condition, it will still use the condition I given before I cancelled the query.

Overall it is exciting to see R Script in Power BI. It enables many features those Power BI does not support currently. Remember you never know what incredible Power BI team can deliver; an element which is missing today may appear in the next minute 🙂

Till next time,

Prathy 🙂

2 comments

You may also like

%d bloggers like this:
Skip to toolbar