Wednesday, November 15, 2023

Ignite News: Microsoft Fabric is Now GA (and more)!


Over the (roughly) last 6 months we all had the ability to play with Microsoft Fabric, when it went Public Preview at Microsoft Build.

Now that the keynote of Microsoft Ignite by CEO Satya Nadella has been delivered, and also the amazing in-depth Fabric session by Amir Netz, Arun Alagaratchagan: Make your data AI ready with Microsoft Fabric and Azure Databricks. Back then at Build, Satya called it:

"...the biggest launch of a data product from Microsoft since the launch of SQL Server!"

Generally available

But now, Satya went one step further and announced GENERAL AVAILABITIY of Fabric!

Also, Copilot in Fabric is now in public preview.

More updates

There's a ton of updates inside Fabric on existing features on the latest Fabric blog here.

And there are also some exciting new features:

  • Seamlessly connect your data sources to Fabric
    The ability to create shortcuts was already there, where you virtualize data in OneLake without having to move or duplicate that data.. You can create shortcuts to another Lakehouse and Warehouse, but also to files on ADLS or even Amazon S3 or Google storage.
    The newest feature just announced is called Mirroring, where you can add and manage existing cloud data warehouses (and databases) in Fabric's Data Warehouse experience. The way this works, is like replication in SQL Server. Fabric replicates a snapshot of that database to OneLake in Delta Parquet files and keeps that in sync in near real time, which relies on the Change Data Capture feature of the underlying source. Initially it's supported for Azure Cosmos DB, Azure SQL DB and Snowflake, more sources will follow next year.
  • Copilot in Power BI (public preview)
    Just be aware that Copilot will be rolling out in stages, 
    • Smart Narrative is an existing visual in Power BI Desktop, now rebranded to Narrative with Copilot
    • The November Desktop update let's you generate synonyms for your fields, measures and tables using Copilot.
    • In the future, there's also going to be:
      • a report creation experience
      • a DAX writing experience
  • Direct Lake support on Data Warehouse
    There's also an update on the size limits of your Fabric capacity and when it will fallback to Direct Query.
  • Stored credentials for Direct Lake semantic model
    You can now specify a fixed identity (like a service principal) for a Direct Lake mode semantic model
  • Pricing on Fabric is updated!
    Reserved pricing is now available, with a rough discount of around 41% off of Pay-As-You-Go pricing.
  • OneLake integration for Import-mode semantic models is coming!
    This allows for a seamless (at least that's what Microsoft claims πŸ˜„) integration for your import Power BI Desktop models into OneLake. I wonder if this also implies that you can convert your import report to a Direct Lake mode model afterwards. It's not totally clear to me at this point.
[update on November 16th]

Conclusion

Will I now go all-in on Fabric? "It depends", but probably not πŸ˜€
I think it still depends on a case per case basis. Is it a newish customer with not too many investments in other data platform services like Synapse/Data Factory. Then it makes sense to evaluate the requirements and see if it's worthwhile to start with Fabric, considering a lot of features are still in preview, and others are not there yet.
On the other hand, if it's an existing customer with real estate in Synapse and databricks for example, where they implemented a medaillon structure and have infrastructure running there, I'd seriously reconsider if it's worth moving to Fabric. I'm leaning towards a no for now.


Keep a look out on the official Fabric and Power BI blogs and Microsoft Learn for all new content:



I'm updating this post live while the updates are rolling out, so come back later for more updates!πŸ˜€

Friday, October 20, 2023

Data, Insights, and Community: My Reflections on Data Saturday Holland and dataMinds Connect

The Lamot conference center alongside the river Dijle in Mechelen, Belgium


The world of data and analytics is constantly evolving, with new tools, technologies, and best practices emerging almost daily. As a data enthusiast and professional, I'm always on the lookout for opportunities to expand my knowledge and stay up to date with the latest trends in the field. Recently, I had the pleasure of attending two fantastic events: Data Saturday Holland and dataMinds Connect, and you may already guess—it was amazing!

Data Saturday Holland - Where Passion Meets Expertise


Data Saturday Holland, formerly known as SQL Saturday Holland, is a renowned event that brings together data professionals, enthusiasts, and experts to share their knowledge and experiences.

One of the things that struck me the most at Data Saturday Holland was the passion of the speakers and participants. From Power BI to data engineering, there was a wide range of sessions to choose from. I attended sessions covering various topics like Direct Lake in Power BI and improving your Power BI report. These sessions provided me with valuable insights and practical tips that I could immediately apply to my work. It was enlightening to learn from experts who shared their real-world experiences and demonstrated the tools and techniques they use on a daily basis.

A highlight of the Saturday was being able to attend my favorite podcast, Knee-Deep in Tech, live in a movie theater!
Knee-Deep in Tech by Heini Ilmarinen, Alexander Arvidsson and Simon Binder


Another highlight of the event were the networking opportunities. I had the chance to meet old friends and new people, exchange ideas, and even discuss potential collaborations. It's incredible how the event fostered a sense of community and encouraged knowledge sharing. I left Data Saturday Holland inspired and motivated!

DataMinds Connect - A Deep Dive into Data and AI


Just when I thought my week couldn't get any better, I had the privilege of attending dataMinds Connect. This event is an annual conference organized by the dataMinds community, focusing on Microsoft Data Platform technologies. The event spans (for now 😏) two days and covers a wide range of topics related to data and AI.

One of the standout features of dataMinds Connect was the depth and breadth of the sessions. Experts from various domains within data and AI shared their knowledge. From advanced SQL Server features to leading AI applications, there was something for everyone. I especially enjoyed the in-depth sessions that allowed me to explore complex topics in detail, like Mathias Thierbach's Power BI Source Control precon.




I also volunteered during the two days this year, assisting speakers and visitors in making the most of their own experience. During Mathias' workshop, I helped answer questions and on the second day, I addressed general visitor questions and took care of the speakers, so they didn't have to worry about the technical aspects, drinks, or other logistics.

The community at dataMinds Connect was incredibly welcoming, and I had the opportunity to engage in conversations with participants and speakers. This sense of community and feeling of being welcome was a common theme throughout the event, and it's something that truly sets this conference apart. A beer and some chocolate certainly works wonders too! :-)


The Value of Conferences


Attending Data Saturday Holland and dataMinds Connect provided me with a comprehensive overview of the Power BI and Fabric landscape and the latest developments in this field. I left with new skills and insights and a better understanding of the latest trends and technologies. Moreover, the connections I made during these events were invaluable. Networking with professionals who share my passion and interests can lead to future collaborations, career opportunities, or simply the joy of being part of a vibrant and supportive community.

In summary, my week at Data Saturday Holland and dataMinds Connect was an incredible experience! These events not only expanded my knowledge but also allowed me to connect (and continue to connect) with fellow data enthusiasts and experts!

I have a few more things coming up in the following monhts:
Do I see you there? πŸ˜€

Friday, September 29, 2023

Pausing a Fabric Capacity - What Does It Actually Mean?

After an initial question by my friend and fellow MVP Koen Verbeeck, myself and a bunch of people started answering, amongst others was Mohammad Ali, Group Program Manager for Power BI.

After a while it got me thinking:
  • What does it actually mean when I pause a Fabric capacity?
  • What will stop working?
  • What can I still do and won't stop working?

Important considerations

Microsoft Fabric is a prerelease online service that is currently in public preview and may be substantially modified before it's released. Preview online service products and features aren't complete but are made available on a preview basis so that customers can get early access and provide feedback.
A note before you start and might be aware of, Microsoft Fabric is still in preview, so be aware of the available functionality, availability and supportability, which is described in detail here.

TL;DR

After playing around and testing various scenario's, I was quite surprised on a few answers I got, so keep reading if you want to find out!
In case you are not interested in the setup, you can also skip right to my tests or the conclusions.

Start setup

The steps I took to start exploring the capacity capabilities are the following:
  • I created a Fabric capacity in the Azure Portal for my tenant. You can even start an Azure (30 day) free trial and use that to create a Fabric capacity. Erwin did a great job explaing how to create a Fabric capacity, so I won't go into details here.
  • Then I set up a basic Lakehouse from the Lakehouse tutorial on Microsoft Learn. I followed the tutorial up untill step 3 (Build a lakehouse), where I end up with a dataflow Gen2, a lakehouse and a Power BI (Direct Lake) report on the default dataset.
  • I also created 2 workspaces:







    • Test Fabric Capacity holds all my Fabric artifacts items and has the Fabric capacity (nickyscapacity, see below) assigned. This is the workspace I used for my tutorial. Let's call this the Fabric workspace.
    • Test Fabric semantic model has no capacity assigned, so it's a regular (pro) workspace. Let's call this workspace.
  • After that, I've created a few datasets/reports (or semantic models if you will πŸ˜€) (with Direct Lake, DirectQuery and Import) on top of the SQL Endpoint of my lakehouse.

The basic report I created, it's not really important how it looks for now:



My tests

The first thing is of course pausing my capacity, which is an easy push of a button in the Azure portal.



Next I want to see what happens when I access certain items or take certain actions in my workspaces.
Here's a list of things I tried to do:

Access my dataflow Gen2

Not a very helpful error message πŸ˜€

Access the Lakehouse


This is very helpful, it actually mentions my capacity (ID) is not running.

Access SQL Endpoint (of the Lakehouse)


Not very helpful, it doesn't say anything about my capacity.

Access Direct Lake model from Fabric workspace


This one IS helpful, it actually mentions my capacity is not active, although it doesn't give the ID like with accessing the lakehouse.

Access Direct Lake model from workspace

I created a copy of the report into the regular workspace and opened the report.
Strange enough, I suspect because of some caching that was already done before pausing the capacity, some parts of the report still work. I assume that cache is then copied over (with the report) to the regular workspace.
Some interactions worked, but when I clicked a filter without any cache, I got the same error message as below with the DQ model.


Access DQ model from Fabric workspace

The visual itself gives me the above error, see the detailed error message below: not very helpful.



Access DQ dataset from workspace

This one is a bit inconsistent because I got different errors on this action.

I'm accessing the DQ report in the other, regular workspace. I'm getting a slightly different error in the visual, but the detailed error message is totally different then from the Fabric workspace. It's more a SQL server error message.
It at least tells me there's something wrong with the SQL endpoint.

But when I tried this same action later, I got the following error, which is very much helpful because it mentions the CapacityNotActive.



Access Import dataset from Fabric workspace


The error itself is helpful, because it mentions the capacity.
However, this one got me a bit surprised, because I'm accessing an imported model, so the data is no longer in OneLake. But as we'll see a bit further, nothing from a Fabric workspace can be accessed anymore when the capacity is paused.

Access Import dataset from workspace ✅

The difference with the action above is that this is the regular workspace. This one succeeds, because the data is in the imported model in the regular workspace, which is active and running. It has nothing to do with the Fabric capacity.

Download import dataset and re-publish to workspace ✅

Surprisingly (to me), I can still download the dataset from the Fabric workspace. So it seems the dataset itself is not stored in OneLake, since that is paused. Still a bit strange why then the import model doesn't start from this Fabric workspace.


Republishing to the (regular) workspace succeeds and gives me the report below:


Refresh Import dataset from workspace

This action pertains to refreshing the Import dataset from the regular workspace, the dataset which I could open. However, the refresh action itself fails, because it needs the lakehouse data to refresh, which is not available.

Move the Fabric workspace to Pro

When trying to move the Fabric workspace to a regular (Pro) workspace, you might be thrown off by this message in the workspace Premium settings in the bottom:

However, moving a Fabric workspace to Pro is only possible when there are no Fabric items inside:

This is also mentioned as one of the current restrictions.
I would urge you to carefully read those restrictions, the known issue(s) and final way of working when the known issue is resolved, especially if you plan to move items between regions after a workspace has been created.

Access the Fabric Capacity Metrics App ✅

The Fabric Capacity Metrics app just keeps functioning. It doesn’t need the capacity itself to operate on, it uses the analytics/telemetry from the capacity that is logged and reports on that.

Editing capacity settings

The capacity settings in the Fabric Admin portal are italic and cannot be edited, unless you resume the capacity.


Conclusion

So to conclude: all items in a Fabric workspace become unavailable (for interactive opening) when a capacity is paused. Also Power BI-only items.
You can still download an import dataset from the workspace. You can also export the .json file of a dataflow (gen 1 and gen 2). But that's about it you can do on a workspace with a paused Fabric capacity.

1: Depending on the way you created the report, it might still have some cache, so it might work partially

Thanks to Ε tΔ›pΓ‘n ReΕ‘l for pointing out the Usage Metrics report.


So for now, it's best to separate the two, Fabric and non-Fabric items, in separate workspaces so you can always access the Power BI only items when the capacity is paused. In case you are not running your own capacity, but a free trial capacity, you don't have to take this into account for now.


I hope this overview was usefull to you, I can at least use it as a reference and lookup post :-)
I am sure a couple of things will change in the near future, as Fabric updates keep coming out regularly.

If you are missing something from this overview let me know in the comments and I can see if I can add it here.

Wednesday, September 20, 2023

SQLBits 2024 Has Landed!

I've written about SQLBits before here and here, but in case you still don't know what SQLBits is, it's the greatest Data conference in Europe, spanning a whopping 5 days, including the free community event on Saturday.
From the SQLBits website:
SQLBits is the industry’s leading Data Platform conference with over 300 sessions across 5 days covering data technologies including SQL Server, Azure, Big Data, Power BI, Machine Learning, and more.

And if you need a reason (for your boss) to attend SQLBits: there are 10 reasons listed on their website!

Next to all the great content, there's also lot's of extra's happening at SQLBits, like a board game night, a pub quiz, a SQLBits run and last but not least the famous Friday night party!


Previous Editions

I've also had the pleasure of presenting at 2 SQLBits events. In 2022 it was virtual, but last year I was excited to be attending and speaking at my first in-person SQLBits! I delivered 2 sessions, one (20-minute) lightning talk and a general session on Write-back with Power Apps in Power BI (the recording can be found here).


2024

Now on to the 2024 event, last week there was a live stream with Marco Russo, Alberto Ferrari (both from SQLBI) and Simon Sabin (founder of SQLBits) announcing the dates and location for this years event. This year SQLBits will be from March 19th - 23rd, in Farnborough.Hampshire, UK.

The theme for this year will be Aviation, because Farnborough is the birthplace of aviation and the home of pioneering spirit.

The pricing and Call for Speakers for the event will be open very soon, so be sure to keep an eye on their website!

See you there? πŸ˜€



Thursday, June 8, 2023

Power BI Desktop - Unable To Connect

 Just a quick post on an issue I saw a few people run into lately.

Context

I opened a report with a Live connection to a dataset and I was presented with the error below:

We encountered an error while trying to connect.
Details: "Looks like we're unable to access the dataset. Please contact the owner of the dataset."

Clicking Edit gets you the following dialogue, which might already give you a clue what's happening,



Solution

The answer here is actually really simple, if you know it! πŸ˜† 
You have to login to Power BI Desktop with the right account. I was currently logged in with my Powerdobs account, while this was a customer report.

So clicking Edit gave me the datasets in the Powerdobs tenant to connect to, which quickly led me to the conclusion I was logged in with the wrong account.

So to solve this, in the top right corner, click on your name/picture and click Sign in with a different account.


Open the report again and it should connect immediately!
Everyone happy :-)

Monday, May 29, 2023

What Is Microsoft Fabric and Why Should I Care?

As I mentioned in my Build Announcement Summary, Microsoft Fabric has been announced!

In short, Fabric covers the complete spectrum of services including data movement, data lake, data engineering, data integration and data science, real time analytics, and business intelligence.

Straight to Next up.

It promises to offer end to end analytics from the data lake to the business user, covering the following pillars:

  • Complete Analytics Platform
  • Lake-Centric and Open
  • Empower Every Business User
  • AI Powered


In my opinion this provides the following benefits:
  • A broad set of deeply integrated analytics 
  • Shared experiences that are familiar and easy to learn across all the products
  • All assets to be easily discovered and reused by all developers
  • OneLake, a unified data lake, Microsoft calls it the "OneDrive for data", allowing customers to keep one copy of the data while using the analytics tools of choice 
  • Centralized administration and governance across all workloads

As mentioned in my blog, Fabric is turned off by default, until July 1st, so you will have to enable it to start using it! 


The new Fabric / Power BI home page when you switch on Fabric looks like this:

Fabric home page

And with the button in the left bottom corner you can switch between the different persona's/workloads:

Fabric workload switcher



Microsoft also states clearly that standalone PaaS products will stay untouched and remain active. So there's no need for existing customers to worry about solutions currently in production.

"Existing Microsoft products such as Azure Synapse Analytics, Azure Data Factory, and Azure Data Explorer will continue to provide a robust, enterprise-grade platform as a service (PaaS) solution for data analytics. Fabric represents an evolution of those offerings in the form of a simplified SaaS solution that can connect to existing PaaS offerings. Customers will be able to upgrade from their current products into Fabric at their own pace." by Arun Ulag
I'm also pretty sure that existing customer will be able to migrate to the new SaaS-solution.
If you are familiar with the current Synapse offerings, you might find the following mapping table interesting to have as a reference.

Synapse

Fabric

Pipelines

Data Pipelines

Data Flows

Dataflows

SQL Pools

Data Warehouse

Spark

Spark

Notebooks

Notebooks

Azure Data Explorer (ADX/Kusto)

Real-time Analytics

SQL Serverless

Lakehouse

Synapse Workspace

Power BI Workspace

ADLS Gen2

OneLake

Linked Services

Connections

Datasets

Sources/Destinations

Self-Hosted Integration Runtime (SHIR)

Power BI Gateway

CI/CD, Git

ALM



Although Fabric is still in preview, I would encourage you to try out features and look at the use cases, because:
  • Fabric is based on the serverless paradigm. You don't have to start clusters or manage resource in Azure anymore. Instead, Fabric delivers capacities as a SaaS resource. You can spin up analytics solutions faster and more easily.
  • OneLake makes it easier to:
    • Store large amounts of data
    • Use one accurate, certified and real-time unified source of truth
    • Use shortcuts / mounts to leverages existing data from Azure, AWS or OneLake
  • Analysts can leverage their best skills, be it SQL, Spark or DAX
  • Performance benefits
    Microsoft is working on performance improvements, 1 example is DirectLake, the new storage mode for Power BI. Everyting in OneLake is now in the same open Delta Parquet format.
  • Simplified billing and management of runtime components
    Fabric now brings capacities with compute instead of activities per pipeline or TeraBytes/s. That means we don't have to include multiple factors into the equation anymore
    Instead of managing every resource individually, putting it on pause when you don't need it, you can now provision Fabric capacities, which start at a much smaller price point then a Power BI Premium capacity.  Exact pricing will be announced later.
  • AI will become a bigger part of our daily work, with the integration of Copilot inside Microsoft Fabric and Power BI
    • Generate code and queries
    • Turn words into dataflows and data pipelines
    • Create Power BI reports in seconds
    • Generate DAX calculations
    • Create narrative summaries
This post by Kim Manis has some more details from Microsoft's point of view: Introducing Microsoft Fabric.

Next up?

There are still quite some questions around Fabric that will be answered in the near future I assume, a few that I'm thinking of are:
  • Is the performance of Direct Lake really going to be that good?
  • What is V-order with regards to parquet files and how can we influence/handle that?
  • How will the Processing Units for the Fabric capacities hold up for specific workloads? It will be interesting to see what an F2 capacity can handle for example.
On Microsoft Learn, there are also 4 End-to-end tutorials available to get you started with learning Fabric:
  • Lakehouse
  • Data Science
  • Real-Time Analytics
  • Data Warehouse
But also on more experience-specific topics like Power BI, Data Factory and Price prediction with R for Data Science.
I see you thinking: "So now I need to learn all these new products/services with all the accompanying languages, like T-SQL, Python, R, KQL and what have you...?"
Can you do it? Of course! But I certainly don't think it's a necessity to get to know everything.

For example:
If today you are a Power BI developer, you might want to familiarize yourself with the Data Warehouse load and maybe learn the basics of T-SQL. But with the default dataset that comes with the Warehouse (as is the case with a Datamart), you could as well create a basic and fast report out of that to do some basic visualizations to familiarize yourself with the data, so T-SQL is also optional. 

I'm still very excited with this next step forward by Microsoft and I'm eager to start learning more of Fabric. And also to learn the use cases and all the questions that our customers have!

Wednesday, May 24, 2023

Microsoft Build - Data Announcements Summary

During Build we heard a lot of announcements around data, analytics and AI. Let me give you my summary and take on the things I heard and saw!

In general, AI is going to exist in more and more places in our daily work. Earlier, Copilot was already announced in Power Apps and Power Automate, Outlook and Office products, but also GitHub. I wouldn’t be surprised if it will be embedded in almost every part of our daily work in the future, at least to some extend.

These were my favorite announcements:
  • Microsoft Fabric delivers an integrated and simplified experience for all analytics workloads
  • Data Activator is a new detection system for alerting and taking actions (and part of Fabric)
  • Git Integration: delivered as part of the new Fabric workloads
  • Power BI Desktop Developer Mode will deliver a better experience for developers with a new Power BI Project file-type (PBIP)
Let's dive into a little bit more details about the above topics.

Microsoft Fabric

Fabric promises to offer end to end analytics from the data lake to the business user, covering the following pillars:
  • Complete Analytics Platform
  • Lake-Centric and Open
  • Empower Every Business User
  • AI Powered
Fabric covers the complete spectrum of services including data movement, data lake, data engineering, data integration and data science, real time analytics, and business intelligence.



Fabric makes life simpler for customers with its unified and comprehensive platform. Fabric architecture is based on Software as a Service (SaaS) foundation instead of the traditional Platform as a Service (PaaS), to take simplicity and integration to the next level.
This SaaS experience makes sure that all the data and services used within Fabric are pre-wired together and share the same user experience, much as with Office today. 

But of course, Microsoft Fabric was not the only announcement at Build.


Power BI Desktop Developer Mode

Power BI Desktop Developer Mode is here, at least it will be very soon ! In a nutshell, "Developer Mode" enables you to save a Power BI Desktop file into a Power BI Project (PBIP) and operate on the artifacts stored as a folder in your file system.
Power BI Desktop is expanding to serve a better experience for developers, with capabilities like:
  • Source Control for version history and diffs
  • CI/CD for e.g. Pull Requests
  • Text editor support

Developer Mode also ties into the next point: Git integration in the service!

Git Integration

The long-awaited source control integration!
Next to Developer Mode in Desktop and an easier and better way to merge changes into source control, Microsoft has also started working on source control integration on the workspace level.
Be aware that this is a Premium feature, so only workspaces with a Power BI Premium capacity license can connect to source control.

Data Activator

This is actually a new name we haven't heard that much about.
"It will help customers respond to changes in their data instantly by setting up a system of detection that automatically alerts the team with the right context to take action."
 It looks like a low code/no-code way to take actions on your data. It's only in private preview at the moment, so we'll have to wait a bit to get more info on this. In the meanwhile, you can read the announcement blog.


Conclusion

There's a lot of exciting news shared during Microsoft Build!

Also be mindful that until July 1st, Fabric is disabled by default. After that date, it will be enabled by default, so you (as an admin) have some time to prepare your users or only give a small group of people access to Fabric for example. Thank you Microsoft for listening to the community! You can also start a free 60 day trial: aka.ms/Try-Fabric.
A tenant admin can enable Fabric workloads manually by switching the tenant setting to on.
Taken from the Power BI blog


If you want a complete (textual) overview of all announcements during Build, have a look at the Build Book of News 2023.

If you want to know more details about Microsoft Fabric and the other announcements, or if you want to watch the recordings of other sessions, I suggest starting with the below sessions to get an overview.
A few important sessions to start with:
Blog posts:

After hearing all this exciting news, I'll dive into more details on separate blogs on the above topics.












Thursday, May 11, 2023

Microsoft Build Is Around The Corner

 I think I already shared it earlier, but in case you missed it:



In just under 2 weeks, Microsoft Build (in-person and online conference) is happening with a lot of exciting Power BI and data related updates. You should definitely watch it, either live or the recordings afterwards! It starts at Tuesday, May 23rd, 6PM CEST.

More info and registration: build.microsoft.com


A few important sessions to start with:


I will also share an update shortly after Build to summarize the news and give my feedback on it, so stay tuned!

Featured Post

Governance & Administration - Tenant Settings: Searching

With all the Fabric announcements in the last months, some of the Admin announcements might have slipped through. As you might know, the Adm...