Monday, May 29, 2023

What Is Microsoft Fabric and Why Should I Care?

As I mentioned in my Build Announcement Summary, Microsoft Fabric has been announced!

In short, Fabric covers the complete spectrum of services including data movement, data lake, data engineering, data integration and data science, real time analytics, and business intelligence.

Straight to Next up.

It promises to offer end to end analytics from the data lake to the business user, covering the following pillars:

  • Complete Analytics Platform
  • Lake-Centric and Open
  • Empower Every Business User
  • AI Powered


In my opinion this provides the following benefits:
  • A broad set of deeply integrated analytics 
  • Shared experiences that are familiar and easy to learn across all the products
  • All assets to be easily discovered and reused by all developers
  • OneLake, a unified data lake, Microsoft calls it the "OneDrive for data", allowing customers to keep one copy of the data while using the analytics tools of choice 
  • Centralized administration and governance across all workloads

As mentioned in my blog, Fabric is turned off by default, until July 1st, so you will have to enable it to start using it! 


The new Fabric / Power BI home page when you switch on Fabric looks like this:

Fabric home page

And with the button in the left bottom corner you can switch between the different persona's/workloads:

Fabric workload switcher



Microsoft also states clearly that standalone PaaS products will stay untouched and remain active. So there's no need for existing customers to worry about solutions currently in production.

"Existing Microsoft products such as Azure Synapse Analytics, Azure Data Factory, and Azure Data Explorer will continue to provide a robust, enterprise-grade platform as a service (PaaS) solution for data analytics. Fabric represents an evolution of those offerings in the form of a simplified SaaS solution that can connect to existing PaaS offerings. Customers will be able to upgrade from their current products into Fabric at their own pace." by Arun Ulag
I'm also pretty sure that existing customer will be able to migrate to the new SaaS-solution.
If you are familiar with the current Synapse offerings, you might find the following mapping table interesting to have as a reference.

Synapse

Fabric

Pipelines

Data Pipelines

Data Flows

Dataflows

SQL Pools

Data Warehouse

Spark

Spark

Notebooks

Notebooks

Azure Data Explorer (ADX/Kusto)

Real-time Analytics

SQL Serverless

Lakehouse

Synapse Workspace

Power BI Workspace

ADLS Gen2

OneLake

Linked Services

Connections

Datasets

Sources/Destinations

Self-Hosted Integration Runtime (SHIR)

Power BI Gateway

CI/CD, Git

ALM



Although Fabric is still in preview, I would encourage you to try out features and look at the use cases, because:
  • Fabric is based on the serverless paradigm. You don't have to start clusters or manage resource in Azure anymore. Instead, Fabric delivers capacities as a SaaS resource. You can spin up analytics solutions faster and more easily.
  • OneLake makes it easier to:
    • Store large amounts of data
    • Use one accurate, certified and real-time unified source of truth
    • Use shortcuts / mounts to leverages existing data from Azure, AWS or OneLake
  • Analysts can leverage their best skills, be it SQL, Spark or DAX
  • Performance benefits
    Microsoft is working on performance improvements, 1 example is DirectLake, the new storage mode for Power BI. Everyting in OneLake is now in the same open Delta Parquet format.
  • Simplified billing and management of runtime components
    Fabric now brings capacities with compute instead of activities per pipeline or TeraBytes/s. That means we don't have to include multiple factors into the equation anymore
    Instead of managing every resource individually, putting it on pause when you don't need it, you can now provision Fabric capacities, which start at a much smaller price point then a Power BI Premium capacity.  Exact pricing will be announced later.
  • AI will become a bigger part of our daily work, with the integration of Copilot inside Microsoft Fabric and Power BI
    • Generate code and queries
    • Turn words into dataflows and data pipelines
    • Create Power BI reports in seconds
    • Generate DAX calculations
    • Create narrative summaries
This post by Kim Manis has some more details from Microsoft's point of view: Introducing Microsoft Fabric.

Next up?

There are still quite some questions around Fabric that will be answered in the near future I assume, a few that I'm thinking of are:
  • Is the performance of Direct Lake really going to be that good?
  • What is V-order with regards to parquet files and how can we influence/handle that?
  • How will the Processing Units for the Fabric capacities hold up for specific workloads? It will be interesting to see what an F2 capacity can handle for example.
On Microsoft Learn, there are also 4 End-to-end tutorials available to get you started with learning Fabric:
  • Lakehouse
  • Data Science
  • Real-Time Analytics
  • Data Warehouse
But also on more experience-specific topics like Power BI, Data Factory and Price prediction with R for Data Science.
I see you thinking: "So now I need to learn all these new products/services with all the accompanying languages, like T-SQL, Python, R, KQL and what have you...?"
Can you do it? Of course! But I certainly don't think it's a necessity to get to know everything.

For example:
If today you are a Power BI developer, you might want to familiarize yourself with the Data Warehouse load and maybe learn the basics of T-SQL. But with the default dataset that comes with the Warehouse (as is the case with a Datamart), you could as well create a basic and fast report out of that to do some basic visualizations to familiarize yourself with the data, so T-SQL is also optional. 

I'm still very excited with this next step forward by Microsoft and I'm eager to start learning more of Fabric. And also to learn the use cases and all the questions that our customers have!

No comments:

Post a Comment

Thanks for taking the time to provide feedback!

Cheers,
Nicky