Databricks or Microsoft Fabric?
One of the questions I’ve been getting lately is, given the huge overlap between the features of Databricks and Microsoft Fabric, which one should customers standardize on while building out their Data Platforms? While every company has it’s own unique set of requirements based on which they need to make the decision, in this article I propose three requirements which should be universal to all companies looking to build their data platforms irrespective of their industry and size. I call it the 3S model for choosing a data platform technology. Full Disclosure, I work for Databricks, but the opinions and views expressed in this article are solely my own and do not represent the views or opinions of my employer.

Requirement 1: Simplicity
A simple yet powerful tool will always win over the more complex tool with the same capabilities. Having fewer moving parts not only makes the tool simple, but it also makes it more reliable. Lets compare Databricks and Fabric against this backdrop.
Databricks uses a single storage pattern (i.e. Delta tables) and a single engine (Spark + Photon) in all its workloads. While these are available in different form factors and price points to suit the needs of different customers, they all leverage the same underlying technology. Store your data in Delta Tables, access them using Spark+Photon engine and you are guaranteed to get the best performance in the world.
Microsoft Fabric is a conglomeration of 4 different technologies i.e. SQL Data Warehouse, Spark Lakehouse, ADX/KQL Database and Power BI Data marts. This link helps you decide when to use each of these four flavors of Fabric storage. What this does is, you end up with data silos and data duplication both of which hinder your ability to analyze data fast and increase your overall costs.
Requirement 2: Security
While all tools have encryption and access control features, what differentiates tools is the ease and flexibility with which you can encrypt and control your data assets. Most customers I have spoken to want to get out of “IAM hell” and want to leverage a single pane of glass to show their complete data security posture.
Microsoft Fabric by virtue of its siloed data storage model requires you to control the data in each of the four silos separately. This adds to the complexity of access control and any slip ups in any of these formats is going to result in exposing your data to entities who shouldn’t have access to your data. While Microsoft Purview attempts to be that single pane of glass, it is yet to completely integrate with Fabric.
Databricks stores all data in cloud storage in Delta tables and files. You can control the access to both tables and files via Unity Catalog. As long as you encrypt and lock down the cloud storage assets (only admins can access raw data on cloud storage) and control access through Unity Catalog, you can rest assured that your data is visible to only entities who have been given access to via Unity Catalog.
It remains to be seen how Microsoft Purview and Databricks Unity Catalog mature over the coming years to become that central governance platform that they aspire to be.
Requirement 3: Shareability
In this day and age sharing data within an organization and across organizations has become increasingly important. Hoarding data leads to locking up the value of this important asset and loss of opportunity to monetize your data.
Microsoft Fabric while great at allowing you to ingest data from different sources doesn’t itself allow its data to be shared outside your organization. This is an important feature I believe should be available in every data platform to ensure inter-operability between organizations and within organizations.
Databricks provides Delta Sharing which allows you to share within and across organizations safely. Databricks Clean rooms is an added feature which further enhances your ability to securely share data externally.
Conclusion
While there might be other critical requirements that need to be considered before making a decision, the three points discussed above are a great starting point. Do try both the tools and let me know in the comments which one you like better for your organization.





