What are the components of Microsoft Fabric?
Data Engineering: Data Engineering experience enhances Spark for data transformation and scheduling.
Azure Data Factory: Azure Data Factory blends Power Query simplicity with Azure's data scalability, offering 200+ native connectors for on-premises and cloud data sources.
Data Science: Data Science experience in Fabric integrates Azure ML for model deployment, tracking, enriching data, and shifting insights from descriptive to predictive for analysts.
Data Warehouse: Data Warehouse experience excels in SQL performance and scale, decoupling compute and storage for flexible scaling while natively storing data in Delta Lake format.
Real-Time Analytics: Observational data, collected from diverse sources, rapidly grows, often semi-structured (JSON/Text), high-volume, and schema-flexible, challenging traditional data warehousing. Real-Time Analytics excels in handling this data.
Power BI: The premier Business Intelligence platform, enables seamless access to Fabric's data, facilitating informed decision-making for business owners.
What is one lake in Microsoft Fabric?
Built on top of the mighty Azure Data Lake Store gen2, OneLake is a powerhouse solution that can leverage multiple storage accounts across different regions, while virtualizing them into one seamless, logical lake.
What is Direct Lake?
Direct Lake is based on loading parquet-formatted files directly from a data lake without having to query a Lakehouse endpoint, and without having to import or duplicate data into a Power BI dataset. Direct Lake is a fast-path to load the data from the lake straight into the Power BI engine, ready for analysis.
What is Delta Lake?
Delta Lake is an open-source storage layer that brings ACID (atomicity, consistency, isolation, and durability) transactions to Apache Spark and big data workloads. The current version of Delta Lake included with Azure Synapse has language support for Scala, PySpark, and .NET and is compatible with Linux Foundation Delta Lake. Delta Lake is designed to work with Apache Spark, a powerful processing engine capable of handling large amounts of data and complex analytics workloads.
Data Lakehouse is a hybrid architecture that combines the best of data lake and data warehouse capabilities. Delta Lake, on the other hand, is a data management system running on Apache Spark.
What is Parquet file?
It stores data in a columner format to support the high speed for huge volume of data. In fabric, the delta parquet is added on the top of that to allow you to do the acid transaction.