Implementing Enterprise-Wide Lakehouse using Microsoft Azure Databricks and Delta Lake |
||
![]() |
![]() |
|
© 2025 by IJCTT Journal | ||
Volume-73 Issue-4 |
||
Year of Publication : 2025 | ||
Authors : Mehul K Bhuva | ||
DOI : 10.14445/22312803/IJCTT-V73I4P119 |
How to Cite?
Mehul K Bhuva, "Implementing Enterprise-Wide Lakehouse using Microsoft Azure Databricks and Delta Lake," International Journal of Computer Trends and Technology, vol. 73, no. 4, pp. 135-139, 2025. Crossref, https://doi.org/10.14445/22312803/IJCTT-V73I4P119
Abstract
This article presents a practical and scalable approach for implementing an enterprise-wide Lakehouse using Azure Databricks and Delta Lake. As data grows in volume, variety, and velocity, organizations need a unified platform that combines the reliability of data warehouses with the scalability of data lakes. The Lakehouse paradigm fulfills this by enabling transactional data lakes with support for both analytical and operational workloads. This paper discusses the architecture, key components, implementation strategies, and real-world considerations for building such systems in Azure. The results showcase improved data governance, reduced duplication, and faster insights. This architecture has broad implications for digital transformation and advanced analytics.
Keywords
Azure databricks, Data lakehouse, Data pipeline, Delta lake, Enterprise data architecture..
Reference
[1] Matei Zaharia et al., “Apache Spark: A Unified Engine for Big Data Processing,” Communications of the ACM, vol. 59, no. 11, pp. 56 65, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Michael Armbrust et al., “Delta Lake: High-Performance ACID Table Storage Over Cloud Object Stores,” Proceedings of the VLDB Endowment, vol. 13, no. 12, pp. 3411-3424, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Databricks Documentation, Lakehouse Architecture, 2023. [Online]. Available: https://docs.databricks.com/lakehouse/
[4] Microsoft, Azure Data Lake Storage Gen2 Documentation, 2023. [Online]. Available: https://learn.microsoft.com/en us/azure/storage/blobs/data-lake-storage-introduction
[5] Unity Catalog Documentation, Databricks, 2023. [Online]. Available: https://docs.databricks.com/data-governance/unity catalog/index.html
[6] Konstantin Shvachko et al., “The Hadoop Distributed File System,” IEEE 26th Symposium on Mass Storage Systems and Technologies, Incline Village, NV, USA, pp. 1-10, 2010.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Daniel E. O'Leary, “Embedding AI and Crowdsourcing in the Big Data Lake,” IEEE Intelligent Systems, vol. 29, no. 5, pp. 70-73, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Natalia Miloslavskaya, and Alexander Tolstoy, “Big Data, Fast Data and Data Lake Concepts,” Procedia Computer Science, vol. 88, pp. 300-305, 2016.
[CrossRef] [Google Scholar] [Publisher Link]