A Comprehensive Review of Cloud-Native Event Driven Architectures for Real-Time Data Streaming and Analytics in Large-Scale Enterprises

  IJCTT-book-cover
 
         
 
© 2024 by IJCTT Journal
Volume-72 Issue-12
Year of Publication : 2024
Authors : Murugan Lakshmanan
DOI :  10.14445/22312803/IJCTT-V72I12P116

How to Cite?

Murugan Lakshmanan, "A Comprehensive Review of Cloud-Native Event Driven Architectures for Real-Time Data Streaming and Analytics in Large-Scale Enterprises ," International Journal of Computer Trends and Technology, vol. 72, no. 12, pp. 133-137, 2024. Crossref, https://doi.org/10.14445/22312803/IJCTT-V72I12P116

Abstract
Modern enterprises increasingly depend on up-to-the-moment insights to enhance decision-making and operational effectiveness. Event-Driven Architectures (EDAs), paired with cloud-native platforms, have become critical paradigms for delivering real-time analytics at a significant scale. This article consolidates foundational theories, industrial practices, and recent academic findings related to event streaming technologies, stream processing frameworks, architectural design principles, governance policies, compliance strategies, and best practices for operation. By reviewing leading open-source tools, established design patterns, real-world applications, and emerging developments, this study offers a unified reference for professionals, enterprise architects, and researchers. Special attention is devoted to scalability approaches, fault tolerance, data protection, regulatory mandates (such as GDPR and CCPA), performance metrics, and the integration of machine learning with advanced analytics. The discussion concludes with an exploration of new directions, including serverless implementations, interoperability standards, and AI-driven performance optimizations, thereby guiding continued progress in this evolving field.

Keywords
Cloud-Native Computing, Data Analytics, Enterprise Data Governance, Event-Driven Architecture, Messaging Platforms, Real-Time Data Streaming, Stream Processing

Reference

[1] Jay Kreps, Neha Narkhede, and Jun Rao, “Kafka: A Distributed Messaging System for Log Processing,” Proceedings of NetDB, Athens, Greece, pp. 1-7, 2011.
[Google Scholar] [Publisher Link]
[2] Tyler Akidau et al., “MillWheel: Fault-Tolerant Stream Processing at Internet Scale,” Proceedings of the VLDB Endowment, vol. 6, no. 11, pp. 1033-1044, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Brendan Burns, Joe Beda, and Kelsey Hightower, Kubernetes: Up & Running, 2nd ed., O’Reilly Media, 2019.
[Google Scholar] [Publisher Link]
[4] Martin Kleppmann, Designing Data-Intensive Applications, O’Reilly Media, 2017.
[Google Scholar] [Publisher Link]
[5] Tyler Akidau et al., “The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost,” Proceedings of the 41st International Conference on Very Large Data Bases, Kohala Coast, Hawaii, vol. 8, no. 12, pp. 1792-1803, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Brendan Burns, Designing Distributed Systems: Patterns and Paradigms for Scalable, Reliable Services, O’Reilly Media, 2018.
[Google Scholar] [Publisher Link]
[7] Amazon Kinesis Documentation, AWS Documentation. [Online]. Available: https://docs.aws.amazon.com/kinesis/
[8] Google Pub/Sub Documentation, Google Cloud. [Online]. Available: https://cloud.google.com/pubsub/docs
[9] J. Event-Driven Architecture and Real-Time Enterprises, InfoWorld, 2020. [Online]. Available: https://www.infoworld.com/article/3533330/event-driven-architecture-and-real-time-enterprises.html
[10] Xiangrui Meng et al., “MLlib: Machine Learning in Apache Spark,” Journal of Machine Learning Research, vol. 17, no. 34, pp. 1-7, 2016.
[Google Scholar] [Publisher Link]
[11] Neha Narkhede, Gwen Shapira, and Todd Palino, Kafka: The Definitive Guide: Real-Time Data and Stream Processing at Scale, O’Reilly Media, 2017.
[Google Scholar] [Publisher Link]
[12] Cloud-Native, Distributed Messaging and Streaming, Apache Pulsar Documentation, 2017. [Online]. Available: https://pulsar.apache.org/
[13] Azure Event Hubs Documentation, Microsoft Learn Challenge. [Online]. Available: https://learn.microsoft.com/en-us/azure/event-hubs/
[14] Paris Carbone et al., “Apache Flink: Stream and Batch Processing in a Single Engine,” IEEE Data Engineering Bulletin, vol. 38, no. 4, pp. 28-38, 2015.
[Google Scholar] [Publisher Link]
[15] Michael Armbrust et al., “Structured Streaming: A Declarative API for Real-Time Applications in Apache Spark,” Proceedings of the 2018 International Conference on Management of Data, pp. 601-613, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[16] William P. Bejeck, Kafka Streams in Action: Real-Time Apps and Microservices with the Kafka Streams API, Manning Publications, pp. 1-280, 2018.
[Google Scholar] [Publisher Link]
[17] RabbitMQ Documentation, RabbitMQ. [Online]. Available: https://www.rabbitmq.com/docs
[18] Martin Fowler, Event Sourcing, 2005. [Online]. Available: https://martinfowler.com/eaaDev/EventSourcing.html
[19] CQRS Documents and Videos, CQRS Wordpress, 2010. [Online]. Available: https://cqrs.wordpress.com/documents/
[20] Jay Kreps, Questioning the Lambda Architecture, O’Reilly Radar, 2014.
[Google Scholar] [Publisher Link]
[21] Zhamak Dehghani, Data Mesh Principles and Logical Architecture, MartinFowler, 2020. [Online]. Available: https://martinfowler.com/articles/data-mesh-principles.html
[22] Schema Registry for Confluent, Confluent Documentation. [Online]. Available: https://docs.confluent.io/platform/current/schema registry/index.html
[23] Einat Orr, Metadata Management in Data Lakes, lakeFS, 2024. [Online]. Available: https://lakefs.io/blog/metadata-management-data lakes-challenges/
[24] General Data Protection Regulation (GDPR), European Parliament, 2016. [Online]. Available: https://eur-lex.europa.eu/EN/legal content/summary/general-data-protection-regulation-gdpr.html
[25] California Consumer Privacy Act (CCPA), State of California Department of Justice, 2024. [Online]. Available: https://oag.ca.gov/privacy/ccpa
[26] TLS 1.3 RFC 8446, Internet Engineering Task Force (IETF). [Online]. Available: https://datatracker.ietf.org/doc/html/rfc8446
[27] Justin Garrison, and Kris Nova, Cloud Native Infrastructure, O’Reilly Media, 2017.
[Google Scholar] [Publisher Link]