In today's era of Big Data and data-driven decision-making processes there is a need to shift focus from computation to data flows.
Stream processing paradigm, reincarnated in various other forms, is the basis for a fundamentally different approach to develop systems.
This altered way of handling and processing data is precursor for realizing efficient data pipelines for feeding machine learning and
In this talk, you will get acquainted with the core ideas behind stream processing and see how it smoothly scales from a single machine to a distributed setup. To make the talk pragmatic you will see an example implementation of an application to find Feynman's consecutive 9's in the digits of PI (known as Feynman Point). The same problem will be implemented via a pure Java stream processing program as well as using Apache Spark on AWS EMR.
One peculiarity of the solution is that you will not see explicit loops of any sort nor branching instructions (barring those that are used to validate command line arguments), although digits of PI are going to be iteratively produced. Streams are the foundation for systems like Apache Kafka, AWS Kinesis, and similar data streaming platforms.
Intermediate / Beginner
Ervin is a professional software engineer since 1994. He is an IEEE Software
Engineering Certified Instructor and possesses the IEEE Professional Software
Engineering Master Certification.
Ervin is also a Senior Member of the IEEE and a Professional Member of the ACM. Ervin is also author of several books and scientific journal/conference papers (a full list is available upon request). Ervin is an owner of the Expro I.T. Consulting, Serbia consulting company (see exproit.rs) and is an Associate Professor at the University of Novi Sad, Faculty of Technical Sciences, Serbia.
He has a citizenship of Serbia and Hungary. Ervin's mother tongue is Hungarian and fluently speaks Serbian and English.