Apache Samza is anopen-source, near-realtime, asynchronous computational framework forstream processing developed by theApache Software Foundation inScala andJava. It has been developed in conjunction withApache Kafka. Both were originally developed byLinkedIn.[2]
Samza allows users to buildstateful applications that process data in real-time from multiple sources including Apache Kafka.
Samza provides fault tolerance, isolation and stateful processing. Unlike batch systems such asApache Hadoop orApache Spark, it provides continuous computation and output, which result in sub-second[3] response times.
There are many players in the field of real-time stream processing and Samza is one of the mature products.[4][5][6] It was added to Apache in 2013.[7]
Samza is used by multiple companies.[8] The biggest installation is in LinkedIn.