Chapter 8. Using Spark Streaming
Spark Streaming is an extension of the core Spark API that enables scalable,
high-throughput, fault-tolerant processing of real-time data streams. Data can be ingested from
sources such as Kafka and Flume, and can be processed using complex algorithms expressed with
high-level functions like map, reduce, join, and
window. Processed data can be sent to file systems, databases, and live
dashboards.
![]() | Important |
|---|---|
Kafka Direct Receiver integration with Spark Streaming only works when the cluster is not Kerberos-enabled. Dynamic Resource Allocation does not work with Spark Streaming. |
The Apache Spark Streaming Programming Guide offers conceptual information; programming examples in Scala, Java, and Python; and performance tuning information.
For additional examples, see the Apache GitHub example repositories for Scala, Java, and Python.


![[Important]](../common/images/admon/important.png)