Chapter 5. Ingesting Data with the Apache Kafka Spout Connector
Apache Kafka is a high-throughput, distributed messaging system. Apache Storm provides a Kafka spout to facilitate ingesting data from Kafka 0.8x brokers. Storm developers should include downstream bolts in their topologies to process data ingested with the Kafka spout.
The storm-kafka components include a core-storm spout, as well as a
            fully transactional Trident spout. Storm-kafka spouts provide the following key
            features:
- 'Exactly once' tuple processing with the Trident API 
- Dynamic discovery of Kafka brokers and partitions 
Hortonworks recommends that Storm developers use the Trident API. However, use the core-storm API if sub-second latency is critical for your Storm topology.
- The core-storm API represents a Kafka spout with the - KafkaSpoutclass.
- The Trident API provides a - OpaqueTridentKafkaSpoutclass to represent the spout.
To initialize KafkaSpout and
                OpaqueTridentKafkaSpout, Storm developers need an instance of a
            subclass of the KafkaConfig class, which represents configuration
            information needed to ingest data from a Kafka cluster.
- The - KafkaSpoutconstructor requires the- SpoutConfigsubclass.
- The - OpaqueTridentKafkaSpoutrequires the- TridentKafkaConfigsubclass.
In turn, the constructors for both KafkaSpout and
                OpaqueTridentKafkaSpout require an implementation of the
                BrokerHosts interface, which is used to map Kafka brokers to
            topic partitions. The storm-kafka component provides two implementations of
                BrokerHosts: ZkHosts and
                StaticHosts.
- Use the - ZkHostsimplementation to dynamically track broker-to-partition mapping.
- Use the - StaticHostsimplementation for static broker-to-partition mapping.

