Introduction to Kafka¶
Apache Kafka is a distributed streaming platform. What exactly does that mean?¶
A streaming platform has three key capabilities:¶
- Publish and subscribe to streams of records, similar to a message queue or enterprise messaging system.
- Store streams of records in a fault-tolerant durable way.
- Process streams of records as they occur.
Kafka is generally used for two broad classes of applications:¶
- Building real-time streaming data pipelines that reliably get data between systems or applications
- Building real-time streaming applications that transform or react to the streams of data
Kafka cluster and Kafka connect JDBC sinc setup(e.g. For 3 nodes cluster)¶
Use-Case¶
In this setup we are going to send data from kafka avro producer to aws aurora mysql database. For this we are using avro converter to convert schema.