Introduction to Kafka

Apache Kafka is a distributed streaming platform. What exactly does that mean?

A streaming platform has three key capabilities:

  • Publish and subscribe to streams of records, similar to a message queue or enterprise messaging system.
  • Store streams of records in a fault-tolerant durable way.
  • Process streams of records as they occur.

Kafka is generally used for two broad classes of applications:

  • Building real-time streaming data pipelines that reliably get data between systems or applications
  • Building real-time streaming applications that transform or react to the streams of data
alternate text

Kafka cluster and Kafka connect JDBC sinc setup(e.g. For 3 nodes cluster)

Use-Case

In this setup we are going to send data from kafka avro producer to aws aurora mysql database. For this we are using avro converter to convert schema.