Sunday, March 27, 2022

Kafka




 

Apache Kafka decouple your data stream and your systems
Source system will send the data in Kafaka.
Target system will get data from the kafaka system.

what we have in kafka
any data stream we think about
e.g website events
user interaction, pricing data, financial projection
once your data there in kafka
you can put in any system you like
database
analytics
email system
audit

why apache kafka?
created by linkedin, open source
distributed, resilient architecture, fault tolerant
Horizontal scalability
 can scale to 100s of brokers
 can scale to millions of messages per second
High performance (latency of less than 10ms) - real time
Used by the 2000+ firms, 35% of the fortune 500:

Use cases
Messaging system
activity tracking
gather metrics from many different locations
application logs gathering
stream processing(with the kafka streams API or Spark for example)
De-coupling of sytem dependencies
Integration with spark,flink,storm,hadoop, and many other big data technologies

Example
Neflix uses kafka to apply recommendations in real-time while your watching tv shows

Uber uses kafka to gather user, taxi and trip data in real-time to compute and forecast demand,
and compute surge pricing in real-time

LinkdIn uses kafka to prevent spam, collect user interactions to make better
connection recommendations in real time.

Note : kafka is only used as a transportation mechanism.


No comments:

Post a Comment