TheDeveloperBlog.com

Home | Contact Us

C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML

Kafka Topic Replication

Kafka Topic Replication with Apache Kafka Introduction, What is Kafka, Kafka Topics, Kafka Topic Replication, Kafka Fundamentals, Kafka Architecture, Kafka Installation, Kafka Tools, Kafka Application etc.

<< Back to KAFKA

Kafka Topic Replication

Apache Kafka is a distributed software system in the Big Data world. Thus, for such a system, there is a requirement to have copies of the stored data. In Kafka, each broker contains some sort of data. But, what if the broker or the machine fails down? The data will be lost. Precautionary, Apache Kafka enables a feature of replication to secure data loss even when a broker fails down. To do so, a replication factor is created for the topics contained in any particular broker. A replication factor is the number of copies of data over multiple brokers. The replication factor value should be greater than 1 always (between 2 or 3). This helps to store a replica of the data in another broker from where the user can access it.

For example, suppose we have a cluster containing three brokers say Broker 1, Broker 2, and Broker 3. A topic, namely Topic-X is split into Partition 0 and Partition 1 with a replication factor of 2.

Kafka Topic Replication

Thus, we can see that Partition 0 of Topic-x is having its replicas in Broker 1 and Broker 2. Also, Partition1 of Topic-x is having its replication in Broker 2 and Broker 3.

It is obvious to have confusion when both the actual data and its replicas are present. The cluster may get confuse that which broker should serve the client request. To remove such confusion, the following task is done by Kafka:

  • It chooses one of the broker's partition as a leader, and the rest of them becomes its followers.
  • The followers(brokers) will be allowed to synchronize the data. But, in the presence of a leader, none of the followers is allowed to serve the client's request. These replicas are known as ISR(in-sync-replica). So, Apache Kafka offers multiple ISR(in-sync-replica) for the data.

Therefore, only the leader is allowed to serve the client request. The leader handles all the read and writes operations of data for the partitions. The leader and its followers are determined by the zookeeper(discussed later).

If the broker holding the leader for the partition fails to serve the data due to any failure, one of its respective ISR replicas will takeover the leadership. Afterward, if the previous leader returns back, it tries to acquire its leadership again.

Let's see an example to understand the concept of leader and its followers.

Suppose, a cluster with the following three brokers 1,2, and 3. A topic x is present having two partitions and with replication factor=2.

Kafka Topic Replication

So, to remove the confusion, Partition-0 under Broker 1 is provided with the leadership. Thus, it is the leader and Partition 0 under Broker 2 will become its replica or ISR. Similarly, Partition 1 under Broker 2 is the leader and Partition 1 under Broker 3 is its replica or ISR. In case, Broker 1 fails to serve, Broker 2 with Partition 0 replica will become the leader.


Next TopicKafka Producer




Related Links:


Related Links

Adjectives Ado Ai Android Angular Antonyms Apache Articles Asp Autocad Automata Aws Azure Basic Binary Bitcoin Blockchain C Cassandra Change Coa Computer Control Cpp Create Creating C-Sharp Cyber Daa Data Dbms Deletion Devops Difference Discrete Es6 Ethical Examples Features Firebase Flutter Fs Git Go Hbase History Hive Hiveql How Html Idioms Insertion Installing Ios Java Joomla Js Kafka Kali Laravel Logical Machine Matlab Matrix Mongodb Mysql One Opencv Oracle Ordering Os Pandas Php Pig Pl Postgresql Powershell Prepositions Program Python React Ruby Scala Selecting Selenium Sentence Seo Sharepoint Software Spellings Spotting Spring Sql Sqlite Sqoop Svn Swift Synonyms Talend Testng Types Uml Unity Vbnet Verbal Webdriver What Wpf