Kafka & Kubernetes: Scaling Consumers

Kafka and Kubernetes (K8s) are a great match. Kafka has knobs to optimize throughput and Kubernetes scales to multiply that throughput.

On the consumer side, there are a few ways to improve scalability.

Resource & Client Tuning
Horizontal Pod Autoscaling (HPA)
Horizontal Workload Scaling

Let’s jump right in.

Resource & Client Tuning

Kafka consumers usually have a very specific job to perform on each Kafka record. As a result, resource allocation is not typically the bottleneck. If anything, we want to allocate as little as possible so that the HPA (see next section) can be as effective as possible. With monitoring in place, observe and tune your service to be as powerful and efficient as it can be so that CPU & Memory are fully utilized.

If your application metrics are not already exposed, this is your first step to scaling. You cannot tune your application if you can’t observe it.

Once resource requirements are understood, squeeze out additional throughput by optimizing the consumer client configurations to meet your goals.

For example

Increased throughput: Increase the amount of data in batches with fetch.min.bytes.
Decreased latency: Limit batch sizes with fetch.max.bytes so that batches are handled quicker and more frequently.

There are great recipes out there for optimizing consumers. Go research the topic and figure out what makes the most sense.

Horizontal Pod Autoscaling (HPA)

Out of the box, K8s scales pods based on pod-level metrics like CPU and Memory. This is great, but not ideal for Kafka Consumers. As mentioned above, resources aren’t typically the issue with a consumer. Even as lag increases, the consumer processes records as quickly as it can which means CPU and Memory stay fairly stable. However, with custom metrics support, applications can scale based on any metric such as Kafka Consumer Lag. This metric is as good as it gets in terms of understanding when you should scale out.

If you’re new to Kafka, it’s worth noting that Kafka’s unit of parallelism is the number of topic partitions, so when consuming a topic with 10 partitions the HPA can only scale up to 10 pods during peak loads.

Partition planning, pod tuning, and an effective HPA will cover most of your scaling needs.

Most…

Horizontal Workload Scaling

There are scenarios where simply scaling a single workload to 50 pods might not help.

Here is an example: You’ve built a SaaS-like Kafka consumer that is responsible for a large and changing number of topics. These topics have varying partition counts, record counts, record sizes, and SLA requirements. If these all get wrapped up into the same consumer, the more demanding topics (high record count, large record size, etc.) will claim the majority of processing time. Scaling out wider and wider won’t fix this.

Out-of-the-box HPA scaling lacks granularity.

Helm Subcharts make it easy to deploy multiple flavors of a single workload. A workload, in this case, is the tuned consumer with HPA configured.

In the Horizontal Workload deployment model, a workload can be dedicated to a large topic while another workload focuses on a set of smaller topics. The workloads do not compete with each other and independently scale to meet the needs of the topic(s) they are responsible for. This also allows for data to be isolated to specific consumers which may be beneficial in certain environments.

This per-topic flexibility will allow you to efficiently scale to meet your needs.

Sample Project

Rather than clutter this blog with sample code, I created a sample repo to illustrate this deployment model. The repo is for demo purposes only.

https://github.com/schroedermatt/helm-subchart-example/tree/master/record-processor

Summary

Kafka boasts scalability. It’s been a cornerstone of the product since Day 1. However, it’s not always clear how we can capitalize on this.

There are a few layers to take into consideration when building Kafka Consumers. Use none, 1, or all of them.

Resource & Client Tuning – Optimize an application.
Horizontal Pod Autoscaling – Autoscale the optimized application.
Horizontal Workload Scaling – Scale the autoscaled, optimized application.

Kafka & Kubernetes: Scaling Consumers

Resource & Client Tuning

Horizontal Pod Autoscaling (HPA)

Horizontal Workload Scaling

Sample Project

Summary

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List