Install and configure Apache Kafka with cluster setup and monitoring

Intermediate 45 min Apr 01, 2026 51 views
Ubuntu 24.04 Ubuntu 22.04 Debian 12 AlmaLinux 9 Rocky Linux 9 Fedora 41

Set up a production-ready Apache Kafka cluster with SSL security, ZooKeeper ensemble, and comprehensive monitoring using JMX and Prometheus for high-throughput message streaming.

Prerequisites

  • At least 3 servers with 4GB RAM each
  • Root or sudo access
  • Network connectivity between cluster nodes
  • Basic understanding of distributed systems

What this solves

Apache Kafka is a distributed streaming platform that handles high-throughput, fault-tolerant message streaming for modern applications. This tutorial sets up a production-ready Kafka cluster with multiple brokers, ZooKeeper ensemble coordination, SSL/SASL security, and monitoring integration for enterprise workloads.

Step-by-step installation

Update system packages and install Java

Kafka requires Java 11 or higher to run. Start by updating your system and installing OpenJDK.

sudo apt update && sudo apt upgrade -y
sudo apt install -y openjdk-17-jdk wget curl net-tools
sudo dnf update -y
sudo dnf install -y java-17-openjdk-devel wget curl net-tools

Verify Java installation and set JAVA_HOME:

java -version
export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64
echo 'export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64' >> ~/.bashrc

Create Kafka user and directory structure

Create a dedicated user for Kafka services and set up the directory structure with proper permissions.

sudo useradd -r -m -s /bin/bash kafka
sudo mkdir -p /opt/kafka /var/log/kafka /var/lib/kafka/data /var/lib/zookeeper
sudo chown -R kafka:kafka /opt/kafka /var/log/kafka /var/lib/kafka /var/lib/zookeeper
Never use chmod 777. We use specific ownership (kafka:kafka) and will set minimal permissions (755 for directories, 644 for files) to maintain security.

Download and install Kafka

Download the latest Kafka binary distribution and extract it to the installation directory.

cd /tmp
wget https://downloads.apache.org/kafka/2.8.2/kafka_2.13-2.8.2.tgz
tar -xzf kafka_2.13-2.8.2.tgz
sudo mv kafka_2.13-2.8.2/* /opt/kafka/
sudo chown -R kafka:kafka /opt/kafka
sudo chmod -R 755 /opt/kafka

Configure ZooKeeper ensemble

Set up ZooKeeper configuration for a three-node ensemble. This provides fault tolerance and prevents split-brain scenarios.

dataDir=/var/lib/zookeeper
clientPort=2181
maxClientCnxns=0
tickTime=2000
initLimit=10
syncLimit=5
server.1=kafka-node1.example.com:2888:3888
server.2=kafka-node2.example.com:2888:3888
server.3=kafka-node3.example.com:2888:3888
4lw.commands.whitelist=*
autopurge.snapRetainCount=3
autopurge.purgeInterval=24

Create myid file for each ZooKeeper node (use 1 for first node, 2 for second, 3 for third):

echo "1" | sudo tee /var/lib/zookeeper/myid
sudo chown kafka:kafka /var/lib/zookeeper/myid
sudo chmod 644 /var/lib/zookeeper/myid

Configure Kafka broker settings

Configure the primary Kafka server properties for clustering, performance, and reliability.

broker.id=1
listeners=PLAINTEXT://0.0.0.0:9092,SSL://0.0.0.0:9093
advertised.listeners=PLAINTEXT://kafka-node1.example.com:9092,SSL://kafka-node1.example.com:9093
log.dirs=/var/lib/kafka/data
num.network.threads=8
num.io.threads=16
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
num.partitions=3
num.recovery.threads.per.data.dir=2
offsets.topic.replication.factor=3
transaction.state.log.replication.factor=3
transaction.state.log.min.isr=2
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=kafka-node1.example.com:2181,kafka-node2.example.com:2181,kafka-node3.example.com:2181
zookeeper.connection.timeout.ms=18000
group.initial.rebalance.delay.ms=0
min.insync.replicas=2
default.replication.factor=3
auto.create.topics.enable=false
delete.topic.enable=true
Note: Change broker.id to 2 and 3 for the other nodes, and update the advertised.listeners hostname accordingly.

Configure SSL security

Set up SSL certificates for encrypted client-broker and inter-broker communication.

sudo mkdir -p /opt/kafka/ssl
cd /opt/kafka/ssl
sudo keytool -keystore kafka.server.keystore.jks -alias kafka-server -validity 3650 -genkey -keyalg RSA -storepass changeme -keypass changeme -dname "CN=kafka-node1.example.com,OU=Engineering,O=Example,L=City,S=State,C=US"

Create truststore and certificate authority:

sudo openssl req -new -x509 -keyout ca-key -out ca-cert -days 3650 -passout pass:changeme -subj "/CN=Kafka-CA"
sudo keytool -keystore kafka.server.truststore.jks -alias CARoot -import -file ca-cert -storepass changeme -noprompt

Add SSL configuration to server.properties:

ssl.keystore.location=/opt/kafka/ssl/kafka.server.keystore.jks
ssl.keystore.password=changeme
ssl.key.password=changeme
ssl.truststore.location=/opt/kafka/ssl/kafka.server.truststore.jks
ssl.truststore.password=changeme
ssl.client.auth=required
ssl.endpoint.identification.algorithm=
security.inter.broker.protocol=SSL
sudo chown -R kafka:kafka /opt/kafka/ssl
sudo chmod -R 640 /opt/kafka/ssl/*.jks

Create systemd service files

Set up systemd services for ZooKeeper and Kafka to enable automatic startup and proper service management.

[Unit]
Description=Apache ZooKeeper
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=forking
User=kafka
Group=kafka
Environment=JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh -daemon /opt/kafka/config/zookeeper.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh
Restart=on-failure
RestartSec=5
StandardOutput=journal
StandardError=journal
SyslogIdentifier=zookeeper

[Install]
WantedBy=multi-user.target
[Unit]
Description=Apache Kafka
Requires=zookeeper.service
After=zookeeper.service

[Service]
Type=forking
User=kafka
Group=kafka
Environment=JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64
Environment=JMX_PORT=9999
ExecStart=/opt/kafka/bin/kafka-server-start.sh -daemon /opt/kafka/config/server.properties
ExecStop=/opt/kafka/bin/kafka-server-stop.sh
Restart=on-failure
RestartSec=10
StandardOutput=journal
StandardError=journal
SyslogIdentifier=kafka

[Install]
WantedBy=multi-user.target

Configure firewall rules

Open the necessary ports for Kafka cluster communication and client access.

sudo ufw allow 2181/tcp comment 'ZooKeeper client'
sudo ufw allow 2888/tcp comment 'ZooKeeper peer'
sudo ufw allow 3888/tcp comment 'ZooKeeper leader election'
sudo ufw allow 9092/tcp comment 'Kafka PLAINTEXT'
sudo ufw allow 9093/tcp comment 'Kafka SSL'
sudo ufw allow 9999/tcp comment 'Kafka JMX'
sudo firewall-cmd --permanent --add-port=2181/tcp --add-port=2888/tcp --add-port=3888/tcp
sudo firewall-cmd --permanent --add-port=9092/tcp --add-port=9093/tcp --add-port=9999/tcp
sudo firewall-cmd --reload

Start and enable services

Start ZooKeeper first, then Kafka, and enable both services for automatic startup.

sudo systemctl daemon-reload
sudo systemctl enable zookeeper kafka
sudo systemctl start zookeeper
sleep 10
sudo systemctl start kafka
sudo systemctl status zookeeper kafka

Configure JMX monitoring

Enable JMX metrics collection for monitoring Kafka performance and health metrics.

rules:
  • pattern: kafka.server<>Value
name: kafka_server_$1_$2
  • pattern: kafka.server<>Value
name: kafka_server_$1_$2 labels: clientId: "$3"
  • pattern: kafka.network<>Value
name: kafka_network_$1_$2
  • pattern: kafka.log<>Value
name: kafka_log_$1_$2

Download JMX Prometheus exporter:

cd /opt/kafka
sudo wget https://repo1.maven.org/maven2/io/prometheus/jmx/jmx_prometheus_javaagent/0.19.0/jmx_prometheus_javaagent-0.19.0.jar
sudo chown kafka:kafka jmx_prometheus_javaagent-0.19.0.jar

Enable Prometheus metrics export

Configure Kafka to export metrics in Prometheus format for monitoring integration.

# Add to KAFKA_JVM_PERFORMANCE_OPTS (around line 180)
KAFKA_JVM_PERFORMANCE_OPTS="-server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -XX:MaxInlineLevel=15 -Djava.awt.headless=true -javaagent:/opt/kafka/jmx_prometheus_javaagent-0.19.0.jar=8080:/opt/kafka/config/jmx-exporter.yml"

Restart Kafka to apply JMX configuration:

sudo systemctl restart kafka
sudo systemctl status kafka

Performance tuning configuration

Apply production-ready performance tuning for high-throughput scenarios.

# Network and I/O tuning
num.network.threads=16
num.io.threads=32
queued.max.requests=16000
fetch.purgatory.purge.interval.requests=1000
producer.purgatory.purge.interval.requests=1000

Log and memory settings

log.flush.interval.messages=10000 log.flush.interval.ms=1000 replica.socket.receive.buffer.bytes=65536

Compression and batching

compression.type=snappy log.cleanup.policy=delete log.cleaner.enable=true

Configure JVM heap settings:

# Modify KAFKA_HEAP_OPTS (around line 20)
export KAFKA_HEAP_OPTS="-Xmx4G -Xms4G"

Verify your setup

Test the Kafka cluster functionality and verify all components are working correctly.

# Check service status
sudo systemctl status zookeeper kafka

Verify ZooKeeper ensemble

echo stat | nc localhost 2181

Test topic creation

/opt/kafka/bin/kafka-topics.sh --create --topic test-topic --bootstrap-server localhost:9092 --partitions 3 --replication-factor 3

List topics

/opt/kafka/bin/kafka-topics.sh --list --bootstrap-server localhost:9092

Check JMX metrics endpoint

curl http://localhost:8080/metrics | head -20

Test SSL connectivity

/opt/kafka/bin/kafka-topics.sh --list --bootstrap-server localhost:9093 --command-config /opt/kafka/config/ssl-client.properties

Create SSL client configuration for testing:

security.protocol=SSL
ssl.truststore.location=/opt/kafka/ssl/kafka.server.truststore.jks
ssl.truststore.password=changeme
ssl.keystore.location=/opt/kafka/ssl/kafka.server.keystore.jks
ssl.keystore.password=changeme
ssl.key.password=changeme

Common issues

SymptomCauseFix
ZooKeeper won't startPort 2181 already in usesudo netstat -tulpn | grep 2181 and kill conflicting process
Kafka broker fails to startInsufficient heap memoryIncrease KAFKA_HEAP_OPTS in startup script
SSL handshake failuresCertificate hostname mismatchUpdate certificate CN to match server hostname
Topic replication failsNot enough brokers availableEnsure all 3 brokers are running and connected
High consumer lagInsufficient partitionsIncrease partition count: kafka-topics.sh --alter --partitions 6
JMX metrics not availableJMX port conflictChange JMX_PORT in service file to unused port

Next steps

Automated install script

Run this to automate the entire setup

#apache-kafka #kafka-cluster #message-streaming #zookeeper #kafka-ssl #kafka-monitoring #jmx-prometheus

Need help?

Don't want to manage this yourself?

We handle infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.

Talk to an engineer