Install and configure Fluentd for centralized log collection with multi-format parsing

Intermediate 45 min Apr 01, 2026 54 views
Ubuntu 24.04 Ubuntu 22.04 Debian 12 AlmaLinux 9 Rocky Linux 9 Fedora 41

Set up Fluentd with td-agent for enterprise-grade log collection, custom parsers, and multi-destination routing. Configure SSL, performance optimization, and security hardening for production environments.

Prerequisites

  • Root or sudo access
  • At least 2GB RAM
  • Network access to log destinations
  • Basic understanding of system logs

What this solves

Fluentd is a unified logging layer that collects, transforms, and routes log data from multiple sources to various destinations. This tutorial shows you how to install td-agent (the stable Fluentd distribution), configure input plugins for system and application logs, set up custom parsers for different log formats, and optimize performance for production environments.

Step-by-step installation

Update system packages

Start by updating your package manager to ensure you get the latest versions of dependencies.

sudo apt update && sudo apt upgrade -y
sudo dnf update -y

Install required dependencies

Install curl and gnupg for repository setup and SSL certificate verification.

sudo apt install -y curl gnupg lsb-release ca-certificates
sudo dnf install -y curl gnupg2 ca-certificates

Add Treasure Data repository

Add the official td-agent repository to get the latest stable version of Fluentd with enterprise features.

curl -fsSL https://toolbelt.treasuredata.com/sh/install-ubuntu-jammy-td-agent4.sh | sh
curl -fsSL https://toolbelt.treasuredata.com/sh/install-redhat-td-agent4.sh | sh

Create td-agent user and directories

Set up proper ownership and permissions for log collection and processing. The td-agent user needs access to system logs and write access to its working directories.

sudo mkdir -p /var/log/td-agent /etc/td-agent/conf.d
sudo chown -R td-agent:td-agent /var/log/td-agent
sudo chmod 755 /var/log/td-agent
sudo usermod -a -G adm td-agent
Note: Adding td-agent to the adm group allows it to read system logs in /var/log without requiring overly broad permissions.

Configure main td-agent configuration

Create the primary configuration file that defines input sources, parsing rules, and output destinations.

# System configuration

  log_level info
  suppress_repeated_stacktrace true
  emit_error_log_interval 30s
  suppress_config_dump
  without_source true


Include additional configurations

@include conf.d/*.conf

Input: System logs

@type tail @id syslog_input tag system.syslog path /var/log/syslog pos_file /var/log/td-agent/syslog.log.pos @type syslog message_format rfc3164 with_priority true

Input: Authentication logs

@type tail @id auth_input tag system.auth path /var/log/auth.log pos_file /var/log/td-agent/auth.log.pos @type syslog message_format rfc3164 with_priority true

Input: Application logs with JSON format

@type tail @id app_json_input tag app.json path /var/log/app/*.json pos_file /var/log/td-agent/app.json.pos @type json time_key timestamp time_format %Y-%m-%dT%H:%M:%S.%L%z

Input: NGINX access logs

@type tail @id nginx_access_input tag web.nginx.access path /var/log/nginx/access.log pos_file /var/log/td-agent/nginx.access.pos @type nginx format combined

Filter: Add hostname and environment tags

@type record_transformer hostname ${hostname} environment production log_source fluentd

Output: File for debugging

@type file path /var/log/td-agent/debug append true time_slice_format %Y%m%d time_slice_wait 10m time_format %Y%m%dT%H%M%S%z buffer_type file buffer_path /var/log/td-agent/buffer/debug flush_interval 30s

Output: Forward to centralized log server

app. web.**> @type forward @id main_forward name log-server host 203.0.113.10 port 24224 weight 60 @type file path /var/log/td-agent/buffer/forward flush_mode interval flush_interval 5s flush_thread_count 2 retry_type exponential_backoff retry_wait 1s retry_max_interval 60s retry_timeout 60m overflow_action block

Create custom parser for application logs

Define a custom parser configuration for application-specific log formats that don't match standard patterns.

# Custom parser for application logs with specific format

  @type tail
  @id custom_app_input
  tag app.custom
  path /var/log/myapp/*.log
  pos_file /var/log/td-agent/myapp.log.pos
  
    @type regexp
    expression /^(?


Multi-line parser for Java stack traces

@type tail @id java_app_input tag app.java path /var/log/java-app/*.log pos_file /var/log/td-agent/java-app.log.pos multiline_flush_interval 5s @type multiline format_firstline /^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}/ format1 /^(?

CSV log parser

@type tail @id csv_app_input tag app.csv path /var/log/csv-app/*.csv pos_file /var/log/td-agent/csv-app.log.pos @type csv keys time,level,user_id,action,result,duration time_key time time_format %Y-%m-%d %H:%M:%S

Configure multiple output destinations

Set up routing to send different types of logs to appropriate destinations with error handling and buffering.

# Output to Elasticsearch for search and analysis
 web.>
  @type elasticsearch
  @id elasticsearch_output
  host 203.0.113.20
  port 9200
  scheme https
  ssl_verify true
  user elastic
  password your-secure-password
  index_name fluentd-logs
  type_name _doc
  logstash_format true
  logstash_prefix fluentd
  logstash_dateformat %Y.%m.%d
  include_tag_key true
  tag_key @log_name
  
    @type file
    path /var/log/td-agent/buffer/elasticsearch
    flush_mode interval
    flush_interval 10s
    flush_thread_count 4
    retry_type exponential_backoff
    retry_wait 2s
    retry_max_interval 120s
    retry_timeout 120m
    overflow_action block
  


Output critical errors to email alert

@type copy @type elasticsearch @id auth_elasticsearch host 203.0.113.20 port 9200 scheme https ssl_verify true user elastic password your-secure-password index_name security-logs type_name _doc logstash_format true logstash_prefix security @type grep key message pattern /Failed password|authentication failure|sudo/ @type file path /var/log/td-agent/security-alerts append true time_slice_format %Y%m%d time_format %Y%m%dT%H%M%S%z

Output application logs to S3 for long-term storage

@type s3 @id s3_output aws_key_id YOUR_AWS_ACCESS_KEY aws_sec_key YOUR_AWS_SECRET_KEY s3_bucket my-log-bucket s3_region us-east-1 path logs/ s3_object_key_format %{path}%{time_slice}_%{index}.%{file_extension} time_slice_format %Y/%m/%d/%H store_as gzip_command @type file path /var/log/td-agent/buffer/s3 timekey 3600 timekey_wait 10m chunk_limit_size 256m flush_mode interval flush_interval 60s

Configure SSL and security settings

Set up SSL certificates and secure communication between Fluentd instances and output destinations.

# SSL input for receiving logs from other Fluentd instances

  @type forward
  @id secure_forward_input
  port 24224
  bind 0.0.0.0
  
    cert_path /etc/td-agent/ssl/fluentd.crt
    private_key_path /etc/td-agent/ssl/fluentd.key
    ca_path /etc/td-agent/ssl/ca.crt
    client_cert_auth true
  
  
    self_hostname fluentd-server
    shared_key secure-shared-key-change-this
  


Secure forward output with authentication

@type secure_forward @id secure_forward_output self_hostname fluentd-client shared_key secure-shared-key-change-this secure yes enable_strict_verification yes ca_cert_path /etc/td-agent/ssl/ca.crt host 203.0.113.30 port 24284 @type file path /var/log/td-agent/buffer/secure flush_mode interval flush_interval 10s

Create SSL certificates directory

Set up the SSL directory with proper permissions for certificate storage. Only the td-agent user should have read access to private keys.

sudo mkdir -p /etc/td-agent/ssl
sudo chown td-agent:td-agent /etc/td-agent/ssl
sudo chmod 700 /etc/td-agent/ssl
Never use chmod 777. Private keys must only be readable by the service user. Use chmod 600 for private keys and 644 for certificates.

Configure performance optimization

Optimize buffer settings, memory usage, and worker processes for high-throughput log processing.

# Performance tuning for high-volume logs

  workers 4
  root_dir /var/log/td-agent
  suppress_repeated_stacktrace true
  emit_error_log_interval 30s
  suppress_config_dump true
  log_event_verbose false


Buffer optimization for high throughput

@type forward @id high_volume_output name primary-server host 203.0.113.40 port 24224 @type file path /var/log/td-agent/buffer/high-volume flush_mode immediate flush_thread_count 8 chunk_limit_size 64m chunk_limit_records 100000 total_limit_size 2g overflow_action drop_oldest_chunk retry_type exponential_backoff retry_wait 1s retry_max_interval 60s retry_timeout 60m compress gzip

Set up log rotation and maintenance

Configure logrotate to manage Fluentd's own log files and prevent disk space issues.

/var/log/td-agent/td-agent.log {
    daily
    missingok
    rotate 30
    compress
    delaycompress
    notifempty
    copytruncate
    su td-agent td-agent
}

/var/log/td-agent/*.log {
    daily
    missingok
    rotate 7
    compress
    delaycompress
    notifempty
    copytruncate
    su td-agent td-agent
}

Enable and start td-agent service

Start the td-agent service and enable it to start automatically on system boot.

sudo systemctl enable td-agent
sudo systemctl start td-agent
sudo systemctl status td-agent

Configure firewall rules

Open necessary ports for log collection while maintaining security. Only allow connections from trusted sources.

sudo ufw allow from 203.0.113.0/24 to any port 24224
sudo ufw allow from 203.0.113.0/24 to any port 24284
sudo ufw reload
sudo firewall-cmd --add-rich-rule='rule family="ipv4" source address="203.0.113.0/24" port protocol="tcp" port="24224" accept' --permanent
sudo firewall-cmd --add-rich-rule='rule family="ipv4" source address="203.0.113.0/24" port protocol="tcp" port="24284" accept' --permanent
sudo firewall-cmd --reload

Verify your setup

Test your Fluentd installation and configuration with these verification commands.

# Check td-agent service status
sudo systemctl status td-agent

Verify configuration syntax

sudo td-agent --dry-run

Check td-agent version and plugins

td-agent --version td-agent-gem list

Test log processing

echo '{"message":"test log entry","level":"info"}' | sudo -u td-agent td-agent -c /etc/td-agent/td-agent.conf --dry-run

Monitor real-time logs

sudo tail -f /var/log/td-agent/td-agent.log

Check buffer status

sudo ls -la /var/log/td-agent/buffer/

Test network connectivity to outputs

sudo netstat -tlnp | grep td-agent telnet 203.0.113.10 24224

Performance tuning and monitoring

Monitor Fluentd performance and adjust settings based on your log volume and system resources.

Install monitoring plugins

Add plugins for monitoring Fluentd's internal metrics and performance.

sudo td-agent-gem install fluent-plugin-prometheus
sudo td-agent-gem install fluent-plugin-monitor-agent

Configure monitoring endpoints

Set up HTTP endpoints to monitor Fluentd's internal status and metrics.

# Monitor agent for internal metrics

  @type monitor_agent
  @id monitor_agent_input
  bind 127.0.0.1
  port 24220
  tag fluentd.monitor.metrics


Debug agent for configuration inspection

@type debug_agent @id debug_agent_input bind 127.0.0.1 port 24230

Prometheus metrics endpoint

@type prometheus @id prometheus_metrics bind 127.0.0.1 port 24231 metrics_path /metrics

Internal metrics collection

@type prometheus_monitor @id prometheus_monitor interval 30s

Common issues

SymptomCauseFix
Permission denied reading logstd-agent user lacks read accesssudo usermod -a -G adm td-agent
Buffer files filling diskOutput destination unreachableCheck network connectivity and increase buffer limits
High memory usageLarge buffer chunk sizesReduce chunk_limit_size and increase flush frequency
SSL handshake failuresCertificate mismatch or expiredVerify certificates with openssl x509 -text -noout -in cert.crt
Config syntax errorsInvalid YAML or plugin parametersRun sudo td-agent --dry-run to validate
Logs not being forwardedFirewall blocking connectionsCheck firewall rules and test with telnet host port
Service fails to startPort already in useCheck with sudo netstat -tlnp | grep :24224

Next steps

Automated install script

Run this to automate the entire setup

#fluentd #log-collection #centralized-logging #td-agent #log-parsing

Need help?

Don't want to manage this yourself?

We handle infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.

Talk to an engineer