Centralized Log Aggregation: ELK, Graylog, Vector, and Syslog-ng
When something breaks in your homelab, the answer is almost always in the logs. But the logs are scattered across a dozen machines and fifty containers — Proxmox logs on one server, Docker container logs on another, firewall logs on your router, application logs in various directories. By the time you find the relevant log file, SSH into the right machine, and grep through thousands of lines, the problem has either resolved itself or you've lost the context of what you were looking for.
Centralized log aggregation solves this by shipping all logs to one place where you can search, filter, and correlate them. A query like "show me everything that happened across all services between 2:15 AM and 2:20 AM" becomes trivial instead of impossible.

This guide covers the major options beyond Loki (which we've covered separately) — each with different trade-offs in resource usage, features, and complexity.
The Options at a Glance
| Solution | Architecture | Storage | RAM Minimum | Best For |
|---|---|---|---|---|
| Loki + Grafana | Log labels, not indexes | Object store / filesystem | ~256 MB | Already using Grafana, cost-conscious |
| ELK Stack | Full-text indexing | Elasticsearch | ~4 GB | Full-text search, complex queries |
| Graylog | Full-text indexing | OpenSearch/Elasticsearch | ~3 GB | GELF input, stream routing, alerts |
| Vector + ClickHouse | Pipeline + columnar DB | ClickHouse | ~1 GB | High-volume, structured data |
| syslog-ng + flat files | Traditional syslog | Filesystem | ~50 MB | Lightweight, compliance, archival |
ELK Stack (Elasticsearch, Logstash, Kibana)
The ELK stack is the industry standard for log aggregation. Elasticsearch indexes your logs for fast full-text search, Logstash processes and transforms log data, and Kibana provides the web UI for searching and visualizing.
When to Choose ELK
- You need powerful full-text search across millions of log entries
- You want the richest visualization and dashboard capabilities
- You have 8+ GB of RAM to spare
- You're learning for career purposes (ELK is used widely in production)
Deployment
The modern approach replaces Logstash with Filebeat for log collection (lighter) and uses Elasticsearch directly for processing:
# docker-compose.yml
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.12.0
restart: unless-stopped
environment:
- discovery.type=single-node
- xpack.security.enabled=true
- ELASTIC_PASSWORD=change-this-password
- "ES_JAVA_OPTS=-Xms2g -Xmx2g"
volumes:
- es-data:/usr/share/elasticsearch/data
ports:
- "9200:9200"
ulimits:
memlock:
soft: -1
hard: -1
kibana:
image: docker.elastic.co/kibana/kibana:8.12.0
restart: unless-stopped
ports:
- "5601:5601"
environment:
ELASTICSEARCH_HOSTS: '["http://elasticsearch:9200"]'
ELASTICSEARCH_USERNAME: kibana_system
ELASTICSEARCH_PASSWORD: kibana-password
depends_on:
- elasticsearch
logstash:
image: docker.elastic.co/logstash/logstash:8.12.0
restart: unless-stopped
ports:
- "5044:5044" # Beats input
- "5514:5514" # Syslog input
- "5514:5514/udp"
volumes:
- ./logstash/pipeline:/usr/share/logstash/pipeline
environment:
LS_JAVA_OPTS: "-Xms512m -Xmx512m"
depends_on:
- elasticsearch
volumes:
es-data:
Logstash Pipeline
# logstash/pipeline/homelab.conf
input {
# Accept syslog from network devices and servers
syslog {
port => 5514
type => "syslog"
}
# Accept Filebeat input
beats {
port => 5044
}
}
filter {
# Parse syslog messages
if [type] == "syslog" {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:source_host} %{DATA:program}(?:\[%{POSINT:pid}\])?: %{GREEDYDATA:log_message}" }
}
date {
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
}
# Parse Docker JSON logs
if [fields][source] == "docker" {
json {
source => "message"
}
}
# Add geographic data for firewall logs (optional)
if [source_ip] {
geoip {
source => "source_ip"
}
}
}
output {
elasticsearch {
hosts => ["http://elasticsearch:9200"]
user => "elastic"
password => "change-this-password"
index => "homelab-%{+YYYY.MM.dd}"
}
}
Filebeat on Client Machines
Install Filebeat on each server to ship logs to Logstash:
# /etc/filebeat/filebeat.yml
filebeat.inputs:
- type: log
paths:
- /var/log/syslog
- /var/log/auth.log
fields:
source: system
- type: container
paths:
- /var/lib/docker/containers/*/*.log
fields:
source: docker
output.logstash:
hosts: ["logserver.homelab.lan:5044"]
# Install Filebeat
curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-8.12.0-amd64.deb
sudo dpkg -i filebeat-8.12.0-amd64.deb
sudo systemctl enable --now filebeat
Index Lifecycle Management
Elasticsearch indexes grow without bound unless you configure retention:
# Create an ILM policy via Kibana Dev Tools or API
curl -X PUT "localhost:9200/_ilm/policy/homelab-policy" \
-H 'Content-Type: application/json' \
-u elastic:password \
-d '{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_size": "5gb",
"max_age": "7d"
}
}
},
"delete": {
"min_age": "30d",
"actions": {
"delete": {}
}
}
}
}
}'
Graylog
Graylog is a purpose-built log management platform. Unlike ELK (which is a general-purpose search engine repurposed for logs), Graylog was designed from the ground up for log collection, processing, and alerting.
Why Graylog Over ELK
- Native GELF (Graylog Extended Log Format) input — many applications and Docker support GELF natively
- Stream-based routing: send firewall logs to one stream, Docker logs to another, with different retention policies
- Built-in alerting without additional components
- Simpler to operate for pure log management (ELK has more general-purpose complexity)
Deployment
# docker-compose.yml
services:
mongodb:
image: mongo:6
restart: unless-stopped
volumes:
- mongo-data:/data/db
opensearch:
image: opensearchproject/opensearch:2
restart: unless-stopped
environment:
- discovery.type=single-node
- "OPENSEARCH_JAVA_OPTS=-Xms1g -Xmx1g"
- DISABLE_SECURITY_PLUGIN=true
volumes:
- os-data:/usr/share/opensearch/data
ulimits:
memlock:
soft: -1
hard: -1
graylog:
image: graylog/graylog:5.2
restart: unless-stopped
depends_on:
- mongodb
- opensearch
ports:
- "9000:9000" # Web UI
- "1514:1514" # Syslog TCP
- "1514:1514/udp" # Syslog UDP
- "12201:12201" # GELF TCP
- "12201:12201/udp" # GELF UDP
environment:
GRAYLOG_PASSWORD_SECRET: "a-long-random-string-at-least-16-chars"
GRAYLOG_ROOT_PASSWORD_SHA2: "your-sha256-hashed-password"
GRAYLOG_HTTP_EXTERNAL_URI: "http://10.0.0.50:9000/"
GRAYLOG_ELASTICSEARCH_HOSTS: "http://opensearch:9200"
GRAYLOG_MONGODB_URI: "mongodb://mongodb:27017/graylog"
volumes:
- graylog-data:/usr/share/graylog/data
volumes:
mongo-data:
os-data:
graylog-data:
Generate the password hash:
echo -n "your-admin-password" | sha256sum | cut -d' ' -f1
Docker GELF Logging Driver
Docker can send container logs directly to Graylog via GELF:
# Per container
docker run --log-driver=gelf \
--log-opt gelf-address=udp://10.0.0.50:12201 \
--log-opt tag="{{.Name}}" \
nginx
# Or set as default in /etc/docker/daemon.json
{
"log-driver": "gelf",
"log-opts": {
"gelf-address": "udp://10.0.0.50:12201",
"tag": "{{.Name}}"
}
}
Stream Configuration
In the Graylog web UI, create streams to organize logs:
- Infrastructure: Match on
sourcecontaining your server hostnames - Docker: Match on
facility=dockeror tag containing container names - Firewall: Match on
application_name=filterlog(pfSense) orsource= your firewall - Security: Match on
facility=authormessagecontaining "Failed password"
Each stream can have its own retention period, alert rules, and access permissions.
Vector: The Modern Log Pipeline
Vector (by Datadog, open source) is a high-performance log and metrics pipeline written in Rust. It replaces Logstash, Filebeat, Fluentd, and similar tools with a single binary that can collect, transform, and route observability data.
Why Vector
- Extremely resource-efficient (10x less memory than Logstash for equivalent workloads)
- Single binary, no JVM
- Can replace both the collector (Filebeat) and the processor (Logstash)
- Supports dozens of sources and sinks
- VRL (Vector Remap Language) for powerful log transformation
Deployment as an Agent
# docker-compose.yml
services:
vector:
image: timberio/vector:latest-alpine
restart: unless-stopped
ports:
- "8686:8686" # API
- "5514:5514" # Syslog
volumes:
- ./vector.yaml:/etc/vector/vector.yaml:ro
- /var/log:/var/log:ro
- /var/lib/docker/containers:/var/lib/docker/containers:ro
Vector Configuration
# vector.yaml
sources:
# Collect syslog from the host
host_syslog:
type: file
include:
- /var/log/syslog
- /var/log/auth.log
# Collect Docker container logs
docker_logs:
type: docker_logs
include_containers:
- "*"
# Accept syslog from network devices
network_syslog:
type: syslog
address: "0.0.0.0:5514"
mode: udp
transforms:
# Parse and enrich logs
parsed_logs:
type: remap
inputs:
- host_syslog
- docker_logs
- network_syslog
source: |
# Add a homelab source tag
.homelab_source = "homelab"
# Parse severity from syslog
if exists(.severity) {
.level = to_string!(.severity)
}
# Extract container name from Docker logs
if exists(.container_name) {
.service = replace!(.container_name, "/", "")
}
# Detect and tag error messages
if match(to_string!(.message), r'(?i)(error|fail|critical|panic)') {
.is_error = true
}
# Filter out noisy health check logs
filtered:
type: filter
inputs:
- parsed_logs
condition:
type: vrl
source: '!match(to_string!(.message), r"GET /health")'
sinks:
# Send to Elasticsearch/OpenSearch
elasticsearch:
type: elasticsearch
inputs:
- filtered
endpoints:
- "http://elasticsearch:9200"
bulk:
index: "homelab-%Y-%m-%d"
auth:
strategy: basic
user: elastic
password: change-this-password
# Also write to local files as backup
file_backup:
type: file
inputs:
- filtered
path: "/var/log/vector/homelab-%Y-%m-%d.log"
encoding:
codec: json
Vector as a Replacement for the Full ELK Stack
For simpler setups, Vector can write directly to ClickHouse (columnar database, efficient for log queries) instead of Elasticsearch, significantly reducing resource usage:
sinks:
clickhouse:
type: clickhouse
inputs:
- filtered
endpoint: "http://clickhouse:8123"
database: logs
table: homelab_logs
auth:
strategy: basic
user: default
password: ""
syslog-ng: Traditional and Lightweight
For pure syslog collection — network devices, servers, firewalls — syslog-ng is battle-tested and uses minimal resources.
Configuration
# /etc/syslog-ng/syslog-ng.conf
@version: 4.4
source s_network {
syslog(
ip("0.0.0.0")
port(514)
transport("udp")
);
syslog(
ip("0.0.0.0")
port(514)
transport("tcp")
);
};
# Separate logs by source host
destination d_hosts {
file("/var/log/remote/${HOST}/${YEAR}-${MONTH}-${DAY}.log"
create-dirs(yes)
);
};
# Firewall logs to a separate directory
filter f_firewall {
host("10.0.0.1") or program("filterlog");
};
destination d_firewall {
file("/var/log/remote/firewall/${YEAR}-${MONTH}-${DAY}.log"
create-dirs(yes)
);
};
log {
source(s_network);
filter(f_firewall);
destination(d_firewall);
flags(final);
};
log {
source(s_network);
destination(d_hosts);
};
syslog-ng writes structured log files that you can search with grep, awk, or feed into a log viewer like lnav. It's not as fancy as Elasticsearch, but it uses ~50 MB of RAM and never loses logs because a Java process ran out of heap space.
Choosing Your Stack
You have 4+ GB RAM to spare and want the best search experience: ELK or Graylog. ELK has a larger ecosystem; Graylog is more focused on log management.
You want modern and efficient: Vector as the pipeline, with ClickHouse or Elasticsearch as the backend. Best performance per resource dollar.
You just need syslog from network devices: syslog-ng or rsyslog. Add lnav for a nice terminal-based log viewer.
You're already running Grafana and Prometheus: Stick with Loki (covered in our separate guide). It integrates seamlessly and uses the least resources for label-based log queries.
Whatever you choose, the goal is the same: when something goes wrong at 2 AM, you open one interface, type a query, and find the answer instead of SSH-ing into six machines and grepping through log files. That capability alone is worth the setup effort.