EFK 日志栈
EFK 日志栈
EFK(Elasticsearch + Fluentd + Kibana)是经典的日志收集和分析方案。
架构
┌─────────────────────────────────────┐
│ Application Pods │
│ (stdout/stderr logs) │
└────────────┬────────────────────────┘
│
┌────────────▼────────────────────────┐
│ Fluentd DaemonSet │
│ (每个节点运行,收集容器日志) │
└────────────┬────────────────────────┘
│
┌────────────▼────────────────────────┐
│ Elasticsearch Cluster │
│ (存储、索引、搜索日志) │
└────────────┬────────────────────────┘
│
┌────────────▼────────────────────────┐
│ Kibana │
│ (可视化查询和分析) │
└─────────────────────────────────────┘
部署 Elasticsearch
创建 Namespace
kubectl create namespace logging
Elasticsearch StatefulSet
apiVersion: v1
kind: Service
metadata:
name: elasticsearch
namespace: logging
labels:
app: elasticsearch
spec:
clusterIP: None # Headless Service
ports:
- port: 9200
name: http
- port: 9300
name: transport
selector:
app: elasticsearch
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: elasticsearch
namespace: logging
spec:
serviceName: elasticsearch
replicas: 3
selector:
matchLabels:
app: elasticsearch
template:
metadata:
labels:
app: elasticsearch
spec:
# 初始化容器:调整系统参数
initContainers:
- name: increase-vm-max-map
image: busybox
command: ["sysctl", "-w", "vm.max_map_count=262144"]
securityContext:
privileged: true
- name: increase-fd-ulimit
image: busybox
command: ["sh", "-c", "ulimit -n 65536"]
securityContext:
privileged: true
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:8.11.0
ports:
- containerPort: 9200
name: http
- containerPort: 9300
name: transport
env:
# 集群配置
- name: cluster.name
value: "k8s-logs"
- name: node.name
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: discovery.seed_hosts
value: "elasticsearch-0.elasticsearch,elasticsearch-1.elasticsearch,elasticsearch-2.elasticsearch"
- name: cluster.initial_master_nodes
value: "elasticsearch-0,elasticsearch-1,elasticsearch-2"
# JVM 设置
- name: ES_JAVA_OPTS
value: "-Xms2g -Xmx2g"
# 禁用安全功能(生产环境需要启用)
- name: xpack.security.enabled
value: "false"
- name: xpack.security.enrollment.enabled
value: "false"
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
resources:
requests:
memory: 2Gi
cpu: 1000m
limits:
memory: 4Gi
cpu: 2000m
# 就绪探针
readinessProbe:
httpGet:
path: /_cluster/health
port: 9200
initialDelaySeconds: 30
periodSeconds: 10
# 存活探针
livenessProbe:
tcpSocket:
port: 9200
initialDelaySeconds: 60
periodSeconds: 10
# 持久化存储
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "fast-ssd"
resources:
requests:
storage: 100Gi
验证 Elasticsearch
# 查看 Pod 状态
kubectl get pods -n logging -l app=elasticsearch
# 端口转发
kubectl port-forward -n logging svc/elasticsearch 9200:9200
# 测试连接
curl http://localhost:9200/_cluster/health?pretty
# 查看节点
curl http://localhost:9200/_cat/nodes?v
部署 Fluentd
Fluentd ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: fluentd-config
namespace: logging
data:
fluent.conf: |
# 输入:收集容器日志
<source>
@type tail
@id in_tail_container_logs
path /var/log/containers/*.log
pos_file /var/log/fluentd-containers.log.pos
tag kubernetes.*
read_from_head true
<parse>
@type json
time_format %Y-%m-%dT%H:%M:%S.%NZ
time_key time
keep_time_key true
</parse>
</source>
# 过滤:添加 Kubernetes 元数据
<filter kubernetes.**>
@type kubernetes_metadata
@id filter_kube_metadata
kubernetes_url "#{ENV['FLUENT_FILTER_KUBERNETES_URL'] || 'https://' + ENV.fetch('KUBERNETES_SERVICE_HOST') + ':' + ENV.fetch('KUBERNETES_SERVICE_PORT') + '/api'}"
verify_ssl "#{ENV['KUBERNETES_VERIFY_SSL'] || true}"
ca_file "#{ENV['KUBERNETES_CA_FILE']}"
skip_labels "#{ENV['FLUENT_KUBERNETES_METADATA_SKIP_LABELS'] || 'false'}"
skip_container_metadata "#{ENV['FLUENT_KUBERNETES_METADATA_SKIP_CONTAINER_METADATA'] || 'false'}"
skip_master_url "#{ENV['FLUENT_KUBERNETES_METADATA_SKIP_MASTER_URL'] || 'false'}"
skip_namespace_metadata "#{ENV['FLUENT_KUBERNETES_METADATA_SKIP_NAMESPACE_METADATA'] || 'false'}"
</filter>
# 过滤:排除系统日志
<filter kubernetes.**>
@type grep
<exclude>
key $.kubernetes.namespace_name
pattern ^(kube-system|kube-public|kube-node-lease)$
</exclude>
</filter>
# 输出:发送到 Elasticsearch
<match kubernetes.**>
@type elasticsearch
@id out_es
@log_level info
include_tag_key true
host elasticsearch.logging.svc.cluster.local
port 9200
logstash_format true
logstash_prefix k8s
logstash_dateformat %Y.%m.%d
include_timestamp false
type_name _doc
<buffer>
@type file
path /var/log/fluentd-buffers/kubernetes.system.buffer
flush_mode interval
retry_type exponential_backoff
flush_thread_count 2
flush_interval 5s
retry_forever false
retry_max_interval 30
chunk_limit_size 2M
queue_limit_length 8
overflow_action block
</buffer>
</match>
Fluentd DaemonSet
apiVersion: v1
kind: ServiceAccount
metadata:
name: fluentd
namespace: logging
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: fluentd
rules:
- apiGroups: [""]
resources:
- pods
- namespaces
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: fluentd
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: fluentd
subjects:
- kind: ServiceAccount
name: fluentd
namespace: logging
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
namespace: logging
labels:
app: fluentd
spec:
selector:
matchLabels:
app: fluentd
template:
metadata:
labels:
app: fluentd
spec:
serviceAccountName: fluentd
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
- name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearch
env:
- name: FLUENT_ELASTICSEARCH_HOST
value: "elasticsearch.logging.svc.cluster.local"
- name: FLUENT_ELASTICSEARCH_PORT
value: "9200"
- name: FLUENT_ELASTICSEARCH_SCHEME
value: "http"
- name: FLUENTD_SYSTEMD_CONF
value: "disable"
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
- name: config
mountPath: /fluentd/etc/fluent.conf
subPath: fluent.conf
resources:
requests:
cpu: 100m
memory: 200Mi
limits:
cpu: 500m
memory: 500Mi
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
- name: config
configMap:
name: fluentd-config
部署 Kibana
apiVersion: apps/v1
kind: Deployment
metadata:
name: kibana
namespace: logging
spec:
replicas: 1
selector:
matchLabels:
app: kibana
template:
metadata:
labels:
app: kibana
spec:
containers:
- name: kibana
image: docker.elastic.co/kibana/kibana:8.11.0
ports:
- containerPort: 5601
name: http
env:
- name: ELASTICSEARCH_HOSTS
value: "http://elasticsearch:9200"
- name: SERVER_NAME
value: "kibana"
- name: SERVER_HOST
value: "0.0.0.0"
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 1000m
memory: 2Gi
readinessProbe:
httpGet:
path: /api/status
port: 5601
initialDelaySeconds: 60
periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
name: kibana
namespace: logging
spec:
type: NodePort
ports:
- port: 5601
targetPort: 5601
nodePort: 30561
selector:
app: kibana
Kibana 配置
创建索引模式
# 访问 Kibana: http://node-ip:30561
# 1. 进入 Management → Stack Management → Index Patterns
# 2. 点击 Create index pattern
# 3. 输入索引模式: k8s-*
# 4. 选择时间字段: @timestamp
# 5. 创建
日志查询
# 查询特定 namespace
kubernetes.namespace_name: "production"
# 查询特定 Pod
kubernetes.pod_name: "myapp-*"
# 查询错误日志
level: "error"
# 组合查询
kubernetes.namespace_name: "production" AND level: "error"
# 排除查询
NOT kubernetes.namespace_name: "kube-system"
索引生命周期管理
# 创建 ILM 策略
PUT _ilm/policy/k8s-logs-policy
{
"policy": {
"phases": {
"hot": {
"min_age": "0ms",
"actions": {
"rollover": {
"max_age": "1d",
"max_size": "50gb"
},
"set_priority": {
"priority": 100
}
}
},
"warm": {
"min_age": "7d",
"actions": {
"set_priority": {
"priority": 50
}
}
},
"delete": {
"min_age": "30d",
"actions": {
"delete": {}
}
}
}
}
}
小结
本节介绍了 EFK 日志栈:
✅ Elasticsearch:3节点集群,持久化存储
✅ Fluentd:DaemonSet 收集所有节点日志
✅ Kibana:可视化查询和分析
✅ ILM:索引生命周期管理
下一节:Loki 和分布式追踪。