kube-prometheus-stack部署实践

Environment

  • Kubernetes 集群
    需要一个已经部署完成且可用的Kubernetes 1.16+集群。
  • Helm
    helm version v3+

Steps

  • 添加 Prometheus chart repo 到 Helm

    1
    2
    helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
    helm repo update
  • 查看版本信息

    1
    2
    3
    $ helm search repo prometheus-community/kube-prometheus-stack
    NAME CHART VERSION APP VERSION DESCRIPTION
    prometheus-community/kube-prometheus-stack 57.0.2 v0.72.0 kube-prometheus-stack collects Kubernetes manif...
  • 将仓库拉取到本地

    1
    helm pull prometheus-community/kube-prometheus-stack
  • 修改values.yaml

    • 配置简单的本地 NFS 存储卷
      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      30
      31
      32
      33
      34
      35
      36
      37
      38
      39
      40
      41
      42
      43
      44
      45
      46
      47
      48
      49
      50
      51
      52
      53
      54
      55
      $ vim ./values.yaml
      ---
      ## Storage is the definition of how storage will be used by the Alertmanager instances.
      storage:
      volumeClaimTemplate:
      spec:
      storageClassName: nfs-client
      accessModes: ["ReadWriteOnce"]
      resources:
      requests:
      storage: 10Gi
      ...
      grafana:
      enabled: true
      namespaceOverride: ""

      defaultDashboardsTimezone: Asia/Shanghai
      adminPassword: prom-operator
      persistence:
      enabled: true
      type: pvc
      storageClassName: nfs-client
      accessModes:
      - ReadWriteOnce
      size: 20Gi
      finalizers:
      - kubernetes.io/pvc-protection
      ...
      # 配置Prometheus持久化NFS存储
      prometheus:
      prometheusSpec:
      podMonitorSelectorNilUsesHelmValues: false
      serviceMonitorSelectorNilUsesHelmValues: false
      ## Prometheus StorageSpec for persistent data
      storageSpec: {}
      ## Using PersistentVolumeClaim
      volumeClaimTemplate:
      spec:
      storageClassName: nfs-client
      accessModes: ["ReadWriteOnce"]
      resources:
      requests:
      storage: 20Gi
      ...
      ## Storage is the definition of how storage will be used by the ThanosRuler instances.
      ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/storage.md
      ##
      storage:
      volumeClaimTemplate:
      spec:
      storageClassName: nfs-client
      accessModes: ["ReadWriteOnce"]
      resources:
      requests:
      storage: 5Gi
    • 配置grafana ingress
      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      30
      31
      32
      33
      34
      35
      36
      ingress:
      ## If true, Grafana Ingress will be created
      ##
      enabled: true

      ## IngressClassName for Grafana Ingress.
      ## Should be provided if Ingress is enable.
      ##
      ingressClassName: nginx

      ## Annotations for Grafana Ingress
      ##
      annotations:
      kubernetes.io/ingress.class: nginx
      # kubernetes.io/tls-acme: "true"

      ## Labels to be added to the Ingress
      ##
      labels: {}

      ## Hostnames.
      ## Must be provided if Ingress is enable.
      ##
      # hosts:
      # - grafana.domain.com
      hosts:
      - grafana.dev.XXX.cn
      ## Path for grafana ingress
      path: /
      ## TLS configuration for grafana Ingress
      ## Secret must be manually created in the namespace
      ##
      tls: []
      # - secretName: grafana-general-tls
      # hosts:
      # - grafana.example.com
  • 使用 Helm 更新版本重新部署

    1
    helm upgrade prometheus  --namespace monitoring --create-namespace -f values.yaml .
  • 查看资源组件情况

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    kubectl get pod -n monitoring
    NAME READY STATUS RESTARTS AGE
    alertmanager-prometheus-kube-prometheus-alertmanager-0 2/2 Running 0 18m
    prometheus-grafana-0 3/3 Running 0 10m
    prometheus-kube-prometheus-operator-546f866469-rvssk 1/1 Running 1 (15h ago) 18h
    prometheus-kube-state-metrics-868cc5957b-9lgt5 1/1 Running 1 (15h ago) 18h
    prometheus-prometheus-kube-prometheus-prometheus-0 2/2 Running 0 18m
    prometheus-prometheus-node-exporter-577kd 1/1 Running 1 (15h ago) 18h
    prometheus-prometheus-node-exporter-f5g8r 1/1 Running 1 (15h ago) 18h
    prometheus-prometheus-node-exporter-gkhmw 1/1 Running 0 18h
    prometheus-prometheus-node-exporter-lql7g 1/1 Running 0 18h

Usage

  • 我们可以通过Grafana ingress 地址进行访问,浏览Grafana仪表板
    Mm0Kiu
  • 可以通过prometheus 9090端口的web界面进行访问查看prometheus信息
    jUUwJe