KEMBAR78
Prometheus on NKS | PDF
Prometheus on NKS 가이드 문서
📌QA test Region on (KR / 한국)
https://github.com/sysnet4admin
Helm v3.10.3 설치
1.helm binary 설치 확인 (헬름 설치가 안되 있는 경우 설치를 우선 진행)
root@k8s-console:~# helm version
WARNING: Kubernetes configuration file is group-readable. This is
insecure. Location: /root/.kube/config
WARNING: Kubernetes configuration file is world-readable. This is
insecure. Location: /root/.kube/config
version.BuildInfo{Version:"v3.10.3",
GitCommit:"835b7334cfe2e5e27870ab3ed4135f136eecc704",
GitTreeState:"clean", GoVersion:"go1.18.9"}
❗만약 insecure 메시지를 보고 싶지 않다면...
root@k8s-console:~# chmod 700 ~/.kube/config
root@k8s-console:~# helm version --short
v3.10.3+g835b733
헬름을 통한 Prometheus 배포를 위한 사전 작업
1.프로메테우스 설치를 위한 헬름 레포를 추가
root@k8s-console:~# helm repo add prometheus-community
https://prometheus-community.github.io/helm-charts
"prometheus-community" has been added to your repositories
2.레포에서 최신 내용을 받아 업데이트
root@k8s-console:~# helm repo update
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "prometheus-community" chart
repository
Update Complete. ⎈Happy Helming!⎈
3.사전 구성된 스토리지클래스 확인
root@k8s-console:~# kubectl get storageclass
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
nks-block-storage (default) blk.csi.ncloud.com Delete WaitForFirstConsumer true 17d
nks-nas-csi nas.csi.ncloud.com Delete WaitForFirstConsumer true 17d
Prometheus 배포
1.헬름을 통해서 NKS에 프로메테우스 배포
root@k8s-console:~# helm install prometheus
prometheus-community/prometheus 
--set server.service.type="LoadBalancer" 
--namespace=monitoring 
--create-namespace
WARNING: Kubernetes configuration file is group-readable. This is
insecure. Location: /root/.kube/config
WARNING: Kubernetes configuration file is world-readable. This is
insecure. Location: /root/.kube/config
NAME: prometheus
LAST DEPLOYED: Sat Dec 17 17:03:41 2022
NAMESPACE: monitoring
STATUS: deployed
REVISION: 1
NOTES:
The Prometheus server can be accessed via port 80 on the following DNS
name from within your cluster:
prometheus-server.monitoring.svc.cluster.local
Get the Prometheus server URL by running these commands in the same
shell:
NOTE: It may take a few minutes for the LoadBalancer IP to be
available.
You can watch the status of by running 'kubectl get svc
--namespace monitoring -w prometheus-server'
export SERVICE_IP=$(kubectl get svc --namespace monitoring
prometheus-server -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo http://$SERVICE_IP:80
The Prometheus alertmanager can be accessed via port on the following
DNS name from within your cluster:
prometheus-%!s(<nil>).monitoring.svc.cluster.local
Get the Alertmanager URL by running these commands in the same shell:
export POD_NAME=$(kubectl get pods --namespace monitoring -l
"app=prometheus,component=" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace monitoring port-forward $POD_NAME 9093
########################################################################
#########
###### WARNING: Pod Security Policy has been disabled by default since
#####
###### it deprecated after k8s 1.25+. use
#####
###### (index .Values "prometheus-node-exporter" "rbac"
#####
###### . "pspEnabled") with (index .Values
#####
###### "prometheus-node-exporter" "rbac" "pspAnnotations")
#####
###### in case you still need it.
#####
########################################################################
#########
The Prometheus PushGateway can be accessed via port 9091 on the
following DNS name from within your cluster:
prometheus-prometheus-pushgateway.monitoring.svc.cluster.local
Get the PushGateway URL by running these commands in the same shell:
export POD_NAME=$(kubectl get pods --namespace monitoring -l
"app=prometheus-pushgateway,component=pushgateway" -o
jsonpath="{.items[0].metadata.name}")
kubectl --namespace monitoring port-forward $POD_NAME 9091
For more information on running Prometheus, visit:
https://prometheus.io/
❗만약 storageclass를 nks-block-storage가 아닌 다른 스토리지를 쓰고 싶다면 다음을
참조하세요
helm install prometheus prometheus-community/prometheus 
--set alertmanager.persistentVolume.storageClass="nks-block-storage" 
--set server.persistentVolume.storageClass="nks-block-storage" 
--set server.service.type="LoadBalancer" 
--namespace=monitoring 
--create-namespace
2.배포된 pods와 services 확인
root@k8s-console:~# kubectl get po,svc -n monitoring
NAME READY STATUS RESTARTS AGE
pod/prometheus-alertmanager-0 1/1 Running 0 3m37s
pod/prometheus-kube-state-metrics-7cdcf7cc98-rsgcr 1/1 Running 0 3m37s
pod/prometheus-prometheus-node-exporter-5qpn4 1/1 Running 0 3m37s
pod/prometheus-prometheus-pushgateway-959d84d7f-8ztlm 1/1 Running 0 3m37s
pod/prometheus-server-54956c9cfb-wlvms 2/2 Running 0 3m37s
NAME TYPE CLUSTER-IP EXTERNAL-IP
PORT(S) AGE
service/prometheus-alertmanager ClusterIP 198.19.133.139 <none>
9093/TCP 3m38s
service/prometheus-alertmanager-headless ClusterIP None <none>
9093/TCP 3m38s
service/prometheus-kube-state-metrics ClusterIP 198.19.185.119 <none>
8080/TCP 3m37s
service/prometheus-prometheus-node-exporter ClusterIP 198.19.252.64 <none>
9100/TCP 3m37s
service/prometheus-prometheus-pushgateway ClusterIP 198.19.193.200 <none>
9091/TCP 3m37s
service/prometheus-server LoadBalancer 198.19.178.17
monitoring-prometheus-se-18ca9-15174488-e4dd7137207d.kr.lb.naverncp.com 80:32534/TCP 3m38s
3.배포된 프로메테우스 확인
4.조회된 메트릭 데이터 확인
5.배포된 프로메테우스 조회 및 삭제
root@k8s-console:~# helm list -n monitoring
NAME NAMESPACE REVISION UPDATED
STATUS CHART APP VERSION
prometheus monitoring 1 2022-12-17 17:03:41.29034263
+0900 KST deployed prometheus-19.0.2 v2.40.5
root@k8s-console:~# helm uninstall prometheus -n monitoring
release "prometheus" uninstalled
6.삭제된 프로메테우스 리소스 확인
root@k8s-console:~# helm list -n monitoring
NAME NAMESPACE REVISION UPDATED STATUS CHART APP
VERSION
root@k8s-console:~#
root@k8s-console:~# kubectl get po,svc -n monitoring
No resources found in monitoring namespace.
Kube Prometheus Stack (이하 프로메테우스 스택) 배포
1.헬름을 통해서 NKS에 프로메테우스 스택 배포
root@k8s-console:~# helm install kube-prometheus-stack
prometheus-community/kube-prometheus-stack 
--set prometheus.service.type=LoadBalancer 
--set grafana.service.type=LoadBalancer 
--namespace=monitoring 
--create-namespace
NAME: kube-prometheus-stack
LAST DEPLOYED: Sat Dec 17 17:14:15 2022
NAMESPACE: monitoring
STATUS: deployed
REVISION: 1
NOTES:
kube-prometheus-stack has been installed. Check its status by running:
kubectl --namespace monitoring get pods -l
"release=kube-prometheus-stack"
Visit https://github.com/prometheus-operator/kube-prometheus for
instructions on how to create & configure Alertmanager and Prometheus
instances using the Operator.
2.배포된 pods와 services 확인
root@k8s-console:~# kubectl get po,svc -n monitoring
NAME READY STATUS RESTARTS AGE
pod/alertmanager-kube-prometheus-stack-alertmanager-0 2/2 Running 1 (104s ago) 105s
pod/kube-prometheus-stack-grafana-77fd7cc8ff-57tp5 3/3 Running 0 114s
pod/kube-prometheus-stack-kube-state-metrics-579bf68b5-rj5ff 1/1 Running 0 114s
pod/kube-prometheus-stack-operator-64bc8bd9fd-2ggrs 1/1 Running 0 114s
pod/kube-prometheus-stack-prometheus-node-exporter-rv8b5 1/1 Running 0 115s
pod/prometheus-kube-prometheus-stack-prometheus-0 2/2 Running 0 105s
NAME TYPE CLUSTER-IP EXTERNAL-IP
PORT(S) AGE
service/alertmanager-operated ClusterIP None <none>
9093/TCP,9094/TCP,9094/UDP 105s
service/kube-prometheus-stack-alertmanager ClusterIP 198.19.250.205 <none>
9093/TCP 115s
service/kube-prometheus-stack-grafana LoadBalancer 198.19.171.157
monitoring-kube-promethe-4b1de-15174529-f0806941ff3d.kr.lb.naverncp.com 80:31512/TCP
115s
service/kube-prometheus-stack-kube-state-metrics ClusterIP 198.19.173.244 <none>
8080/TCP 115s
service/kube-prometheus-stack-operator ClusterIP 198.19.134.58 <none>
443/TCP 115s
service/kube-prometheus-stack-prometheus LoadBalancer 198.19.233.72
monitoring-kube-promethe-5d777-15174528-c0eedcb927a3.kr.lb.naverncp.com 9090:32176/TCP
115s
service/kube-prometheus-stack-prometheus-node-exporter ClusterIP 198.19.202.67 <none>
9100/TCP 115s
service/prometheus-operated ClusterIP None <none>
9090/TCP 105s
❗현재 프로메테우스 스택의 큰 문제점 ?
프로메테우스 배포에는 다음과 같이 default로 storageclass(nks-block-storage)를 통해서
pv와 pvc가 생성됩니다.
root@k8s-console:~# kubectl get pv -n monitoring
CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM
STORAGECLASS REASON AGE
pvc-0d5a8305acee499e8a0d57245a 10Gi RWO Delete Bound
monitoring/storage-prometheus-alertmanager-0 nks-block-storage 9m42s
pvc-6ae9e2442da2475295da9b1050 10Gi RWO Delete Bound
monitoring/prometheus-server nks-block-storage 9m44s
root@k8s-console:~# kubectl get pvc -n monitoring
NAME STATUS VOLUME CAPACITY
ACCESS MODES STORAGECLASS AGE
prometheus-server Bound pvc-6ae9e2442da2475295da9b1050 10Gi
RWO nks-block-storage 10m
storage-prometheus-alertmanager-0 Bound pvc-0d5a8305acee499e8a0d57245a 10Gi
RWO nks-block-storage 10m
그러나 프로메테우스 스택에서 storageclass를 지정해 주지 않으면 다음과 같이 pv,pvc를
이용하는 것이 아니라 emptyDir를 이용해서 임시로만 사용하도록 배포 됩니다.
root@k8s-console:~# kubectl get pv,pvc -n monitoring | grep
prometheus-server
root@k8s-console:~#
root@k8s-console:~# kubectl get po -n monitoring
prometheus-kube-prometheus-stack-prometheus-0 -o yaml | grep volumes
-A30
volumes:
- name: config
secret:
defaultMode: 420
secretName: prometheus-kube-prometheus-stack-prometheus
- name: tls-assets
projected:
defaultMode: 420
sources:
- secret:
name: prometheus-kube-prometheus-stack-prometheus-tls-assets-0
- emptyDir: {}
name: config-out
- configMap:
defaultMode: 420
name: prometheus-kube-prometheus-stack-prometheus-rulefiles-0
name: prometheus-kube-prometheus-stack-prometheus-rulefiles-0
- name: web-config
secret:
defaultMode: 420
secretName: prometheus-kube-prometheus-stack-prometheus-web-config
- emptyDir: {}
name: prometheus-kube-prometheus-stack-prometheus-db
- name: kube-api-access-g8rvd
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
<snipped>
따라서 현업 관점에서는 storageclass가 사용되도록 설정을 해줘야 하며, 이는
value.yaml을 통해서 추가 설정 배포 되어야 합니다. (또는 차트를 fork하고 새로 고쳐야함)
이는 다음의 링크를 참조하시기 바랍니다.
프로메테우스: https://github.com/prometheus-community/helm-charts/issues/186
그라파나: https://github.com/prometheus-community/helm-charts/issues/436
헬름value관련:
https://helm.sh/docs/intro/using_helm/#customizing-the-chart-before-installing
만약 정말하고 싶다면….부록1을 참고하세요
3.배포된 프로메테우스 확인
❗scapeInterval 시간을 배포 후에 변경하기를 원한다면
$ kubectl get prometheus -n monitoring -o yaml | nl | grep scrap
57 scrapeInterval: 30s
$ kubectl edit prometheus -n monitoring
prometheus.monitoring.coreos.com/kube-prometheus-stack-prometheus edited
$ kubectl get prometheus -n monitoring -o yaml | nl | grep scrap
57 scrapeInterval: 2m
4.배포된 그라파나 확인 및 로그인
ID: admin
Password: prom-operator
5.미리 설정된 데이터 소스가 프로메테우스인지 확인
6. 미리 만들어진 대시보드를 불러오기 위해 13770을 import 메뉴에
입력
7.Data Source를 프로메테우스로 선택하고 import 누름
8.import 된 13770을 감상 및 N/A와 No data 수정
9.(필요시) 배포된 프로메테우스 스택 조회 및 삭제
root@k8s-console:~# helm list -n monitoring
NAME NAMESPACE REVISION UPDATED
STATUS CHART APP VERSION
kube-prometheus-stack monitoring 1 2022-12-17 17:14:15.264607955
+0900 KST deployed kube-prometheus-stack-43.1.1 0.61.1
root@k8s-console:~# helm uninstall -n monitoring kube-prometheus-stack
release "kube-prometheus-stack" uninstalled
부록1
1.helm inspect로 values 파일 생성
$ helm inspect values prometheus-community/kube-prometheus-stack
--version 43.1.1 > kube-prometheus-stack-43.1.1.values
2. 생성된 values 파일에 필요 내용 추가 및 수정
라인 번호는 실행 시점 및 수정 순서에 따라 다소 차이가 있을 수도 있습니다.
참고로 라인 번호는 vi 실행 이후에 :set nu로 표시할 수 있습니다.
수정
542 ## Storage is the definition of how storage will be used by the
Alertmanager instances.
543 ## ref:
https://github.com/prometheus-operator/prometheus-operator/blob/main/Doc
umentation/user-guides/storage.md
544 ##
545 storage:
546 volumeClaimTemplate:
547 spec:
548 storageClassName: nks-block-storage
549 accessModes: ["ReadWriteOnce"]
550 resources:
551 requests:
552 storage: 50Gi
553 # selector: {}
추가
697 ## Using default values from
https://github.com/grafana/helm-charts/blob/main/charts/grafana/values.y
aml
698 ##
699 grafana:
700 enabled: true
701 namespaceOverride: ""
702
703 # override configuration by hoon
704 persistence:
705 enabled: true
706 type: pvc
707 storageClassName: nks-block-storage
708 accessModes:
709 - ReadWriteOnce
710 size: 100Gi
711 finalizers:
712 - kubernetes.io/pvc-protection
수정
726 ## Timezone for the default dashboards
727 ## Other options are: browser or a specific timezone, i.e.
Europe/Luxembourg
728 ##
729 defaultDashboardsTimezone: utc
730
731 adminPassword: admin
732
수정
2580 ## Prometheus StorageSpec for persistent data
2581 ## ref:
https://github.com/prometheus-operator/prometheus-operator/blob/main/Doc
umentation/user-guides/storage.md
2582 ##
2583 storageSpec:
2584 ## Using PersistentVolumeClaim
2585 ##
2586 volumeClaimTemplate:
2587 spec:
2588 storageClassName: nks-block-storage
2589 accessModes: ["ReadWriteOnce"]
2590 resources:
2591 requests:
2592 storage: 50Gi
2593 # selector: {}
3.helm install 실행
root@k8s-console:~# helm install
prometheus-community/kube-prometheus-stack
--set prometheus.service.type=LoadBalancer 
--set grafana.service.type=LoadBalancer 
--create-namespace 
--namespace monitoring 
--generate-name 
--values kube-prometheus-stack-43.1.1.values
NAME: kube-prometheus-stack-1671267408
LAST DEPLOYED: Sat Dec 17 17:56:49 2022
NAMESPACE: monitoring
STATUS: deployed
REVISION: 1
NOTES:
kube-prometheus-stack has been installed. Check its status by running:
kubectl --namespace monitoring get pods -l
"release=kube-prometheus-stack-1671267408"
Visit https://github.com/prometheus-operator/kube-prometheus for
instructions on how to create & configure Alertmanager and Prometheus
instances using the Operator.
4.변경된 값이 있는 values를 통해서 생성된 프로메테우스 스택 확인
root@k8s-console:~# kubectl get po,svc,pv,pvc -n monitoring
NAME READY STATUS RESTARTS AGE
pod/alertmanager-kube-prometheus-stack-1671-alertmanager-0 2/2 Running 1 (24s ago) 36s
pod/kube-prometheus-stack-1671-operator-696ddf996d-2tbft 1/1 Running 0 37s
pod/kube-prometheus-stack-1671267408-grafana-75cf5cff79-hrs59 3/3 Running 0 37s
pod/kube-prometheus-stack-1671267408-kube-state-metrics-7b44cdrf8q9 1/1 Running 0 37s
pod/kube-prometheus-stack-1671267408-prometheus-node-exporter-npmpk 1/1 Running 0 37s
pod/prometheus-kube-prometheus-stack-1671-prometheus-0 2/2 Running 0 35s
NAME TYPE CLUSTER-IP EXTERNAL-IP
PORT(S) AGE
service/alertmanager-operated ClusterIP None <none>
9093/TCP,9094/TCP,9094/UDP 36s
service/kube-prometheus-stack-1671-alertmanager ClusterIP 198.19.141.183 <none>
9093/TCP 37s
service/kube-prometheus-stack-1671-operator ClusterIP 198.19.249.190 <none>
443/TCP 37s
service/kube-prometheus-stack-1671-prometheus LoadBalancer 198.19.189.46
monitoring-kube-promethe-94513-15174705-1fbb6ff1467d.kr.lb.naverncp.com 9090:30008/TCP 37s
service/kube-prometheus-stack-1671267408-grafana LoadBalancer 198.19.206.4 <pending>
80:31398/TCP 37s
service/kube-prometheus-stack-1671267408-kube-state-metrics ClusterIP 198.19.225.152 <none>
8080/TCP 37s
service/kube-prometheus-stack-1671267408-prometheus-node-exporter ClusterIP 198.19.191.119 <none>
9100/TCP 37s
service/prometheus-operated ClusterIP None <none>
9090/TCP 35s
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM
STORAGECLASS REASON AGE
persistentvolume/pvc-7c195a1da23d4755b21b6ed2db 50Gi RWO Delete Bound
monitoring/prometheus-kube-prometheus-stack-1671-prometheus-db-prometheus-kube-prometheus-stack-1671-prometheus-0
nks-block-storage 33s
persistentvolume/pvc-8c1c8c896efb40b6af8fe82a42 50Gi RWO Delete Bound
monitoring/alertmanager-kube-prometheus-stack-1671-alertmanager-db-alertmanager-kube-prometheus-stack-1671-alertma
nager-0 nks-block-storage 34s
persistentvolume/pvc-c4ba41508e4d4914a1f255f0ae 100Gi RWO Delete Bound
monitoring/kube-prometheus-stack-1671267408-grafana
nks-block-storage 36s
NAME
STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/alertmanager-kube-prometheus-stack-1671-alertmanager-db-alertmanager-kube-prometheus-stack-1
671-alertmanager-0 Bound pvc-8c1c8c896efb40b6af8fe82a42 50Gi RWO nks-block-storage 36s
persistentvolumeclaim/kube-prometheus-stack-1671267408-grafana
Bound pvc-c4ba41508e4d4914a1f255f0ae 100Gi RWO nks-block-storage 38s
persistentvolumeclaim/prometheus-kube-prometheus-stack-1671-prometheus-db-prometheus-kube-prometheus-stack-1671-pr
ometheus-0 Bound pvc-7c195a1da23d4755b21b6ed2db 50Gi RWO nks-block-storage 35s
레퍼런스:
https://1week.tistory.com/43
https://passwd.tistory.com/entry/Helm-kube-prometheus-stack-Grafana-Persistence-%ED%9
9%9C%EC%84%B1%ED%99%94
https://github.com/prometheus-community/helm-charts/issues/113

Prometheus on NKS

  • 1.
    Prometheus on NKS가이드 문서 📌QA test Region on (KR / 한국) https://github.com/sysnet4admin
  • 2.
    Helm v3.10.3 설치 1.helmbinary 설치 확인 (헬름 설치가 안되 있는 경우 설치를 우선 진행) root@k8s-console:~# helm version WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/.kube/config WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /root/.kube/config version.BuildInfo{Version:"v3.10.3", GitCommit:"835b7334cfe2e5e27870ab3ed4135f136eecc704", GitTreeState:"clean", GoVersion:"go1.18.9"} ❗만약 insecure 메시지를 보고 싶지 않다면... root@k8s-console:~# chmod 700 ~/.kube/config root@k8s-console:~# helm version --short v3.10.3+g835b733 헬름을 통한 Prometheus 배포를 위한 사전 작업 1.프로메테우스 설치를 위한 헬름 레포를 추가 root@k8s-console:~# helm repo add prometheus-community https://prometheus-community.github.io/helm-charts "prometheus-community" has been added to your repositories 2.레포에서 최신 내용을 받아 업데이트 root@k8s-console:~# helm repo update Hang tight while we grab the latest from your chart repositories... ...Successfully got an update from the "prometheus-community" chart repository Update Complete. ⎈Happy Helming!⎈ 3.사전 구성된 스토리지클래스 확인 root@k8s-console:~# kubectl get storageclass NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE nks-block-storage (default) blk.csi.ncloud.com Delete WaitForFirstConsumer true 17d nks-nas-csi nas.csi.ncloud.com Delete WaitForFirstConsumer true 17d
  • 3.
    Prometheus 배포 1.헬름을 통해서NKS에 프로메테우스 배포 root@k8s-console:~# helm install prometheus prometheus-community/prometheus --set server.service.type="LoadBalancer" --namespace=monitoring --create-namespace WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/.kube/config WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /root/.kube/config NAME: prometheus LAST DEPLOYED: Sat Dec 17 17:03:41 2022 NAMESPACE: monitoring STATUS: deployed REVISION: 1 NOTES: The Prometheus server can be accessed via port 80 on the following DNS name from within your cluster: prometheus-server.monitoring.svc.cluster.local Get the Prometheus server URL by running these commands in the same shell: NOTE: It may take a few minutes for the LoadBalancer IP to be available. You can watch the status of by running 'kubectl get svc --namespace monitoring -w prometheus-server' export SERVICE_IP=$(kubectl get svc --namespace monitoring prometheus-server -o jsonpath='{.status.loadBalancer.ingress[0].ip}') echo http://$SERVICE_IP:80 The Prometheus alertmanager can be accessed via port on the following DNS name from within your cluster: prometheus-%!s(<nil>).monitoring.svc.cluster.local Get the Alertmanager URL by running these commands in the same shell: export POD_NAME=$(kubectl get pods --namespace monitoring -l "app=prometheus,component=" -o jsonpath="{.items[0].metadata.name}")
  • 4.
    kubectl --namespace monitoringport-forward $POD_NAME 9093 ######################################################################## ######### ###### WARNING: Pod Security Policy has been disabled by default since ##### ###### it deprecated after k8s 1.25+. use ##### ###### (index .Values "prometheus-node-exporter" "rbac" ##### ###### . "pspEnabled") with (index .Values ##### ###### "prometheus-node-exporter" "rbac" "pspAnnotations") ##### ###### in case you still need it. ##### ######################################################################## ######### The Prometheus PushGateway can be accessed via port 9091 on the following DNS name from within your cluster: prometheus-prometheus-pushgateway.monitoring.svc.cluster.local Get the PushGateway URL by running these commands in the same shell: export POD_NAME=$(kubectl get pods --namespace monitoring -l "app=prometheus-pushgateway,component=pushgateway" -o jsonpath="{.items[0].metadata.name}") kubectl --namespace monitoring port-forward $POD_NAME 9091 For more information on running Prometheus, visit: https://prometheus.io/ ❗만약 storageclass를 nks-block-storage가 아닌 다른 스토리지를 쓰고 싶다면 다음을 참조하세요 helm install prometheus prometheus-community/prometheus --set alertmanager.persistentVolume.storageClass="nks-block-storage" --set server.persistentVolume.storageClass="nks-block-storage" --set server.service.type="LoadBalancer" --namespace=monitoring --create-namespace
  • 5.
    2.배포된 pods와 services확인 root@k8s-console:~# kubectl get po,svc -n monitoring NAME READY STATUS RESTARTS AGE pod/prometheus-alertmanager-0 1/1 Running 0 3m37s pod/prometheus-kube-state-metrics-7cdcf7cc98-rsgcr 1/1 Running 0 3m37s pod/prometheus-prometheus-node-exporter-5qpn4 1/1 Running 0 3m37s pod/prometheus-prometheus-pushgateway-959d84d7f-8ztlm 1/1 Running 0 3m37s pod/prometheus-server-54956c9cfb-wlvms 2/2 Running 0 3m37s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/prometheus-alertmanager ClusterIP 198.19.133.139 <none> 9093/TCP 3m38s service/prometheus-alertmanager-headless ClusterIP None <none> 9093/TCP 3m38s service/prometheus-kube-state-metrics ClusterIP 198.19.185.119 <none> 8080/TCP 3m37s service/prometheus-prometheus-node-exporter ClusterIP 198.19.252.64 <none> 9100/TCP 3m37s service/prometheus-prometheus-pushgateway ClusterIP 198.19.193.200 <none> 9091/TCP 3m37s service/prometheus-server LoadBalancer 198.19.178.17 monitoring-prometheus-se-18ca9-15174488-e4dd7137207d.kr.lb.naverncp.com 80:32534/TCP 3m38s 3.배포된 프로메테우스 확인
  • 6.
    4.조회된 메트릭 데이터확인 5.배포된 프로메테우스 조회 및 삭제 root@k8s-console:~# helm list -n monitoring NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION prometheus monitoring 1 2022-12-17 17:03:41.29034263 +0900 KST deployed prometheus-19.0.2 v2.40.5 root@k8s-console:~# helm uninstall prometheus -n monitoring release "prometheus" uninstalled 6.삭제된 프로메테우스 리소스 확인 root@k8s-console:~# helm list -n monitoring NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION root@k8s-console:~# root@k8s-console:~# kubectl get po,svc -n monitoring No resources found in monitoring namespace.
  • 7.
    Kube Prometheus Stack(이하 프로메테우스 스택) 배포 1.헬름을 통해서 NKS에 프로메테우스 스택 배포 root@k8s-console:~# helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack --set prometheus.service.type=LoadBalancer --set grafana.service.type=LoadBalancer --namespace=monitoring --create-namespace NAME: kube-prometheus-stack LAST DEPLOYED: Sat Dec 17 17:14:15 2022 NAMESPACE: monitoring STATUS: deployed REVISION: 1 NOTES: kube-prometheus-stack has been installed. Check its status by running: kubectl --namespace monitoring get pods -l "release=kube-prometheus-stack" Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator. 2.배포된 pods와 services 확인 root@k8s-console:~# kubectl get po,svc -n monitoring NAME READY STATUS RESTARTS AGE pod/alertmanager-kube-prometheus-stack-alertmanager-0 2/2 Running 1 (104s ago) 105s pod/kube-prometheus-stack-grafana-77fd7cc8ff-57tp5 3/3 Running 0 114s pod/kube-prometheus-stack-kube-state-metrics-579bf68b5-rj5ff 1/1 Running 0 114s pod/kube-prometheus-stack-operator-64bc8bd9fd-2ggrs 1/1 Running 0 114s pod/kube-prometheus-stack-prometheus-node-exporter-rv8b5 1/1 Running 0 115s pod/prometheus-kube-prometheus-stack-prometheus-0 2/2 Running 0 105s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 105s service/kube-prometheus-stack-alertmanager ClusterIP 198.19.250.205 <none> 9093/TCP 115s service/kube-prometheus-stack-grafana LoadBalancer 198.19.171.157 monitoring-kube-promethe-4b1de-15174529-f0806941ff3d.kr.lb.naverncp.com 80:31512/TCP 115s service/kube-prometheus-stack-kube-state-metrics ClusterIP 198.19.173.244 <none> 8080/TCP 115s service/kube-prometheus-stack-operator ClusterIP 198.19.134.58 <none> 443/TCP 115s service/kube-prometheus-stack-prometheus LoadBalancer 198.19.233.72 monitoring-kube-promethe-5d777-15174528-c0eedcb927a3.kr.lb.naverncp.com 9090:32176/TCP
  • 8.
    115s service/kube-prometheus-stack-prometheus-node-exporter ClusterIP 198.19.202.67<none> 9100/TCP 115s service/prometheus-operated ClusterIP None <none> 9090/TCP 105s ❗현재 프로메테우스 스택의 큰 문제점 ? 프로메테우스 배포에는 다음과 같이 default로 storageclass(nks-block-storage)를 통해서 pv와 pvc가 생성됩니다. root@k8s-console:~# kubectl get pv -n monitoring CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-0d5a8305acee499e8a0d57245a 10Gi RWO Delete Bound monitoring/storage-prometheus-alertmanager-0 nks-block-storage 9m42s pvc-6ae9e2442da2475295da9b1050 10Gi RWO Delete Bound monitoring/prometheus-server nks-block-storage 9m44s root@k8s-console:~# kubectl get pvc -n monitoring NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE prometheus-server Bound pvc-6ae9e2442da2475295da9b1050 10Gi RWO nks-block-storage 10m storage-prometheus-alertmanager-0 Bound pvc-0d5a8305acee499e8a0d57245a 10Gi RWO nks-block-storage 10m 그러나 프로메테우스 스택에서 storageclass를 지정해 주지 않으면 다음과 같이 pv,pvc를 이용하는 것이 아니라 emptyDir를 이용해서 임시로만 사용하도록 배포 됩니다. root@k8s-console:~# kubectl get pv,pvc -n monitoring | grep prometheus-server root@k8s-console:~# root@k8s-console:~# kubectl get po -n monitoring prometheus-kube-prometheus-stack-prometheus-0 -o yaml | grep volumes -A30 volumes: - name: config secret: defaultMode: 420 secretName: prometheus-kube-prometheus-stack-prometheus - name: tls-assets projected:
  • 9.
    defaultMode: 420 sources: - secret: name:prometheus-kube-prometheus-stack-prometheus-tls-assets-0 - emptyDir: {} name: config-out - configMap: defaultMode: 420 name: prometheus-kube-prometheus-stack-prometheus-rulefiles-0 name: prometheus-kube-prometheus-stack-prometheus-rulefiles-0 - name: web-config secret: defaultMode: 420 secretName: prometheus-kube-prometheus-stack-prometheus-web-config - emptyDir: {} name: prometheus-kube-prometheus-stack-prometheus-db - name: kube-api-access-g8rvd projected: defaultMode: 420 sources: - serviceAccountToken: expirationSeconds: 3607 path: token - configMap: <snipped> 따라서 현업 관점에서는 storageclass가 사용되도록 설정을 해줘야 하며, 이는 value.yaml을 통해서 추가 설정 배포 되어야 합니다. (또는 차트를 fork하고 새로 고쳐야함) 이는 다음의 링크를 참조하시기 바랍니다. 프로메테우스: https://github.com/prometheus-community/helm-charts/issues/186 그라파나: https://github.com/prometheus-community/helm-charts/issues/436 헬름value관련: https://helm.sh/docs/intro/using_helm/#customizing-the-chart-before-installing 만약 정말하고 싶다면….부록1을 참고하세요
  • 10.
    3.배포된 프로메테우스 확인 ❗scapeInterval시간을 배포 후에 변경하기를 원한다면 $ kubectl get prometheus -n monitoring -o yaml | nl | grep scrap 57 scrapeInterval: 30s $ kubectl edit prometheus -n monitoring prometheus.monitoring.coreos.com/kube-prometheus-stack-prometheus edited $ kubectl get prometheus -n monitoring -o yaml | nl | grep scrap 57 scrapeInterval: 2m
  • 11.
    4.배포된 그라파나 확인및 로그인 ID: admin Password: prom-operator 5.미리 설정된 데이터 소스가 프로메테우스인지 확인
  • 12.
    6. 미리 만들어진대시보드를 불러오기 위해 13770을 import 메뉴에 입력 7.Data Source를 프로메테우스로 선택하고 import 누름
  • 13.
    8.import 된 13770을감상 및 N/A와 No data 수정 9.(필요시) 배포된 프로메테우스 스택 조회 및 삭제 root@k8s-console:~# helm list -n monitoring NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION kube-prometheus-stack monitoring 1 2022-12-17 17:14:15.264607955 +0900 KST deployed kube-prometheus-stack-43.1.1 0.61.1 root@k8s-console:~# helm uninstall -n monitoring kube-prometheus-stack release "kube-prometheus-stack" uninstalled
  • 14.
    부록1 1.helm inspect로 values파일 생성 $ helm inspect values prometheus-community/kube-prometheus-stack --version 43.1.1 > kube-prometheus-stack-43.1.1.values 2. 생성된 values 파일에 필요 내용 추가 및 수정 라인 번호는 실행 시점 및 수정 순서에 따라 다소 차이가 있을 수도 있습니다. 참고로 라인 번호는 vi 실행 이후에 :set nu로 표시할 수 있습니다. 수정 542 ## Storage is the definition of how storage will be used by the Alertmanager instances. 543 ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Doc umentation/user-guides/storage.md 544 ## 545 storage: 546 volumeClaimTemplate: 547 spec: 548 storageClassName: nks-block-storage 549 accessModes: ["ReadWriteOnce"] 550 resources: 551 requests: 552 storage: 50Gi 553 # selector: {} 추가 697 ## Using default values from https://github.com/grafana/helm-charts/blob/main/charts/grafana/values.y aml 698 ## 699 grafana: 700 enabled: true 701 namespaceOverride: "" 702 703 # override configuration by hoon 704 persistence: 705 enabled: true 706 type: pvc
  • 15.
    707 storageClassName: nks-block-storage 708accessModes: 709 - ReadWriteOnce 710 size: 100Gi 711 finalizers: 712 - kubernetes.io/pvc-protection 수정 726 ## Timezone for the default dashboards 727 ## Other options are: browser or a specific timezone, i.e. Europe/Luxembourg 728 ## 729 defaultDashboardsTimezone: utc 730 731 adminPassword: admin 732 수정 2580 ## Prometheus StorageSpec for persistent data 2581 ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Doc umentation/user-guides/storage.md 2582 ## 2583 storageSpec: 2584 ## Using PersistentVolumeClaim 2585 ## 2586 volumeClaimTemplate: 2587 spec: 2588 storageClassName: nks-block-storage 2589 accessModes: ["ReadWriteOnce"] 2590 resources: 2591 requests: 2592 storage: 50Gi 2593 # selector: {} 3.helm install 실행 root@k8s-console:~# helm install prometheus-community/kube-prometheus-stack
  • 16.
    --set prometheus.service.type=LoadBalancer --setgrafana.service.type=LoadBalancer --create-namespace --namespace monitoring --generate-name --values kube-prometheus-stack-43.1.1.values NAME: kube-prometheus-stack-1671267408 LAST DEPLOYED: Sat Dec 17 17:56:49 2022 NAMESPACE: monitoring STATUS: deployed REVISION: 1 NOTES: kube-prometheus-stack has been installed. Check its status by running: kubectl --namespace monitoring get pods -l "release=kube-prometheus-stack-1671267408" Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator. 4.변경된 값이 있는 values를 통해서 생성된 프로메테우스 스택 확인 root@k8s-console:~# kubectl get po,svc,pv,pvc -n monitoring NAME READY STATUS RESTARTS AGE pod/alertmanager-kube-prometheus-stack-1671-alertmanager-0 2/2 Running 1 (24s ago) 36s pod/kube-prometheus-stack-1671-operator-696ddf996d-2tbft 1/1 Running 0 37s pod/kube-prometheus-stack-1671267408-grafana-75cf5cff79-hrs59 3/3 Running 0 37s pod/kube-prometheus-stack-1671267408-kube-state-metrics-7b44cdrf8q9 1/1 Running 0 37s pod/kube-prometheus-stack-1671267408-prometheus-node-exporter-npmpk 1/1 Running 0 37s pod/prometheus-kube-prometheus-stack-1671-prometheus-0 2/2 Running 0 35s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 36s service/kube-prometheus-stack-1671-alertmanager ClusterIP 198.19.141.183 <none> 9093/TCP 37s service/kube-prometheus-stack-1671-operator ClusterIP 198.19.249.190 <none> 443/TCP 37s service/kube-prometheus-stack-1671-prometheus LoadBalancer 198.19.189.46 monitoring-kube-promethe-94513-15174705-1fbb6ff1467d.kr.lb.naverncp.com 9090:30008/TCP 37s service/kube-prometheus-stack-1671267408-grafana LoadBalancer 198.19.206.4 <pending> 80:31398/TCP 37s service/kube-prometheus-stack-1671267408-kube-state-metrics ClusterIP 198.19.225.152 <none> 8080/TCP 37s service/kube-prometheus-stack-1671267408-prometheus-node-exporter ClusterIP 198.19.191.119 <none> 9100/TCP 37s service/prometheus-operated ClusterIP None <none> 9090/TCP 35s NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE persistentvolume/pvc-7c195a1da23d4755b21b6ed2db 50Gi RWO Delete Bound monitoring/prometheus-kube-prometheus-stack-1671-prometheus-db-prometheus-kube-prometheus-stack-1671-prometheus-0 nks-block-storage 33s persistentvolume/pvc-8c1c8c896efb40b6af8fe82a42 50Gi RWO Delete Bound monitoring/alertmanager-kube-prometheus-stack-1671-alertmanager-db-alertmanager-kube-prometheus-stack-1671-alertma
  • 17.
    nager-0 nks-block-storage 34s persistentvolume/pvc-c4ba41508e4d4914a1f255f0ae100Gi RWO Delete Bound monitoring/kube-prometheus-stack-1671267408-grafana nks-block-storage 36s NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE persistentvolumeclaim/alertmanager-kube-prometheus-stack-1671-alertmanager-db-alertmanager-kube-prometheus-stack-1 671-alertmanager-0 Bound pvc-8c1c8c896efb40b6af8fe82a42 50Gi RWO nks-block-storage 36s persistentvolumeclaim/kube-prometheus-stack-1671267408-grafana Bound pvc-c4ba41508e4d4914a1f255f0ae 100Gi RWO nks-block-storage 38s persistentvolumeclaim/prometheus-kube-prometheus-stack-1671-prometheus-db-prometheus-kube-prometheus-stack-1671-pr ometheus-0 Bound pvc-7c195a1da23d4755b21b6ed2db 50Gi RWO nks-block-storage 35s 레퍼런스: https://1week.tistory.com/43 https://passwd.tistory.com/entry/Helm-kube-prometheus-stack-Grafana-Persistence-%ED%9 9%9C%EC%84%B1%ED%99%94 https://github.com/prometheus-community/helm-charts/issues/113