쿠버네티스 데이터베이스 오퍼레이터 - 3주차

hanship 2025. 12. 5. 05:28

2025. 12. 5. 05:28

PostgreSQL

PostgreSQL 데이터베이스 구조

클러스터 Cluster : 데이터베이스들의 집합
스키마 Schema : 개체들의 논리적인 집합, 개체는 테이블, 뷰, 함수, 인덱스, 데이터 타입, 연산자 등이 있다.
- 데이터베이스 생성 시 자동으로 기본 스키마인 public 스키마가 생성됨
- PostgreSQL 는 테이블의 집합을 스키마의 개념으로 사용한다. 여기서 스키마들의 집합이 데이터베이스가 된다.
- 대표적으로 MySQL 에서는 테이블의 집합이 데이터베이스가 된다.
테이블 Table : 가장 기본 구조인 테이블, 표를 테이블이라 부른다고 생각하면 좋다
- 테이블은 가로행(row) 과 세로열(column)로 이루어져 있다.
- 어떤 경우에는 테이블을 ‘릴레이션 relation’ 라고 한다. 테이블과 릴레이션은 데이터간 ‘관계'를 통해 데이터를 관리할 수 있게 한다.

PostgreSQL의 주요 특징:

SQL 표준 준수 : PostgreSQL은 SQL 표준에 가깝게 구현된 시스템으로, 다양한 SQL 언어의 기능을 지원
ACID 준수 트랜잭션 : Atomicity(원자성), Consistency(일관성), Isolation(고립성), Durability(영구성)을 보장
확장 가능한 구조 : 사용자 정의 타입, 함수, 연산자 등을 통해 확장할 수 있으며, 플러그인이나 외부 모듈을 통한 기능 추가가 가능
복잡한 쿼리 실행 : 강력한 인덱스 기능과 복잡한 SQL 쿼리 실행 능력을 갖추고 있음
MVCC : Multi-Version Concurrency Control을 통해 고수준의 동시성과 성능을 제공
강력한 데이터 타입 지원 : 다양한 내장 데이터 타입과 사용자 정의 데이터 타입을 지원
완벽한 텍스트 검색 : 내장 텍스트 검색 기능을 통해 복잡한 텍스트 기반 쿼리를 지원
보안 : 강력한 인증, 권한 부여 체계와 SSL 지원을 통해 데이터 보안을 보장
대용량 데이터 처리 능력 : 테라바이트 이상의 대용량 데이터를 처리할 수 있음

PostgreSQL의 사용법

postgresql 실행

docker run -p 5432:5432 --name postgres -e POSTGRES_PASSWORD=postgres -d postgres

접속 client

docker exec -it postgres psql -U postgres;

그외에도 pgadmin, dbeaver, datagrip과 같은 툴을 이용하여 접근 가능하다.

CloudNativePG

Kubernetes 환경에서 PostgreSQL 데이터베이스를 실행 및 관리하기 위한 오픈소스이다

CloudNativePG 설치

**curl -sL <https://github.com/operator-framework/operator-lifecycle-manager/releases/download/v0.25.0/install.sh> | bash -s v0.25.0**
kubectl get all -n olm
NAME                                    READY   STATUS    RESTARTS   AGE
pod/catalog-operator-569cd6998d-l6jbv   1/1     Running   0          60s
pod/olm-operator-6fbbcd8c8b-p6qbt       1/1     Running   0          60s
pod/operatorhubio-catalog-fgwlx         1/1     Running   0          50s
pod/packageserver-78dc57bf98-frbvr      1/1     Running   0          50s
pod/packageserver-78dc57bf98-jxjb9      1/1     Running   0          50s

NAME                            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)     AGE
service/operatorhubio-catalog   ClusterIP   10.100.29.65            50051/TCP   50s
service/packageserver-service   ClusterIP   10.100.65.160           5443/TCP    51s

NAME                               READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/catalog-operator   1/1     1            1           60s
deployment.apps/olm-operator       1/1     1            1           60s
deployment.apps/packageserver      2/2     2            2           51s

NAME                                          DESIRED   CURRENT   READY   AGE
replicaset.apps/catalog-operator-569cd6998d   1         1         1       60s
replicaset.apps/olm-operator-6fbbcd8c8b       1         1         1       60s
replicaset.apps/packageserver-78dc57bf98      2         2         2       51s

설치 후 olm을 확인할 수 있다. olm(operator lifecycle manage) 은 오퍼레이터를 관리하기 위한 시스템 파드들이 배치된 곳이다.

curl -s -O https://operatorhub.io/install/cloudnative-pg.yaml
kubectl create -f cloudnative-pg.yaml
# check
kubectl get all -n operators
NAME                                           READY   STATUS    RESTARTS   AGE
pod/cnpg-controller-manager-7c74c96b65-gpvm2   1/1     Running   0          26s

NAME                                      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
service/cnpg-controller-manager-service   ClusterIP   10.100.110.12   <none>        443/TCP   26s
service/cnpg-webhook-service              ClusterIP   10.100.55.12    <none>        443/TCP   28s

NAME                                      READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/cnpg-controller-manager   1/1     1            1           26s

NAME                                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/cnpg-controller-manager-7c74c96b65   1         1         1       26s

kubectl get crd | grep cnpg # cnpg 관련 crd 확인
backups.postgresql.cnpg.io                    2023-11-04T06:46:52Z
clusters.postgresql.cnpg.io                   2023-11-04T06:46:52Z
poolers.postgresql.cnpg.io                    2023-11-04T06:46:51Z
scheduledbackups.postgresql.cnpg.io           2023-11-04T06:46:51Z # 백업할때  사용하는 crd

위 명령어를 이용하여 오퍼레이터를 설치한다.(cnpg = cloudnativepg)

cat <<EOT> mycluster1.yaml
# Example of PostgreSQL cluster
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: mycluster
spec:
  imageName: ghcr.io/cloudnative-pg/postgresql:15.3 # postgres 버전
  instances: 3
  storage:
    size: 3Gi
  postgresql:
    parameters:
      max_worker_processes: "40"
      timezone: "Asia/Seoul"
    pg_hba: # 인증관련
      - host all postgres all trust
  primaryUpdateStrategy: unsupervised 
  enableSuperuserAccess: true
  bootstrap: # 최초 데이터베이스 생성 시
    initdb:
      database: app
      encoding: UTF8
      localeCType: C
      localeCollate: C
      owner: app

  monitoring:
    enablePodMonitor: true # prometheus 수집가능
EOT

kubectl apply -f mycluster1.yaml

위 명령어를 이용하여 실제 클러스터를 배포한다.

kubectl get pod -w 를 통해 생성과정을 보면

mycluster-1-initdb-9dwcc   0/1     Pending   0          1s
mycluster-1-initdb-9dwcc   0/1     Pending   0          4s
mycluster-1-initdb-9dwcc   0/1     Init:0/1   0          5s
mycluster-1-initdb-9dwcc   0/1     PodInitializing   0          16s
mycluster-1-initdb-9dwcc   1/1     Running           0          28s
mycluster-1-initdb-9dwcc   0/1     Completed         0          35s
...
mycluster-2-join-7lqkc     0/1     Pending           0          0s
mycluster-2-join-7lqkc     0/1     Pending           0          4s
mycluster-2-join-7lqkc     0/1     Init:0/1          0          4s
mycluster-2-join-7lqkc     0/1     Init:0/1          0          13s
mycluster-2-join-7lqkc     0/1     PodInitializing   0          14s
mycluster-2-join-7lqkc     1/1     Running           0          27s
mycluster-2-join-7lqkc     0/1     Completed         0          33s
...
mycluster-2-join-7lqkc     0/1     Terminating       0          2m40s
mycluster-1-initdb-9dwcc   0/1     Terminating       0          3m36s
mycluster-2-join-7lqkc     0/1     Terminating       0          2m40s
mycluster-1-initdb-9dwcc   0/1     Terminating       0          3m36s
mycluster-3-join-jxgcx     0/1     Terminating       0          112s
mycluster-3-join-jxgcx     0/1     Terminating       0          112s

처음에는 initdb 가 돌고 그 다음에는 mycluster-2-join-7lqkc 파드에서 join을 볼 수 있는데 이는 primary인 mycluster-1 이 실행되고 나면 두번째가 join으로 primary에 join을 하게된다. 2번이 끝나면 그 다음 3번이 primary에 join하게 된다.

마지막에는 job같은 것들은 Terminating 된다.

Service, Endpoint ro,rw,r 확인

kubectl get svc
NAME           TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
kubernetes     ClusterIP   10.100.0.1       <none>        443/TCP    90m
mycluster-r    ClusterIP   10.100.218.178   <none>        5432/TCP   4m33s
mycluster-ro   ClusterIP   10.100.154.145   <none>        5432/TCP   4m33s
mycluster-rw   ClusterIP   10.100.171.7     <none>        5432/TCP   4m33s

k get endpointslices
NAME                 ADDRESSTYPE   PORTS   ENDPOINTS                                  AGE
kubernetes           IPv4          443     192.168.1.245,192.168.3.45                 105m
mycluster-r-wpqct    IPv4          5432    192.168.3.212,192.168.2.150,192.168.1.82   18m
mycluster-ro-766zp   IPv4          5432    192.168.2.150,192.168.1.82                 18m
mycluster-rw-sdgm2   IPv4          5432    192.168.3.212                              18m

mycluster-ro : Read-only workloads -RO 요청을 Standby 로 Round robin 방식으로 전달

mycluster-r : Read-only workloads -R 요청 any로 전달(any는 promary, standby 모두 접근 가능)

mycluster-rw : -RW 요청을 프라이머리로 전달, primary pod에 해당

kubectl describe pod -l cnpg.io/cluster=mycluster # TCP 9187 메트릭 제공
kubectl get pod -l cnpg.io/cluster=mycluster -owide
curl -s <파드IP>:9187/metrics
curl -s 192.168.1.84:9187/metrics
# 그라파나 대시보드 설정 : CloudNativePG 대시보드
kubectl apply -f https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/main/docs/src/samples/monitoring/grafana-configmap.yaml

위 명령어를 통해 metric과 granfana 를 확인 할 수 있다

kubectl krew install cnpg
# 플러그인 설치 후 정보를 편하게 확인할 수 있다.
kubectl cnpg status mycluster
Cluster Summary
Name:                mycluster
Namespace:           default
System ID:           7297496592879362067
PostgreSQL Image:    ghcr.io/cloudnative-pg/postgresql:15.3
Primary instance:    mycluster-1
Primary start time:  2023-11-04 06:54:08 +0000 UTC (uptime 14m0s)
Status:              Cluster in healthy state
Instances:           3
Ready instances:     3
Current Write LSN:   0/7000060 (Timeline: 1 - WAL File: 000000010000000000000007)

Certificates Status
Certificate Name       Expiration Date                Days Left Until Expiration
----------------       ---------------                --------------------------
mycluster-ca           2024-02-02 06:48:20 +0000 UTC  89.99
mycluster-replication  2024-02-02 06:48:20 +0000 UTC  89.99
mycluster-server       2024-02-02 06:48:20 +0000 UTC  89.99

Continuous Backup status
Not configured

Streaming Replication status
Replication Slots Enabled
Name         Sent LSN   Write LSN  Flush LSN  Replay LSN  Write Lag  Flush Lag  Replay Lag  State      Sync State  Sync Priority  Replication Slot
----         --------   ---------  ---------  ----------  ---------  ---------  ----------  -----      ----------  -------------  ----------------
mycluster-2  0/7000060  0/7000060  0/7000060  0/7000060   00:00:00   00:00:00   00:00:00    streaming  async       0              active
mycluster-3  0/7000060  0/7000060  0/7000060  0/7000060   00:00:00   00:00:00   00:00:00    streaming  async       0              active

Unmanaged Replication Slot Status
No unmanaged replication slots found

Instances status
Name         Database Size  Current LSN  Replication role  Status  QoS         Manager Version  Node
----         -------------  -----------  ----------------  ------  ---         ---------------  ----
mycluster-1  29 MB          0/7000060    Primary           OK      BestEffort  1.21.1           ip-192-168-3-23.ap-northeast-2.compute.internal
mycluster-2  29 MB          0/7000060    Standby (async)   OK      BestEffort  1.21.1           ip-192-168-2-98.ap-northeast-2.compute.internal
mycluster-3  29 MB          0/7000060    Standby (async)   OK      BestEffort  1.21.1           ip-192-168-1-204.ap-northeast-2.compute.internal

사전에 krew를 설치하고 cnpg 플러그인을 설치한다. -v 옵션으로 좀 더 디테일하게 볼 수 있다.

k get sts
No resources found in default namespace.

statefullset 을 조회하면 아무것도 안나오는데 cnpg는 statefullset을 사용하지 않는다. statefullset은 중간에 변경이 힘드는 듯 여러가지 단점이 있기에 커스텀 컨트롤러를 만들어서 직접관리해서 사용하지 않음.

kubectl describe pv | grep 'Node Affinity:' -A2
Node Affinity:
  Required Terms:
    Term 0:        topology.ebs.csi.aws.com/zone in [ap-northeast-2c]
--
Node Affinity:
  Required Terms:
    Term 0:        topology.ebs.csi.aws.com/zone in [ap-northeast-2b]
--
Node Affinity:
  Required Terms:
    Term 0:        topology.ebs.csi.aws.com/zone in [ap-northeast-2a]

위 명령어를 통해 pv가 할당된 aws subnet의 zone을 확인 할 수 있다. 데이터베이스에서 ebs가 늘어나기 위해서는 해당 zone에 자원이 있어야 한다.

CloudNativePG 사용

postgres 접속준비 절차

# superuser 계정명 확인
kubectl get secrets mycluster-superuser -o jsonpath={.data.username} | base64 -d ;echo
postgres
# superuser 계정 암호 확인
kubectl get secrets mycluster-superuser -o jsonpath={.data.password} | base64 -d ;echo
iwAdrHxRiBkVkHyHNIsMDL9b37DD5N3zWxYKsHlnDLONkDLcQCxKVPZoi2u757q2

# app 계정명
kubectl get secrets mycluster-app -o jsonpath={.data.username} | base64 -d ;echo
app

# app 계정 암호
kubectl get secrets mycluster-app -o jsonpath={.data.password} | base64 -d ;echo
F29o4utoZUt7RIhsacu6obVmugEsiyVJxwEV1E8V8QgHsLn5R8lxVQhHaTuObySO

# app 계정 암호 변수 지정
AUSERPW=$(kubectl get secrets mycluster-app -o jsonpath={.data.password} | base64 -d)

# myclient 파드 3대 배포 : envsubst 활용
curl -s <https://raw.githubusercontent.com/gasida/DOIK/main/5/myclient-new.yaml> -o myclient.yaml
for ((i=1; i<=3; i++)); do PODNAME=myclient$i VERSION=15.3.0 envsubst < myclient.yaml | kubectl apply -f - ; done

postgres user로 접속

# [myclient1] superuser 계정으로 mycluster-rw 서비스 접속
kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432

연결정보 확인

postgres-# \\conninf
You are connected to database "postgres" as user "postgres" on host "mycluster-rw" (address "10.100.171.7") at port "5432".
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)

mycluster-rw 클러스터 아이피 10.100.171.7 으로 접속했다는 의미

데이터 베이스 조회(bootstrap에서 생성된 db)

postgres=# \l
List of databases
   Name    |  Owner   | Encoding | Collate | Ctype | ICU Locale | Locale Provider |   Access privileges
-----------+----------+----------+---------+-------+------------+-----------------+-----------------------
 app       | app      | UTF8     | C       | C     |            | libc            |
 postgres  | postgres | UTF8     | C       | C     |            | libc            |
 template0 | postgres | UTF8     | C       | C     |            | libc            | =c/postgres          +
           |          |          |         |       |            |                 | postgres=CTc/postgres
 template1 | postgres | UTF8     | C       | C     |            | libc            | =c/postgres          +
           |          |          |         |       |            |                 | postgres=CTc/postgres
(4 rows)

app 계정으로 접속(접속 시 아까 출력한 app 계정 암호 입력)

kubectl exec -it myclient1 -- psql -U app -h mycluster-rw -p 5432
Password for user app: F29o4utoZUt7RIhsacu6obVmugEsiyVJxwEV1E8V8QgHsLn5R8lxVQhHaTuObySO
app=> \conninfo # 접속 정보
app=> \l # db 출력
app=> \dt 
app=> \q # 종료

외부에서 접속 → NLB로 통해서 외부에서 접근 가능하도록 설정, 사전에 external dns, nlb controller 설치 필요

# postgresql psql 툴 설치
yum install postgresql -y

# Service(LoadBalancer)로 외부 노출 설정 : 3~5분 정도 대기 후 아래 접속 시도 배포 시간 걸림
kubectl patch svc mycluster-rw -p '{"spec":{"type":"LoadBalancer"}}'
kubectl annotate service mycluster-rw "external-dns.alpha.kubernetes.io/hostname=psql.$MyDomain"

# 접속
psql -U postgres -h psql.$MyDomain

CloudNativePG 부하 분산 테스트

rw, ro, r 접근 시 부하 분산이 어떻게 되는 지에 대한 테스트

mycluster-1 ⇒ 192.168.3.212 primary

mycluster-2 ⇒ 192.168.2.150 standby

mycluster-3 ⇒ 192.168.1.82 standby

k get po -owide
mycluster-1   1/1     Running   0          67m   192.168.3.212   ip-192-168-3-23.ap-northeast-2.compute.internal    <none>           <none>
mycluster-2   1/1     Running   0          66m   192.168.2.150   ip-192-168-2-98.ap-northeast-2.compute.internal    <none>           <none>
mycluster-3   1/1     Running   0          64m   192.168.1.82    ip-192-168-1-204.ap-northeast-2.compute.internal   <none>           <none>

# rw
for i in {1..12}; do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -c "select inet_server_addr();"; done | sort | uniq -c | sort -nr | grep 192
12  192.168.3.212

# ro
for i in {1..12}; do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-ro -p 5432 -c "select inet_server_addr();"; done | sort | uniq -c | sort -nr | grep 192
7  192.168.2.150
5  192.168.1.82

# r
for i in {1..12}; do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-r -p 5432 -c "select inet_server_addr();"; done | sort | uniq -c | sort -nr | grep 192
6  192.168.1.82
3  192.168.3.212
3  192.168.2.150

rw는 primary만 접근하기에 mycluster-1 만 12번 접근

ro는 standby만 접근하기에 mycluster-2, mycluster-3 를 7,5번 나눠서 접근

r은 모두 접근하는 any이기에 각각 6,3,3 나눠서 접근

cnpg 는 확률에 의한 부하분산이기에 접근 시 한쪽으로 쏠리는 경향이 있음

rw는 primary만 접근하기에 mycluster-1 만 12번 접근

ro는 standby만 접근하기에 mycluster-2, mycluster-3 를 7,5번 나눠서 접근

r은 모두 접근하는 any이기에 각각 6,3,3 나눠서 접근

cnpg 는 확률에 의한 부하분산이기에 접근 시 한쪽으로 쏠리는 경향이 있음

CloudNativePG 장애 테스트

사전 준비 - 데이터 준비

# 다운로드
curl -LO https://www.postgresqltutorial.com/wp-content/uploads/2019/05/dvdrental.zip
unzip dvdrental.zip

# myclient1 파드에 dvdrental.tar 복사
kubectl cp dvdrental.tar myclient1:/tmp

# [myclient1] superuser 계정으로 mycluster-rw 서비스 접속 후 데이터베이스 생성
kubectl exec -it myclient1 -- createdb -U postgres -h mycluster-rw -p 5432 dvdrental

# DVD Rental Sample Database 불러오기
kubectl exec -it myclient1 -- pg_restore -U postgres -d dvdrental /tmp/dvdrental.tar -h mycluster-rw -p 5432

# DVD Rental Sample Database 에서 actor 테이블 조회
kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -d dvdrental -c "SELECT * FROM actor"
actor_id | first_name  |  last_name   |      last_update
----------+-------------+--------------+------------------------
        1 | Penelope    | Guiness      | 2013-05-26 14:47:57.62
        2 | Nick        | Wahlberg     | 2013-05-26 14:47:57.62
        3 | Ed          | Chase        | 2013-05-26 14:47:57.62
        4 | Jennifer    | Davis        | 2013-05-26 14:47:57.62
        5 | Johnny      | Lollobrigida | 2013-05-26 14:47:57.62
        6 | Bette       | Nicholson    | 2013-05-26 14:47:57.62

사전 준비 - 환경설정

# 파드IP 변수 지정
POD1=$(kubectl get pod mycluster-1 -o jsonpath={.status.podIP})
POD2=$(kubectl get pod mycluster-2 -o jsonpath={.status.podIP})
POD3=$(kubectl get pod mycluster-3 -o jsonpath={.status.podIP})

# query.sql
curl -s -O <https://raw.githubusercontent.com/gasida/DOIK/main/5/query.sql>

# SQL 파일 query 실행
kubectl cp query.sql myclient1:/tmp
kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -f /tmp/query.sql

[장애1] 프라이머리 파드(인스턴스) 1대 강제 삭제 및 동작 확인, 터미널 4개 실행 필요

데이터를 Insert 하는 중에 primary pod 삭제 시 Insert가 끊어지는 지 확인하는 확인

# [터미널1] 모니터링
watch kubectl get pod -l cnpg.io/cluster=mycluster

# [터미널2] 모니터링
while true; do kubectl exec -it myclient2 -- psql -U postgres -h mycluster-ro -p 5432 -d test -c "SELECT COUNT(*) FROM t1"; date;sleep 1; done

# [터미널3] test 데이터베이스에 다량의 데이터 INSERT
for ((i=10001; i<=20000; i++)); do kubectl exec -it myclient2 -- psql -U postgres -h mycluster-rw -p 5432 -d test -c "INSERT INTO t1 VALUES ($i, 'Luis$i');";echo; done

primary pod 강제 삭제(현재 primary pod는 mycluster-1)

kubectl delete pvc/mycluster-1 pod/mycluster-1
kubectl cnpg status mycluster

Insert 계속 되는 지 [터미널2] 로 확인

# 삭제전 count 는 계속 증가
count
-------
    12
(1 row)
Sat Nov  4 17:14:20 KST 2023
 count
-------
    13
(1 row)
Sat Nov  4 17:14:23 KST 2023

# 삭제후
[터미널1]
NAME                     READY   STATUS      RESTARTS   AGE
mycluster-2              1/1     Running     0          80m
mycluster-3              1/1     Running     0          78m
mycluster-4              0/1     Running     0          11s
mycluster-4-join-rhjvx   0/1     Completed   0          27s

[터미널2]
Sat Nov  4 17:15:21 KST 2023
 count
-------
    52
(1 row)
Sat Nov  4 17:15:23 KST 2023
 count
-------
    53
(1 row)
Sat Nov  4 17:15:26 KST 2023
 count
-------
    55
(1 row)
Sat Nov  4 17:15:28 KST 2023

[터미널3]
INSERT 0 1

psql: error: connection to server at "mycluster-rw" (10.100.171.7), port 5432 failed: FATAL:  the database system is shutting down
command terminated with exit code 2

INSERT 0 1

터미널3 에서 끊어졌다고 에러가 나오지만 터미널2를 보면 정상적으로 데이터가 Insert 되고 있고 터미널1에서 바로 join이 실행되는걸 볼수 있음

[장애2] 프라이머리 파드(인스턴스) 가 배포된 노드 1대 drain 설정 및 동작 확인

# [터미널1] 모니터링
watch kubectl get pod -l cnpg.io/cluster=mycluster

# [터미널2] 모니터링
while true; do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-ro -p 5432 -d test -c "SELECT COUNT(*) FROM t1"; date;sleep 1; done

# [터미널3] test 데이터베이스에 다량의 데이터 INSERT
for ((i=301; i<=10000; i++)); do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -d test -c "INSERT INTO t1 VALUES ($i, 'Luis$i');";echo; done

워커노드 drain 위 테스트로 인해 mycluster-2번이 primary로 되었음 그렇기에 mycluster-2 drain

k cnpg status mycluster
Instances status
Name         Database Size  Current LSN  Replication role  Status  QoS         Manager Version  Node
----         -------------  -----------  ----------------  ------  ---         ---------------  ----
mycluster-2  51 MB          0/D00A9B8    Primary           OK      BestEffort  1.21.1           ip-192-168-2-98.ap-northeast-2.compute.internal
mycluster-3  51 MB          0/D00A9B8    Standby (async)   OK      BestEffort  1.21.1           ip-192-168-1-204.ap-northeast-2.compute.internal
mycluster-4  51 MB          0/D00A9B8    Standby (async)   OK      BestEffort  1.21.1           ip-192-168-3-23.ap-northeast-2.compute.internal

# mycluster-2 가 위치한 노드 확인 후 해당 node 주소 입력
kubectl get node
NODE=ip-192-168-2-98.ap-northeast-2.compute.internal
kubectl drain $NODE --delete-emptydir-data --force --ignore-daemonsets && kubectl get node -w

# drain 이후
[터미널1]
NAME          READY   STATUS        RESTARTS     AGE
mycluster-2   0/1     Terminating   1 (7s ago)   88m
mycluster-3   1/1     Running       0            86m
mycluster-4   1/1     Running       0            7m45s

[터미널2]
INSERT 0 1
INSERT 0 1
INSERT 0 1

[터미널3]
count
-------
   125
(1 row)
Sat Nov  4 17:23:34 KST 2023
 count
-------
   125
(1 row)

kubectl cnpg status mycluster
Instances status
Name         Database Size  Current LSN  Replication role  Status             QoS         Manager Version  Node
----         -------------  -----------  ----------------  ------             ---         ---------------  ----
mycluster-3  51 MB          0/F00D450    Primary           OK                 BestEffort  1.21.1           ip-192-168-1-204.ap-northeast-2.compute.internal
mycluster-4  51 MB          0/F00D450    Standby (async)   OK                 BestEffort  1.21.1           ip-192-168-3-23.ap-northeast-2.compute.internal
mycluster-2  -              -            -                 pod not available  BestEffort  -

# drain node 복구
kubectl uncordon $NODE
kubectl cnpg status mycluster
Instances status
Name         Database Size  Current LSN  Replication role  Status  QoS         Manager Version  Node
----         -------------  -----------  ----------------  ------  ---         ---------------  ----
mycluster-3  51 MB          0/F012AC0    Primary           OK      BestEffort  1.21.1           ip-192-168-1-204.ap-northeast-2.compute.internal
mycluster-2  51 MB          0/F012AC0    Standby (async)   OK      BestEffort  1.21.1           ip-192-168-2-98.ap-northeast-2.compute.internal
mycluster-4  51 MB          0/F012AC0    Standby (async)   OK      BestEffort  1.21.1           ip-192-168-3-23.ap-northeast-2.compute.internal

node가 drain 됬지만 mycluster 3번이 primary로 승격하여 정상적인 db 동작

CloudNativePG 운영 테스트

파드볼륨 증가

db 운영 시 pod 볼륨 부족으로 scale up 해야될 경우가 있다. cnpg에서는 해당 기능이 명령어만으로 aws와 연동하여 가능하다. 다만 늘릴수는 있어도 줄일 수는 없다. 볼륨을 수동으로 변경 시 상당히 번거로운데 명령어만으로 간편하게 스케일업가능

# 모니터링
watch kubectl get pod,pvc

# 현재 pv 용량 확인
kubectl df-pv
PV NAME                                   PVC NAME     NAMESPACE  NODE NAME                                         POD NAME     VOLUME MOUNT NAME  SIZE  USED   AVAILABLE  %USED  IUSED  IFREE   %IUSED
pvc-24ae349b-4f2d-4fa5-897b-ac8f34af3e63  mycluster-2  default    ip-192-168-2-98.ap-northeast-2.compute.internal   mycluster-2  pgdata             2Gi   291Mi  2Gi        9.88   2034   194574  1.03
pvc-d43bd5fa-14dd-4eea-810c-a517384e6516  mycluster-4  default    ip-192-168-3-23.ap-northeast-2.compute.internal   mycluster-4  pgdata             2Gi   131Mi  2Gi        4.46   2012   194596  1.02
pvc-a8d1b7da-ae9b-44d4-b570-9f4c5a33a366  mycluster-3  default    ip-192-168-1-204.ap-northeast-2.compute.internal  mycluster-3  pgdata             2Gi   259Mi  2Gi        8.80   2029   194579  1.03

pod 들이 2g씩 할당되어 있음을 볼 수 있다. 이 용량을 5g로 증가시킨다.

kubectl patch cluster mycluster --type=merge -p '{"spec":{"storage":{"size":"**5Gi**"}}}'
kubectl df-pv
PV NAME                                   PVC NAME     NAMESPACE  NODE NAME                                         POD NAME     VOLUME MOUNT NAME  SIZE  USED   AVAILABLE  %USED  IUSED  IFREE   %IUSED
 pvc-24ae349b-4f2d-4fa5-897b-ac8f34af3e63  mycluster-2  default    ip-192-168-2-98.ap-northeast-2.compute.internal   mycluster-2  pgdata             4Gi   307Mi  4Gi        6.20   2036   325644  0.62

aws console에서도 ebs가 5기가로 변경됨을 볼 수 있다.

primary 파드 변경

운영중 primary 파드 변경 시 간편하게 변경할 수 있다. 아래의 예시에서 mycluster-3에서 mycluster-4로 변경됨을 볼 수 있다.

# 현재 primary 확인 -> mycluster-3 이 primary
kubectl cnpg status mycluster
Instances status
Name         Database Size  Current LSN  Replication role  Status  QoS         Manager Version  Node
----         -------------  -----------  ----------------  ------  ---         ---------------  ----
mycluster-3  51 MB          0/10000110   Primary           OK      BestEffort  1.21.1           ip-192-168-1-204.ap-northeast-2.compute.internal
mycluster-2  51 MB          0/10000110   Standby (async)   OK      BestEffort  1.21.1           ip-192-168-2-98.ap-northeast-2.compute.internal
mycluster-4  51 MB          0/10000110   Standby (async)   OK      BestEffort  1.21.1           ip-192-168-3-23.ap-northeast-2.compute.internal

# primary 파드 변경 mycluster-4 로 변경
kubectl cnpg promote mycluster mycluster-4

kubectl cnpg status mycluster
Instances status
Name         Database Size  Current LSN  Replication role  Status  QoS         Manager Version  Node
----         -------------  -----------  ----------------  ------  ---         ---------------  ----
mycluster-4  51 MB          0/11001410   Primary           OK      BestEffort  1.21.1           ip-192-168-3-23.ap-northeast-2.compute.internal
mycluster-2  51 MB          0/11001410   Standby (async)   OK      BestEffort  1.21.1           ip-192-168-2-98.ap-northeast-2.compute.internal
mycluster-3  50 MB          0/110000A0   Standby (async)   OK      BestEffort  1.21.1           ip-192-168-1-204.ap-northeast-2.compute.internal

scale in/out 테스트

운영중 db 트래픽이 증가하여 pod를 늘릴 필요가 있다. 아래의 과정을 통해 간편하게 scale out 할 수 있다.

# 현재 정보 확인, 현재 3대 운영중
kubectl cnpg status mycluster
Instances status
Name         Database Size  Current LSN  Replication role  Status  QoS         Manager Version  Node
----         -------------  -----------  ----------------  ------  ---         ---------------  ----
mycluster-4  51 MB          0/110033D0   Primary           OK      BestEffort  1.21.1           ip-192-168-3-23.ap-northeast-2.compute.internal
mycluster-2  51 MB          0/110033D0   Standby (async)   OK      BestEffort  1.21.1           ip-192-168-2-98.ap-northeast-2.compute.internal
mycluster-3  51 MB          0/110033D0   Standby (async)   OK      BestEffort  1.21.1           ip-192-168-1-204.ap-northeast-2.compute.internal

# 5대로 증가 -> scale out
kubectl patch cluster mycluster --type=merge -p '{"spec":{"instances":5}}' && kubectl get pod -l postgresql=mycluster -w

# 정보 확인
kubectl cnpg status mycluster
Instances status
Name         Database Size  Current LSN  Replication role  Status  QoS         Manager Version  Node
----         -------------  -----------  ----------------  ------  ---         ---------------  ----
mycluster-4  51 MB          0/15000060   Primary           OK      BestEffort  1.21.1           ip-192-168-3-23.ap-northeast-2.compute.internal
mycluster-2  51 MB          0/15000060   Standby (async)   OK      BestEffort  1.21.1           ip-192-168-2-98.ap-northeast-2.compute.internal
mycluster-3  51 MB          0/15000060   Standby (async)   OK      BestEffort  1.21.1           ip-192-168-1-204.ap-northeast-2.compute.internal
mycluster-5  51 MB          0/15000060   Standby (async)   OK      BestEffort  1.21.1           ip-192-168-2-98.ap-northeast-2.compute.internal
mycluster-6  51 MB          0/15000060   Standby (async)   OK      BestEffort  1.21.1           ip-192-168-1-204.ap-northeast-2.compute.internal

# any로 모든 pod 접근 가능여부 확인
for i in {1..30}; do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-r -p 5432 -c "select inet_server_addr();"; done | sort | uniq -c | sort -nr | grep 192
7  192.168.3.147
7  192.168.2.121
6  192.168.1.199
5  192.168.2.56
5  192.168.1.82

# 다시 3대로 감소 -> scale in
kubectl patch cluster mycluster --type=merge -p '{"spec":{"instances":3}}' && kubectl get pod -l postgresql=mycluster -w

롤링업데이트

서비스 운영중 primary 파드의 postgres 버전 업그레이드가 필요할 경우가 있는데 아래 과정을 통해 15.3 의 버전을 15.4 로 업그레이드 가능하다.

업데이트 설정에 는 두가지가 있다

primaryUpdateMethod

restart(default)
swithover

primaryUpdateStrategy

unsupervised(default) : 자동으로 해줌
supervised : 사용자가 수동으로 stop/restart 해야 함

# [터미널1] 모니터링
watch kubectl get pod -l cnpg.io/cluster=mycluster

# [터미널2] 모니터링
while true; do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-ro -p 5432 -d test -c "SELECT COUNT(*) FROM t1"; date;sleep 1; done

# 현재 primary image 버전 확인
kubectl cnpg status mycluster
Cluster Summary
Name:                mycluster
Namespace:           default
System ID:           7297496592879362067
PostgreSQL Image:    ghcr.io/cloudnative-pg/postgresql:15.3
Primary instance:    mycluster-4

# [터미널3] test 데이터베이스에 다량의 데이터 INSERT

# [터미널4] postgresql:15.3 → postgresql:15.4 로 업데이트 >> 순서와 절차 확인
k edit cluster mycluster
# 혹은 기본값이 restart 이기에 swithover 옵션을 넣어준다
kubectl patch cluster mycluster --type=merge -p '{"spec":{"imageName":"ghcr.io/cloudnative-pg/postgresql:**15.4**","primaryUpdateStrategy":"unsupervised","primaryUpdateMethod":"switchover"}}' && kubectl get pod -l postgresql=mycluster -w

# [터미널1]
NAME          READY   STATUS            RESTARTS   AGE
mycluster-2   1/1     Running     	0          31m
mycluster-3   0/1     PodInitializing   0          18s
mycluster-4   1/1     Running     	0          39m

kubectl cnpg status mycluster
Cluster Summary
Name:                mycluster
Namespace:           default
System ID:           7297496592879362067
PostgreSQL Image:    ghcr.io/cloudnative-pg/postgresql:15.4
Primary instance:    mycluster-2
Primary start time:  2023-11-04 08:55:54 +0000 UTC (uptime 7s)
Status:              Waiting for the instances to become active Some instances are not yet active. Please wait.
Instances:           3
Ready instances:     2
Current Write LSN:   0/170037F8 (Timeline: 5 - WAL File: 000000050000000000000017)

Instances status
Name         Database Size  Current LSN  Replication role  Status  QoS         Manager Version  Node
----         -------------  -----------  ----------------  ------  ---         ---------------  ----
mycluster-2  51 MB          0/170037F8   Primary           OK      BestEffort  1.21.1           ip-192-168-2-98.ap-northeast-2.compute.internal
mycluster-3  51 MB          0/170037F8   Standby (async)   OK      BestEffort  1.21.1           ip-192-168-1-204.ap-northeast-2.compute.internal
mycluster-4  50 MB          0/170037F8   Unknown           OK      BestEffort  1.21.1           ip-192-168-3-23.ap-northeast-2.compute.interna

switchover로 인해 기존 primary 가 mycluster-4에서 mycluster-2로 변경되었고 이미지는 15.4로 정상적으로 업데이트 되었다

CloudNativePG 기타

PgBouncer

PgBouncer는 PostgreSQL 데이터베이스 서버의 연결을 관리하기 위한 경량의 커넥션 풀러. 대량의 동시 데이터베이스 연결을 효율적으로 관리함으로써 성능을 향상시키고, 시스템 자원의 오버헤드를 줄이는 데 도움을 줌

DB 앞단의 proxy와 유사한 기능

기존 실습 삭제

kubectl delete cluster mycluster && kubectl delete pod —all

설치(최신버전인 16.0으로 설치)

# 클러스터 신규 설치 : 동기 복제
cat <<EOT> mycluster2.yaml
> # Example of PostgreSQL cluster
> apiVersion: postgresql.cnpg.io/v1
> kind: Cluster
> metadata:
>   name: mycluster
> spec:
>   imageName: ghcr.io/cloudnative-pg/postgresql:16.0
>   instances: 3
>   storage:
>     size: 3Gi
>   postgresql:
>     pg_hba:
>       - host all postgres all trust
>   enableSuperuserAccess: true
>   minSyncReplicas: 1
>   maxSyncReplicas: 2
>   monitoring:
>     enablePodMonitor: true
> EOT
kubectl apply -f mycluster2.yaml && kubectl get pod -w

# 동기 복제 정보 확인 Sync 로 되어있음
kubectl cnpg status mycluster
Instances status
Name         Database Size  Current LSN  Replication role  Status  QoS         Manager Version  Node
----         -------------  -----------  ----------------  ------  ---         ---------------  ----
mycluster-1  29 MB          0/7000000    Primary           OK      BestEffort  1.21.1           ip-192-168-2-98.ap-northeast-2.compute.internal
mycluster-2  29 MB          0/7000000    Standby (sync)    OK      BestEffort  1.21.1           ip-192-168-1-204.ap-northeast-2.compute.internal
mycluster-3  29 MB          0/7000000    Standby (sync)    OK      BestEffort  1.21.1           ip-192-168-3-23.ap-northeast-2.compute.internal

# 클러스터와 반드시 동일한 네임스페이스에 PgBouncer 파드 설치
cat <<EOT> pooler.yaml
> apiVersion: postgresql.cnpg.io/v1
> kind: Pooler
> metadata:
>   name: pooler-rw
> spec:
>   cluster:
>     name: mycluster
>   instances: 3
>   type: rw
>   pgbouncer:
>     poolMode: session
>     parameters:
>       max_client_conn: "1000"
>       default_pool_size: "10"
> EOT
kubectl apply -f pooler.yaml

# 확인
kubectl get pooler
NAME        AGE   CLUSTER     TYPE
pooler-rw   32s   mycluster   rw

kubectl get svc,ep pooler-rw
NAME                TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
service/pooler-rw   ClusterIP   10.100.7.179   <none>        5432/TCP   68s
NAME                  ENDPOINTS                                                 AGE
endpoints/pooler-rw   192.168.1.82:5432,192.168.2.121:5432,192.168.3.213:5432   68s

# superuser 계정 암호
kubectl get secrets mycluster-superuser -o jsonpath={.data.password} | base64 -d ; echo
P7spo92qE1wn1lMVOdzwOAm46XjqoQCvq7srOS1pExYAGPqROYMYTYacXSVn6qmI

# 접속 client 생성
for ((i=1; i<=3; i++)); do PODNAME=myclient$i VERSION=15.3.0 envsubst < myclient.yaml | kubectl apply -f - ; done
# 접속 확인 : pooler 인증 설정이 적용됨! -> rw여서 동일한 pod 접근
kubectl exec -it myclient1 -- psql -U postgres -h pooler-rw -p 5432 -c "select inet_server_addr();"
P7spo92qE1wn1lMVOdzwOAm46XjqoQCvq7srOS1pExYAGPqROYMYTYacXSVn6qmI
inet_server_addr
------------------
 192.168.2.150
(1 row)

# (옵션) Monitoring Metrics
kubectl get pod -l cnpg.io/poolerName=pooler-rw -owide
curl <파드IP>:9127/metrics

cat <<EOT> podmonitor-pooler-rw.yaml
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: pooler-rw
spec:
  selector:
    matchLabels:
      cnpg.io/poolerName: pooler-rw
  podMetricsEndpoints:
  - port: metrics
EOT
kubectl apply -f podmonitor-pooler-rw.yaml

'스터디 > [gasida] 쿠버네티스 데이터베이스 오퍼레이터' 카테고리의 다른 글

쿠버네티스 데이터베이스 오퍼레이터 6주차 (0)	2025.12.05
쿠버네티스 데이터베이스 오퍼레이터 5주차 (0)	2025.12.05
쿠버네티스 데이터베이스 오퍼레이터 - 4주차 (0)	2025.12.05
쿠버네티스 데이터베이스 오퍼레이터 - 2주차 (0)	2025.12.05
쿠버네티스 데이터베이스 오퍼레이터 - 1주차 (0)	2025.12.05

hanship 님의 블로그