KEMBAR78
Kubernates를 위한 Chaos Engineering in Action :: 윤석찬 (AWS 테크에반젤리스트) | PDF
3/ . 103 / 21
3/ 2 118 2
1
. 2
) (
# 2 . 1
/- . / -/ -
Failures are a given and
everything will eventually
fail over time.
https://www.allthingsdistributed.com/2016/03/10-lessons-from-10-years-of-aws.html
https://www.youtube.com/watch?v=zoz0ZjfrQ9s
Amazon 2006
GameDay: Creating
Resiliency Through
Destruction
Jesse Robbins
Netflix 2011
Chaos Monkeys:
Test the resilience
of its Infrastructure
Simian Army – Open Source
https://github.com/Netflix/SimianArmy
T
T r
T a o W
D e W
A W
E n
(c) Dave Hahn, A Day in the Life of a Netflix Engineer Using 37% of the Internet, re:Invent 2015
W
0 5 5 9
• ( E n Un NS
• ( , c
• ( - 1 T
T c 31 /-12
• ( /3 /12
• ( ) 31 /12 (
- 8 - 8
• ( i
ELB
Zuul
API
(c) Josh Evans, Mastering Chaos A Netflix Guide to Microservices, QCon SF 2016
E
• (
• ( / F o Sv c dB
xe n B l F +/
+/ k F D y u
c d
• +/ / ( v n L Cp r
: v b Cp e t s
R ic E
• (
• ( , / / / c i c
c as
• )
• ) / ) c c d d
B e / u y m
(c) Ruslan Meshenberg, From Asgard to Zuul, re:Invent 2014
F A I
R ( c
R a
Z d )/ ,
( c
R e
o nP dS e F
(c) Josh Evans, Mastering Chaos A Netflix Guide to Microservices, QCon SF 2016
Chaos Monkey
https://github.com/ne
tflix/chaosmonkey
Instance Fail?
Chaos Gorilla
Zone Fail?
Chaos Kong
Region Fail?
.
, ) )( .
j a l o i
c
e
n tF r I
n
•
• U
• P
•
• ), ( )
• P CP !
Chaos doesn’t cause problems.
It reveals them.
C
E C C
!
https://www.oreilly.com/webops-
perf/free/chaos-engineering.csp
http://principlesofchaos.org/
http://channy.creation.net/blog/netflix-
principles-of-chaos-engineering
•
•
•
•
•
. ( / ) - / / /( : .
(
Y S
• 4 A A9 4 A4 4 4 A 4 . : %
• .AA: A9 4 A4 4 ( 9 4 04 4 04 %
• 4 AA! ) A9 4 A4 4 4 L 4 5 A9
A A M54 N 5 9A 4: G A4 1 A 2 G4 %
• C aP
• 0 C
•
• ? 0 V
• 5 3
•
• 4
•
•
•
•
•
• ! !
•
•
Users
) ( )
99%
users
1%
users
Start with...
• ?
• ?
•
•
•
•
•
( 5 5 5
) 5
Microservices (applications)
DevOps(Culture)
C
haos
Engineering
Cloud (Scale)
Chaos Toolkit
PowerfulSeal
Gremlin
mycluster.eks.amazonaws.com
Availability
Zone 1
Availability
Zone 2
Availability
Zone 3
Kubectl
mycluster.eks.amazonaws.com
Availability
Zone 1
Availability
Zone 2
Availability
Zone 3
Kubectl
x
x
Health check?
Dead node?
https://github.com/asobti/kube-monkey
C
apiVersion: apps/v1
kind: Deployment
. . .
template:
metadata:
labels:
app: greeting
kube-monkey/enabled: enabled
kube-monkey/identifier: redis-master
kube-monkey/mtbf: 2
kube-monkey/kill-mode: random-max-percent
kube-monkey/kill-value: 40
spec:
containers:
- name: greeting
apiVersion: v1
kind: Confirm
metadata:
name: kube-monkey-config-map
namespace: kube-system
data:
config.toml: |
[kubemonkey]
run_hour = 8
start_hour = 10
end_hour = 16
blacklisted_namespaces = ["kube-system"]
whitelisted_namespaces = [”default”]
time_zone = “UTC”
.
Amazon EKS
Chaos Toolkit
PowerfulSeal
Gremlin
Service Mesh Plane
Svc
A
Svc
B
Side
Car
Side
Car
M S
I
•
•
•
•
,
//.
,
2
C
1
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: ratings-destination-rule
spec:
host: ratings
subsets:
- name: ratings
labels:
ratings: hello
trafficPolicy:
connectionPool:
tcp:
maxConnections: 100
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: reviews
spec:
hosts:
- reviews
http:
- route:
- destination:
host: reviews
subset: v1
weight: 75
- destination:
host: reviews
subset: v2
weight: 25
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: ratings
spec:
hosts:
- ratings
http:
- fault:
delay:
fixedDelay: 10s
percent: 100
route:
- destination:
host: ratings
subset: v1
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: ratings
spec:
hosts:
- ratings
http:
- fault:
abort:
httpStatus: 500
percent: 100
route:
- destination:
host: ratings
subset: v1
https://github.com/chaostoolkit/chaostoolkit-kubernetes
& & &
https://help.gremlin.com/installation/#ho
w-to-install-gremlin-with-kubernetes
https://github.com/bloomberg/powerfulseal
https://www.youtube.com/watch?v=00BMn0UjsG4
https://bit.ly/2uKOJMQ
https://github.com/chaoseng/wg-chaoseng
• @LLH N AJ BJ@ L / 5 H=J 2 = 183A H
• @LLH ?J= A
• @LLH = = J? =L A -A
• @LLH L J== ?A ==JA ? A
• @LLH ?AL@ : L=J? = = J=
• @LLH = AP J? L= A = =J= = A A H H=J H
• @LLH = A .4=L AP7= @0 ?
• @LLH H= =J = L : L @ = ?A ==JA ? : L H
• @LLH ?AL@ : @ J = = @ = ?A ==JA ?
• @LLH A HJ= = L LA =L AP @ A J =JNA =
• @LLH J HA ? H L= L H HA ? 8 HLA =8 @= L8 @==L H
• @LLH A ? =? =JA J L = A
References
https://www.facebook.com/groups/chaosengkorea/
https://www.meetup.com/Korea-Chaos-Engineering-Community/

Kubernates를 위한 Chaos Engineering in Action :: 윤석찬 (AWS 테크에반젤리스트)