MS SRE Workshop Notes Taken
what level of flows are healthy, storage account, 5ms healthy, analyse the failure, implement mitigations, put pods into 10 mins crashloop, what will happen, response time will increase.
get to the public website critical, understanding what is first step, e.g. website
large customers run web and backend traffic in different clusters, it can start from one cluster for small footprints, apim for AI, better observability, AKS egress node pool, NSG
login for different apps cpus,
john runs some chaos experiments
Chaos Mesh Overview | Chaos Mesh
install chaos studio pre-rep and then create chaso studio target and experiments
Comments
Post a Comment