XenonStack Recommends

Cloud Native Applications

KrKn for Chaos Engineering in Managed Services

Gursimran Singh | 10 October 2024

KrKn for Chaos Engineering in Managed Services
10:42
KrKn for Chaos Engineering in Managed Services

What is Chaos Engineering?

Chaos engineering is an approach to today’s software systems that seeks to strengthen the internal integrity of the system by subjecting it to experiments in live conditions. This technique promotes the discovery of points of weakness and endorses the fortification of systems, particularly in the context of the health of cloud-native systems

The major objective in practicing chaos engineering is to testify for the risks found in the system, all the others being controlled, before launching the system into a production environment where the end users are active. By experimenting with more and more failure scenarios, which would otherwise be avoided at all costs, teams work on the stability, reliability, and robustness of their applications.

The Necessity of Chaos Engineering

With advances in technology, particularly computer and business-enhancing related program development, software applications become and quite logically is true for the distributed systems that are usually sliced and diced unto a virtualized environment in the email so it becomes even more important to guarantee that such applications can withstand and all sorts of failures.  

For this reason, there are a few reasons which argue chaos engineering is a must: 

Resilience Improvement:

No one is perfect in every organization, be it a cyber privacy health care system. Therefore, in cases where downtimes need to be prevented by enhancing performance, chaos experiments will be timely conducted as organizations will be able to look for and fix system defects.

Enhanced Handling of Incidents:

Knowledge of an application's behaviour during failure creates much less ambiguity and speeds up recovery.

Reduction of anxiety levels on releases:

Knowing that the system will not be broken whenever changes are made to prevent any future downtimes, the assumption is that teams will be less worried in pushing the live editions of the projects when they are ready for this phase. 
Reliability in culture: Effective adoption of chaos engineering builds up trust in teams making them seek better solutions to their systems.

Overview of KrKnOverview of KrKn

KrKn is an open-source tool that is devoted to chaos engineering in a particular Kubernetes environment. The failure of traditional static architecture in the face of advancing microservices and an increased use of Kubernetes as the mainstream container manager has indicated the introduction of such a chaos engineering tool.  

KrKn helps the teams imagine many failure situations and as such the products are well stress tested.

Key Features of KrKn

  1. Kubernetes Native: Since KrKn is a Kubernetes operational tool, it can be deployed in managed Kubernetes environments such as amazon EKS, Google GKE, Azure AKS. 

  2. Customizable Chaos Experiments: Users can self-define chaos scenarios using YAML to achieve effectiveness for specific applications. 

  3. Observability and Monitoring: KrKn also enables insights into how a system or application performs during the chaos such that some metrics, logs and traces can be captured. 

  4. Automation and Rollbacks: KrKn is capable of performing the chaos experiment in an automated manner and undoing the procedural changes that were activated as a result of the testing to restore normalcy. 

  5. Rich Documentation and Community Support: Another advantage of KrKn is that, as free source code development, users readily assist it, and the vast amount of available documentation allows them to begin their work soon.

Getting Started with KrKn

To implement KrKn in your Kubernetes environment, follow these steps: 

Step 1: Prerequisites 

Kubernetes Cluster: Ensure you have a running Kubernetes cluster. This can be a managed service like EKS, GKE, or AKS or a self-hosted cluster. 

kubectl: Install kubectl, the command-line tool for interacting with your Kubernetes cluster. 

Helm: Install Helm, the package manager for Kubernetes, to simplify the installation of KrKn. 

Step 2: Installing KrKn 

Add the KrKn Helm repository: 

   1. helm repo add krkn https://charts.krkn.dev 

   2. helm repo update 

Install KrKn: 

   1. helm install krkn krkn/krkn 

Verify Installation: 

   1. kubectl get pods -n krkn 

Step 3: Defining Chaos Experiments 

KrKn allows users to create chaos experiments through YAML files. Here’s an example of a basic chaos experiment that simulates a pod failure: 

apiVersion: krkn.io/v1 

kind: ChaosExperiment 

metadata: 

  name: pod-failure-experiment 

spec: 

  selector: 

    matchLabels: 

      app: my-application 

  action: terminate 

  duration: 30s 

  interval: 10s 

For this example:
The pods which have the label app: my-application are the focus of this experiment.

 

For pod termination, this is done for 30 seconds, followed by a few steps, a 10-second pause, and some more steps.

 

Step 4: Conducting Chaos Experiments

 

The pod-failure-experiment YAML file contains the information needed to perform the chaos experiment. The kubectl command is used to apply the pod-failure-experiment YAML configuration

        kubectl applies -f pod-failure-experiment. yaml 

With the possibility of carrying out that experiment, you can see the effect that experiment has on the application and services. 

Step 5: Monitoring and Analyzing Results  

In relation to the above, once the chaos experiment concludes, all the consequences of crowding out must be interpreted. With the assistance of Prometheus and Grafana one can follow the lists of metrics provided to the teams through KrKn internal statistics. Such metrics include 

  • Time taken to respond  
  • Rate of errors 
  • Logs of the systems 
  • Parameters determining work of information system 

 

Some Useful Tips when Working with KrKn in Practice of Chaos Engineering

 

1. Start Small: Start with probably harmless experiments with a low failure simulation. When you feel more secure in your systems, raise the bar slowly, and start performing more complicated tests. 

 

2. Automate Experiments: Integrate chaos experiments into the regular CI/CD pipeline. By instituting chaos testing as a standard approach early on in development, you may solve problems that center around roadblocks that have been identified late in the development cycle 

 

3. Document Findings: Maintain logs of every single experiment done. Document the setups and outcomes and what changes were acted upon in the systems. With this documentation, you can lessen the burdens of any follow up experiments and even improve the system. 

 

4. Involve the Team: Engage in chaos engineering as a team in practice as much as it is possible. 

 

5. Learn from Failures: Chaos engineering is more about learning so as to add value to the systems.

Common Chaos Scenarios to Test with KrKn 

These are some situations of chaos that common teams can play around with using KrKn:- 
  1. Pod Failures – In this case not all the pods will be active, and some will be surgically… 
  2. Network Policy: Introduce artificial delays in the network traffic to test the application's reaction to delayed responses. 
  3. Resource Exhaustion - CPU or memory resources will be granted to specific pods and the performance of the application, under this limitation evaluated. 
  4. Node Failures – Here, the failure of an entire node will be simulated in order to gauge. 
  5. Service Interruptibility- Turn off one dependent service for a while to check the application when one important service is missing.

Benefits of using Karkin for Chaos Engineering

  1. Kubernetes Specific: As a Kubernetes-centric tool, KrKn is developed for seamless adoption of container technologies thus suitable for countries with Kubernetes as the fundamental infrastructure. 
  2. Open Source: This open-source orientation has great benefits since KrKn is open-sourced to developers in the wider society, leading to the improvement of the model itself. 
  3. Economic: Such abuse of the controlled environment is tested by the organizations so that costly downtimes are avoided in the controlled “production” environments. 
  4. Elastic: With KrKn’s help, it is possible to run chaos engineering experiments in pre-production and production environments allowing it to grow with your applications. 
  5. Platforms Interoperability: KrKn can be trained to enhance the platforms’ interoperability, seamlessly integrated into the organization's existing CC practices, enhancing reliability constructs.

Challenges and Considerations

Here’s a concise list of considerations for applying chaos engineering methodologies, such as KrKn:

  1. Preparedness for Failure: Ensure teams are well-prepared for potential failures when introducing faults.

  2. Supportive Culture: Foster a culture that encourages trial-and-error learning to support experimentation.

  3. User Impact: Prioritize minimizing negative effects on end users during chaos tests in production environments.

  4. Understanding Microservices: Recognize the complexity of microservices architecture and how various services interact.

  5. Monitoring and Control: Continuously monitor chaos experiments to prevent them from spiraling out of control.

  6. Timing of Experiments: Conduct chaos experiments during off-peak hours to reduce disruption during high-traffic times.

  7. Legal Considerations: Be aware of legal boundaries and obligations to avoid potential legal issues arising from chaos experiments.

8. Identifying Barriers: Focus on identifying factors that hinder fault detection in challenging environments. 

Final Thoughts on Using KrKn for Chaos Engineering

KrKn helps his clients optimally apply chaos engineering to their Kubernetes deployments. While chaos testing has limitations, certain threats are also investigated to increase the system's confidence and improve its ability to respond to incidents. 

On the other hand, such operating models would still face the issue of the complex and distributed nature of systems if the companies have grown in terms of operational and business activities. That is the reason why a good number of organizations are more than pleased by the adoption of the chaos engineering principles. There is no doubt that organizations seeking KrKn in addition to chaos engineering techniques can and will prepare their applications against aging problems and in particular distortions. 

 Also, it is worth noting that chaos engineering can be deployed at any stage of development workflow through the hierarchical level of the team structure to create reliable, even safer, and stress-free applications. 

Explore more about Google Cloud Managed Services
Know more about Kubernetes Managed Services for Hybrid Cloud