Blogs and Insights on Cloud, DevOps, Big Data Analytics, AI-XenonStack

KEDA (Kubernetes Event-driven Autoscaling)

Written by Gursimran Singh | 07 October 2024

​Introduction to KEDA (Kubernetes Event-driven Autoscaling)

As we know, there are a lot of applications in today’s digital world, and those applications need to handle different amounts of work. Sometimes, there is a lot of activity, like when a sale is ongoing or when a certain video has been posted. At other times, it is silent. To deal with these variations, there is a tool called Kubernetes Event-driven Autoscaling that assists in scaling resources that applications require depending on such events. This blog will discuss what it is and how this service makes it easier for software applications to event-based scaling.

What is KEDA?

Kubernetes-based Event Driven Autoscaling, is an essential tool for optimizing resource management within Kubernetes clusters. It enables applications running in Kubernetes to automatically adjust their resource allocation based on real-time workload demands. By integrating seamlessly with Kubernetes' native features, it ensures that pods have the right amount of resources, scaling up during peak activity and scaling down during quieter periods. This dynamic resource allocation minimizes waste and enhances performance, making it easier for teams to manage fluctuating workloads efficiently within their Kubernetes environments. 

Key Features of Kubernetes Event-Driven Autoscaler 

  • Scalability: Depending on the created event keda is able to scale the number of instances of the application. 

  • Flexibility: It is malleable, which means that it can link to other event sources, for example, queues and databases.

  • Simplicity: There are some integrations with Kubernetes so that it becomes available for many developers.

The architecture of KEDA (Kubernetes Event-Driven Autoscaler )

This is the reason that understanding how it fits into your application can help you understand how powerful it is. It is yet another level up on the Kubernetes, which takes events on your application and decides when it should scale up or down.

  Workflow of KEDA in Kubernetes 

Components: 

  • Scaler: This is the part of KEDA that waits for events and metrics.

  • Controller:  The controller is also in charge of the scaling operations where all the necessary changes are made within the Kubernetes setting.

  • Metrics Adapter: This component then converts the metrics into a format that Kubernetes understands so that the latter can utilize them in making scaling decisions.

  • ScaledObject: This is a special CRD which describes this particular application and its need in scaling. KEDA basically explains how the application should scale based on the set triggers. 

 Understanding the workflow 

  • KEDA is deployed and the scaledobject and valueLocation is defined with source as Pulsar in this setting. 

  • Therefore, when the messages come from Pulsar, the Scaler which is observing for events initiates the Metrics adapter, which in turn translates into a form that the Kubernetes controller can perceive. 

  • The scaling decision is made by the controller through the scaler and the metrics adapter and consequently starts the native Kubernetes HPA(Horizontal Pod AutoScaler). 

  • The HPA spins up a pod which is already there in the scaledobject the Node AutoScaler sees a pod in the pending state and spins up a new node for that pod to get scheduled. 

Supported Scalers

KEDA can connect to many sources: 

  • Message Queues: Pulsar, RabbitMQ, Kafka, etc.

  • Databases: Redis, SQL

  • Cloud Services: Azure, Google Cloud Platform (GCP), and Amazon Web Services (AWS). 

An open-source container orchestration engine and also an abstraction layer for managing full-stack operations of hosts and containers.Click to explore about, Types of Kubernetes Architecture Event-Based Scaling Explained

Event-Based Scaling Explained

The name event-based scaling comes from the context that means that the application resources have to be scaled based on some event, just like a restaurant which hires more employees during periods of high traffic. One of the places that benefit from this is restaurants because, they may employ extra personnel during the busy periods to ensure a quick response time and let go of extra staff during the low traffic period to avoid running into losses.

 

Similarly, KEDA or Kubernetes-based Event driven Autoscaling allows applications to scale up and down in real-time with the purpose that all the resources used will not be wasted and shall be fully responsive to its users. 

This approach keeps performance high and inversely brings costs under control so that resources are distributed according to their usage. 

How Event-Based Scaling works 

In general, when an event occurs, it assesses whether the present resources available would sufficiently cater for the demand occasioned by that event. If the incoming events are too many for the application,it scales up by creating more instances of the application. On the other hand, if there is a drop in the event traffic, it reduces the number of instances to utilize, thus avoiding the wastage of resources. 


For example, it is possible to depict an online store during the holiday sale festival. When customers flood the site to make purchases, KEDA identifies the rise in order requests. It does so by creating more application instances to accommodate the increased traffic load. After the sales event is over and customer activity decreases, it scales the number of instances down, or in other words, costs less while ensuring optimal performance.

Benefits of Event-Based Scaling 

  • Cost Efficiency: Nevertheless, one of the primary benefits is the fact that event-based scaling let you use resources only when it is really required. It also helps to optimize expenses of organizations on cloud infrastructure because companies will not pay for the capacity they do not use.

  • Improved Performance: It means that applications will be able to perform at their best because they are able to adapt to actual demand. This is particularly important during high traffic, for instance, people don’t wait for long and the application doesn’t crash.

  • Flexibility and Responsiveness: Scalability through an event can thus help the various applications adapt to the varying needs in a short while. For one, it means an overnight use shoot-up, or a regular traffic slump, KEDA is able to do this in real-time and ensure that the application returns to its optimal performance to be efficient as well as dependable.

  • Simple Management: Scaling is also taken care of by it to offload the responsibility from developers’ shoulders and leave them with more time to work on new features and innovations instead of having to worry about things like resources. What this automation does is decrease the operational load and give teams the opportunity to be efficient. 

Scaling Policies in KEDA

Scales policies in it speak to how an application factors resource allocation in light of different metrics and occurrences. These are good policies since they enable an application to provide consistent performance depending on its demand for resources. 

Horizontal Pod Autoscaling 

Of all the scaling methods, it’s foundational scaling mechanism is Horizontal pod autoscaling (HPA). This feature enables the auto-scaling of pods, which are instances of the application in terms of usage. 
HPA is configured when you use it with KEDA and you set up some metrics that are to cause a scaling operation. For instance, one of the available rules is to add more pods when the pods’ CPU utilization reaches 70% for some time and then apply it with the additional load. If the CPU requirement is below a threshold level, Therefore, in order not to waste the resources, it reduces the number of pods. 

Custom Metrics and Scalers 

Another excellent feature of KEDA is that it is capable of dealing with custom metrics and scalers whereas with traditional scaling you are mostly oriented towards values such as CPU or memory usage, it is going to scale by the metrics adjusted to what is more important for the given application. Perhaps you are in charge of a messaging application, such as Pulsar.

 

Depending on how many messages are stored in that queue, you might want to scale the consumer based on the number of messages. You also react much better to true demand in this manner. To achieve load scaling in regards to the number of messages, it can be configured to watch the queue. This makes sure that your application is in a position to process requests that come in and at the same time is not too much of a load for one instance. 

Defining Scaled Objects 

However, one of the basic concepts when working with KEDA is the notion of a ScaledObject. This is a resource created by an application that defines how an application will scale.

 

  • Triggers: These are the particular instances or happenings that will trigger scaling actions. For instance you can write a trigger that scales the application based on the amount of messages that are waiting in the Kafka topic to be consumed for instance. 


  • Polling Interval: This indicates how frequently the system evaluates the defined triggers to determine whether scaling action is required. A short period will help address changes in demand earlier but incur higher overhead; conversely, a long period will minimize resource consumption but be late to respond. 


  • Min and Max Replicas: These settings define how many pods minimum and maximum KEDA is allowed to from the defined workload. This assists in avoiding over-capacity during eventless moments or undercapacity during periods that have a high demand for products.

Challenges and Considerations 

Challenges 

  • Limitations of KEDA: Some applications do not need event-driven scaling, and some event sources may lack support or require custom scalers.

  • Performance Issues: Likes to “flap” – to scale up and down the resources so often that the resulting environment is unstable and resources are contested.

  • Complex Configuration: Metrics, triggers, or scaling policies can be quite intricate for Kubernetes when compared to the Vanilla Deployments, and they may not be easy for anyone with limited experience in Kubernetes.

  • Security Risks: External services may expose a large amount of data, hence the need for a secure user name and password for access. 

 Considerations 

  • Monitoring Needs: Monitoring is also important to track KEDA's performance and the efficacy of the scaling policies implemented; one way of doing this is with Prometheus.

  • Fallback Strategies: Measures for addressing any concerns that may arise, such as potential downtime or poor performance of external services, can help minimize the risks of dependencies. 

Conclusion of KEDA

It is an effective tool for managing application resources based on real-time events. It enhances performance and cost efficiency, particularly in dynamic environments. As technology evolves, tools like this  will be essential for keeping applications running smoothly, regardless of demand.