Autoscaling Kubernetes Workloads on AWS EKS with KEDA
The client operated a microservices platform on AWS EKS, with standard Kubernetes Horizontal Pod Autoscaling (HPA) based on CPU and memory. While this model worked well for synchronous services, one document processing service presented a challenge:
Client Overview
Merciv, Inc. is an AI and data analytics company that transforms strategic business intelligence workflows for the retail industry and consumer packaged goods (CPG) brands. At the core of their offering is the Evolving Virtual Analyst (EVA), an AI-driven platform that breaks down data silos by ingesting and unifying information from multiple sources. EVA enables users to simply ask business questions in plain language and receive data-driven answers with intelligent visualisations, empowering them to optimise workflows and discover new growth opportunities.
Understanding their clients' need for the highest levels of reliability, Merciv maintains a comprehensive security program with detailed policies that protect both company and customer data. Merciv's robust security framework includes least-privilege access management with MFA, stringent change management and vulnerability remediation, and secure development practices integrated throughout the product lifecycle. Through Merciv's solutions, customer organisations can extract maximum value from their data and scale their operations like never before.
Challenge
Standard autoscaling methods based on CPU and memory usage were not sufficient for the client’s asynchronous workloads. To maintain performance under unpredictable load, a more event-driven approach was required.
Asynchronous Workload Pattern
Incoming tasks were queued before processing, leading to a delay between workload arrival and resource consumption.
Delayed Autoscaling Reactions
Kubernetes HPA reacted only after CPU/memory usage increased—by which time service bottlenecks had already occurred.
Burst Load Sensitivity
Large, sudden spikes in document processing requests could overwhelm the system before autoscaling caught up.
Operational Constraints
Any solution needed to integrate seamlessly without introducing architectural complexity or requiring a full-scale redesign.
Solution
To bridge the gap between workload demand and scaling responsiveness, Bion Consulting implemented Kubernetes Event-Driven Autoscaling (KEDA) for the queue-driven service, while retaining HPA for the rest of the system.
1. Event-Driven Scaling with KEDA
KEDA enabled autoscaling based on queue length instead of waiting for CPU/memory load. This allowed the service to scale when messages arrived—not after they piled up.
2. Hybrid Autoscaling Architecture
We designed a hybrid scaling strategy:
- Queue length as the primary trigger for scaling
- CPU/memory metrics as guardrails
- Min/max replica counts to ensure platform stability
3. Minimal Operational Overhead
By isolating KEDA to a single service and leaving other services on HPA, we avoided unnecessary operational burden or architectural complexity. This targeted integration allowed the document processing service to scale in alignment with actual workload ingress.
Results
Faster Autoscaling Response
The system scaled proactively based on queue volume, not reactively on resource load.
Improved Queue Processing Times
Backlogs were cleared more efficiently during peak demand, enhancing throughput.
More Predictable Resource Usage
Scaling matched real workload patterns, preventing overprovisioning and underutilisation.
Reduced Complexity
The hybrid model allowed tailored scaling per service without changing the overall architecture.
Technology Stack
To implement event-driven autoscaling for asynchronous services, the following technologies were used:
- Cloud Platform: AWS EKS, EC2
- Container Management: Kubernetes, Helm
- Autoscaling: Kubernetes HPA, KEDA
- Event Source: Message Queue-based Trigger
- Monitoring & Metrics: Kubernetes-native metrics and queue backlog indicators
By combining KEDA with traditional HPA, Bion enabled a smarter, more adaptive scaling strategy aligned with real-world demand.
Ready to scale smarter with Kubernetes on AWS? Book a consultation with our experts.