AI Workload Scheduling in Heterogeneous Cloud Environments

TITLE	AI Workload Scheduling in Heterogeneous Cloud Environments
ABSTRACT	Modern cloud environments are increasingly heterogeneous, comprising diverse hardware accelerators such as GPUs, TPUs, FPGAs, alongside traditional CPUs. Scheduling AI workloads across this heterogeneous infrastructure is essential for maximizing performance, energy efficiency, and cost-effectiveness. This paper investigates AI workload scheduling techniques tailored for heterogeneous cloud environments. We propose a hybrid scheduling framework that integrates profiling-based workload characterization, multi-objective optimization, and an adaptive runtime scheduler. Workloads are profiled offline to capture compute intensity, memory requirements, and accelerator affinity. The scheduler then employs a Pareto-front based algorithm to map tasks onto the most suitable resources, balancing throughput, execution time, energy consumption, and monetary cost. We implemented the framework atop Kubernetes, extending its scheduler with custom resource types and decision modules. Our evaluation—conducted on a multi-node cluster with CPU, GPU, and FPGA nodes—spans a range of AI workloads, including CNN training, transformer inference, and reinforcement learning tasks. Results demonstrate that our framework reduces average job completion time by up to 35%, lowers energy use by 22%, and cuts cost by 18% compared to baseline round-robin or least-loaded strategies. Furthermore, the system adapts dynamically to workload changes and resource availability with minimal overhead. We discuss trade offs between scheduling latency, resource fragmentation, and optimization quality. The contributions of this work are: (1) a novel hybrid scheduling framework for heterogeneous AI workloads; (2) a dynamic mapping strategy with multi-objective optimization; (3) an empirical evaluation demonstrating substantial improvements in performance, energy efficiency, and cost. The insights derived can guide cloud providers and practitioners in deploying scalable, efficient AI services across heterogeneous infrastructures.
AUTHOR	Amrit Lal Nagar Amity School of Languages, Lucknow, India
VOLUME	12
DOI	DOI:10.15680/IJARETY.2025.1204088
PDF	88_AI Workload Scheduling in Heterogeneous Cloud Environments.pdf
KEYWORDS

Article

Email

[email protected]