• Thursday, Oct 23rd, 2025

International Journal of Advanced Research in Education and TechnologY(IJARETY)
International, Double Blind-Peer Reviewed & Refereed Journal, Open Access Journal
|Approved by NSL & NISCAIR |Impact Factor: 8.152 | ESTD: 2014|

|Scholarly Open Access Journals, Peer-Reviewed, and Refereed Journal, Impact Factor-8.152 (Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool), Multidisciplinary, Bi-Monthly, Citation Generator, Digital Object Identifier(DOI)|

Article

TITLE AI Workload Scheduling in Heterogeneous Cloud Environments
ABSTRACT Modern cloud environments are increasingly heterogeneous, comprising diverse hardware accelerators such as GPUs, TPUs, FPGAs, alongside traditional CPUs. Scheduling AI workloads across this heterogeneous infrastructure is essential for maximizing performance, energy efficiency, and cost-effectiveness. This paper investigates AI workload scheduling techniques tailored for heterogeneous cloud environments. We propose a hybrid scheduling framework that integrates profiling-based workload characterization, multi-objective optimization, and an adaptive runtime scheduler. Workloads are profiled offline to capture compute intensity, memory requirements, and accelerator affinity. The scheduler then employs a Pareto-front based algorithm to map tasks onto the most suitable resources, balancing throughput, execution time, energy consumption, and monetary cost. We implemented the framework atop Kubernetes, extending its scheduler with custom resource types and decision modules. Our evaluation—conducted on a multi-node cluster with CPU, GPU, and FPGA nodes—spans a range of AI workloads, including CNN training, transformer inference, and reinforcement learning tasks. Results demonstrate that our framework reduces average job completion time by up to 35%, lowers energy use by 22%, and cuts cost by 18% compared to baseline round-robin or least-loaded strategies. Furthermore, the system adapts dynamically to workload changes and resource availability with minimal overhead. We discuss trade offs between scheduling latency, resource fragmentation, and optimization quality. The contributions of this work are: (1) a novel hybrid scheduling framework for heterogeneous AI workloads; (2) a dynamic mapping strategy with multi-objective optimization; (3) an empirical evaluation demonstrating substantial improvements in performance, energy efficiency, and cost. The insights derived can guide cloud providers and practitioners in deploying scalable, efficient AI services across heterogeneous infrastructures.
AUTHOR Amrit Lal Nagar Amity School of Languages, Lucknow, India
PUBLICATION DATE 2025-09-17
VOLUME 12
DOI DOI:10.15680/IJARETY.2025.1204088
PDF 88_AI Workload Scheduling in Heterogeneous Cloud Environments.pdf
KEYWORDS