All Courses
Optimizing Generative AI on Arm Processors: from Edge to Cloud
edX
Course
Intermediate
Free to Audit
Certificate

Optimizing Generative AI on Arm Processors: from Edge to Cloud

Arm Education

Efficient AI is more than algorithms. This hands-on course shows you how to optimize GenAI workloads on real-world systems using techniques such as SIMD (SVE, Neon), low-bit quantization, and the optimized KleidiAI library. You will develop essential strategies and skills to build scalable, high-performance AI on edge and cloud-based platforms based on the most widespread processor architecture.

4 hrs/week4 weeksEnglish455 enrolled
Free to Audit

About this Course

AI models are becoming increasingly powerful—but also increasingly demanding. As generative AI moves from cloud data centers to mobile phones, autonomous systems and embedded IoT devices, the need to optimize performance across diverse hardware environments has never been more critical. Arm-based processors power more than 300 billion devices globally, from smartphones to hyperscale cloud servers, making them a key foundation for efficient AI deployment across the compute landscape. To meet this growing demand, learners need the skills to translate machine learning models into real-time, hardware-aware implementations across Arm-based platforms. Optimizing Generative AI on Arm Processors: from Edge to Cloud is designed for intermediate machine learning practitioners who want to bridge the gap between model design and deployment efficiency. Rather than revisiting ML fundamentals, this course dives straight into performance engineering for generative AI on Arm-based platforms, including edge and cloud environments. You’ll explore real-world constraints, Arm architecture features, and software techniques used to accelerate AI inference—including SIMD (SVE, Neon), low-bit quantization, and the KleidiAI library. Each concept is taught using concise, interactive notebooks and narrated examples, enabling you to measure, tweak, and iterate on actual hardware like the Raspberry Pi 5 or AWS Graviton3 cloud instances. This course consists of four modules and hands-on lab exercises: Module 1: Challenges Facing Cloud and Edge GenAI Inference Understanding the limitations and constraints of AI inference in different environments. Module 2: Generative AI Models Exploring model architectures, training methodologies, and deployment considerations. Module 3: ML Frameworks and Optimized Libraries A deep dive into AI software stacks, including PyTorch, llama.cpp, and Arm-specific optimizations. Module 4: Optimization for CPU Inference Techniques such as quantization, pruning, and leveraging SIMD instructions for faster AI performance. 3b:T40

What You'll Learn

  • You will learn how to optimize AI inference using Arm-specific techniques such as SIMD (SVE, Neon) and low-bit quantization. The course covers practical strategies for running generative AI efficiently on edge and cloud-based Arm platforms. You will also explore the trade-offs between cloud and edge deployment, gaining both theoretical knowledge and hands-on skills.By the end of this course, you will have a strong foundation in deploying high-performance AI models on Arm hardware.

Prerequisites

  • This course assumes a foundational understanding of machine learning, including completion of a basic introductory course, such as one at the undergraduate level.To run the laboratory exercises, we assume you have access to a Raspberry Pi 5 and an Arm-based cloud instance. We have validated this on AWS Graviton; other cloud platforms may require modification to run.If you’re new to the field, we recommend starting with Arm’s AI and ML learning pathway, a structured journey that builds essential knowledge and skills from the ground up. The pathway includes courses that introduce concepts such as the fundamentals of AI; hands-on projects using ML models with open-source frameworks on Arm-based development boards and deploying LLMs on mobile devices.

Instructors

O

Oliver Grainge

AI Researcher

K

Kieran Hejmadi

Software and Academic Ecosystem Development Manager

Topics

Data Centers
Autonomous System
Mobile Phones
Pruning
Performance Engineering
Algorithms
Amazon Web Services
Quantization
PyTorch (Machine Learning Library)
Artificial Intelligence
Machine Learning
Scalability

Course Info

PlatformedX
LevelIntermediate
PacingUnknown
CertificateAvailable
PriceFree to Audit

Skills

مراكز البيانات
الأنظمة الذاتية
الهواتف المحمولة
التقليم
هندسة الأداء
Algorithms
Amazon Web Services
Quantization
PyTorch (Machine Learning Library)
Artificial Intelligence

Start Learning Now