TrueschoTruescho
All Courses
Build Multimodal Generative AI Applications
Coursera
Course
Unknown

Build Multimodal Generative AI Applications

IBM

Explore multimodal AI combining text, images, and speech to build advanced and interactive generative AI applications.

Unknown3 weeksEnglish

About this Course

Ready to level up your GenAI skills? Step into the exciting world of multimodal AI, where language, images, and speech come together to build smarter, more interactive applications. In this hands-on course, you’ll learn how to build systems that work across multiple modalities, from creating AI-powered storytellers and meeting assistants to developing image captioning tools and video generation apps. You’ll gain experience with real-world tools like IBM’s Granite, OpenAI’s Whisper, Sora and D

What You'll Learn

  • Build skills to develop multimodal generative AI applications
  • Understand fundamental concepts and challenges of multimodal AI
  • Build AI applications using models like Granite, Whisper, and DALL·E
  • Develop chatbots and image/video generation models using advanced frameworks

Instructors

H

Hailey Quach

R

Ricky Shi

Topics

Web Development
OpenAI API
Application Development
Multimodal Prompts
Flask (Web Framework)
Generative Model Architectures
Software Development
Prompt Engineering
Web Applications
LLM Application

Course Info

PlatformCoursera
LevelUnknown
PacingUnknown
PriceFree

Skills

تطوير الويب
واجهة برمجة تطبيقات OpenAI
تطوير التطبيقات
العبارات متعددة الوسائط
إطار عمل Flask
هندسة النماذج التوليدية
تطوير البرمجيات
هندسة العبارات
Web Applications
LLM Application

Start Learning Now