
In this course, you will: Gain the skills to expose large language models through REST API endpoints Learn how to configure the llama.cpp server to customize model behavior Understand how to efficiently handle requests and integrate language model capabilities into applications Reinforce concepts through hands-on exercises and code examples using tools like curl and Python Be equipped to deploy robust language model APIs for various NLP tasks The course empowers you to harness state-of-the-art NLP models in your projects through a convenient and performant API interface, focusing on the practical aspects of serving large language models in production environments using the efficient and flexible llama.cpp framework.
Alfredo Deza
Adjunct Assistant Professor in the Pratt School of Engineering
Noah Gift
Executive in Residence and Founder of Pragmatic AI Labs