Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Generative AI with LangChain

You're reading from   Generative AI with LangChain Build large language model (LLM) apps with Python, ChatGPT, and other LLMs

Arrow left icon
Product type Paperback
Published in Dec 2023
Publisher Packt
ISBN-13 9781835083468
Length 368 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Ben Auffarth Ben Auffarth
Author Profile Icon Ben Auffarth
Ben Auffarth
Arrow right icon
View More author details
Toc

Table of Contents (13) Chapters Close

Preface 1. What Is Generative AI? FREE CHAPTER 2. LangChain for LLM Apps 3. Getting Started with LangChain 4. Building Capable Assistants 5. Building a Chatbot Like ChatGPT 6. Developing Software with Generative AI 7. LLMs for Data Science 8. Customizing LLMs and Their Output 9. Generative AI in Production 10. The Future of Generative Models 11. Other Books You May Enjoy
12. Index

How to evaluate LLM apps

The goal of evaluating LLMs is to understand their strengths and weaknesses, so as to enhance accuracy and efficiency while reducing errors, thereby maximizing their usefulness in solving real-world problems. This evaluation process typically occurs offline during the development phase. Offline evaluations provide initial insights into model performance under controlled test conditions and include aspects such as hyperparameter tuning and benchmarking against peer models or established standards. They offer a necessary first step toward refining a model before deployment.

While human assessments are sometimes seen as the gold standard, they are hard to scale and require careful design to avoid bias from subjective preferences or authoritative tones. There are many standardized benchmarks such as MBPP to test basic programming skills, while GSM8K is utilized for multi-step mathematical reasoning. API-Bank evaluates models’ aptitudes for making decisions...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image