Learning how to use LangSmith to evaluate your project
Remember that different types of AI agents will require different types of evaluation approaches. There is no one-size-fits-all approach, and it’s up to you to decide on the methodology and approach to follow based on your use case. A Q&A agent would be simpler to evaluate if you are looking to support knowledge of a specific domain – for example, company-specific information from a RAG system – while an agent that needs to support transactional conversations will need a more complex evaluation implementation as you need to be absolutely sure your conversational agent is going to consistently meet their task.
With an intent-based system, you are able to accurately control each step of the conversation, while with an LLM-powered conversational agent, you’re controlling your agents’ actions and capabilities with prompts, which in my opinion is a more nuanced and volatile approach.
Langsmith...