References
- G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment, Y. Liu et. al, 2023
- Post Turing: Mapping the landscape of LLM Evaluation, A. Tikhonov, I. Yamshchikov, 2023
- LLM Evaluators Recognize and Favor Their Own Generations, Panickssery et. al., 2024
- Synthetic data generation on LangChain:
https://python.langchain.com/v0.2/docs/tutorials/data_generation/
- LangSmith documentation:
- Vertex GenAI Evaluation service:
https://cloud.google.com/vertex-ai/generative-ai/docs/models/evaluation-overview