
Adewale Akinfaderin
Articles
-
1 week ago |
aws.amazon.com | Shreyas Vathul Subramanian |Adewale Akinfaderin |Ishan Singh |Jesse Manders
With Amazon Bedrock Evaluations, you can evaluate foundation models (FMs) and Retrieval Augmented Generation (RAG) systems, whether hosted on Amazon Bedrock or another model or RAG system hosted elsewhere, including Amazon Bedrock Knowledge Bases or multi-cloud and on-premises deployments. We recently announced the general availability of the large language model (LLM)-as-a-judge technique in model evaluation and the new RAG evaluation tool, also powered by an LLM-as-a-judge behind the scenes.
-
1 month ago |
aws.amazon.com | Adewale Akinfaderin |Ishan Singh |Jesse Manders |Shreyas Vathul Subramanian
AWS Machine Learning Blog Organizations deploying generative AI applications need robust ways to evaluate their performance and reliability. When we launched LLM-as-a-judge (LLMaJ) and Retrieval Augmented Generation (RAG) evaluation capabilities in public preview at AWS re:Invent 2024, customers used them to assess their foundation models (FMs) and generative AI applications, but asked for more flexibility beyond Amazon Bedrock models and knowledge bases.
-
2 months ago |
aws.amazon.com | Ishan Singh |Adewale Akinfaderin |Ayan Ray |Jesse Manders
Organizations building and deploying AI applications, particularly those using large language models (LLMs) with Retrieval Augmented Generation (RAG) systems, face a significant challenge: how to evaluate AI outputs effectively throughout the application lifecycle. As these AI technologies become more sophisticated and widely adopted, maintaining consistent quality and performance becomes increasingly complex. Traditional AI evaluation approaches have significant limitations.
Try JournoFinder For Free
Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.
Start Your 7-Day Free Trial →