Jesse Manders

Featured in: Favicon

amazon.com Favicon

ieee.org

aws.amazon.com Favicon

spectrum.ieee.org

Articles

Use custom metrics to evaluate your generative AI application with Amazon Bedrock | Amazon Web Services

1 month ago | aws.amazon.com | Shreyas Vathul Subramanian |Adewale Akinfaderin |Ishan Singh |Jesse Manders

With Amazon Bedrock Evaluations, you can evaluate foundation models (FMs) and Retrieval Augmented Generation (RAG) systems, whether hosted on Amazon Bedrock or another model or RAG system hosted elsewhere, including Amazon Bedrock Knowledge Bases or multi-cloud and on-premises deployments. We recently announced the general availability of the large language model (LLM)-as-a-judge technique in model evaluation and the new RAG evaluation tool, also powered by an LLM-as-a-judge behind the scenes.
Evaluate models or RAG systems using Amazon Bedrock Evaluations – Now generally available | Amazon Web Services

2 months ago | aws.amazon.com | Adewale Akinfaderin |Ishan Singh |Jesse Manders |Shreyas Vathul Subramanian

AWS Machine Learning Blog Organizations deploying generative AI applications need robust ways to evaluate their performance and reliability. When we launched LLM-as-a-judge (LLMaJ) and Retrieval Augmented Generation (RAG) evaluation capabilities in public preview at AWS re:Invent 2024, customers used them to assess their foundation models (FMs) and generative AI applications, but asked for more flexibility beyond Amazon Bedrock models and knowledge bases.
Evaluating RAG applications with Amazon Bedrock knowledge base evaluation | Amazon Web Services

Mar 14, 2025 | aws.amazon.com | Ishan Singh |Adewale Akinfaderin |Ayan Ray |Jesse Manders

Organizations building and deploying AI applications, particularly those using large language models (LLMs) with Retrieval Augmented Generation (RAG) systems, face a significant challenge: how to evaluate AI outputs effectively throughout the application lifecycle. As these AI technologies become more sophisticated and widely adopted, maintaining consistent quality and performance becomes increasingly complex. Traditional AI evaluation approaches have significant limitations.

Contact details

Emails

[email protected]

Socials & Sites

Try JournoFinder For Free

Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.

Start Your 7-Day Free Trial →

Jesse Manders

Articles

Use custom metrics to evaluate your generative AI application with Amazon Bedrock | Amazon Web Services

Evaluate models or RAG systems using Amazon Bedrock Evaluations – Now generally available | Amazon Web Services

Evaluating RAG applications with Amazon Bedrock knowledge base evaluation | Amazon Web Services

Contact details

Emails

Socials & Sites