Hamel Husain

Featured in: Favicon

medium.com Favicon

oreilly.com Favicon

flipboard.com Favicon

towardsdatascience.com Favicon

kdnuggets.com

Articles

AI Evals: Everything You Need to Know to Get Started

3 weeks ago | news.aakashg.com | Aakash Gupta |Hamel Husain

Just like you can't be a PM without using analytics, you can't be a PM on AI products without evals. Unlike traditional software, LLM pipelines do not produce deterministic outputs. A response may be factually accurate but inappropriate (i.e., the “vibes are off”). They may sound persuasive while conveying incorrect information. The core challenge is: How do we assess whether an LLM pipeline is performing adequately?And how do we diagnose where it is failing?
Mastering AI Evals: A Complete Guide for PMs

2 months ago | productcompass.pm | Paweł Huryn |Hamel Husain

Hey, Paweł here. Welcome to the free edition of The Product Compass Newsletter. With 107,800+ PMs from companies like Meta, Amazon, Google, and Apple, this newsletter is the #1 source for learning and growth as an AI PM. Consider subscribing and upgrading your account for the full experience:Recently, subscribers kept asking me about AI Evals. It’s arguably the most critical element of any AI initiative. But it’s hard to find reliable information.
Creating a LLM-as-a-Judge That Drives Business Results –

Oct 30, 2024 | hamel.dev | Hamel Husain

Earlier this year, I wrote Your AI product needs evals. Many of you asked, “How do I get started with LLM-as-a-judge?” This guide shares what I’ve learned after helping over 30 companies set up their evaluation systems. The Problem: AI Teams Are Drowning in DataEver spend weeks building an AI system, only to realize you have no idea if it’s actually working? You’re not alone.
Dokku: my favorite personal serverless platform – Hamel’s Blog

Aug 26, 2024 | hamel.dev | Hamel Husain

What is Dokku? Dokku is an open-source Platform as a Service (PaaS) that runs on a single server of your choice. It’s like Heroku, but you own it. It is a great way to get the benefits of Heroku without the costs (Heroku can get quite expensive!). I need to deploy many applications for my LLM consulting work. Having a cost-effective, easy-to-use serverless platform is essential for me. I run a Dokku server on a $7/month VPS on OVHcloud for non-gpu workloads.
Hamel’s Blog - An Open Course on LLMs, Led by Practitioners

Jul 29, 2024 | hamel.dev | Hamel Husain

Today, we are releasing Mastering LLMs, a set of workshops and talks from practitioners on topics like evals, retrieval-augmented-generation (RAG), fine-tuning and more. This course is unique because it is:Taught by 25+ industry veterans who are experts in information retrieval, machine learning, recommendation systems, MLOps and data science. We discuss how this prior art can be applied to LLMs to give you a meaningful advantage.
Mastering LLMs: A Conference For Developers & Data Scientists

May 11, 2024 | maven.com | Dan Becker |Hamel Husain

New·Cohort-based CourseAn online conference for everything LLMs.New·Cohort-based CourseAn online conference for everything LLMs.This course is popular130 people enrolled last week. Course overviewNote: Registration is paused and will reopen on June 2. The original course registration included compute credits on several platforms. Registrations after reopening will not include compute credits, but they will include access to all videos, materials and the Discord community.
Debugging AI With Adversarial Validation

Apr 12, 2024 | hamel.dev | Hamel Husain

Minimal Example: ft_driftI work with lots of folks who are fine-tuning models using the OpenAI API. I’ve created a small CLI tool, ft_drift, that detects drift between two multi-turn chat formatted jsonl files. Currently, ft_drift only detects drift in prompt templates, schemas and other token-based drift (as opposed to semantic drift). However, this is a good starting point to understand the general concept of adversarial validation.
Is Fine-Tuning Still Valuable?

Mar 27, 2024 | hamel.dev | Hamel Husain

Here is my personal opinion about the questions I posed in this tweet:There are a growing number of voices expressing disillusionment with fine-tuning. I'm curious about the sentiment more generally. (I am withholding sharing my opinion rn). Tweets below are from @mlpowered @abacaj @emollick pic.twitter.com/cU0hCdubBU— Hamel Husain (@HamelHusain) March 26, 2024I think that fine-tuning is still very valuable in many situations.
On commercializing nbdev

May 31, 2023 | hamel.dev | Hamel Husain

A few friends have asked me why I decided not to commercialize nbdev, especially after putting lots of work into the project, including leaving my full-time job to work on it. So I thought I would write a short post to explain my reasoning. Backgroundnbdev is an innovative software development framework for Python that embraces literate and exploratory programming. I worked on nbdev from 2020-2023 with Jeremy Howard and, later, Wasim Lorgat.