Articles
-
1 week ago |
magazine.sebastianraschka.com | Sebastian Raschka
KV caches are one of the most critical techniques for efficient inference in LLMs in production. KV caches are an important component for compute-efficient LLM inference in production. This article explains how they work conceptually and in code with a from-scratch, human-readable implementation. It's been a while since I shared a technical tutorial explaining fundamental LLM concepts.
-
2 months ago |
magazine.sebastianraschka.com | Sebastian Raschka
A lot has happened this month, especially with the releases of new flagship models like GPT-4.5 and Llama 4. But you might have noticed that reactions to these releases were relatively muted. Why? One reason could be that GPT-4.5 and Llama 4 remain conventional models, which means they were trained without explicit reinforcement learning for reasoning. Meanwhile, competitors such as xAI and Anthropic have added more reasoning capabilities and features into their models.
-
Mar 8, 2025 |
magazine.sebastianraschka.com | Sebastian Raschka
Improving the reasoning abilities of large language models (LLMs) has become one of the hottest topics in 2025, and for good reason. Stronger reasoning skills allow LLMs to tackle more complex problems, making them more capable across a wide range of tasks users care about. In the last few weeks, researchers have shared a large number of new strategies to improve reasoning, including scaling inference-time compute, reinforcement learning, supervised fine-tuning, and distillation.
-
Feb 6, 2025 |
magazine.sebastianraschka.com | Sebastian Raschka
This article describes the four main approaches to building reasoning models, or how we can enhance LLMs with reasoning capabilities. I hope this provides valuable insights and helps you navigate the rapidly evolving literature and hype surrounding this topic. In 2024, the LLM field saw increasing specialization. Beyond pre-training and fine-tuning, we witnessed the rise of specialized applications, from RAGs to code assistants.
-
Sep 21, 2024 |
magazine.sebastianraschka.com | Sebastian Raschka
In this article, I want to show you how to transform pretrained large language models (LLMs) into strong text classifiers. But why focus on classification? First, finetuning a pretrained model for classification offers a gentle yet effective introduction to model finetuning. Second, many real-world and business challenges revolve around text classification: spam detection, sentiment analysis, customer feedback categorization, topic labeling, and more.
Contact details
No sites or socials found.
Try JournoFinder For Free
Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.
Start Your 7-Day Free Trial →