<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>RAG on Yuan's Blog</title><link>https://liyuan.org/tags/rag/</link><description>Recent content in RAG on Yuan's Blog</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Sun, 10 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://liyuan.org/tags/rag/index.xml" rel="self" type="application/rss+xml"/><item><title>Financial RAG Agent Optimization:Methods, Cases, and Data</title><link>https://liyuan.org/posts/ai/rag-improvement-via-18-failures/</link><pubDate>Sun, 10 May 2026 00:00:00 +0000</pubDate><guid>https://liyuan.org/posts/ai/rag-improvement-via-18-failures/</guid><description>This project details the refinement of an agentic RAG system for financial Q&amp;amp;A, boosting test accuracy from &lt;strong>0.871 to ~0.919&lt;/strong> by systematically diagnosing 18 failure cases. Rather than blind model tuning, the author prioritized a &amp;quot;diagnose-first&amp;quot; approach: resolving &amp;quot;judge-side&amp;quot; discrepancies with deterministic numeric prefiltering, then implementing structural improvements like query translation, anti-refusal checks, and a five-layer fix for superlative ambiguities. The results highlight that while prompt-based reflection is helpful, structural, schema-enforced changes offer superior reliability. Ultimately, the author demonstrates engineering pragmatism by consciously leaving eight failures unfixed—due to dataset noise or unfavorable ROI—distinguishing between &amp;quot;fixing everything&amp;quot; and strategic, production-oriented optimization.</description></item><item><title>Picking Evaluation Metrics for a RAG Agent — Notes from the Trenches</title><link>https://liyuan.org/posts/ai/rag-eval-metrics-selection/</link><pubDate>Tue, 05 May 2026 00:00:00 +0000</pubDate><guid>https://liyuan.org/posts/ai/rag-eval-metrics-selection/</guid><description>This article outlines a pragmatic, tiered approach to evaluating Retrieval-Augmented Generation (RAG) agents, specifically within the context of complex financial document analysis (FinanceBench). The author argues that effective evaluation is not about maximizing the number of metrics, but about selecting signals that provide clear, actionable insights at different stages of the development lifecycle.</description></item></channel></rss>