<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Metrics on Yuan的博客</title><link>https://liyuan.org/zh/tags/metrics/</link><description>Recent content in Metrics on Yuan的博客</description><generator>Hugo</generator><language>zh-cn</language><lastBuildDate>Tue, 05 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://liyuan.org/zh/tags/metrics/index.xml" rel="self" type="application/rss+xml"/><item><title>给 RAG Agent 挑评估指标 —— 来自一线的笔记</title><link>https://liyuan.org/zh/posts/ai/rag-eval-metrics-selection/</link><pubDate>Tue, 05 May 2026 00:00:00 +0000</pubDate><guid>https://liyuan.org/zh/posts/ai/rag-eval-metrics-selection/</guid><description>这篇文章介绍了一套面向 RAG（检索增强生成）Agent 的务实分层评估思路，背景是在复杂的金融文档分析场景（FinanceBench）上做评测。作者的核心观点是：有效的评估不是堆指标的数量，而是在开发周期的不同阶段选出能给出清晰、可行动信号的指标。</description></item></channel></rss>