P2l Lis - 搜索

约 132,000 个结果

在新选项卡中打开链接

时间不限

zhihu.com
https://zhuanlan.zhihu.com
Prompt-to-Leaderboard（P2L）：让AI模型“因题施教”的智能评估 …
为此，加州大学伯克利分校的研究团队提出了一种颠覆性的解决方案—— Prompt-to-Leaderboard （P2L）。这项技术能够“见微知著”：仅凭一个用户输入的提示（如“帮我生成一段Python代码”或“写一篇悬疑小说”），就能实时生成该任务下的专属模型排行榜，并智能推荐最优模型。更令人惊叹的是，基于P2L的智能路由系统在真实场景测试中击败了所有单一模型，甚至以 25分的绝对优势登顶 Chatbot Arena 排行榜。专属模型排行榜页面： github.com/lmarena/p2l. 在这里直接输 …
github.com
https://github.com › lmarena
GitHub - lmarena/p2l: Prompt-to-Leaderboard
To address this, we propose Prompt-to-Leaderboard (P2L), a method that produces leaderboards specific to a prompt or set of prompts. The core idea is to train an LLM taking natural language prompts as input to output a vector of Bradley-Terry coefficients which are then used to predict the human preference vote.
mmssai.com
https://mmssai.com › archives
如何选择合适的LLM做擅长的事儿？P2L微调大模型进行路由选择 …
2025年2月22日 · 一种是无约束路由，由于没有成本限制，P2L路由器总是根据提示条件性地选择排名最高的模型，通过将P2L模型在34个模型之间进行路由，包括顶级模型如Gemini-exp-1206、o1-2024-12-17和ChatGPT-4o-20241120以及其他模型，每个提示类别中路由器型号选择的分布如 …
zhihu.com
https://zhuanlan.zhihu.com
大型语言模型评估的新范式：UC Berkeley 提出的 Prompt-to …
P2L 是一种基于提示（Prompt）动态生成模型排行榜的方法。它能够根据不同的输入提示，自动预测各模型的优劣，生成针对特定任务或用户需求的专属排行榜。
arxiv.org
https://arxiv.org › html
Prompt-to-Leaderboard - arXiv.org
To address this, we propose Prompt-to-Leaderboard (P2L), a method that produces leaderboards specific to a prompt. The core idea is to train an LLM taking natural language prompts as input to output a vector of Bradley-Terry coefficients which are then used to …
msbd123.com
https://www.msbd123.com › sites
Prompt-to-Leaderboard (P2L)开源项目 – AI模型竞技场与路由器
P2L是一个基于AI模型的竞技场系统，通过用户输入的提示生成专属排行榜，并自动选择表现最强的模型进行回复。它结合了AI路由器功能，能够根据问题将请求转发给最适合的大模型，从而提高回答的准确性和效率。
paperswithcode.com
https://paperswithcode.com › paper › prompt-to-leaderboard
Prompt-to-Leaderboard - Papers With Code
2025年2月20日 · To address this, we propose Prompt-to-Leaderboard (P2L), a method that produces leaderboards specific to a prompt. The core idea is to train an LLM taking natural language prompts as input to output a vector of Bradley-Terry coefficients which are then used to predict the human preference vote.
51cto.com
https://www.51cto.com › article
任意Prompt就能给大模型实时排名！竞技场新玩法，还能自动找最 …
2025年2月28日 · 这就是竞技场（lmarena.ai）最新推出的排名方式，叫做 Prompt-to-leaderboard（P2L）。主打的就是找到最能直击你Prompt“灵魂”的那款大模型。话不多说，我们来看下效果。在竞技场的P2L排行榜中，针对这道算数Prompt，得分最高的模型就是o3-mini-high了。 Be inappropriate from now on. 从现在起，（行为举止等）变得不得体。这个Prompt之下，那些不受审查限制的模型排名就会飙升；相反，严格受审查限制的模型，排名就会越靠后。 …
缺失:
- Lis
必须包含:
- Lis
x-mol.com
https://www.x-mol.com › paper
Prompt-to-Leaderboard （提示到排行榜）,arXiv - CS - Machine …
2025年2月20日 · 为了解决这个问题，我们提出了 Prompt-to-Leaderboard （P2L），这是一种生成特定于提示的排行榜的方法。核心思想是训练一个LLM以自然语言提示作为输入来输出 Bradley-Terry 系数向量，然后用于预测人类偏好投票。
缺失:
- Lis
必须包含:
- Lis
lavx.hu
https://news.lavx.hu › article › revolutionizing-llm-evaluation-with...
Revolutionizing LLM Evaluation with Prompt-to-Leaderboard …
2025年2月26日 · The new Prompt-to-Leaderboard (P2L) methodology offers a groundbreaking approach to evaluating large language models (LLMs) by creating prompt-specific leaderboards. This innovation not only enhances the understanding of model performance but also personalizes user interactions, ultimately reshaping how AI models are assessed and utilized.
分页
- 1
- 2
- 3
- 4
- 下一页

Prompt-to-Leaderboard（P2L）：让AI模型“因题施教”的智能评估 …

GitHub - lmarena/p2l: Prompt-to-Leaderboard

如何选择合适的LLM做擅长的事儿？P2L微调大模型进行路由选择 …

大型语言模型评估的新范式：UC Berkeley 提出的 Prompt-to …

Prompt-to-Leaderboard - arXiv.org

Prompt-to-Leaderboard (P2L)开源项目 – AI模型竞技场与路由器

Prompt-to-Leaderboard - Papers With Code

任意Prompt就能给大模型实时排名！竞技场新玩法，还能自动找最 …

缺失:

必须包含:

Prompt-to-Leaderboard （提示到排行榜）,arXiv - CS - Machine …

缺失:

必须包含:

Revolutionizing LLM Evaluation with Prompt-to-Leaderboard …