
[2410.09997] Collu-Bench: A Benchmark for Predicting Language …
2024年10月13日 · Collu-Bench includes 13,234 code hallucination instances collected from five datasets and 11 diverse LLMs, ranging from open-source models to commercial ones. To better understand and predict code hallucinations, Collu-Bench provides detailed features such as the per-step log probabilities of LLMs' output, token types, and the execution ...
Collu-Bench includes 13,234 code hallucination instances. To facilitate the understanding of where the LLM makes mistakes, Collu-Bench includes detailed signals such as per-step log
Collu-Bench: A Benchmark for Predicting Language Model …
2024年10月13日 · We build Collu-Bench, a benchmark with 13,234 code hallucination instances produced by 11 LLMs on five datasets. Collu-Bench includes detailed information such as per-step log prob., token types, and execution feedback, which are useful signals for developing code hallucination localizing and predicting techniques. Report issue for preceding ...
sets and 11 diverse LLMs, ranging from open-source models to commercial ones. To better under-stand and predict code hallucinations, Collu-Bench provides detailed features such as the per-step log probabilities of LLMs’ output, token type.
(PDF) Collu-Bench: A Benchmark for Predicting Language Model ...
2024年10月13日 · To pave the way for research in LLMs' hallucinations in code, we introduce Collu-Bench, a benchmark for predicting code hallucinations of LLMs across code generation (CG) and automated program...
Cool Papers - Immersive Paper Discovery
Explore the latest in academic research with Cool Papers. Our platform, powered by Kimi Chat AI, streamlines the discovery of arXiv and top conference papers, offering an interactive FAQ for quick insights.
Maurizio COLLU | Professor (Full) | PhD CEng MRINA MEI FHEA ...
This paper investigates the limiting wave conditions at which a wind turbine technician can complete maintenance activities safely and effectively on a 15MW floating offshore wind turbine.
Stability requirements for floating offshore wind
2014年7月1日 · In this paper a review is carried out in order to smoothly cover a rule gap, in terms of intact stability issues, for a wind turbine floating unit, especially in the critical phases when the wind turbine is placed onboard and when the unit sails toward the field of installation and eventually it is assembled and moored on site.
Artificial Intelligence & Collusion: When Computers Inhibit ... - SSRN
2015年4月9日 · After discussing the way in which computerised technology is changing the competitive landscape, we explore four scenarios where AI can foster anticompetitive collusion and the legal and ethical challenges each scenario raises. Keywords: Competition law, Antitrust, Computers, Artificial Intelligence, Collusion, Cartels. Suggested Citation:
Collu-Bench: A Benchmark for Predicting Language Model …
2024年10月13日 · Collu-Bench includes 13,234 code hallucination instances collected from five datasets and 11 diverse LLMs, ranging from open-source models to commercial ones. To better understand and predict code hallucinations, Collu-Bench provides detailed features such as the per-step log probabilities of LLMs' output, token types, and the execution ...
- 某些结果已被删除