Quip 2 - 搜索

约 24,400,000 个结果

在新选项卡中打开链接

时间不限

arxiv.org
https://arxiv.org › abs
QuIP: 2-Bit Quantization of Large Language Models With …
2023年7月25日 · We introduce quantization with incoherence processing (QuIP), a new method based on the insight that quantization benefits from $\textit{incoherent}$ weight and Hessian …
arxiv.org
https://arxiv.org › abs
[2402.04396] QuIP#: Even Better LLM Quantization with Hadamard ...
2024年2月6日 · Post-training quantization (PTQ) reduces the memory footprint of LLMs by quantizing their weights to low-precision. In this work, we introduce QuIP#, a weight-only PTQ …
zhihu.com
https://zhuanlan.zhihu.com
QuIP：大型语言模型的2位量化 - 知乎 - 知乎专栏
作者们在OPT模型上成功地实现了可观的2位量化。虽然与fp16相比有一些质量上的损失，但相对于大多数（或全部？）其他2位量化方案，其质量损失要小得多。他们方法的关键在于，通过 …
github.com
https://github.com › Cornell-RelaxML › QuIP
GitHub - Cornell-RelaxML/QuIP: Code for paper: "QuIP: 2-Bit ...
This repository contains code for the paper QuIP: 2-Bit Quantization of Large Language Models with Guarantees. TLDR: Our proposed incoherence processing enables quantization of large …
neurips.cc
https://proceedings.neurips.cc › paper_files › paper › hash
QuIP: 2-Bit Quantization of Large Language Models With …
We introduce quantization with incoherence processing (QuIP), a new method based on the insight that quantization benefits from incoherent weight and Hessian matrices, i.e., from the …
acm.org
https://dl.acm.org › doi
QuIP: 2-bit quantization of large language models with guarantees
2023年12月10日 · QuIP consists of two steps: (1) an adaptive rounding procedure minimizing a quadratic proxy objective; (2) efficient pre- and post-processing that ensures weight and …
nih.gov
https://pmc.ncbi.nlm.nih.gov › articles
QuIP: 2-Bit Quantization of Large Language Models With …
QuIP is the first PTQ procedure to achieve good quantization at two bits per weight, across a variety of LLM sizes and evaluation tasks. In Figure 5 we compare QuIP and OPTQ when …
github.com
https://github.com › Cornell-RelaxML › quip-sharp
Cornell-RelaxML/quip-sharp - GitHub
QuIP# is a weight-only post-training quantization method that achieves state-of-the-art performance in extreme compression ($\le 4$ bits per weight) regimes. QuIP# introduces (1) …
csdn.net
https://blog.csdn.net › article › details
QuIP：基于Hadamard非相干性和格点码本的LLM量化新高度-CSD…
2024年9月27日 · 为了解决这一问题,研究人员提出了一种名为QuIP(Quantization with Incoherence Processing)的新方法,实现了LLMs的2比特量化,大幅降低了模型的存储需求。本文将详细介绍 …
briefgpt.xyz
https://briefgpt.xyz
QuIP：具有保证的大型语言模型的2位量化 | BriefGPT - AI 论文速递
通过引入具有不相干处理（QuIP）的量化方法，研究人员发现其在减少权重和Hessian矩阵的量化误差方面表现良好，经过优化的舍入过程以及通过随机正交矩阵进行预处理和后处理可进一步 …

分页
- 1
- 2
- 3
- 4
- 下一页

QuIP: 2-Bit Quantization of Large Language Models With …

[2402.04396] QuIP#: Even Better LLM Quantization with Hadamard ...

QuIP：大型语言模型的2位量化 - 知乎 - 知乎专栏

GitHub - Cornell-RelaxML/QuIP: Code for paper: "QuIP: 2-Bit ...

QuIP: 2-Bit Quantization of Large Language Models With …

QuIP: 2-bit quantization of large language models with guarantees

QuIP: 2-Bit Quantization of Large Language Models With …

Cornell-RelaxML/quip-sharp - GitHub

QuIP：基于Hadamard非相干性和格点码本的LLM量化新高度-CSD…

QuIP：具有保证的大型语言模型的2位量化 | BriefGPT - AI 论文速递