AW Q - 搜索

约 548,000 个结果

在新选项卡中打开链接

时间不限

zhihu.com
https://zhuanlan.zhihu.com
AWQ：用于 LLM 压缩和加速的激活感知权重量化 - 知乎
在本文中，我们提出了"激活感知权权重化"（Activation-aware Weight Quantization，AWQ），这是一种对硬件友好的低位 LLM 仅权重化方法。我们的方法基于这样一种观点，即权重对 LLM …
github.com
https://github.com › mit-han-lab › llm-awq
GitHub - mit-han-lab/llm-awq: [MLSys 2024 Best Paper Award] AWQ …
Efficient and accurate low-bit weight quantization (INT3/4) for LLMs, supporting instruction-tuned models and multi-modal LMs. The current release supports: AWQ search for accurate …
zhihu.com
https://zhuanlan.zhihu.com
W4A16模型量化大法 AWQ - 知乎 - 知乎专栏
AWQ outperforms existing methods on various language modeling and domain-specific benchmarks, including instruction-tuned LMs and multi-modal LMs. The authors also …
arxiv.org
https://arxiv.org › abs
[2306.00978] AWQ: Activation-aware Weight Quantization for LLM ...
2023年6月1日 · We propose Activation-aware Weight Quantization (AWQ), a hardware-friendly approach for LLM low-bit weight-only quantization. AWQ finds that not all weights in an LLM …
zhihu.com
https://zhuanlan.zhihu.com
大模型量化技术原理-AWQ、AutoAWQ - 知乎 - 知乎专栏
AWQ（AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration）是一种对大模型仅权重量化方法。通过保护更“重要”的权重不进行量化，从而在不进行训练的情 …

csdn.net
https://blog.csdn.net › article › details
AWQ模型量化实践 - CSDN博客
2024年5月28日 · AWQ量化精度比GPTQ高一点，并且AWQ比GPTQ更容易实现，计算性能更高。相比AWQ采用heuristic的方法来寻找最佳的scale和 clip 系数，新的OminiQuant则采用训练 …
csdn.net
https://blog.csdn.net › TFATS › article › details
大模型量化之AWQ原理和应用 - CSDN博客
2025年2月8日 · AWQ（Activation-aware Weight Quantization）量化是一种基于激活值分布 (activation distribution)挑选显著权重 (salient weight)进行量化的方法，其不依赖于任何反向传 …
armcvai.cn
https://www.armcvai.cn › llm-quant-awq.html
AWQ 量化详解 - Zhang
2024年11月1日 · 本文提出了激活感知权重量化 (AWQ)，这是一种适合硬件的 LLM 低位权重（比如 w4）量化方法。 AWQ 发现，并非所有 LLM 权重都同等重要，仅保护 1% 的显著权重便 …
aijishu.com
https://aijishu.com
GPTQ & SmoothQuant & AWQ 代码解析 - 极术社区 - 连接开发者 …
2024年6月12日 · 本文主要是对LLM PTQ量化方向的几个经典算法 (GPTQ、SmoothQuant、AWQ)的代码实现进行介绍，一方面是为了加深对算法的理解，另一方面也是想看看有什么值 …
csdn.net
https://blog.csdn.net › article › details
大模型AWQ量化Qwen模型和推理实战教程 - CSDN博客
2024年11月9日 · AWQ（Activation-aware Weight Quantization）是一种专门针对大规模语言模型设计的低比特权重量化方法。它不仅考虑了权重本身的分布特性，还考虑了激活值的影响， …
分页
- 1
- 2
- 3
- 4
- 下一页

AWQ：用于 LLM 压缩和加速的激活感知权重量化 - 知乎

GitHub - mit-han-lab/llm-awq: [MLSys 2024 Best Paper Award] AWQ …

W4A16模型量化大法 AWQ - 知乎 - 知乎专栏

[2306.00978] AWQ: Activation-aware Weight Quantization for LLM ...

大模型量化技术原理-AWQ、AutoAWQ - 知乎 - 知乎专栏

AWQ模型量化实践 - CSDN博客

大模型量化之AWQ原理和应用 - CSDN博客

AWQ 量化详解 - Zhang

GPTQ & SmoothQuant & AWQ 代码解析 - 极术社区 - 连接开发者 …

大模型AWQ量化Qwen模型和推理实战教程 - CSDN博客