
[2307.12533] PUMA: Secure Inference of LLaMA-7B in Five …
2023年7月24日 · PUMA is about $2\times$ faster than the state-of-the-art framework MPCFORMER(ICLR 2023) and has similar accuracy as plaintext models without fine-tuning …
PUMA: Secure Inference of LLaMA-7B in Five Minutes
2023年9月21日 · PUMA is about 2× faster than the state-of-the-art framework MPCFORMER(ICLR 2023) and has similar accuracy as plaintext models without fine-tuning …
担心prompt泄露隐私?这个框架让LLaMA-7B完成安全推理 - 知乎
2023年7月30日 · 研究者在 3 个阿里云 ecs.r7.32xlarge 服务器上使用 PUMA 评估了大型语言模型 LLaMA-7B,其中每个服务器都有 128 线程和 1 TB RAM,带宽为 20 GB,往返时间为 0.06 ms …
Publications - Ye Dong
PUMA is about 2x faster than the state-of-the-art MPC framework MPCFORMER(ICLR 2023) and has similar accuracy as plaintext models without fine-tuning (which the previous works failed to …
PUMA Secure Inference of Llama-7B in Five minutes
在PUMA中,我们的目标是实现基于Transformer模型的安全计算。 为了实现这一目标,系统定义了三个实体:模型所有者、客户端和计算方。 模型所有者提供经过训练的Transformer模型, …
Puma : Secure Inference of LLaMA-7B in Five Minutes - ar5iv
We evaluated the large language model LLaMA-7B using Puma under 3 Alibaba Cloud ecs.r7.32xlarge servers, each has 128 threads and 1TB RAM, with 20GB bandwidth, 0.1ms …
PUMA: Secure Inference of LLaMA-7B in Five Minutes,arXiv - CS ...
PUMA is about 2x faster than the state-of-the-art MPC framework MPCFORMER(ICLR 2023) and has similar accuracy as plaintext models without fine-tuning (which the previous works failed to …
PUMA: Secure Inference of LLaMA-7B in Five Minutes
2023年7月24日 · In this work, we present LLAMA – an end-to-end, FSS based, secure inference library supporting precise low bitwidth computations (required by converters) as well as …
PUMA: Secure Inference of LLaMA-7B in Five Minutes
2023年7月24日 · To address these limitations, we propose framework PUMA to enable fast and secure Transformer model inference. Our framework designs high quality approximations for …
Transformer architecture. PUMA is about 2× faster than the state-of-the-art framework MPCFORMER(ICLR 2023) and has similar accuracy as plaintext models without fine-tuning …
- 某些结果已被删除