Do VL - 搜索

约 15,600,000 个结果

在新选项卡中打开链接

时间不限

zhihu.com
https://zhuanlan.zhihu.com
DeepSeek-VL 本地部署、云平台测试 - 知乎 - 知乎专栏
DeepSeek-VL 拥有通用的多模态理解能力，能够处理逻辑图、网页、公式识别、科学文献、自然图像以及复杂场景中的具身智能。 DeepSeek-VL: 迈向现实世界视觉语言理解
zhihu.com
https://zhuanlan.zhihu.com
DeepSeek-VL：迈向真实世界的视觉-语言理解 - 知乎
2025年2月3日 · DeepSeek-VL系列（包括1.3B和7B模型）在真实应用中作为视觉-语言聊天机器人展现了卓越的用户体验，在相同模型规模下，在广泛的视觉-语言基准测试中达到了最先进或具有竞争力的性能，同时在以语言为中心的基准测试中保持了强劲的表现。
zhihu.com
https://zhuanlan.zhihu.com
久等了，DeepSeek-VL2 - 知乎 - 知乎专栏
DeepSeek-VL2 同时具备图像理解和代码生成的功能，可以作为你逆向画图的好帮手。 Prompt: Draw a plot similar to the image in Python. 更大规模的训练数据赋予了 DeepSeek-VL2 解析各种 Meme 的能力，有时它甚至懂得比你还要多。大模型的能力绝不仅限于封闭类别的物体识别。 Zero-shot grounding：你可以用任意的自然语言进行描述，然后让 DeepSeek-VL2 帮你在图像里找到符合描述的部分（注：模型本身只是输出相应物体的边界框，而不会直接在原图上绘制边界 …
github.com
https://github.com › deepseek-ai
GitHub - deepseek-ai/DeepSeek-VL2: DeepSeek-VL2: Mixture-of …
2024年12月13日 · Introducing DeepSeek-VL2, an advanced series of large Mixture-of-Experts (MoE) Vision-Language Models that significantly improves upon its predecessor, DeepSeek-VL. DeepSeek-VL2 demonstrates superior capabilities across various tasks, including but not limited to visual question answering, optical character recognition, document/table/chart ...
qwenlm.github.io
https://qwenlm.github.io › blog
Qwen2.5 VL! Qwen2.5 VL! Qwen2.5 VL! | Qwen - qwenlm.github.io
2025年1月26日 · We release Qwen2.5-VL, the new flagship vision-language model of Qwen and also a significant leap from the previous Qwen2-VL. To try the latest model, feel free to visit Qwen Chat and choose Qwen2.5-VL-72B-Instruct. Also, we open both base and instruct models in 3 sizes, including 3B, 7B, and 72B, in both Hugging Face and ModelScope.

csdn.net
https://blog.csdn.net › WhiffeYF › article › details
Qwen2-VL 视觉大模型快速 Qwen2-VL-7B-Instruct部署 - CSDN博客
2025年1月16日 · Qwen2-VL是阿里巴巴达摩院开发的一款先进的视觉多模态人工智能模型。 Qwen2-VL能够处理包括图像、视频在内的多种模态数据，这意味着它不仅能够理解静态图像，还能解析动态视频内容，为更广泛的应用场景提供了可能。
csdn.net
https://blog.csdn.net › sherlockMa › article › details
Vllm进行Qwen2-vl部署（包含单卡多卡部署及爬虫请求）-CSDN博客
2024年11月1日 · Qwen2-VL是通义千问团队最近开源的大语言模型，由阿里云通义实验室研发。以Qwen2-VL作为基座多模态大模型，通过的方式实现特定场景下的OCR，是学习的入门任务。
csdn.net
https://blog.csdn.net › article › details
以DeepSeek-VL为例，详解视觉语言模型原理及代码 - CSDN博客
2024年8月27日 · DeepSeek-VL2 是基于深度学习的视觉语言模型，其主要特点如下：混合专家架构：DeepSeek-VL2 采用了混合专家（MoE）架构，这使得模型在参数规模扩展的同时能够有效控制计算成本。
github.com
https://github.com › xwjim
GitHub - xwjim/Qwen2-VL: Qwen2-VL is the multimodal large …
We have open-sourced Qwen2-VL models, including Qwen2-VL-2B and Qwen2-VL-7B under the Apache 2.0 license, as well as Qwen2-VL-72B under the Qwen license. These models are now integrated with Hugging Face Transformers, vLLM, and other third-party frameworks.
aliyun.com
https://developer.aliyun.com › article
【项目实战】通过LLaMaFactory+Qwen2-VL-2B微调一个多模态医 …
2024年12月2日 · Qwen2-VL-2B作为多模态大模型，具备有非常强的多模态处理能力，除了能够识别图片内容，还可以进行相关的推理。我们可以通过 LLaMaFactory 对模型进行微调，使得其具备医疗方面的处理能力。
分页
- 1
- 2
- 3
- 4
- 下一页