Wwdok - 搜索

约 4,110,000 个结果

在新选项卡中打开链接

时间不限

github.com
https://github.com › wwdok
wwdok (weida wang) - GitHub
wwdok has 41 repositories available. Follow their code on GitHub.
zhihu.com
https://www.zhihu.com › people › posts
wwdok - 知乎
加速扩散模型的方法有很多种，有渐进式蒸馏、对抗训练、LCM等等，但今年5月份新出的PeRFlow，似乎效果和速度都很好！要说PeRFlow还得先从Rectified Flow说起。去年，Rectified Flow的作者就在知乎上发表过文章介绍它《[ICLR2023] 扩散生成模型新方法：极度简化，一步生成》（建议看… Hallo 项目主页：fudan-generative-vision.github.io github仓库：github.com/fudan-genera 论文：《Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image …
github.com
https://github.com › wwdok › faster-whisper-webui
wwdok/faster-whisper-webui-cn - GitHub
To detect different speakers in the audio, you can use the whisper-diarization application. Download the JSON file after running Whisper on an audio file, and then run app.py in the whisper-diarization repository with the audio file and the JSON file as arguments. You can choose between using whisper or faster-whisper.
github.com
https://github.com › wwdok › sadtalker_modelscope
GitHub - wwdok/SadTalker_ModelScope: Use one line code to …
modelscope托管的仓库： https://modelscope.cn/models/wwd123/sadtalker (是本github仓库的精减版，两处的代码有轻微的不同。该仓库主要分为两种使用方式，第一种运行时跑的是本仓库根目录下的代码，入口文件是gradio_app.py。另一个是通过modelscope调用、运行时跑的是modelscope cache目录下的代码，入口文件是gradio_app_ms.py、demo.ipynb、ms_wrapper.py。以Linux为例，考虑到在安装过程中可能会出现某些pypi包会覆盖安装的问 …
zhihu.com
https://zhuanlan.zhihu.com
扩散模型中的v-prediction推导 - 知乎 - 知乎专栏
v代表velocity（速度），也就是说扩散模型输出的是预测的速度，v-prediction来源于论文《PROGRESSIVE DISTILLATION FOR FAST SAMPLING OF DIFFUSION MODELS》，这篇论文内容很多，跟v-prediction相关的内容主要是第四节的这句结论和它说的附录D：现在diffusers已经集成了v-prediction，上面第一个等式用在了Scheduler类的 def get_velocity 里（相关代码），第二个等式用在了 def step 里（相关代码）。要推导出上面两个等式，需要结合附录D里的一张图：
zhihu.com
https://zhuanlan.zhihu.com
通俗易懂地理解Gumbel Softmax - 知乎 - 知乎专栏
基于前人们的知识成果积累，论文《Categorical Reparameterization with Gumbel-Softmax》的作者还真找到了解决方法，第一个问题的方法是使用Gumbel Max Trick，第二个问题的方法是把Gumbel Max Trick里的argmax换成softmax，综合起来就是Gumbel Softmax。在介绍gumbel之前，我们先看一下离散概率分布采样在计算机编程中是如何实现的。它的采样方法可以表示为： z = ( { i | p_1 + p_2 + ...+p_ {i-1} \leq u} )，其中 i = 1, 2, ..., n 是类别的下标，随机变量 u 服从均匀 …
github.com
https://gist.github.com › wwdok
wwdok’s gists · GitHub
Instantly share code, notes, and snippets. Split images and labels together to train/val/test dataset. After running this script, there will be train/val/test three folders inside imageDir and annotationDir. GitHub Gist: star and fork wwdok's gists by creating an account on GitHub.
bilibili.com
https://www.bilibili.com › opus
多模态论文串讲·上【论文精读·46】 - 哔 ... - 哔哩哔哩
2023年5月30日 · ViLT论文里的这张图展示了多模态模型的发展历程，最开始的模型特点是Visual Encoder最大，Textual Encoder第二大，Modality Interaction是对文本特征和图像特征做一个点乘，计算量最小，所以VE>TE>MI。后来CLIP把Textual Encoder换成更大的attention结构，VilBERT、UNITER把Modality Interaction换成更大的attention结构，而ViLT保留较大的Modality Interaction，转而把TE、VE换成小的，就是TE只做个tokenization，VE做个patch embeding。
bilibili.com
https://space.bilibili.com
wwdok的个人空间-wwdok个人主页-哔哩哔哩视频
哔哩哔哩wwdok的个人空间，提供wwdok分享的视频、音频、文章、动态、收藏等内容，关注wwdok账号，第一时间了解UP主动态。知乎：https://www.zhihu.com/people/wang-wei-78-16-16
zhihu.com
https://www.zhihu.com › people
wwdok - 知乎
2024年8月18日 · 一、介绍 Stable Fast 3D，由 Stability AI 推出，是一项颠覆性的 3D 建模技术，它通过 AI 算法将单张图片迅速转换为高质量的 3D 模型，极大地缩短了传统 3D 建模的时间… Hallo 项目主页：fudan-generative-vision.github.io github仓库：github.com/fudan-genera 论文：《Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation》本算…
某些结果已被删除
分页
- 1
- 2
- 3
- 4
- 下一页

wwdok (weida wang) - GitHub

wwdok - 知乎

wwdok/faster-whisper-webui-cn - GitHub

GitHub - wwdok/SadTalker_ModelScope: Use one line code to …

扩散模型中的v-prediction推导 - 知乎 - 知乎专栏

通俗易懂地理解Gumbel Softmax - 知乎 - 知乎专栏

wwdok’s gists · GitHub

多模态论文串讲·上【论文精读·46】 - 哔 ... - 哔哩哔哩

wwdok的个人空间-wwdok个人主页-哔哩哔哩视频

wwdok - 知乎