Cvf VG - 搜索

约 228,000 个结果

在新选项卡中打开链接

时间不限

thecvf.com
https://openaccess.thecvf.com › content › html › Song_Advancing...
CVPR 2023 Open Access Repository
Visual grounding (VG) aims to establish fine-grained alignment between vision and language. Ideally, it can be a testbed for vision-and-language models to evaluate their understanding of …
thecvf.com
https://openaccess.thecvf.com › content › papers › Deng_TransVG...
[PDF]
TransVG: End-to-End Visual Grounding With Transformers
In this paper, we present a neat yet effective transformer-based framework for visual grounding, namely TransVG, to address the task of grounding a language query to the corresponding …
github.com
https://github.com › zhjohnchan › SK-VG
GitHub - zhjohnchan/SK-VG: [CVPR-2023] The official dataset of ...
We introduce a challenging task that requires VG models to reason over (image, scene knowledge, query) triples and build a new dataset named SK-VG on top of real images …
huggingface.co
https://huggingface.co
michelecafagna26/vinvl_vg_x152c4 - Hugging Face
More info about how to use this model can be found here: michelecafagna26/vinvl-visualbackbone. You can obtain the full VinVL's visual features by concatenating the "features" …
thecvf.com
https://cvpr.thecvf.com › virtual › poster
CVPR Poster Advancing Visual Grounding With Scene Knowledge: …
Visual grounding (VG) aims to establish fine-grained alignment between vision and language. Ideally, it can be a testbed for vision-and-language models to evaluate their understanding of …
github.com
https://github.com › linhuixiao › CLIP-VG
CLIP-VG: Self-paced Curriculum Adapting of CLIP for Visual …
2024年12月28日 · In order to utilize vision and language pre-trained models to address the grounding problem, and reasonably take advantage of pseudo-labels, we propose CLIP-VG, a …
arxiv.org
https://arxiv.org › abs
Language Adaptive Weight Generation for Multi-task Visual …
2023年6月6日 · Inspired by this, we propose an active perception Visual Grounding framework based on Language Adaptive Weights, called VG-LAW. The visual backbone serves as an …
github.com
https://github.com › ATVGnet
GitHub - lelechen63/ATVGnet: CVPR 2019
This repository contains the original models (AT-net, VG-net) described in the paper Hierarchical Cross-modal Talking Face Generation with Dynamic Pixel-wise Loss. The demo video is …
arxiv.org
https://arxiv.org › abs
Title: TransVG: End-to-End Visual Grounding with Transformers
2021年4月17日 · In this paper, we present a neat yet effective transformer-based framework for visual grounding, namely TransVG, to address the task of grounding a language query to the …
thecvf.com
https://cvpr.thecvf.com › virtual › poster
VectorFloorSeg: Two-Stream Graph Attention Network for …
Vector graphics (VG) are ubiquitous in industrial designs. In this paper, we address semantic segmentation of a typical VG, i.e., roughcast floorplans with bare wall structures, whose output …
分页
- 1
- 2
- 3
- 4
- 下一页

CVPR 2023 Open Access Repository

TransVG: End-to-End Visual Grounding With Transformers

GitHub - zhjohnchan/SK-VG: [CVPR-2023] The official dataset of ...

michelecafagna26/vinvl_vg_x152c4 - Hugging Face

CVPR Poster Advancing Visual Grounding With Scene Knowledge: …

CLIP-VG: Self-paced Curriculum Adapting of CLIP for Visual …

Language Adaptive Weight Generation for Multi-task Visual …

GitHub - lelechen63/ATVGnet: CVPR 2019

Title: TransVG: End-to-End Visual Grounding with Transformers

VectorFloorSeg: Two-Stream Graph Attention Network for …