Conghui He's profile photo

Conghui He

Featured in: Favicon arxiv.org

Articles

  • Dec 30, 2023 | linyq17.github.io | Conghui He |Alex Wang |Bin Wang |Weijia Li

    TL;DR Captions in LAION-2B have a significant bias towards describing visual text content embedded in the images. Released CLIP models have strong text spotting bias almost in every style of web images, resulting in the CLIP-filtering datasets inherently biased towards visual text dominant data. CLIP models easily learn text spotting capacity from parrot captions while failing to connect the vision-language semantics, just like a text spotting parrot.

Contact details

Socials & Sites

Try JournoFinder For Free

Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.

Start Your 7-Day Free Trial →