Conghui He

Featured in: Favicon

Articles

CLIP Parrot Bias

Dec 30, 2023 | linyq17.github.io | Conghui He |Alex Wang |Bin Wang |Weijia Li

TL;DR Captions in LAION-2B have a significant bias towards describing visual text content embedded in the images. Released CLIP models have strong text spotting bias almost in every style of web images, resulting in the CLIP-filtering datasets inherently biased towards visual text dominant data. CLIP models easily learn text spotting capacity from parrot captions while failing to connect the vision-language semantics, just like a text spotting parrot.

Try JournoFinder For Free

Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.