
Zeyuan Johnson Chen
Articles
-
Jan 8, 2025 |
salesforce.com | Jieyu Zhang |Le Xue |Zeyuan Johnson Chen |Ran Xu
The development of multimodal language models (MLMs) such as GPT4-V and BLIPs [1,2] have enabled many multimodal applications such as answering complex image-based queries; for example, “How many students are raising their hands in this image?”. These models rely heavily on instruction data—datasets that pair visual content with corresponding questions and answers. However, generating such data is a challenging task due to the limitations of existing approaches.
-
Sep 27, 2024 |
mdpi.com | Zeyuan Johnson Chen |Bo Xu |Xuan Wang |Linsong Sun
All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess.
Try JournoFinder For Free
Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.
Start Your 7-Day Free Trial →