Arnab Dhua's profile photo

Arnab Dhua

Featured in: Favicon amazon.science

Articles

  • Jun 13, 2024 | amazon.science | Shasha Li |Ming Du |Arnab Dhua |Shuai Tang

    Vision-language transformer models play a pivotal role in e-commerce product search. When using product description (e.g. product title) and product image pairs to train such models, there are often non-visual-descriptive text attributes in the product description, which makes the visual textual alignment challenging. We introduce MultiModal Learning with online Token Pruning (MML-TP).

  • Apr 16, 2024 | amazon.science | Michael Huang |Xinliang Zhu |Arnab Dhua |Oriol Barbany Mayor

    Multimodal search has become increasingly important in providing users with a natural and effective way to ex-press their search intentions. Images offer fine-grained details of the desired products, while text allows for easily incorporating search modifications. However, some existing multimodal search systems are unreliable and fail to address simple queries.

Contact details

Socials & Sites

Try JournoFinder For Free

Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.

Start Your 7-Day Free Trial →