Boran Han

Featured in: Favicon

science.org Favicon

cell.com

amazon.science

Articles

Unraveling the gradient descent dynamics of transformers

Oct 21, 2024 | amazon.science | Boran Han |Shuai Zhang |Jie Ding |Bingqing Song

While the Transformer architecture has achieved remarkable success across various domains, a thorough theoretical foundation explaining its optimization dynamics is yet to be fully developed. In this study, we aim to bridge this understanding gap by answering the following two core questions: (1) Which types of Transformer architectures allow Gradient Descent (GD) to achieve guaranteed convergence?
CaMML: Context-aware multimodal learner for large models

Aug 12, 2024 | amazon.science | Yixin Chen |Shuai Zhang |Boran Han |Hanno Becker

In this work, we introduce Context-Aware MultiModal Learner (CaMML), for tuning large multimodal models (LMMs). CaMML, a lightweight module, is crafted to seamlessly integrate multimodal contextual samples into large models, thereby empowering the model to derive knowledge from analogous, domain-specific, up-to-date information and make grounded inferences. Importantly, CaMML is highly scalable and can efficiently handle lengthy multimodal context examples owing to its hierarchical design.
CoMM: Collaborative multi-agent, multi-reasoning-path prompting for complex problem solving

Apr 3, 2024 | amazon.science | Pei Chen |Boran Han |Shuai Zhang |Larry Hardesty

Large Language Models (LLMs) have shown great ability in solving traditional natural language tasks and elementary reasoning tasks with appropriate prompting techniques. How- ever, their ability is still limited in solving complicated science problems. In this work, we aim to push the upper bound of the reason- ing capability of LLMs by proposing a collaborative multi-agent, multi-reasoning-path (CoMM) prompting framework.

Contact details

Emails

[email protected]

Socials & Sites

Try JournoFinder For Free

Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.

Start Your 7-Day Free Trial →

Boran Han

Articles

Unraveling the gradient descent dynamics of transformers

CaMML: Context-aware multimodal learner for large models

CoMM: Collaborative multi-agent, multi-reasoning-path prompting for complex problem solving

Contact details

Emails

Socials & Sites