
Shaohan Huang
Articles
-
May 8, 2024 |
arxiv.org | Li Dong |Yi Zhu |Shaohan Huang |Wenhui Wang
-
Oct 17, 2023 |
arxiv.org | Shuming Ma |Li Dong |Shaohan Huang |Huaijie Wang
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.
-
Jul 17, 2023 |
arxiv.org | Li Dong |Shaohan Huang |Shuming Ma |Yuqing Xia
[Submitted on 17 Jul 2023 ( v1 ), last revised 9 Aug 2023 (this version, v4)] Title:Retentive Network: A Successor to Transformer for Large Language Models Download a PDF of the paper titled Retentive Network: A Successor to Transformer for Large Language Models, by Yutao Sun and 7 other authors Download PDF Abstract: In this work, we propose Retentive Network (RetNet) as a foundation architecture for large language models, simultaneously achieving training parallelism, low-cost inference,...
-
Jul 5, 2023 |
arxiv.org | Shuming Ma |Li Dong |Xingxing Zhang |Shaohan Huang
arXiv:2307.02486 (cs) Download a PDF of the paper titled LongNet: Scaling Transformers to 1,000,000,000 Tokens, by Jiayu Ding and 6 other authors Download PDF Submission history From: Shuming Ma [ view email] [v1] Wed, 5 Jul 2023 17:59:38 UTC (219 KB) Bibliographic Tools Bibliographic Explorer Toggle Bibliographic Explorer () Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Links to Code Toggle CatalyzeX Code Finder...
-
Jun 26, 2023 |
arxiv.org | Wenhui Wang |Li Dong |Shaohan Huang |Yaru Hao
[Submitted on 26 Jun 2023 ( v1 ), last revised 27 Jun 2023 (this version, v2)] Title:Kosmos-2: Grounding Multimodal Large Language Models to the World Download a PDF of the paper titled Kosmos-2: Grounding Multimodal Large Language Models to the World, by Zhiliang Peng and 6 other authors Download PDF Abstract: We introduce Kosmos-2, a Multimodal Large Language Model (MLLM), enabling new capabilities of perceiving object descriptions (e.g., bounding boxes) and grounding text to the visual...
Try JournoFinder For Free
Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.
Start Your 7-Day Free Trial →