Home

Transformer Next: Pushing the Limits of Language Models


Speaker

Ivan Kobyzev

Time

2024-10-31 13:30:00 ~ 2024-10-31 15:00:00

Location

上海交通大学闵行校区老行政楼独栋432会议室

Host

林洲汉

Abstract

I will start the talk with a brief discussion of some limitations of the existing transformer-based  Large Language Models. Then I will present two of my group’s recent works aiming at overcoming these limitations. First work is dedicated to positional encoding modification to achieve better length generation. Second work proposes new techniques to boost the performance of linear attention.

Bio

Ivan Kobyzev earned his Ph.D. in Mathematics at Western University in Canada, specializing in algebraic geometry. Following his doctorate, he completed a postdoctoral fellowship at the University of Waterloo in Canada, focusing on cognitive computing and emotional conversational agents. Then he joined Borealis AI, the research institute of the Royal Bank of Canada, where he explored theoretical aspects of generative models (Normalizing Flows and VAEs) and their application to finance and NLP. He published several papers in top journals and conferences such as ICML, JMLR, and TPAMI. He currently leads the Efficient Language Models project at the NLP team of Huawei Noah’s Ark in Montreal.

© John Hopcroft Center for Computer Science, Shanghai Jiao Tong University
分享到

地址:上海市东川路800号上海交通大学软件大楼专家楼
邮箱:jhc@sjtu.edu.cn 电话:021-54740299
邮编:200240