Transformer Next: Pushing the Limits of Language Models

Speaker

Ivan Kobyzev

Time

2024-10-31 13:30:00 ~ 2024-10-31 15:00:00

Location

上海交通大学闵行校区老行政楼独栋432会议室

Host

林洲汉

Abstract

I will start the talk with a brief discussion of some limitations of the existing transformer-based Large Language Models. Then I will present two of my group’s recent works aiming at overcoming these limitations. First work is dedicated to positional encoding modification to achieve better length generation. Second work proposes new techniques to boost the performance of linear attention.

Bio

Ivan Kobyzev earned his Ph.D. in Mathematics at Western University in Canada, specializing in algebraic geometry. Following his doctorate, he completed a postdoctoral fellowship at the University of Waterloo in Canada, focusing on cognitive computing and emotional conversational agents. Then he joined Borealis AI, the research institute of the Royal Bank of Canada, where he explored theoretical aspects of generative models (Normalizing Flows and VAEs) and their application to finance and NLP. He published several papers in top journals and conferences such as ICML, JMLR, and TPAMI. He currently leads the Efficient Language Models project at the NLP team of Huawei Noah’s Ark in Montreal.

Home

Research Areas

Admission

Students

Open Positions / Job Opportunity