Transformer Next: Pushing the Limits of Language Models
Speaker
Ivan Kobyzev
Time
2024-10-31 13:30:00 ~ 2024-10-31 15:00:00
Location
上海交通大学闵行校区老行政楼独栋432会议室
Host
林洲汉
Abstract
I will start the talk with a brief discussion of some limitations of the existing transformer-based Large Language Models. Then I will present two of my group’s recent works aiming at overcoming these limitations. First work is dedicated to positional encoding modification to achieve better length generation. Second work proposes new techniques to boost the performance of linear attention.
Bio
Ivan Kobyzev earned his Ph.D. in Mathematics at Western University in Canada, specializing in algebraic geometry. Following his doctorate, he completed a postdoctoral fellowship at the University of Waterloo in Canada, focusing on cognitive computing and emotional conversational agents. Then he joined Borealis AI, the research institute of the Royal Bank of Canada, where he explored theoretical aspects of generative models (Normalizing Flows and VAEs) and their application to finance and NLP. He published several papers in top journals and conferences such as ICML, JMLR, and TPAMI. He currently leads the Efficient Language Models project at the NLP team of Huawei Noah’s Ark in Montreal.