Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond


Xutong Liu


2024-04-25 10:00:00 ~ 2024-04-25 11:00:00






In this talk, we will introduce a novel framework of combinatorial multi-armed bandits (CMAB) with multivariant and probabilistically triggering arms (CMAB-MT), where the outcome of each arm is a d-dimensional multivariant random variable and the feedback follows a general arm triggering process. Compared with existing CMAB works, CMAB-MT not only enhances the modeling power but also allows improved results by leveraging distinct statistical properties for multivariant random variables. For CMAB-MT, we propose a general 1-norm multivariant and triggering probability-modulated smoothness condition, and an optimistic CUCB-MT algorithm built upon this condition. Our framework can include many important problems as applications, such as episodic reinforcement learning and probabilistic maximum coverage for goods distribution, all of which meet the above smoothness condition and achieve matching or improved regret bounds compared to existing works. Through our new framework, we build the first connection between the episodic RL and CMAB literature, by offering a new angle to solve the episodic RL through the lens of CMAB, which may encourage more interactions between these two important directions.

Dr. Xutong Liu is a postdoctoral fellow at the Chinese University of Hong Kong (CUHK) and a visiting scholar at University of Massachusetts Amherst. He received his PhD degree from the Chinese University of Hong Kong and his bachelor’s degree from University of Science and Technology of China (USTC). He was awarded the Hong Kong PhD fellowship and the RGC Postdoctoral Fellowship. His research focuses on the theoretical foundations of combinatorial decision-making under uncertainty,  distributed/federated online learning, and their applications in recommendation systems and networked systems. He has published more than 15 papers in top machine learning and networking conferences/journals including ICML, NeurIPS, ICLR, AAAI, INFOCOM, and IEEE TMC.
© John Hopcroft Center for Computer Science, Shanghai Jiao Tong University

邮箱:jhc@sjtu.edu.cn 电话:021-54740299