Exploring Representations Beyond Euclidean Geometry
Speaker
Cheng XIN, Rutgers University
Time
2024-06-04 10:00:00 ~ 2024-06-04 11:30:00
Location
上海交通大学电信群楼1-418会议室
Host
丁家昕
Abstract
In this lecture, we will discuss two methods that go beyond Euclidean geometry for data representation problems.
Representation learning is a crucial topic in modern machine learning and deep learning. Classical representation learning aims to find a proper embedding for input data in high-dimensional Euclidean spaces, which can be used for downstream tasks such as classification, regression, or generation. While Euclidean spaces provide convenience due to their well-understood properties, not all data naturally lie in Euclidean spaces, and it is not always optimal to assume they have hidden Euclidean embeddings in higher dimensions. This talk explores techniques that transcend Euclidean representations.
In the first part of the talk, we will discuss a simple linear embedding into non-Euclidean spaces using a more general bilinear form. Specifically, we focus on the classical Multidimensional Scaling (cMDS) method and propose an extension called Non-Euclidean MDS (Neuc-MDS) that accommodates non-Euclidean and non-metric outputs. The main idea is to generalize the inner product to other symmetric bilinear forms, utilizing both positive and negative eigenvalues of dissimilarity Gram matrices. Neuc-MDS efficiently optimizes the choice of eigenvalues to reduce STRESS, the sum of squared pairwise errors. We provide an in-depth error analysis and demonstrate Neuc-MDS's ability to address limitations of classical MDS identified by prior research.
The second part of the talk will introduce a topological method for data analysis and machine learning, known as Topological Data Analysis (TDA). TDA utilizes techniques from algebraic topology to deeply explore the inherent shape and structure of data. Its applications span various domains, including computer science, biology, neuroscience, and physics, highlighting TDA’s potential as a powerful approach to contemporary data analysis. This talk will demystify the foundational concepts of algebraic topology and introduce a pivotal TDA tool: persistent homology. Persistent homology characterizes the evolutionary trajectory of topological features through filtrations of topological spaces, offering valuable insights into the data’s underlying structure.
Bio
Cheng Xin is a postdoctoral researcher at the Center for Discrete Mathematics and Theoretical Computer Science (DIMACS) at Rutgers University. His research focuses on the intersection of computational topology, geometry, machine learning, and artificial intelligence.
Dr. Xin earned his Ph.D. in Computer Science from Purdue University, where his doctoral research centered on topological data analysis and graph representations. His work explores innovative approaches to data representation and analysis, seeking to uncover hidden patterns and structures in complex datasets.
Through his research, Dr. Xin aims to develop novel algorithms and techniques that leverage the power of topology and geometry to enhance machine learning and AI models. By bridging the gap between these disciplines, he strives to create more efficient, interpretable, and robust models for a wide range of applications.