Research surrounding data science and artificial intelligence foundations are critical to advancements in this field. This research includes computing, high-dimensional statistics, machine learning, network analysis, signal processing, information theory and optimization. These areas make up the core data science fields of computer science, engineering, mathematics and statistics.
This research provides the theoretical and methodological backbone for all other areas of data science and artificial intelligence research.
Natural Artificial Intelligence at the Onset of Chaos
Principal Investigator: James Crutchfield, distinguished professor, Physics
In natural systems, complexity arises at the onset of chaos. Crutchfield and his team are exploring how this same complexity transition occurs in learning algorithms applied to complex data. They are working to bridge the mathematical theory of structural complexity, on the one hand, and modern machine learning (ML) and data exploration applied to complex data, on the other.
Success in this exploration will lead to novel mathematical foundations for Data Science, addressing the failure of machine learning to account for the structural information and long-range statistical dependencies spontaneously generated by complex systems. These novel foundations will enable future generations of machine learning and artificial intelligence to automatically discover structure and organization in natural complex systems.
Nonsmooth Riemannian Optimization for Online Learning
Principal Investigator: Shiqian Ma, associate professor, Mathematics
Ma and his team are investigating a class of efficient nonsmooth Riemannian optimization algorithms for solving two fundamental problems in data science: robust subspace recovery and orthogonal dictionary learning, both with online data.
This team is developing new Riemannian optimization algorithms, studying their theoretical guarantees, and addressing their scalability in the online learning setting. These new tools have the potential to explore structured or unstructured online data such as text, images, video, and audio data.