Invited Speakers
Zeynep Akata
Zeynep Akata is a professor of Computer Science (W3) within the Cluster of Excellence Machine Learning at the University of Tübingen. After completing her PhD at the INRIA Rhone Alpes with Prof Cordelia Schmid (2014), she worked as a post-doctoral researcher at the Max Planck Institute for Informatics with Prof Bernt Schiele (2014-17) and at University of California Berkeley with Prof Trevor Darrell (2016-17). Before moving to Tübingen in October 2019, she was an assistant professor at the University of Amsterdam with Prof Max Welling (2017-19). She received a Lise-Meitner Award for Excellent Women in Computer Science from Max Planck Society in 2014, a young scientist honour from the Werner-von-Siemens-Ring foundation in 2019 and an ERC-2019 Starting Grant from the European Commission. Her research interests include multimodal learning and explainable AI.
Kristen Grauman
Kristen Grauman is a Professor in the Department of Computer Science at the University of Texas at Austin and a Research Director in Facebook AI Research (FAIR). Her research in computer vision and machine learning focuses on video, visual recognition, and action for perception or embodied AI. Before joining UT-Austin in 2007, she received her Ph.D. at MIT. She is an IEEE Fellow, AAAI Fellow, Sloan Fellow, a Microsoft Research New Faculty Fellow, and a recipient of NSF CAREER and ONR Young Investigator awards, the PAMI Young Researcher Award in 2013, the 2013 Computers and Thought Award from the International Joint Conference on Artificial Intelligence (IJCAI), the Presidential Early Career Award for Scientists and Engineers (PECASE) in 2013. She was inducted into the UT Academy of Distinguished Teachers in 2017. She and her collaborators have been recognized with several Best Paper awards in computer vision, including a 2011 Marr Prize and a 2017 Helmholtz Prize (test of time award). She served for six years as an Associate Editor-in-Chief for the Transactions on Pattern Analysis and Machine Intelligence (PAMI) and for ten years as an Editorial Board member for the International Journal of Computer Vision (IJCV). She also served as a Program Chair of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2015 and a Program Chair of Neural Information Processing Systems (NeurIPS) 2018, and will serve as a Program Chair of the IEEE International Conference on Computer Vision (ICCV) 2023.
Paul Liang
Paul Liang is a Ph.D. student in Machine Learning at CMU, advised by Louis-Philippe Morency and Ruslan Salakhutdinov. His research lies in the foundations of multimodal machine learning with applications in socially intelligent AI, understanding human and machine intelligence, natural language processing, healthcare, and education. He is a recipient of the Facebook PhD Fellowship, Center for Machine Learning and Health Fellowship, and the Alan J. Perlis Graduate Student Teaching Award, and his research has been recognized by 3 best-paper awards at NeurIPS workshops and ICMI. He regularly organizes courses, workshops, and tutorials on multimodal machine learning.
Arsha Nagrani
Arsha Nagrani is a senior research scientist at Google AI Research, where she works on machine learning for video understanding. She got her PhD in the VGG group with Andrew Zisserman at the University of Oxford, supported by an EPSRC grant and a Google PhD Fellowship Award. Her thesis “Video Understanding using Multimodal Deep Learning” won the 2021 ELLIS PhD award. Her research focuses on self-supervised and multi-modal machine learning techniques for video recognition, including the use of sound and text to learn better visual representations.
Siddharth Narayanaswamy
Siddharth N. is a Reader in Explainable AI in the School of Informatics at the University of Edinburgh, a part-time Senior Research Fellow at the Alan Turing Institute, a Visiting Fellow at the Department of Engineering Science at the University of Oxford, and an ELLIS Scholar. He was previously a Senior Researcher in Engineering at the University of Oxford and a Postdoctoral Scholar in Psychology at Stanford. He obtained his PhD from Purdue University in Electrical and Computer Engineering. His research interests are broadly cross-disciplinary and motivated by problems found at the intersection of machine learning, computer vision, natural-language processing, cognitive science, robotics, and neuroscience. In particular, he is interested in unsupervised learning of structured representations from perceptual data, and establishing common ground between machines and humans through interaction, with implications for building robust, generalisable, and interpretable AI and ML systems.