Keynote Talks // ACM Multimedia Asia 2024

Prof Wenwu Zhu

Editor-in-Chief of IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)

Fellow of ACM, IEEE, AAAS, SPIE

Tsinghua University

China

Talk Title:
Multimodal Generative AI in Dynamic and Open Environments

Talk Abstract:
Multimodal Generative AI leverages artificial intelligence to generate diverse content, including text, images, and videos. It has found broad applications across fields such as animation creation, game design, and film production etc. Recent advancements in pre-trained large language models and diffusion models have significantly accelerated the development of this technology. However, these pre-trained models often struggle to adapt to dynamically changing environments and evolving user needs, posing substantial challenges for multimodal generative AI in dynamic and open settings. This talk explores strategies for generating novel concepts in such environments, particularly focusing on scenarios involving multiple subjects, behaviors, and contexts. We begin by introducing a generative AI framework based on disentangled representation learning. Following this, we present a series of models and delve into the key technologies underpinning them. Finally, we outline future research directions for advancing multimodal generative AI.

BIO:
Wenwu Zhu is currently a Professor and Vice Dean of National Research Center on Information Science and Technology. Prior to his current post, he was a Senior Researcher and Research Manager at Microsoft Research Asia. He was the Chief Scientist and Director at Intel Research China from 2004 to 2008. He worked at Bell Labs New Jersey as a Member of Technical Staff during 1996-1999. He got his Ph.D. from New York University in 1996.

His current research interests are in the areas of multimedia Analysis, and Graph representation learning. He has published over 400 papers in the referred journals and received ten Best Paper Awards including IEEE TCSVT in 2001 and 2019, and ACM Multimedia 2012. He received 2023 ACM SIGMM Technical Achievement Award and 2024 IEEE Circuits and Systems Charles A. Desoer Technical Achievement Award. He is an ACM Fellow, IEEE Fellow, AAAS Fellow, SPIE Fellow and a member of Academia Europaea.

He serves as EiC for IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) since January 1, 2024. He served as the Editor-in-Chief for the IEEE Transactions on Multimedia (T-MM) from January 1, 2017, to December 31, 2019. He served as the chair of the steering committee for IEEE T-MM from January 1, 2020 to December 31, 2022 .

Prof Yonggang Wen

Editor-in-Chief of IEEE Transactions on Multimedia (T-MM)

Nanyang Technological University (NTU)

Singapore

Talk Title:
EasyFL: Optimising Federated Learning for Computer Vision Applications

Talk Abstract:
Deep learning has transformed industries through powerful computer vision applications. However, the traditional centralized training approach is facing serious challenges due to ever-increasing data privacy regulations. To mitigate this problem, Federated Learning (FL) has emerged as a distributed training paradigm that trains deep learning models on user devices, protecting data privacy by eliminating the need for data transfer to a central server. Despite FL's significant potential for training computer vision applications, it is still in its early stage and requires further optimization in terms of system performance and specificity for booming computer vision applications.
In this talk, we focus on how to optimize FL platforms for computer vision applications through system and algorithmic optimizations. We begin by introducing our low-code FL platform, EasyFL, which improves researchers' productivity and efficiency in implementing new federated computer vision applications. It allows users to write less code with 1.5 times of training speedup. Built on EasyFL, we then present multiple algorithmic optimizations to improve accuracy for various computer vision applications, including person re-identification, face recognition, and self-supervised learning. Finally, we present algorithmic and system optimizations for training multiple simultaneous FL activities under resource constraints.

BIO:
Yonggang Wen is the Professor of Computer Science and Engineering at Nanyang Technological University (NTU), Singapore. He has been serving as the Associate Dean (Research) at College of Engineering at NTU Singapore since 2018. He served as the Acting Director of Nanyang Technopreneurship Centre (NTC) at NTU from 2017 to 2019, and the Assistant Chair (Innovation) of School of Computer Science and Engineering (SCSE) at NTU from 2016 to 2018.

He has worked extensively in learning-based system prototyping and performance optimization for large-scale networked computer systems. He has won 2020 IEEE TCCPS Industrial Technical Excellence Award, 2016 ASEAN ICT Awards, 2015 Datacentre Dynamics Awards – APAC and 2016 Nanyang Awards in Innovation and Entrepreneurship. He is a co-recipient of multiple journal and conference best papers awards, including IEEE Transactions on Circuits and Systems for Video Technology (2019), IEEE Multimedia (2015), 2020 IEEE VCIP.

He is the Editor-in-Chief of IEEE Transactions on Multimedia. He has served or is serving on editorial boards for multiple transactions and journals, including IEEE Transactions on Circuits and Systems for Video Technology, IEEE Wireless Communication Magazine, IEEE Communications Survey & Tutorials, IEEE Transactions on Signal and Information Processing over Networks, and was elected as the Chair for IEEE ComSoc Multimedia Communication Technical Committee (2014-2016).

Prof Klara Nahrstedt

Editor-in-Chief the of ACM/Springer Multimedia Systems journal

Fellow of ACM, IEEE, AAAS

University Of Illinois Urbana-Champain

USA

Talk Title:
End-to-End System and Networking Challenges of Multi-View Video Systems

Talk Abstract:
With the decrease in cost of 2D cameras and increase of their scale, the emergence of 360^o cameras, neural video, and volumetric video content, and extensive deployment of VR/AR display devices, multi-view video content is becoming an integral part of our environments and with it the demand to address the underlying system and networking challenges. In this talk, we will discuss the end-to-end issues of multi-view video systems, ranging from multi-view video generation, navigation, streaming, to distribution and viewing, and their large bandwidth and low latency demands. Numerous potential system and network solutions are becoming available to satisfy the end-to-end demands on bandwidth and latency, including utilizing semantics in multi-camera resource management, designing view-based protocols, experimenting with viewport navigation and prediction techniques, and developing view/viewer management services to enable interactive viewing. Current results of multi-view video systems show promising new concepts and algorithms to address the bandwidth and latency challenges, and to enable user’s interactive viewing with high quality of experience, but further challenges remain.

BIO:
Klara Nahrstedt is the Swanlund Distinguished Chair Professor in the Siebel School of Computing and Data Science at the University of Illinois at Urbana-Champaign and the Director of the Coordinated Science Laboratory, which is an Interdisciplinary Research Organization within the College of Engineering.

She was the elected chair of the ACM SIGMM between 2007 and 2013. She is an ACM Fellow, IEEE Fellow, AAAS Fellow, member of the Leopoldina German National Academy of Sciences, and member of National Academy of Engineering. She received her Ph.D. from the Department of Computer and Information Science at the University of Pennsylvania in 1995.

She is the recipient of the IEEE Communication Society Leonard Abraham Award for Research Achievements, the 2008 University Scholar Award, the 2009 Humboldt Research Award, the 2012 IEEE Computer Society Technical Achievement Award, the 2014 ACM Special Interest Group on Multimedia (SIGMM) Technical Achievement Award, 2018 Robert Piloty Prize, 2019 Tau Beta Pi Daniel C. Drucker Eminent Award in the College of Engineering, 2020 Grainger Distinguished Chair in Engineering,2024 Harrold and Notkin Research and Graduate Mentoring Award and 2024 Swanlund Distinguished Chair.

She has been the editor-in-chief of the ACM/Springer Multimedia Systems journal; associate editor of the ACM Transactions on Multimedia Computing, Communications and Applications; associate editor of the IEEE Transactions on Multimedia; associate editor of the IEEE Transactions on Information Forensics & Security; associate editor of IEEE Multimedia Magazine; general co-chair of ACM Multimedia 2006; IEEE PerCom 2009, ACM/IEEE IOTDI 2019, IEEE SmartGridComm 2020, IEEE SECON 2022, and honorary general co-chair of ACM Multimedia 2024.