Workshops

1. Collaboration and Evolution of Foundation and Specialized Models Workshop
Website: https://sites.google.com/view/cefswworkshop2024

This workshop aims to bring together a group of researchers and practitioners who share similar technical experiences and interests in the collaboration and evolution of multi-media foundation and specialized models. The scope of this workshop includes, but is not limited to, 1) Analysis and understanding of heterogeneous model collaboration strategies; 2) Collaboration paradigms of foundation (e.g., MLLMs) and specialized models; 3) Evolution paradigms of foundation (e.g., MLLMs) and specialized models; 4) Device-cloud Collaborative Learning Paradigms (cloud-centric collaboration, device-centric collaboration, bidirectional device-cloud collaboration); 5) Deployed research on heterogeneous model collaboration and evolution; 6) Datasets or benchmarks for heterogeneous model collaboration and evolution.

Contact person:
Shengyu Zhang (sy_zhang@zju.edu.cn)
Organizer:
Shengyu Zhang, Zhejiang University, China
Chaoyue Niu, Shanghai Jiao Tong University, China
Fan Wu, Shanghai Jiao Tong University, China
Hongxia Yang, Hong Kong Polytechnic University, China



2. Workshop on Multimodal Foundation Models for Remote Sensing and Agriculture (MFM-RsAg)
Website: https://mfm-rsag.github.io/

This workshop focuses on exploring the potential of multimodal foundation models in analyzing remote sensing data to provide innovative solutions for agricultural tasks, such as weed management, yield prediction, and crop mapping. Remote sensing data includes a wide range of types, such as hyperspectral, multispectral, LiDAR, SAR, thermal imaging, satellite and UAV imagery, time series, geospatial data, and radar. By leveraging multimedia modeling technologies, the workshop aims to enhance the analysis of this diverse data and advance applications in sustainable agriculture. It will bring together researchers from fields like computer science and remote sensing to collaborate on using foundation models for remote sensing and agricultural applications.

Contact person:
Kun Hu (kun.hu@sydney.edu.au)
Organizer:
Kun Hu, The University of Sydney, Australia
Mingyang Ma, Northwestern Polytechnical University, China
Patrick Filippi, The University of Sydney, Australia
Shaohui Mei, Northwestern Polytechnical University, China
Fan Li, Xi'an Jiaotong University, China
Zhiyong Wang, The University of Sydney, Australia
Thomas Bishop, The University of Sydney, Australia
Mingyi He, Northwestern Polytechnical University, China



3. Workshop of Multimodal, Multilingual and Multitask Modeling Technologies for Oriental Languages
Website: https://sites.google.com/view/m3oriental

This M3Oriental workshop addresses the challenges in low-resourced language problems in speech and language processing. The workshop focuses on integrating multimodal, multilingual, and multitask modeling technologies using large-scale pretraining models. The goal is to explore their potential in multimodal tasks and cross-lingual communication, which are key features of next-generation artificial intelligence. The workshop covers multiple tasks (such as machine translation (MT), speech translation (ST), speech recognition (ASR), speech synthesis (TTS), voice conversion (VC), and speech emotion recognition (SER)).

Contact person:
Sheng Li (sheng.li@nict.go.jp)
Organizer:
Ruili Wang, Massey University
Sheng Li, NICT, Kyoto, Japan
Chenhui Chu, Kyoto University
Jiyi Li, University of Yamanashi
Raj Dabre, NICT, Kyoto, Japan
Xianchao Wu, NVIDIA, Tokyo, Japan
Zuchao Li, Wuhan University
Yang Cao, Tokyo Institute of Technology


4. Workshop on multi-biological sensing data for language deterioration prediction
Website: https://sites.google.com/view/spandldeteriorate

Digital health applications are a hot topic in Computer Vision, NLP, Audio Analysis, and Biomedical Informatics, with significant potential impact across various domains, including the prediction of language deterioration. However, researchers still lack a comprehensive understanding of multimodal biological analysis for language deterioration. The relationship and dynamics between different sensor data and speech data are particularly understudied, especially in longitudinal studies and across heterogeneous populations. With the advent of large pre-trained models, particularly multimodal LLMs, it has become increasingly feasible to enhance language deterioration detection using diverse data modalities. Simultaneously, further research into assistive technologies is essential to improve speech communication for patients with speech and language disorders, thereby enhancing communication efficiency and monitoring mental health. SPandLDeteriorate is thus a significant addition to ACM Multimedia Asia 2024, offering a unique platform to discuss the convergence of these critical areas.

Contact person:
Yi-lin Pan (yilin.pan@dlmu.edu.cn)
Organizer:
Yilin Pan, Dalian Maritime University
Yuanchao Li, University of Edinburgh
Haiyang Zhang, Xi'an Jiaotong-Liverpool University
Zhaojie Luo, Southeast University
Zhao Ren, University of Bremen
Ting Dang, University of Melbourne
Siyang Song, University of Leicester
Nicholas Cummins, King’s College London
Yijia Zhang, Dalian Maritime University