Visual Modeling (VISMOD) group focuses on research of innovative machine learning methodologies and technologies, in order to solve real-world computer vision problems and AI+ problems, specifically human-centric vision scenarios including image and video representation, understanding and generation, as well as application of AI methods in other science and technology domains.
The current main research topics include:
● Models and algorithms in machine learning, especially
1) Learning under data distribution shift: long-tailed, few-shot, meta-learning, etc.
2) Learning under weak annotations: unsupervised/semi-supervised/weakly supervised learning, etc.
3) Anomaly and generalization dilemma: anomaly detection, out-of-distribution (OOD) detection/ generalization, etc.
● Human-centric visual modeling
1) Person re-identification
2) Human pose, motion understanding and generalization
3) Human-centric multimodal large language models
● AI4Science
1) Multimodal large language models with scientific data
2) Deep learning models in specific domains