博士生导师
王树徽 (Shuhui Wang) 研究员 (Professor)
电子邮箱: wangshuhui@ict.ac.cn and shuhui.wang@vipl.ict.ac.cn
通讯地址: 北京市海淀区科学院南路6号(Tel: 010-62600573)
研究方向: 跨媒体理解与交互、跨媒体知识工程、异构数据挖掘、机器学习
个人简介

王树徽,2006年于清华大学获得工学学士学位,2012年7月于中国科学院计算技术研究所获得工学博士学位,2014年10月从中国科学院计算所博士后出站并留所工作,历任助理研究员、副研究员(2015)、研究员(2020)。从事跨媒体理解与知识推理、大数据理论与方法、机器学习等方面的研究,已在诸如IEEE/ACM顶级汇刊TPAMI、TIP、TKDE、TMM、TOMM、TCSVT、TKDD、TIST,以及NeurIPS、ICCV、CVPR、ACM Multimedia、ECCV、SIGMOD、VLDB、AAAI、IJCAI等媒体,视觉、数据科学和人工智能领域的顶级期刊和会议上发表和录用学术论文60余篇,授权国家专利4项。多次担任领域顶级国际会议ACM Multimedia, IJCAI, AAAI 领域主席,参与ICME、PCM、ICIMCS等国际会议的会议组织工作,并担任数十个高水平国际期刊和顶级会议的审稿人。参与科技创新2030-新一代人工智能重大项目、973课题、863课题等重大研究任务,获得国家自然科学基金委优青资助。与多个互联网企业保持良好的科研合作关系。


欢迎对图像视频理解,图文检索与内容转换生成,跨媒体分析推理,跨媒体知识工程等前沿研究有强烈兴趣和相关研究背景的同学报考博士硕士研究生!


经历

教育经历

2006.9 ~ 2012.7 中国科学院计算技术研究所 计算机应用技术 工学博士(免试)

2002.9 ~ 2006.7 清华大学 电子信息工程 工学学士

学术经历

2016~今, 面向开放环境的跨媒体分析推理与交互

2011~今,视觉-语言/跨媒体关联学习技术研究

2009~2013:基于多特征融合学习的图像分类技术研究

2006~2009: 海量图像视频理解与检索技术研究

2006: 基于降维学习的人脸分类技术研究


学术服务

刊物服务

[1]   Reviewer of Information Science (Elsevier) and Pattern Recognition (Elsevier).

[2]   Reviewer of ACM-TKDD and ACM-TOMCCAP

[3]   Reviewer of IEEE-TIP, IEEE-TKDE, IEEE-TMM, IEEE-TCSVT, IEEE-TCYB and IEEE-TBD.

会议服务

[1]   Outstanding reviewer of NeurIPS 2021

[2]   Area Chair of ACM Multimedia 2019-2021.

[3]   TPC member of ECCV'20, CVPR'18-20, ICCV'19, AAAI'19-20, IJCAI'18-20, ACCV'18, PCM'18, PRCV'18, ChinaMM'18.

[4]   Publication Chair of PCM 2017.

[5]   PC Co-chair of the MASS workshop, with APWEB-WAIM 2017, Jul. 7, 2017.

[6]   PC Co-Chair, 1st International Workshop on Mobility Analytics for Spatio-temporal and Social Data (MATES),VLDB'17, Sept. 1, 2017.

[7]   Publication Chair, ACM International Conference on Internet Multimedia Computing and Service (ICIMCS'15), 17th-21th, Aug, 2015, Zhangjiajie, Hunan.

研究内容

1.   多源异构媒体大数据分析

包括:多媒体内容推荐,网络有噪声数据的机器学习方法、网络社群分析。

2.   多模态高效感知与表达

包括:感知增强,图像翻译(风格转换),知识、记忆和语言驱动的类人多模态主动感知技术。

3.   视觉与跨媒体理解包括:

领域适应及迁移学习,图像视频物体、场景、事件分类,多模态协同的媒体主题语义分析等,跨媒体知识图谱构建与学习。

4.   多模态推理及交互

视觉-语言表征学习与检索,知识推理与知识泛化,视觉内容概述与内容生成,多模态问答及对话等。


著论

著作

1.   Siyuan Liu , Shuhui Wang, Qiang Qu. Trajectory Mining. Book chapter of Encyclopedia of GIS, Springer, ISBN: 978-3-319-23519-6 (Online), 2017. 


论文

Major publication in chronical order (see DBLP for full list):


1. Shuhao Cui, Shuhui Wang, Junbao Zhuo, Liang Li, Qingming Huang, Qi Tian. Fast Batch Nuclear-norm Maximization and Minimization for Robust Domain Adaptation. arXiv:2107.06154v3

2. Xinzhe Han, Shuhui Wang, Chi Su, Qi Tian, Qingming Huang. Greedy Gradient Ensemble for Robust Visual Question Answering. ICCV, 2021. (Accepted as Oral paper)

3. Jingru Gan, Jinchang Luo, Haiwei Wang, Shuhui Wang, Wei He, Qingming Huang. Multimodal Entity Linking: A New Dataset and A Baseline. ACM Multimedia, 2021.(Accepted as Oral paper)

4. Xu Yan, Zhengcong Fei, Zekang Li, Shuhui Wang, Qingming Huang, Qi Tian. Semi-autoregressive Image Captioning. ACM Multimedia, 2021. (Accepted as Oral paper)

5. Zhaobo Qi, Shuhui Wang, Chi Su, Li Su, Qingming Huang, Qi Tian. Self-Regulated Learning for Egocentric Video Activity Anticipation. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), accepted. Code

6. Shuhao Cui, Xuan Jin, Shuhui Wang, Yuan He, Qingming Huang. Heuristic Domain Adaptation. NeurIPS, 2020.

7. Zhaobo Qi, Shuhui Wang, Chi Su, Li Su, Qingming Huang, Qi Tian. Towards More Explainability: Concept Knowledge Mining Network for Event Recognition. ACM Multimedia, 2020.

8. Zhaobo Qi, Shuhui Wang, Chi Su, Li Su, Weigang Zhang, Qingming Huang. Modeling Temporal Concept Receptive Field Dynamically for Untrimmed Video Analysis. ACM Multimedia, 2020.

9. Xiaodan Li, Yining Lang, Yuefeng Chen, Xiaofeng Mao, Yuan He, Shuhui Wang, Hui Xue, Quan Lu. Sharp Multiple Instance Learning for DeepFake Video Detection. ACM Multimedia, 2020.

10. Xinzhe Han, Shuhui Wang, Chi Su, Weigang Zhang, Qingming Huang, Qi Tian. Interpretable Visual Reasoning via Probabilistic Formulation under Natural Supervision. ECCV, 2020.

11. Shuhui Wang, Ling Hu, Liang Li, Weigang Zhang, Qingming Huang. Two-Stream Deep Sparse Network for Accurate and Efficient Image Restoration. Computer Vision and Image Understanding (CVIU), 200: 103029, 2020.

12. Guoli Song, Shuhui Wang, Qingming Huang, Qi Tian. Learning Feature Representation and Partial Correlation for Multimodal Multi-Labeled Data. IEEE Transactions on Multimedia (TMM), 23:1882-1894, 2021.

13. Dan Guo, Hui Wang, Shuhui Wang, Meng Wang. Textual-Visual Reference-aware Attention Network for Visual Dialog. IEEE Transactions on Image Processing (TIP), vol. 29, pp. 6655-6666, 2020.

14. Yiling Wu, Shuhui Wang, Guoli Song, Qingming Huang. Augmented Adversarial Training for Cross-modal Retrieval. IEEE Transactions on Multimedia (TMM), 23:559-571, 2020. Code

15. Shijie Yang, Liang Li, Shuhui Wang, Weigang Zhang, Qingming Huang, Qi Tian. A Structured Latent Variable Recurrent Network with Stochastic Attention for Generating Weibo Comments. IJCAI, 2020.

16. Shuhao Cui, Shuhui Wang, Junbao Zhuo, Chi Su, Qingming Huang, Qi Tian. Gradually Vanishing Bridge for Adversarial Domain Adaptation. CVPR, 2020.  Code

17. Shuhao Cui, Shuhui Wang, Junbao Zhuo, Liang Li, Qingming Huang, Qi Tian. Towards Discriminability and Diversity: Batch Nuclear-norm Maximization under Label Insufficient Situations. CVPR, 2020. (Oral) Code

18.  Beichen Zhang, Liang Li, Shijie Yang, Shuhui Wang, Zheng-Jun Zha, Qingming Huang. State-relabling adversarial active learning. CVPR, 2020. (Oral)

19. Jun Wei, Shuhui Wang, Zhe Wu, Chi Su, Qingming Huang, Qi Tian. Label Decoupling Framework for Salient Object Detection. CVPR, 2020. 

20. Dechao Meng, Liang Li, Xuejing Liu, Yadong Li, Shijie Yang, Zhengjun Zha, Xinyu Gao, Shuhui Wang, Qingming Huang. Parsing-based View-aware Embedding Network for Vehicle Re-Identification. CVPR, 2020. 

21. Jun Wei, Shuhui Wang, Qingming Huang. F3Net: Fusion, Feedback and Focus for Salient Object Detection. AAAI, 2020. (Oral) Code

22. Yiling Wu, Shuhui Wang, Qingming Huang. Online Fast Adaptive Low-rank Similarity Learning for Cross-Modal Retrieval. IEEE Transactions on Multimedia (TMM), 22(5): 1310-1322, 2020.

23. Guoli Song, Shuhui Wang, Qingming Huang, Qi Tian. Harmonized Multimodal Learning with Gaussian Process Latent Variable Models. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 43(3): 858-872, 2021.Paper

24. Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Dechao Meng, Qingming Huang. Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding. ICCV, 2019. Code

25. Yiling Wu, Shuhui Wang, Guoli Song, Qingming Huang. Learning Fragment Self-Attention Embeddings for Image-Text Matching. ACM Multimedia, pp. 2088-2096, 2019. (oral)  Code

26. Xuejing Liu, Liang Li, Shuhui Wang, Zhengjun Zha, Li Su, Qingming Huang. Knowledge-guided Pairwise Reconstruction Network for Weakly Supervised Referring Expression Grounding. ACM Multimedia, pp. 539-547, 2019. (oral)

27. Shijie Yang, Liang Li, Shuhui Wang, Dechao Meng, Qingming Huang and Qi Tian. Structured Stochastic Recurrent Network for Linguistic Video Prediction. ACM Multimedia, pp. 21-29, 2019. (oral)

28. Shuhui Wang, Liang Li, Chenxue Yang, Qingming Huang. Regularized Topic-aware Latent Influence Propagation in Dynamic Relational Networks. GeoInformatica, 23(3): 329-352, 2019. Paper

29. Liang Li, Xinge Zhu, Yiming Hao, Shuhui Wang, Xingyu Gao, Qingming Huang. A Hierarchical CNN-RNN Approach for Visual Emotion Classification. ACM Trans. Multimedia Comput. Commun. Appl. (TOMM), 2019, 15(3s): 1-17. 

30. Shijie Yang, Liang Li, Shuhui Wang, Weigang Zhang, Qingming Huang, Qi Tian. SkeletonNet: A Hybrid Network with a Skeleton-Embedding Process for Multi-view Image Representation Learning. IEEE Transactions on Multimedia, 21(11): pp. 2916-2929, 2019.

31. Yiling Wu, Shuhui Wang, Guoli Song, Qingming Huang. Online Asymmetric Metric Learning with Multi-Layer Similarity Aggregation for Cross-Modal Retrieval. IEEE Transaction on Image Processing, vol. 28, no. 9, pp. 4299-4312, 2019. Code

32. Junbao Zhuo, Shuhui Wang, Shuhao Cui, Qingming Huang. Unsupervised Open Domain Recognition by Semantic Discrepancy Minimization. In CVPR, 2019.Paper, Code

33. Zhe Xue, Guorong Li, Shuhui Wang, Weigang Zhang, Qingming Huang. Bilevel Multiview Latent Space Learning. IEEE Trans. Circuits Syst. Video Techn. 28(2): 327-341, 2018.

34. Yangyu Chen, Shuhui Wang, Weigang Zhang, Qingming Huang. Less is More: Picking Informative Frames for Video Captioning. ECCV, 2018.Code

35. Shuhui Wang, Yangyu Chen, Junbao Zhuo, Qingming Huang, Qi Tian. Joint Global and Co-Attentive Representation Learning for Image-Sentence Retrieval. ACM Multimedia, 2018. (Oral)

36. Yiling Wu, Shuhui Wang, Qingming Huang. Learning Semantic Structure-preserved Embeddings for Cross-modal Retrieval. ACM Multimedia, 2018. 

37. Liang Li, Shuhui Wang, Shuqiang Jiang, Qingming Huang. Attentive Recurrent Neural Network for Weak-supervised Multi-label Image Classification. ACM Multimedia, 2018. 

38. Guoli Song, Shuhui Wang , Qingming Huang, Qi Tian: Multimodal Similarity Gaussian Process Latent Variable Model. IEEE Trans. Image Processing 26(9): 4168-4181 (2017). Code

39. Jiaming Zhang, Shuhui Wang, Qingming Huang: Location-Based Parallel Tag Completion for Geo-Tagged Social Image Retrieval. ACM TIST 8(3): 38:1-38:21 (2017).

40. Siyuan Liu, Shuhui Wang: Trajectory Community Discovery and Recommendation by Multi-Source Diffusion Modeling. IEEE Trans. Knowl. Data Eng. 29(4): 898-911 (2017). 

41. Yiling Wu, Shuhui Wang, Qingming Huang: Online Asymmetric Similarity Learning for Cross-Modal Retrieval. CVPR 2017: 3984-3993. 

42. Shijie Yang, Liang Li, Shuhui Wang, Weigang Zhang, Qingming Huang: A Graph Regularized Deep Neural Network for Unsupervised Image Representation Learning. CVPR 2017: 7053-7061. 

43. Guoli Song, Shuhui Wang, Qingming Huang, Qi Tian: Multimodal Gaussian Process Latent Variable Models with Harmonization. ICCV 2017: 5039-5047. Code

44. Junbao Zhuo, Shuhui Wang, Weigang Zhang, Qingming Huang: Deep Unsupervised Convolutional Domain Adaptation. ACM Multimedia 2017: 261-269. 

45. Weiqing Min, Shuqiang Jiang, Shuhui Wang, Jitao Sang, Shuhuan Mei: A Delicious Recipe Analysis Framework for Exploring Multi-Modal Recipes with Various Attributes. ACM Multimedia 2017: 402-410. 

46. Lingyang Chu, Yanyan Zhang, Guorong Li, Shuhui Wang, Weigang Zhang, Qingming Huang: Effective Multimodality Fusion Framework for Cross-Media Topic Detection. IEEE Trans. Circuits Syst. Video Techn. 26(3): 556-569 (2016).

47. Yan Hua, Shuhui Wang, Siyuan Liu, Anni Cai, Qingming Huang: Cross-Modal Correlation Learning by Adaptive Hierarchical Semantic Aggregation. IEEE Trans. Multimedia 18(6): 1201-1216 (2016).

48. Lingyang Chu, Shuhui Wang, Siyuan Liu, Qingming Huang, Jian Pei: ALID: Scalable Dominant Cluster Detection. PVLDB 8(8): 826-837 (2015). 

49. Li Shen, Gang Sun, Qingming Huang, Shuhui Wang, Zhouchen Lin, Enhua Wu: Multi-Level Discriminative Dictionary Learning With Application to Large Scale Image Classification. IEEE Trans. Image Processing 24(10): 3109-3123 (2015). 

50. Siyuan Liu, Qiang Qu, Shuhui Wang: Rationality Analytics from Trajectories. TKDD 10(1): 10:1-10:22 (2015).

51. Siyuan Liu, Shuhui Wang, Feida Zhu: Structured Learning from Heterogeneous Behavior for Social Identity Linkage. IEEE Trans. Knowl. Data Eng. 27(7): 2005-2019 (2015). 

52. Guoli Song, Shuhui Wang, Qingming Huang, Qi Tian: Similarity Gaussian Process Latent Variable Model for Multi-modal Data Analysis. ICCV 2015: 4050-4058. 

53. Yan Hua, Shuhui Wang, Siyuan Liu, Qingming Huang, Anni Cai: TINA: Cross-Modal Correlation Learning by Adaptive Hierarchical Semantic Aggregation. ICDM 2014: 190-199.

54. Siyuan Liu, Shuhui Wang, Feida Zhu, Jinbo Zhang, Ramayya Krishnan: HYDRA: large-scale social identity linkage via heterogeneous behavior modeling. SIGMOD Conference 2014: 51-62. 

55. Lingyang Chu, Shuqiang Jiang, Shuhui Wang, Yanyan Zhang, Qingming Huang: Robust Spatial Consistency Graph Model for Partial Duplicate Image Retrieval. IEEE Trans. Multimedia 15(8): 1982-1996 (2013).

56. Li Shen, Shuhui Wang, Gang Sun, Shuqiang Jiang, Qingming Huang: Multi-level Discriminative Dictionary Learning towards Hierarchical Visual Categorization. CVPR 2013: 383-390. 

57. Shuhui Wang, Qingming Huang, Shuqiang Jiang, Qi Tian: S3MKL: Scalable Semi-Supervised Multiple Kernel Learning for Real-World Image Applications. IEEE Trans. Multimedia 14(4): 1259-1274 (2012).

58. Shuhui Wang, Shuqiang Jiang, Qingming Huang, Qi Tian: Multi-feature metric learning with knowledge transfer among semantics and social tagging. CVPR 2012: 2240-2247. 

59. Shuhui Wang, Shuqiang Jiang, Qingming Huang, Qi Tian: S3MKL: scalable semi-supervised multiple kernel learning for image data mining. ACM Multimedia 2010: 163-172.