Yonghui Wang (王勇惠)

I am currently a 3rd-year Ph.D. student at University of Science and Technology of China (USTC), supervised by Prof. Houqiang Li and Prof. Wengang Zhou . Before that, I received my Bachelor degree from Ocean University of China (OUC) in 2021.

My research interests are about computer vision and deep learning, especially on multimodal large language models.

Email  /  Google Scholar  /  Github  /  CV

profile photo

Publications (* indicates equal contribution)


ROOT: VLM based System for Indoor Scene Understanding and Beyond
Yonghui Wang, Shi-Yong Chen, Zhenxing Zhou, Siyi Li, Haoran Li, Wengang Zhou, Houqiang Li
arXiv, 2024
[paper] [code]

AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding
Yonghui Wang, Wengang Zhou, Hao Feng, Houqiang Li
arXiv, 2024
[paper] [code]

Towards improving document understanding: An exploration on text-grounding via mllms
Yonghui Wang, Wengang Zhou, Hao Feng, Keyi Zhou, Houqiang Li
arXiv, 2023
[paper] [code]

SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection
Yonghui Wang, Shaokai Liu, Li Li, Wengang Zhou, Houqiang Li
ACM Transactions on Multimedia Computing, Communications, and Applications (ACM TOMM), 2024
[paper] [code]

Detect Any Shadow: Segment Anything for Video Shadow Detection
Yonghui Wang, Wengang Zhou, Yunyao Mao, Houqiang Li
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2023
[paper] [code]

Progressive Recurrent Network for shadow removal
Yonghui Wang, Wengang Zhou, Hao Feng, Li Li, Houqiang Li
Computer Vision and Image Understanding (CVIU), 2023
[paper] [code]

UDoc-GAN: Unpaired Document Illumination Correction with Background Light Prior
Yonghui Wang, Wengang Zhou, Zhenbo Lu, Houqiang Li
30th ACM International Conference on Multimedia (ACM MM), 2022
[paper] [code]

Papers I Co-authored


TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding
Bozhi Luan, Hao Feng, Hong Chen, Yonghui Wang, Wengang Zhou, Houqiang Li
arXiv, 2024
[paper] [code]

LaneTCA: Enhancing Video Lane Detection with Temporal Context Aggregation
Keyi Zhou, Li Li, Wengang Zhou, Yonghui Wang, Hao Feng, Houqiang Li
arXiv, 2024
[paper] [code]

Awards and Honors


  • National Scholarship for Graduate Students.
    2023
  • Longfor Scholarship.
    2022
  • Outstanding graduates of OUC.
    2021
  • Second Prize in the National Undergraduate Mathematics Competition
    2018
  • First Prize in the Shandong Undergraduate Mathematics Competition
    2018

Research Experience


  • [2024/05 ~ now]     Research Intern, Game AI Center, Tencent IEG (Mentor: Shi-Yong Chen).

Academic Services


  • Conference Reviewer: ACM MM.
  • Journal Reviewer: TOMM, TMM, TCSVT.