Publications

Selected papers and recent work. A full list is also available on Google Scholar.

Towards Mitigating Modality Bias in Vision-Language Models for Temporal Action Localization

Jiaqi Li, Guangming Wang, Shuntian Zheng, Minzhe Ni, Xiaoman Lu, Guanghui Ye, Yu Guan

Published in Association for Computational Linguistics (ACL), 2026 Top 1% Overall Assessment

We propose ActionVLM, a vision-language framework for temporal action localization that uses Language Advantage to adaptively weight language, mitigating language shortcuts and grounding localization in visual evidence.

Download Paper Code Repository

Keeping the Evidence Chain: Semantic Evidence Allocation for Training-Free Token Pruning in Video Temporal Grounding

Jiaqi Li, Shuntian Zheng, Yixian Shen, Jia-Hong Huang, Xiaoman Lu, Minzhe Ni, Yu Guan

Under review, 2026

SemVID is a training-free VTG token pruning framework that preserves both boundary-critical evidence and cross-frame reasoning.

Download Paper Code Repository

Know the Unknown: An Uncertainty-Sensitive Method for LLM Instruction Tuning

Jiaqi Li, Yixuan Tang, Yi Yang

Published in Findings of the Association for Computational Linguistics: ACL, 2025

A novel fine-tuning framework to automatically synthesize training data tailored for rejecting the questions exceeds the knowledge without compromising on other tasks.

Download Paper Code Repository

Person Parametric Physics-informed Representation for mmWave-based Human Pose Estimation

Shuntian Zheng, Jiaqi Li, Guangming Wang, Minzhe Ni, Arnad Palit, Giovanni Montana, Yu Guan

Under review, 2026

This paper proposes a new input paradigm for mmWave-based human pose estimation, which models human as an Gaussian ensemble enriched with electromagnetic and kinematic parameters.

Download Paper

Why Learn What Physics Already Knows? Realizing Agile mmWave-based Human Pose Estimation via Physics-Guided Preprocessing

Shuntian Zheng, Jiaqi Li, Minzhe Ni, Xiaoman Lu, Yu Guan

Published in IEEE International Conference on Multimedia and Expo (ICME), 2026

We propose a lightweight mmWave pose estimation framework that exploits physical priors to reduce parameters by 55.7%–88.9% while maintaining competitive accuracy and enabling real-time Raspberry Pi deployment.

Download Paper

RMFDNet: Redundant and Missing Feature Decoupling Network for Salient Object Detection

Qianwei Zhou, Jintao Wang, Jiaqi Li, Chen Zhou, Haigen Hu, Keli Hu

Published in Engineering Applications of Artificial Intelligence, 2025

Decouples redundant and missing features through auxiliary decoders for more effective salient object detection.

Download Paper Code Repository

Hot Topic Clustering based on Gaussian Mixture Model built-in DTW

Chenggang Lu, Jiaqi Li

Published in IEEE International Conference on Pattern Recognition and Machine Learning (PRML), 2023

A Gaussian mixture model with built-in DTW for clustering variable-length time-series without dimensional explosion.

Download Paper Code Repository

Automatic and Reliable Faithfulness Evaluation for Scientific Text-to-Image Generation with LMMs

Guanghui Ye, Huan Zhao, Qin Zhu, Fengnan Li, Jiaqi Li, Yixian Shen, Zhonghao Ren, Zhihua Jiang

Under review, 2026

The first automatic faithfulness evaluation metric specifically designed for scientific image tasks.

Generating Multi-Modal Knowledge Clues as an Image: Towards Improving Image-Sequence Reasoning with Assisted Visual Input

Guanghui Ye, Huan Zhao, Yixian Shen, Jiaqi Li, Fengnan Li, Zhihua Jiang, Keqin Li

Published in IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2026

Generates multimodal knowledge clues as an image to strengthen image-sequence reasoning with assisted visual input.