PhD Candidate - University of Warwick
Jiaqi Li
My recent work spans multimodal video understanding, temporal grounding, human pose estimation, and embodied AI.
About
I am a PhD candidate in Computer Science at the University of Warwick, supervised by Prof. Guan Yu. Previously, I completed an MSc in Computer Science at The University of Hong Kong and a BSc in Information and Computing Science, with a minor in Computer Science and Technology.
Research Interests
- Long video understanding
- Multimodal models
- Video temporal grounding
- Temporal Action Localization
- Embodied AI
- Human pose estimation
Highlights
View full CVNews
Our collaborative paper Doppler Prompting for Stable mmWave-based Human Pose Estimation is accepted to ICML 2026.
Towards Mitigating Modality Bias in Vision-Language Models for Temporal Action Localization and a collaborative paper were accepted to ACL 2026.
Our collaborative paper Person Parametric Physics-informed Representation for mmWave-based Human Pose Estimation was accepted to IMWUT.
Know the Unknown: An Uncertainty-Sensitive Method for LLM Instruction Tuning was accepted to ACL Findings 2025.
Selected Publications
All publicationsTowards Mitigating Modality Bias in Vision-Language Models for Temporal Action Localization
We propose ActionVLM, a vision-language framework for temporal action localization that uses Language Advantage to adaptively weight language, mitigating language shortcuts and grounding localization in visual evidence.
Keeping the Evidence Chain: Semantic Evidence Allocation for Training-Free Token Pruning in Video Temporal Grounding
SemVID is a training-free VTG token pruning framework that preserves both boundary-critical evidence and cross-frame reasoning.
Know the Unknown: An Uncertainty-Sensitive Method for LLM Instruction Tuning
A novel fine-tuning framework to automatically synthesize training data tailored for rejecting the questions exceeds the knowledge without compromising on other tasks.
Doppler Prompting for Stable mmWave-based Human Pose Estimation
We improve mmWave human pose stability by treating Doppler as a confidence-gated motion prompt that selectively conditions spatial magnitude, reducing spurious motion artifacts and velocity error across single- and multi-person benchmarks.
Person Parametric Physics-informed Representation for mmWave-based Human Pose Estimation
This paper proposes a new input paradigm for mmWave-based human pose estimation, which models human as an Gaussian ensemble enriched with electromagnetic and kinematic parameters.
