Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
portfolio
publications
Hot Topic Clustering based on Gaussian Mixture Model built-in DTW
Published in IEEE International Conference on Pattern Recognition and Machine Learning (PRML), 2023
A Gaussian mixture model with built-in DTW for clustering variable-length time-series without dimensional explosion.
RMFDNet: Redundant and Missing Feature Decoupling Network for Salient Object Detection
Published in Engineering Applications of Artificial Intelligence, 2025
Decouples redundant and missing features through auxiliary decoders for more effective salient object detection.
Know the Unknown: An Uncertainty-Sensitive Method for LLM Instruction Tuning
Published in Findings of the Association for Computational Linguistics: ACL, 2025
A novel fine-tuning framework to automatically synthesize training data tailored for rejecting the questions exceeds the knowledge without compromising on other tasks.
Generating Multi-Modal Knowledge Clues as an Image: Towards Improving Image-Sequence Reasoning with Assisted Visual Input
Published in IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2026
Generates multimodal knowledge clues as an image to strengthen image-sequence reasoning with assisted visual input.
Why Learn What Physics Already Knows? Realizing Agile mmWave-based Human Pose Estimation via Physics-Guided Preprocessing
Published in IEEE International Conference on Multimedia and Expo (ICME), 2026
We propose a lightweight mmWave pose estimation framework that exploits physical priors to reduce parameters by 55.7%–88.9% while maintaining competitive accuracy and enabling real-time Raspberry Pi deployment.
Automatic and Reliable Faithfulness Evaluation for Scientific Text-to-Image Generation with LMMs
Under review, 2026
The first automatic faithfulness evaluation metric specifically designed for scientific image tasks.
Keeping the Evidence Chain: Semantic Evidence Allocation for Training-Free Token Pruning in Video Temporal Grounding
Under review, 2026
SemVID is a training-free VTG token pruning framework that preserves both boundary-critical evidence and cross-frame reasoning.
Person Parametric Physics-informed Representation for mmWave-based Human Pose Estimation
Under review, 2026
This paper proposes a new input paradigm for mmWave-based human pose estimation, which models human as an Gaussian ensemble enriched with electromagnetic and kinematic parameters.
Towards Mitigating Modality Bias in Vision-Language Models for Temporal Action Localization
Published in Association for Computational Linguistics (ACL), 2026 Top 1% Overall Assessment
We propose ActionVLM, a vision-language framework for temporal action localization that uses Language Advantage to adaptively weight language, mitigating language shortcuts and grounding localization in visual evidence.
talks
teaching
Guest Lecture: Video Forensics and Video Compression
Guest lecture, University of Warwick, 2026
Invited guest lecture for CS355 Digital Forensics at the University of Warwick, covering video forensics and video compression.
