About Me

I am Peijun Bao (包培钧), currently a PhD student at RoseLab@NTU supervised by Prof. Alex Kot (SAEng/IEEE Life Fellow) and Prof. Er Meng Hwa (SAEng/IEEE Life Fellow). Additionally, I collaborate closely with Prof. Yong Xia (NPU) , Prof. Zheng Qian (ZJU), Prof. Yadong Mu (PKU), and Dr. Yang Wenhan (Pengcheng Lab).

My research interests lie in the fields of computer vision and machine learning. Specifically, I am focused on the multimodal understanding of video and language. My goal is to enable machines to accurately understand the video content while keeping the manual annotation process efficient.

News

  • [2024.08] Our ECCV paper is selected as an oral presentation!
  • [2024.07] 1 paper is accepted by ECCV 2024!
  • [2023.12] 2 papers are accepted by AAAI 2024!
  • [2022.12] 1 paper is accepted by AAAI 2023!
  • [2022.01] 1 paper is accepted by ICMR 2022!
  • [2021.01] 1 paper is accepted by AAAI 2021!

Publications

Peijun Bao, Chenqi Kong, Siyuan Yang, Zihao Shao, Xinghao Jiang, Boon Poh Ng, Menghwa Er, Alex Kot,
Vid-Group: Temporal Video Grounding Pretraining from Unlabeled Videos in the Wild,
arXiv preprint arXiv:2412.00811 [pdf], [bib], [code]

Peijun Bao, Zihao Shao, Wenhan Yang, Boon Poh Ng, Alex Kot,
E3M: Zero-Shot Spatio-Temporal Video Grounding with Expectation-Maximization Multimodal Modulation,
European Conference on Computer Vision (ECCV), 2024 (oral, top 2.4%) [pdf], [bib], [code]

Peijun Bao, Zihao Shao, Wenhan Yang, Boon Poh Ng, Meng Hwa Er, Alex Kot,
Omnipotent Distillation with LLMs for Weakly-Supervised Natural Language Video Localization: When Divergence Meets Consistency,
Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI), 2024 [pdf], [bib]

Peijun Bao, Yong Xia, Wenhan Yang, Boon Poh Ng, Meng Hwa Er, Alex Kot,
Local-Global Multi-Modal Distillation for Weakly-Supervised Temporal Video Grounding,
Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI), 2024 [pdf], [bib]

Peijun Bao, Wenhan Yang, Boon Poh Ng, Meng Hwa Er, Alex Kot,
Cross-Modal Label Contrastive Learning for Unsupervised Audio-Visual Event Localization,
Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI), 2023 (oral) [pdf], [bib]

Peijun Bao, Qian Zheng, Yadong Mu,
Dense Events Grounding in Video,
Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI), 2021 (oral) [pdf], [bib], [code]
Note: we propose a popular new task i.e. Video Paragraph Grounding.
A list of works such as [CVPR24], [CVPR23], [CVPR22], [AAAI24], [ACM MM24], [CVIU24], and [EMNLP22] follow our task.

Peijun Bao, Yadong Mu,
Learning Sample Importance for Cross-Scenario Video Temporal Grounding,
The 12th International Conference on Multimedia Retrieval (ICMR), 2022 (oral) [pdf], [bib]

Chenchen Liu, Yongzhi Li, Kangqi Ma, Duo Zhang, Peijun Bao, Yadong Mu,
Learning 3-D Human Pose Estimation from Catadioptric Videos,
The 30th International Joint Conference on Artificial Intelligence (IJCAI), 2021 [pdf]