Publications

Beta-CLIP: Text-Conditioned Contrastive Learning for Multi-Granular Vision-Language Alignment thumbnail

BOLT: Boost Large Vision-Language Model Without Training for Long-form Video Understanding

Shuming Liu, Chen Zhao^†, Tianqi Xu, Bernard Ghanem

CVPR 2025

BOLT: Boost Large Vision-Language Model Without Training for Long-form Video Understanding thumbnail

SMILE: Infusing Spatial and Motion Semantics in Masked Video Learning

Fida Mohammad Thoker, Letian Jiang, Chen Zhao^†, Bernard Ghanem

CVPR 2025

SMILE: Infusing Spatial and Motion Semantics in Masked Video Learning thumbnail

OSMamba: Omnidirectional Spectral Mamba with Dual-Domain Prior Generator for Exposure Correction

Gehui Li^*, Bin Chen^*, Chen Zhao^†, Lei Zhang, Jian Zhang

CVPR 2025

OSMamba: Omnidirectional Spectral Mamba with Dual-Domain Prior Generator for Exposure Correction thumbnail

Invertible Diffusion Models for Compressed Sensing

Bin Chen, Zhenyu Zhang, Weiqi Li, Chen Zhao^†, Jiwen Yu, Shijie Zhao, Jie Chen, Jian Zhang

TPAMI 2025

PDF Code

Invertible Diffusion Models for Compressed Sensing thumbnail

SEVERE++: Evaluating Benchmark Sensitivity in Generalization of Video Representation Learning

Fida Mohammad Thoker, Letian Jiang, Chen Zhao, Piyush Bagad, Hazel Doughty, Bernard Ghanem, Cees GM Snoek

arXiv 2025

SEVERE++: Evaluating Benchmark Sensitivity in Generalization of Video Representation Learning thumbnail

OpenTAD: A Unified Framework and Comprehensive Study of Temporal Action Detection

Shuming Liu^*, Chen Zhao^*, Fatimah Zohra, Mattia Soldan, Alejandro Pardo, Mengmeng Xu, Lama Alssum, Merey Ramazanova, Juan León Alcázar, Anthony Cioppa, Silvio Giancola, Carlos Hinojosa, Bernard Ghanem

CVPRW 2025.

OpenTAD: A Unified Framework and Comprehensive Study of Temporal Action Detection thumbnail

Effectiveness of Max-Pooling for Fine-Tuning CLIP on Videos

Fatimah Zohra, Chen Zhao, Shuming Liu, Bernard Ghanem

CVPRW 2025.

Chen Zhao, with other 84 authors

Effectiveness of Max-Pooling for Fine-Tuning CLIP on Videos thumbnail

Ego4D: Around the World in 3,000 Hours of Egocentric Video

TPAMI 2025. [Best Paper Nominee, Oral]

Ego4D: Around the World in 3,000 Hours of Egocentric Video thumbnail

Towards Automated Movie Trailer Generation

Dawit Mureja Argaw, Mattia Soldan, Alejandro Pardo, Chen Zhao^†, Fabian Caba Heilbron, Joon Son Chung, Bernard Ghanem

CVPR 2024

Chen Zhao, with other 100 authors

Towards Automated Movie Trailer Generation thumbnail

Dr²Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning

Chen Zhao, Shuming Liu, Karttikeya Mangalam, Guocheng Qian, Fatimah Zohra, Abdulmohsen Alghannam, Jitendra Malik, Bernard Ghanem

CVPR 2024

Code PDF Video

Dr2Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning thumbnail

Ego-Exo4D: Understanding Skilled Human Activity from First-and Third-Person Perspectives

CVPR 2024

Ego-Exo4D: Understanding Skilled Human Activity from First-and Third-Person Perspectives thumbnail

End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames

Shuming Liu, Chen-Lin Zhang, Chen Zhao^†, Bernard Ghanem

CVPR 2024

End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames thumbnail

Re²TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization

Chen Zhao, Shuming Liu, Karttikeya Mangalam, Bernard Ghanem

CVPR 2023

Code PDF Video

Re2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization thumbnail

FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model

Jiwen Yu, Yinhuai Wang, Chen Zhao^†, Bernard Ghanem, Jian Zhang^†

ICCV 2023

FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model thumbnail

EgoLoc: Revisiting 3D Object Localization from Egocentric Videos with Visual Queries

Jinjie Mai, Abdullah Hamdi, Silvio Giancola, Chen Zhao, Bernard Ghanem

ICCV 2023. [Won the first place in Ego4D VQ3D Challenge 2023, Oral]

EgoLoc: Revisiting 3D Object Localization from Egocentric Videos with Visual Queries thumbnail

A Unified Continual Learning Framework with General Parameter-Efficient Tuning

Qiankun Gao, Chen Zhao^†, Yifan Sun, Teng Xi, Gang Zhang, Bernard Ghanem, Jian Zhang^†

ICCV 2023

A Unified Continual Learning Framework with General Parameter-Efficient Tuning thumbnail

Large-capacity and Flexible Video Steganography via Invertible Neural Network

Chong Mou, Youmin Xu, Jiechong Song, Chen Zhao, Bernard Ghanem, Jian Zhang

CVPR 2023

Large-capacity and Flexible Video Steganography via Invertible Neural Network thumbnail

ETAD: Training Action Detection End to End on a Laptop

Shuming Liu, Mengmeng Xu, Chen Zhao, Xu Zhao, Bernard Ghanem

CVPRW 2023. [Oral]

Merey Ramazanova, Victor Escorcia, Fabian Caba Heilbron, Chen Zhao, Bernard Ghanem

ETAD: Training Action Detection End to End on a Laptop thumbnail

Owl (observe, watch, listen): Localizing actions in egocentric video via audiovisual temporal context

CVPRW 2023

Owl (observe, watch, listen): Localizing actions in egocentric video via audiovisual temporal context thumbnail

Just a Glimpse: Rethinking Temporal Information for Video Continual Learning

Lama Alssum, Juan Leo ́n Alca ́zar, Merey Ramazanova, Chen Zhao, Bernard Ghanem

CVPRW 2023. [Best Paper Award, Oral]

Just a Glimpse: Rethinking Temporal Information for Video Continual Learning thumbnail

R-DFCIL: Relation-Guided Representation Learning for Data-Free Class Incremental Learning

Qiankun Gao, Chen Zhao, Bernard Ghanem, Jian Zhang

ECCV 2022

R-DFCIL: Relation-Guided Representation Learning for Data-Free Class Incremental Learning thumbnail

End-to-End Active Speaker Detection

Juan Leon Alcazar, Moritz Cordes, Chen Zhao, Bernard Ghanem

ECCV 2022

End-to-End Active Speaker Detection thumbnail

Evaluation of Diverse Convolutional Neural Networks and Training Strategies for Wheat Leaf Disease Identification with Field-Acquired Photographs

Jiale Jiang, Haiyan Liu, Chen Zhao, Can He, Jifeng Ma, Tao Cheng, Yan Zhu, Weixing Cao, Xia Yao

Remote Sensing 2022

Evaluation of Diverse Convolutional Neural Networks and Training Strategies for Wheat Leaf Disease Identification with Field-Acquired Photographs thumbnail

When NAS Meets Trees: An Efficient Algorithm for Neural Architecture Search

Guocheng Qian, Xuanyang Zhang, Guohao Li, Chen Zhao, Yukang Chen, Xiangyu Zhang, Bernard Ghanem, Jian Sun

CVPRW 2022

When NAS Meets Trees: An Efficient Algorithm for Neural Architecture Search thumbnail

MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions

Mattia Soldan, Alejandro Pardo, Juan León Alcázar, Fabian Caba Heilbron, Chen Zhao, Silvio Giancola, Bernard Ghanem

CVPR 2022

MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions thumbnail

Ego4D: Around the World in 3,000 Hours of Egocentric Video

Chen Zhao^*, with other 84 authors

CVPR 2022. [Best Paper Nomination, Oral]

Chen Zhao, Bernard Ghanem

SegTAD: Precise Temporal Action Detection via Semantic Segmentation

Chen Zhao, Merey Ramazanova, Mengmeng Xu, Bernard Ghanem

ECCVW 2022

PDF Video

SegTAD: Precise Temporal Action Detection via Semantic Segmentation thumbnail

Video Self‑Stitching Graph Network for Temporal Action Localization

Chen Zhao, Ali Thabet, Bernard Ghanem

ICCV 2021

Code PDF Video

Video Self‑Stitching Graph Network for Temporal Action Localization thumbnail

ThumbNet: One Thumbnail Image Contains All You Need for Recognition

ACM MM 2020

PDF Video

ThumbNet: One Thumbnail Image Contains All You Need for Recognition thumbnail

Improve Baseline for Temporal Action Detection: HACS Challenge 2020 Solution of IVUL‑KAUST team

Mengmeng Xu, Chen Zhao, Merey Ramazanova, David S. Rojas, Ali Thabet, Bernard Ghanem

CVPRW 2020

Code PDF Project Slides Video

Improve Baseline for Temporal Action Detection: HACS Challenge 2020 Solution of IVUL‑KAUST team thumbnail

G‑TAD: Sub‑Graph Localization for Temporal Action Detection

Mengmeng Xu, Chen Zhao, David Rojas Blanco, Ali Thabet, Bernard Ghanem

CVPR 2020

G‑TAD: Sub‑Graph Localization for Temporal Action Detection thumbnail

Optimization‑Inspired Compact Deep Compressive Sensing

Jian Zhang, Chen Zhao, Wen Gao

JSTSP 2020

Code PDF Project

Logistic Regression is Still Alive and Effective: The 3rd YouTube 8M Challenge Solution of the IVUL‑KAUST team

Merey Ramazanova, Chen Zhao, Mengmeng Xu, Humam Alwassel, Sara Rojas Martinez, Fabian Caba, Bernard Ghanem

ICCVW 2019

Logistic Regression is Still Alive and Effective: The 3rd YouTube 8M Challenge Solution of the IVUL‑KAUST team thumbnail

CREAM: CNN-REgularized ADMM framework for compressive-sensed image reconstruction

Chen Zhao, Jian Zhang, Ronggang Wang, Wen Gao

IEEE Access 2018

BoostNet: A Structured Deep Recursive Network to Boost Image Deblocking

Chen Zhao, Jian Zhang, Ronggang Wang, Wen Gao

VCIP 2018. [Oral]

Better and Faster, when ADMM Meets CNN: Compressive-sensed Image Reconstruction

Chen Zhao, Ronggang Wang, and Wen Gao

PCM 2017. [Oral]

Reducing Image Compression Artifacts by Structural Sparse Representation and Quantization Constraint Prior

Chen Zhao, Jian Zhang, Siwei Ma, Xiaopeng Fan, Yongbing Zhang, Wen Gao

TCSVT 2017

Video Compressive Sensing Reconstruction via Reweighted Residual Sparsity

Chen Zhao, Siwei Ma, Jian Zhang, Ruiqin Xiong and Wen Gao

TCSVT 2017

CONCOLOR: COnstrained Non-Convex Low-Rank Model for Image Deblocking

Jian Zhang, Ruiqin Xiong, Chen Zhao, Yongbing Zhang, Siwei Ma, Wen Gao

TIP 2016

Chen Zhao, Jian Zhang, Siwei Ma and Wen Gao

Nonconvex Lp Nuclear Norm based ADMM Framework for Compressive Sensing

DCC 2016. [Oral (acceptance rate < 10%)]

Chen Zhao, Jian Zhang, Siwei Ma and Wen Gao

Compressive-Sensed Image Coding via Stripe-based DPCM

DCC 2016. [Oral (acceptance rate < 10%)]

A Dual Structured-Sparsity Model for Compressive-Sensed Video Reconstruction

Chen Zhao, Jian Zhang, Siwei Ma, Ruiqin Xiong, Wen Gao

VCIP 2015. [Oral]

基于云数据的高效图像编码方法

Chen Zhao, 马思伟, 张新峰, 张健, 高文

计算机学报 2016. [Recommended from NCMT, Best Paper Award in NCMT]

Thousand to one: An image compression system via cloud search

Chen Zhao, Siwei Ma, Wen Gao

MMSP 2015

Haoming Chen, Chen Zhao, Ming-Ting Sun and Aaron Drake

Adaptive intra-refresh for low-delay error-resilient video coding

JVCIR 2015

Video Compressive Sensing via Structured Laplacian Modelling

Chen Zhao, Siwei Ma, Wen Gao

VCIP 2014. [Oral]

Image Compressive-Sensing Recovery Using Structured Laplacian Sparsity in DCT Domain and Multi-Hypothesis Prediction

Chen Zhao, Siwei Ma, Wen Gao

ICME 2014

Image Compressive Sensing Recovery Using Adaptively Learned Sparsifying Basis via L0 Minimization

Jian Zhang, Chen Zhao, Debin Zhao, Wen Gao

SP 2014

Weakly Supervised Photo Cropping

Luming Zhang, Mingli Song, Yi Yang, Qi Zhao, Chen Zhao, Nicu Sebe

TMM 2014

Chen Zhao, Jian Zhang, Siwei Ma and Wen Gao

Wavelet Inpainting Driven Image Compression via Collaborative Sparsity at Low Bit Rates

ICIP 2013. [Oral]

A Highly Effective Error Concealment Method for Whole Frame Loss

Chen Zhao, Siwei Ma, Jian Zhang, Wen Gao

ISCAS 2013. [Oral]