Chuanxia Zheng

Chuanxia Zheng is a Marie Skłodowska-Curie Actions (MSCA) Fellow and a postdoctoral researcher in VGG at the University of Oxford, working with Prof. Andrea Vedaldi on 3D reconstruction and generation and Prof. Andrew Zisserman on generative model for understanding.

Before that, he spent one year at Monash University, where he worked as a Research Fellow with Prof. Jianfei Cai and Prof. Dinh Phung on codebook learning for generation. He received his PhD degree from the SCSE at Nanyang Technological University, supervised by Prof. Tat-Jen Cham and Prof. Jianfei Cai on 2D generation, translation and completion. His thesis Synthesizing Photorealistic Images was awarded the NTU Outstanding PhD Thesis Award 2022.

Email  /  CV  /  Google Scholar  /  Github  /  Twitter

profile photo
News
Research

His research interests focus on computer vision and machine learning, especially for generative AI. He has done a wide range of work on 2D and 3D scene synthesis, with the goal of synthesizing a photorealistic physical world via generative AI. In particular, on topics:

  • 3D geometry and appearance from limited views or videos.
  • 3D editing via object-centric perception.
  • Generative models for physical.
  • Multi-modalities (1D, 2D, 3D, and 4D) generation.
  • DragAPart: Learning a Part-Level Motion Prior for Articulated Objects
    Ruining Li, Chuanxia Zheng, Christian Rupprecht, Andrea Vedaldi
    arXiv, 2024
    project page / arXiv / code/ demo

    A physical interaction with objects in vision for part-level dragging.

    MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images
    Yuedong Chen, Haofei Xu, Chuanxia Zheng, Bohan Zhuang, Marc Pollefeys, Andreas Geiger, Tat-Jen Cham, Jianfei Cai
    arXiv, 2024
    project page / arXiv / code

    A cost volume representation for efficiently predicting 3D Gaussians from sparse multi-view images in a single forward pass.

    ClusteringSDF: Self-Organized Neural Implicit Surfaces for 3D Decomposition
    Tianhao Wu, Chuanxia Zheng, Tat-Jen Cham, Qianyi Wu
    arXiv, 2024
    project page / arXiv / video / code

    A self-organized 3D segmentation model via neural implicit surface representation.

    Explicit Correspondence Matching for Generalizable Neural Radiance Fields
    Yuedong Chen, Haofei Xu, Qianyi Wu, Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
    arXiv, 2023
    project page / arXiv / code

    Employing explicit correspondence matching as a geometry prior enables NeRF to generalize across scenes.

    Free3D: Consistent Novel View Synthesis without 3D Representation
    Chuanxia Zheng, Andrea Vedaldi
    CVPR, 2024
    project page / PDF / arXiv / video / code /

    Free3D synthesizes consistent novel views on open-set categories without the need of explicit 3D representations.

    Amodal Ground Truth and Completion in the Wild
    Guanqi Zhan, Chuanxia Zheng, Weidi Xie, Andrew Zisserman
    CVPR, 2024
    project page / PDF / arXiv / code /

    Setting up a Stable Diffusion based network to solve the amodal completion problem for any category and without occluder mask provided.

    One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion Schedule Flaws and Enhancing Low-Frequency Controls
    Minghui Hu, Jianbin Zheng, Chuanxia Zheng, Chaoyue Wang, Dacheng Tao, Tat-Jen Cham
    CVPR, 2024
    project page / PDF / arXiv / code / HugeFace

    A versatile plug-and-play module to fix the scheduler flaws for diffusion models.

    PanoDiffusion: 360-degree Panorama Outpainting via Diffusion
    Tianhao Wu, Chuanxia Zheng, Tat-Jen Cham
    ICLR, 2024
    project page / PDF / arXiv / code

    An indoor panorama outpainting model using latent diffusion models with view-consistent.

    Cocktail🍸: Mixing Multi-Modality Controls for Text-Conditional Image Generation
    Minghui Hu, Jianbin Zheng, Daqing Liu, Chuanxia Zheng, Chaoyue Wang, Dacheng Tao, Tat-Jen Cham
    NeurIPS, 2023
    project page / PDF / arXiv / video / code

    We develop a generalized framework for multi-modality control based on text-to-image generation.

    Online clustered codebook
    Chuanxia Zheng, Andrea Vedaldi
    ICCV, 2023
    project page / PDF / arXiv / video / code / poster

    A simple approach to avoid codebook collapse and achive 100% codebook utilisation.

    Vector Quantized Wasserstein Auto-Encoder
    Long Tung Vuong, Trung Le, He zhao, Chuanxia Zheng, Mehrtash Harandi, Jianfei Cai, Dinh Phung
    ICML, 2023
    arXiv / poster / code (coming soon)

    Minimize the codebook-data distortion as the Wasserstein distance.

    UniD3: Unified Discrete Diffusion for Simultaneous Vision-Language Generation
    Minghui Hu, Chuanxia Zheng, Heliang Zheng, Tat-Jen Cham, Chaoyue Wang, Zuopeng Yang, Dacheng Tao, P.N.Suganthan,
    ICLR, 2023
    project page / arXiv / code /

    A unified discrete diffusion model for simultaneous vision-language generation.

    MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation
    Chuanxia Zheng, Long Tung Vuong, Jianfei Cai, Dinh Phung
    NeurIPS (Spotlight), 2022
    project page / PDF / arXiv / video / code(Kandinsky2) / poster

    A spatially conditional normalization is introduced to address the repeated artifacts in vector quantized methods.

    Object-Compositional Neural Implicit Surfaces
    Qianyi Wu, Xian Liu, Yuedong Chen, Kejie Li, Chuanxia Zheng, Jianfei Cai, Jianmin Zheng
    ECCV, 2022
    project page / arXiv / video / code

    Automatically decompose a scene into 3D instance, trained using only 2D semantic lables and images.

    Sem2NeRF: Converting Single-View Semantic Masks to Neural Radiance Fields
    Yuedong Chen, Qianyi Wu, Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai,
    ECCV, 2022
    project page / arXiv / video / code

    We train a 3D inversion model to transfer the 2D semantic map into 3D NeRF, and lets users edit 3D model through 2D semantic input.

    Bridging global context interactions for high-fidelity image completion
    Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai, Dinh Phung
    CVPR, 2022
    project page / PDF / arXiv / video / code / poster

    TFill fills in reasonable contents for both foreground object removal and content completion.

    Visiting the Invisible: Layer-by-Layer Completed Scene Decomposition
    Chuanxia Zheng, Duy-Son Dao, Guoxian Song, Tat-Jen Cham, Jianfei Cai,
    IJCV, 2021
    project page / PDF / arXiv / video / code

    We build a high-level scene understanding system that simultaneously models the completed shape and appearance for all instances.

    AgileGAN: Stylizing Portraits by Inversion-Consistent Transfer Learning
    Guoxian Song, Linjie Luo, Jing Liu, Wan-Chun Ma, Chuanxia Zheng, Tat-Jen Cham,
    SIGGRAPH, 2021
    project page / PDF / video / code / Online Demo

    A GAN inversion model is trained for Stylizing Portraits.

    The Spatially-Correlative Loss for Various Image Translation Tasks
    Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
    CVPR, 2021
    project page / PDF / arXiv / video / code / poster

    We propose a novel spatially-correlative loss that is simple, efficient and yet effective for preserving scene structure consistency while supporting large appearance changes during unpaired I2I translation.

    Pluralistic (Free-Form) Image Completion
    Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
    IJCV, 2021
    CVPR, 2019
    project page / PDF / arXiv / video / code / poster

    Given a single masked image, the proposed model is able to generate multiple and diverse plausible results.

    T2Net: Synthetic-to-Realistic Translation for Depth Estimation Tasks
    Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
    ECCV, 2018
    project page / PDF / arXiv / video / code / poster

    Without any real depth map, the proposed model evaluates depth maps on real scenes using only synthetic datasets.

    Academic Services

    Conference Reviewer

    CVPR    2020, 2021, 2022, 2023 (Outstanding Reviewer), 2024
    ICCV    2019, 2021, 2023
    ECCV    2020, 202, 2024
    NeurIPS    2022, 2023
    ICLR    2021, 2022, 2023, 2024
    ICML    2023
    SIGGRAPH&Asia    2022
    ICRA    2022
    IROS    2022
    IJCAI    2022
    ACM MM    2021, 2022

    Journal Reviewer

    TPAMI, IJCV, TIP, JAS, TMM(Outstanding Reviewer Award, 2021), TCSVT, CVIU, TVCJ, NCAA

    Teaching
    • Teaching Assistant, B16: Software Engineering, Undergraduate, Oxford, 2023
    • Teaching, Generative AI, Graduate, Oxford Summer School, 2023
    • Teaching Assistant, Advanced Digital Image Processing, Graduate, NTU, 2018-2020
    • Teaching Assistant, Human-Computer Interaction, Undergraduate, NTU, 2018-2020
    • Teaching Assistant, Engineering Mathematics, Undergraduate, NTU, 2018-2020

    awesome website template
    Last updated Mar. 2024.