Chuanxia Zheng

Chuanxia Zheng is a postdoctoral researcher in VGG at the University of Oxford, working with Prof. Andrea Vedaldi on 3D reconstruction and generation and Prof. Andrew Zisserman on generative model for understanding.

Before that, he spent one year at Monash University, where he worked as a Research Fellow with Prof. Jianfei Cai and Prof. Dinh Phung on codebook learning for generation. He received his PhD degree from the SCSE at Nanyang Technological University, supervised by Prof. Tat-Jen Cham and Prof. Jianfei Cai on 2D generation, translation and completion. His thesis Synthesizing Photorealistic Images was awarded the NTU Outstanding PhD Thesis Award 2022.

Email  /  CV  /  Google Scholar  /  Github  /  Twitter

profile photo
News
Research

His research interests are broadly in artificial intelligence, with emphasis on computer vision and machine learning. Much of his research is about 2D image generation, completion and translation, 3D scene reconstruction, generation and completion with the goal of building intelligent machines, capable of rebuilding a photorealistic virtual world via generative AI. In particular, he is working on topics:

  • 3D reconstruction and generation from limited views or videos.
  • 3D editing (decomposition, recomposition, and completion) via object-centric perception.
  • Generative models for pyhsical world understanding.
  • Multi-modalities (1D, 2D, 3D, and 4D) generation and understanding.
  • Free3D: Consistent Novel View Synthesis without 3D Representation
    Chuanxia Zheng, Andrea Vedaldi
    arXiv, 2023
    project page / PDF / arXiv / video / code /

    Free3D synthesizes consistent novel views on open-set categories without the need of explicit 3D representations.

    One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion Schedule Flaws and Enhancing Low-Frequency Controls
    Minghui Hu, Jianbin Zheng, Chuanxia Zheng, Chaoyue Wang, Dacheng Tao, Tat-Jen Cham
    arXiv, 2023
    project page / PDF / arXiv / code / HugeFace

    A versatile plug-and-play module to fix the scheduler flaws for diffusion models.

    IPO-LDM: Depth-aided 360-degree Indoor RGB Panorama Outpainting via Latent Diffusion Model
    Tianhao Wu, Chuanxia Zheng, Tat-Jen Cham
    arXiv, 2023
    project page / PDF / arXiv / code

    An indoor panorama outpainting model using latent diffusion models with view-consistent.

    Explicit Correspondence Matching for Generalizable Neural Radiance Fields
    Yuedong Chen, Haofei Xu, Qianyi Wu, Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
    arXiv, 2023
    project page / arXiv / code

    Employing explicit correspondence matching as a geometry prior enables NeRF to generalize across scenes.

    CocktailšŸø: Mixing Multi-Modality Controls for Text-Conditional Image Generation
    Minghui Hu, Jianbin Zheng, Daqing Liu, Chuanxia Zheng, Chaoyue Wang, Dacheng Tao, Tat-Jen Cham
    NeurIPS, 2023
    project page / PDF / arXiv / video / code

    We develop a generalized framework for multi-modality control based on text-to-image generation.

    Online clustered codebook
    Chuanxia Zheng, Andrea Vedaldi
    ICCV, 2023
    project page / PDF / arXiv / video / code / poster

    A simple approach to avoid codebook collapse and achive 100% codebook utilisation.

    Vector Quantized Wasserstein Auto-Encoder
    Long Tung Vuong, Trung Le, He zhao, Chuanxia Zheng, Mehrtash Harandi, Jianfei Cai, Dinh Phung
    ICML, 2023
    arXiv / poster / code (coming soon)

    Minimize the codebook-data distortion as the Wasserstein distance.

    UniD3: Unified Discrete Diffusion for Simultaneous Vision-Language Generation
    Minghui Hu, Chuanxia Zheng, Heliang Zheng, Tat-Jen Cham, Chaoyue Wang, Zuopeng Yang, Dacheng Tao, P.N.Suganthan,
    ICLR, 2023
    project page / arXiv / code /

    A unified discrete diffusion model for simultaneous vision-language generation.

    MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation
    Chuanxia Zheng, Long Tung Vuong, Jianfei Cai, Dinh Phung
    NeurIPS (Spotlight), 2022
    project page / PDF / arXiv / video / code(Kandinsky2) / poster

    A spatially conditional normalization is introduced to address the repeated artifacts in vector quantized methods.

    Object-Compositional Neural Implicit Surfaces
    Qianyi Wu, Xian Liu, Yuedong Chen, Kejie Li, Chuanxia Zheng, Jianfei Cai, Jianmin Zheng
    ECCV, 2022
    project page / arXiv / video / code

    Automatically decompose a scene into 3D instance, trained using only 2D semantic lables and images.

    Sem2NeRF: Converting Single-View Semantic Masks to Neural Radiance Fields
    Yuedong Chen, Qianyi Wu, Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai,
    ECCV, 2022
    project page / arXiv / video / code

    We train a 3D inversion model to transfer the 2D semantic map into 3D NeRF, and lets users edit 3D model through 2D semantic input.

    Bridging global context interactions for high-fidelity image completion
    Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai, Dinh Phung
    CVPR, 2022
    project page / PDF / arXiv / video / code / poster

    TFill fills in reasonable contents for both foreground object removal and content completion.

    Visiting the Invisible: Layer-by-Layer Completed Scene Decomposition
    Chuanxia Zheng, Duy-Son Dao, Guoxian Song, Tat-Jen Cham, Jianfei Cai,
    IJCV, 2021
    project page / PDF / arXiv / video / code

    We build a high-level scene understanding system that simultaneously models the completed shape and appearance for all instances.

    AgileGAN: Stylizing Portraits by Inversion-Consistent Transfer Learning
    Guoxian Song, Linjie Luo, Jing Liu, Wan-Chun Ma, Chuanxia Zheng, Tat-Jen Cham,
    SIGGRAPH, 2021
    project page / PDF / video / code / Online Demo

    A GAN inversion model is trained for Stylizing Portraits.

    The Spatially-Correlative Loss for Various Image Translation Tasks
    Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
    CVPR, 2021
    project page / PDF / arXiv / video / code / poster

    We propose a novel spatially-correlative loss that is simple, efficient and yet effective for preserving scene structure consistency while supporting large appearance changes during unpaired I2I translation.

    Pluralistic (Free-Form) Image Completion
    Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
    IJCV, 2021
    CVPR, 2019
    project page / PDF / arXiv / video / code / poster

    Given a single masked image, the proposed model is able to generate multiple and diverse plausible results.

    T2Net: Synthetic-to-Realistic Translation for Depth Estimation Tasks
    Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
    ECCV, 2018
    project page / PDF / arXiv / video / code / poster

    Without any real depth map, the proposed model evaluates depth maps on real scenes using only synthetic datasets.

    Academic Services

    Conference Reviewer

    CVPR    2020, 2021, 2022, 2023 (Outstanding Reviewer)
    ICCV    2019, 2021, 2023
    ECCV    2020, 2022
    NeurIPS    2022, 2023
    ICLR    2021, 2022, 2023, 2024
    ICML    2023
    SIGGRAPH&Asia    2022
    ICRA    2022
    IROS    2022
    IJCAI    2022
    ACM MM    2021, 2022

    Journal Reviewer

    TPAMI, IJCV, TIP, JAS, TMM(Outstanding Reviewer Award, 2021), TCSVT, CVIU, TVCJ, NCAA

    Teaching
    • Teaching Assistant, B16: Software Engineering, Undergraduate, Oxford, 2023
    • Teaching, Generative AI, Graduate, Oxford Summer School, 2023
    • Teaching Assistant, Advanced Digital Image Processing, Graduate, NTU, 2018-2020
    • Teaching Assistant, Human-Computer Interaction, Undergraduate, NTU, 2018-2020
    • Teaching Assistant, Engineering Mathematics, Undergraduate, NTU, 2018-2020

    awesome website template
    Last updated Dec. 2023.