About me

I am a Researcher at INSAIT, working with Prof. Luc Van Gool and Dr. Danda Paudel. I received my MSc from ETH Zürich, where I conducted 3D Vision and Graphics research at Disney Research | Studios Zürich (with Prof. Markus Gross) and at VLG (with Prof. Siyu Tang). I obtained my Bachelor’s degree from City University of Hong Kong.

My research lies at the intersection of vision-language modeling, spatial AI, and controllable visual representations. I aim to build models that jointly reason about language and 3D environments, enabling fine-grained, controllable generation and editing of both 2D and 3D scene representations.

Research Interests

Vision-Language Models & Multimodal Reasoning
Spatial AI and 3D Scene Understanding
Controllable 2D / 3D Generation & Editing
Neural Rendering and Inverse Rendering

Outside research, I enjoy Rendering, Photography, video games, fingerstyle guitar, table tennis, skiing, and hiking.

News

2026.01🎉 My first-author paper EgoNight: Towards Egocentric Vision Understanding at Night with a Challenging Benchmark has been accepted to ICLR 2026!
2025.09🎉 Our paper StateSpaceDiffuser: Bringing Long Context to Diffusion World Models has been accepted to NeurIPS 2025!
2025.04I joined INSAIT as a Researcher, supervised by Prof. Luc Van Gool and Dr. Danda Paudel!
2024.10🎉 My first-author paper RISE-SDF: a Relightable Information-Shared Signed Distance Field for Glossy Object Inverse Rendering has been accepted to 3DV 2025!
2023.10🎉 My first-author paper CoARF: Controllable 3D Artistic Style Transfer for Radiance Fields has been accepted to 3DV 2024!

Publications

EgoNight: Towards Egocentric Vision Understanding at Night with a Challenging BenchmarkICLR 2026

Deheng Zhang*, Yuqian Fu*, Runyi Yang, Yang Miao, Tianwen Qian, Xu Zheng, Guolei Sun, Ajad Chhatkuli, Xuanjing Huang, Yu-Gang Jiang, Luc Van Gool, Danda Pani Paudel

Paper Project Code Dataset

The first comprehensive benchmark for egocentric vision understanding in low-light and nighttime conditions, comprising synthetic scenes (EgoNight-Synthetic), aligned day–night pairs (EgoNight-Sofia), and unaligned nighttime footage (EgoNight-Oxford).

StateSpaceDiffuser: Bringing Long Context to Diffusion World ModelsNeurIPS 2025

Nedko Savov, Naser Kazemi, Deheng Zhang, Danda Paudel, Xi Wang, Luc Van Gool

Paper Project Code

A diffusion world model that overcomes the memory bottleneck by integrating features from a state-space model representing the entire interaction history, enabling long-context world modeling.

RISE-SDF: a Relightable Information-Shared Signed Distance Field for Glossy Object Inverse Rendering3DV 2025

Deheng Zhang*, Jingyu Wang*, Shaofei Wang, Marko Mihajlovic, Sergey Prokudin, Hendrik P.A. Lensch, Siyu Tang

Paper Project Code Dataset

An end-to-end relightable neural inverse-rendering system enabling high-quality reconstruction of geometry and material properties. The core idea is a two-stage approach for better factorization of scene parameters, supporting high-quality relighting of glossy objects.

CoARF: Controllable 3D Artistic Style Transfer for Radiance Fields3DV 2024

Deheng Zhang, Clara Fernández Labrador, Christopher Schroers

Paper Project

A novel algorithm for controllable 3D scene stylization that enables style transfer for specified objects, compositional 3D style transfer, and semantic-aware style transfer via segmentation masks and label-dependent losses.

EgoSpot: Accessible Robot Control via Egocentric Multimodal SignalsICRA Workshop 2026

Ganlin Zhang*, Deheng Zhang*, Longteng Duan*, Guo Han*, Yuqian Fu, Danda Pani Paudel, Luc Van Gool, Eric Vollenweider (* equal contribution)

Paper Project Code

A mixed-reality system on HoloLens 2 that enables users to control the Boston Dynamics Spot robot through egocentric multimodal signals — gaze, gesture, and voice — making robot teleoperation more accessible and intuitive.

SeasonScapes: Learning Large-scale Re-lightable 3D Landscapes with Seasonal Variation from Sparse WebcamsCVPR Workshop 2026

Timo Kleger, Qi Ma, Deheng Zhang, Luc Van Gool, Danda Pani Paudel

Paper

A framework and dataset for large-scale relightable 3D landscapes — over 85,000 webcam images from 32 Swiss mountain locations across a full year, projected onto a 3D mesh with conditional diffusion inpainting to model seasonal appearance changes and enable physically-based relighting.

Selected Projects

Auto-scrolls horizontally · hover to pause and explore.

Point-Based Radiance Fields for Controllable Human Motion SynthesisCourse Project

Deheng Zhang*, Haitao Yu*, Peiyuan Xie*, Tianyi Zhang*

An animatable human avatar built on point-based primitives — static scene from Point-NeRF + deformation MLP + rotation-only ray-bending.

Paper Project Code

NICE-SLAM with Adaptive Feature GridsCourse Project

Deheng Zhang*, Ganlin Zhang*, Feichi Lu*, Anqi Li

A sparse version of NICE-SLAM bringing Voxel Hashing into the NICE-SLAM framework — surface-adaptive feature grids instead of dense initialization.

Paper Code

Kombu: Physically-based Renderer in C++11Course Project

Deheng Zhang*, Ganlin Zhang*

Heterogeneous volumetric rendering, bilateral-filter denoising, directional lights and instancing — culminating in our rendering-competition piece Christmas on the Moon.

Code Project

SAVA: Style-Attention-Void-Aware Style TransferBachelor Thesis

Deheng Zhang

A self-attention mechanism with explicit mathematical meaning and a style-transfer scheme that captures the blank-leaving structure of the style image.

Paper Code

OPUS: Particle Swarm Using Surrogates via Bunch-Kaufman PivotingCourse Project

Deheng Zhang*, Ganlin Zhang*, Junpeng Gao*, Yu Hong*

Speeding up OPUS black-box optimization with a fast C++ implementation of Bunch-Kaufman pivoting.

Paper Code

Experience

Auto-scrolls vertically · hover any card to pause and expand details.

Doctoral Researcher
INSAIT · Sofia, Bulgaria
Apr 2025 – Present
hover to expand
Working with Prof. Luc Van Gool and Dr. Danda Paudel.
- Research areas: spatial reasoning, egocentric video understanding, controllable and consistent video generation, Gaussian splatting and relighting.
- Recent works: EgoNight (ICLR 2026), StateSpaceDiffuser (NeurIPS 2025), EgoSpot (ICRA Workshop 2026).
Researcher (IMPRS-IS)
International Max Planck Research School for Intelligent Systems · Tübingen, Germany
Sept 2024 – Mar 2025
hover to expand
- Research areas: 3D scene understanding, Gaussian splatting.
- Teaching service: teaching assistant for Introduction to Computer Graphics; thesis supervisor; cluster server management.
Master Thesis Researcher · VLG
Computer Vision and Learning Group, ETH Zürich · Switzerland
Sept 2023 – Apr 2024
hover to expand
Master thesis on inverse rendering and relighting of glossy objects, building a relightable signed-distance-field representation.
- Supervisors: Prof. Siyu Tang, Dr. Sergey Prokudin.
- Outcome: RISE-SDF, accepted to 3DV 2025; thesis grade 6.00 / 6.00.
Semester Project Researcher
Disney Research | Studios · Zürich, Switzerland
Dec 2022 – May 2023
hover to expand
Developed CoARF — a controllable 3D artistic style-transfer method for radiance fields using segmentation masks and a semantic-aware nearest-neighbor matching algorithm.
- Supervisors: Dr. Clara Fernández Labrador, Dr. Christopher Schroers; overseen by Prof. Markus Gross.
- Outcome: CoARF, accepted to 3DV 2024 (also filed as a patent).
Doctoral Researcher
INSAIT · Sofia, Bulgaria
Apr 2025 – Present
hover to expand
Working with Prof. Luc Van Gool and Dr. Danda Paudel.
- Research areas: spatial reasoning, egocentric video understanding, controllable and consistent video generation, Gaussian splatting and relighting.
- Recent works: EgoNight (ICLR 2026), StateSpaceDiffuser (NeurIPS 2025), EgoSpot (ICRA Workshop 2026).
Researcher (IMPRS-IS)
Max Planck Research School for Intelligent Systems · Tübingen, Germany
Sept 2024 – Mar 2025
hover to expand
- Research areas: 3D scene understanding, Gaussian splatting.
- Teaching service: teaching assistant for Introduction to Computer Graphics; thesis supervisor; cluster server management.
Master Thesis Researcher · VLG
Computer Vision and Learning Group, ETH Zürich · Switzerland
Sept 2023 – Apr 2024
hover to expand
Master thesis on inverse rendering and relighting of glossy objects, building a relightable signed-distance-field representation.
- Supervisors: Prof. Siyu Tang, Dr. Sergey Prokudin.
- Outcome: RISE-SDF, accepted to 3DV 2025; thesis grade 6.00 / 6.00.
Semester Project Researcher
Disney Research | Studios · Zürich, Switzerland
Dec 2022 – May 2023
hover to expand
Developed CoARF — a controllable 3D artistic style-transfer method for radiance fields using segmentation masks and a semantic-aware nearest-neighbor matching algorithm.
- Supervisors: Dr. Clara Fernández Labrador, Dr. Christopher Schroers; overseen by Prof. Markus Gross.
- Outcome: CoARF, accepted to 3DV 2024 (also filed as a patent).

Academic Services

Conference Reviewer · ICML 2026 Gold Reviewer · NeurIPS 2026 · CVPR 2026
Journal Reviewer · IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)

Visitor Map

Where readers come from

FewerMore visits Powered by Google Charts · seed data, updated periodically