NeuS2: Fast Learning of Neural Implicit Surfaces for Multi-view Reconstruction
Yiming Wang*, Qin Han*, Marc Habermann, Kostas Daniilidis, Christian Theobalt, Lingjie Liu
ICCV 2023
NeuS2 is a method for fast neural surface reconstruction, which achieves two orders of magnitude improvement in terms of acceleration without compromising reconstruction quality, compared to NeuS. To accelerate the training process, we integrate multi-resolution hash encodings into a neural surface representation and implement our whole algorithm in CUDA. In addition, we extend our method for reconstructing dynamic scenes with an incremental training strategy.
This project is an extension of Instant-NGP enabling it to model neural surface representation and dynmaic scenes. We extended:
- dependencies/neus2_TCNN
- add second-order derivative backpropagation computation for MLP;
- add progressive training for Grid Encoding.
- neural-graphics-primitives
- extend NeRF mode for NeuS;
- add support for dynamic scenes.
Please first see Instant-NGP for original requirements and compilation instructions. NeuS2 follows the installing steps of Instant-NGP.
Clone this repository and all its submodules using the following command:
git clone --recursive https://github.com/19reborn/NeuS2
cd NeuS2
Then use CMake to build the project:
cmake . -B build
cmake --build build --config RelWithDebInfo -j
For python useage, first install dependencies with conda and pip:
conda create -n neus2 python=3.9
conda activate neus2
pip install -r requirements.txt
Then check https://pytorch.org/ for pytorch installation, and https://github.com/facebookresearch/pytorch3d for pytorch3d installation.
If you meet problems of compiling, you may find solutions in https://github.com/NVlabs/instant-ngp#troubleshooting-compile-errors.
You can specify a static scene by setting --scene
to a .json
file containing data descriptions.
The DTU Scan24 scene can be downloaded from Google Drive:
./build/testbed --scene ${data_path}/transform.json
Or, you can run the experiment in an automated fashion through python bindings:
python scripts/run.py --scene ${data_path}/transform.json --name ${your_experiment_name} --network ${config_name} --n_steps ${training_steps}
For training on DTU dataset, config_name
can be dtu.json
and training_steps
can be 15000
.
Also, you can use scripts/run_dynamic.py
as:
python scripts/run_dynamic.py --scene ${data_path}/transform.json --name ${your_experiment_name} --network ${config_name}
, where the number of training iterations is specified in the config.
The outputs and logs of the experiment can be found at output/${your_experiment_name}/
.
To specify a dynamic scene, you should set --scene
to a directory containing .json
files that describe training frames.
./build/testbed --scene ${data_dirname}
Or, run scripts/run_dynamic.py
using python:
python scripts/run_dynamic.py --scene ${data_dirname} --name ${your_experiment_name} --network ${config_name}
There are some hyperparameters of the network configuration, such as configs/nerf/base.json
, to control the dynamic training process:
first_frame_max_training_step
: determine the number of training iterations for the first frame, default2000
.next_frame_max_training_step
: determine the number of training iterations for subsequent frames, default1300
, including global transformation prediction.predict_global_movement
: settrue
if use global transformation prediction.predict_global_movement_training_step
: determine the number of training iterations for global transformation prediction, default300
. Only valid whenpredict_global_movement
istrue
.
Also, we provide scripts to reconstruct dynamic scenes by reconstructing static scene frame by frame.
python scripts/run_per_frame.py --base_dir ${data_dirname} --output_dir ${output_path} --config ${config_name}
Dynamic scene examples can be downloaded from Google Drive.
Our NeuS2 implementation expects initial camera parameters to be provided in a transforms.json
file, organized as follows:
{
"from_na": true, # specify NeuS2 data format
"w": 512, # image_width
"h": 512, # image_height
"aabb_scale": 1.0,
"scale": 0.5,
"offset": [
0.5,
0.5,
0.5
],
"frames": [ # list of reference images & corresponding camera parameters
{
"file_path": "images/000000.png", # specify the image path (should be relative path)
"transform_matrix": [ # specify extrinsic parameters of camera, a camera to world transform (shape: [4, 4])
[
0.9702627062797546,
-0.01474287360906601,
-0.2416049838066101,
0.9490470290184021
],
[
0.0074799139983952045,
0.9994929432868958,
-0.0309509988874197,
0.052045613527297974
],
[
0.2419387847185135,
0.028223415836691856,
0.9698809385299683,
-2.6711924076080322
],
[
0.0,
0.0,
0.0,
1.0
]
],
"intrinsic_matrix": [ # specify intrinsic parameters of camera (shape: [4, 4])
[
2892.330810546875,
-0.00025863019982352853,
823.2052612304688,
0.0
],
[
0.0,
2883.175537109375,
619.0709228515625,
0.0
],
[
0.0,
0.0,
1.0,
0.0
],
[
0.0,
0.0,
0.0,
1.0
]
]
},
...
]
}
Each transforms.json
file contains data about a single frame, including camera parameters and image paths. You can specify specific transform files, such as transforms_test.json
and transforms_train.json
, to use for training and testing with data splitting.
For example, you can organize your dynamic scene data as:
<case_name>
|-- images
|-- 000280 # target frame of the scene
|-- image_c_000_f_000280.png
|-- image_c_001_f_000280.png
...
|-- 000281
|-- image_c_000_f_000281.png
|-- image_c_001_f_000281.png
...
...
|-- train
|-- transform_000280.json
|-- transform_000281.json
...
|-- test
|-- transform_000280.json
|-- transform_000281.json
...
Images are four-dimensional, with three channels for RGB and one channel for the mask.
We also provide a data conversion from NeuS to our data convention, which can be found in tools/data_format_from_neus.py
.
@inproceedings{neus2,
title={NeuS2: Fast Learning of Neural Implicit Surfaces for Multi-view Reconstruction},
author={Wang, Yiming and Han, Qin and Habermann, Marc and Daniilidis, Kostas and Theobalt, Christian and Liu, Lingjie},
year={2023},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}
}