Codebook-NeRF

Abstract

In this paper, we propose a new NeRF[1] method that can restore high-resolution details of low-resolution images without reference images. To this end, while maintaining the Super Resolution process of NeRF-SR[2], the codebook structure of VQ-VAE[3] is introduced to learn the patterns of high-resolution images and improve the definition technique. The number of embedding vectors in the codebook was increased to learn more high-resolution information, and it is trained to imitate high-resolution latent characteristics without reference images through Imaging Inference. As a result of the experiment, the proposed model maintained the PSNR performance of NeRF-SR[2], and succeeded in generating clear and detail-rich images.

Codebook-NeRF

Train Pipeline of Codebook-NeRF: Train SR (low-resolution) patches to mimic the characteristics of HR (high-resolution) patches. (a) Input both HR and SR patches into the codebook. (b) Learn high-resolution latent features from the HR patches through the codebook, training SR patches to imitate these features. (c) Reconstruct HR patches' latent features using a decoder, combining the output from the codebook at each deconvolution layer to produce a high-resolution image. (d) Employ a UNet structure to enhance reconstruction by incorporating high-resolution details obtained at each stage as additional input.

Test Pipeline of Codebook-NeRF: Use only SR patches to restore high-resolution images. (a) Input SR patches into the codebook to generate latent representations with high-resolution features. (b) Pass these representations to the decoder to produce the final high-resolution image. (c) Apply high-resolution details learned by the codebook to SR patches, enabling high-resolution restoration without needing reference images.

Visualization: Here are examples of HuGS on different scenes (datasets). More results can be found in the paper and the data.

(1) Input

(2) Seg. w/ SfM

(3) H^SfM

(4) Color Residual

(5) H^CR

(6) Static Map

Rendering Results

Comparisons on the Distractor Dataset: Our method can better preserve static details while ignoring transient distractors.

BabyYoda

Mip-NeRF 360

w/ RobustNeRF

w/ HuGS (ours)

Crab

Statue

Android

Comparisons on the Kubric Dataset:

Pillow

Mip-NeRF 360

w/ RobustNeRF

w/ HuGS (ours)

Chairs

Cars

Comparisons on the Phototourism Dataset:

Brandenburg Gate

Mip-NeRF 360

w/ RobustNeRF

w/ HuGS (ours)

Sacre Coeur

Taj Mahal

Trevi Fountain

BibTeX

@article{chen2024codebooknerf,
  author    = {Chen, Jiahao and Qin, Yipeng and Liu, Lingjie and Lu, Jiangbo and Li, Guanbin},
  title     = {Codebook-NeRF: Improving NeRF resolution based on codebook},
  journal   = {CVPR},
  year      = {2024},
}