PAPR: Proximity Attention Point Rendering

🌟 NeurIPS 2023 Spotlight 🌟

APEX Lab, Simon Fraser University
* Denotes Equal Contribution.

Overview

Given a set of images from different views and their corresponding camera poses, PAPR learns a point-based surface representation of the scene and a rendering pipeline from scratch. Additionally, PAPR enables practical applications such as geometry editing, object manipulation, texture transfer, and exposure control.

Qualitative Results

We show the RGB rendering of the scene in the first row and the corresponding learnt point cloud in the second row.

Learning Point Positions from Scratch

Our method deforms the initial point cloud to correctly represent the target geometry. In contrast, the baselines either fail to recover the geometry, produce noisy results, or lack structural details in the learnt geometry.

Zero-shot Geometry Editing

We can edit the geometry of the scene by simply manipulating the point cloud of the scene without any additional supervision. Here we demonstrate rigid bending motions applied to the ficus branch and the Lego bulldozer's arm in the first column, roation of the statue's head and the ship in the second column, and non-volume preserving stretching transformations applied to the tip of the microphone and the back of the chair in the third column.

Comparison with Gaussian Splatting

Gaussian Splatting produces significant noise after the non-volume preserving stretching transformation. In contrast, our method successfully avoids creating holes and effectively preserves the texture details after the transformation.

Object Manipulation

We can edit the scene by adding, removing or duplicating points in the point cloud. Here we demonstrate the addition of an extra hotdog to the plate (left), and the removal of certain material balls while duplicating others (right).

Texture Transfer

We can transfer the texture from one part of the scene to another by transferring the associated feature vectors of the corresponding points. Here we transfer the texture of the mustard to the ketchup by transferring the features of the points that correspond to the mustard (highlighted in yellow) to a subset of points that correspond to the ketchup (highlighted in red).

Exposure Control

We introduce an additional latent code input into our model and train it using a technique called conditional Implicit Maximum Likelihood Estimation (cIMLE). During test time, we can manipulate the exposure of the rendered image by changing the latent code input.

BibTeX

@inproceedings{zhang2023papr,
      title={PAPR: Proximity Attention Point Rendering},
      author={Yanshu Zhang and Shichong Peng and Seyed Alireza Moazenipourasil and Ke Li},
      booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
      year={2023}
  }