MorphNeRF: Text-Guided 3D-aware Editing via Morphing Generative Neural Radiance Fields

Abstract

Generative neural radiance fields (NeRF) bring image generation into the 3D era, which have delivered impressive generation quality and 3D consistency, especially in the face generation domain. Upon pre-trained generative NeRF, 3D-aware image editing has been explored and achieved promising performance via manipulating semantic maps or attributes. However, a more flexible editing interface, text, remains under-explored in the context of 3D-aware image editing. In this work, we leverage the Contrastive Language-Image Pre-training (CLIP) model to achieve 3D-aware image editing in pre-trained generative NeRF models given a target text prompt. To achieve accurate and controllable geometry editing, we propose MorphNeRF, a learnable morphing network that morphs the 3D geometry of images toward the target descriptions via generative NeRF. Different from prior studies that achieve image editing by manipulating latent codes or directly finetuning pre-trained models, morphing the geometry can better preserve the texture of the source image and facilitate the control of editing strength by adjusting the weight of morphing maps explicitly. Extensive experiments and comparisons show that the proposed MorphNeRF achieves superior image editing performance.

Publication
In IEEE Transactions on Multimedia (TMM), 2024.