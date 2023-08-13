In recent times, there has been a growing fascination with the task of acquiring a 3D generative model from 2D images. With the advent of Neural Radiance Fields (NeRF), the quality of images produced from a 3D model has witnessed a significant advancement, rivaling the photorealism achieved by 2D models. While specific approaches focus solely on 3D representations to ensure consistency in the third dimension, this often comes at the expense of reduced photorealism. More recent studies, however, have shown that a hybrid approach can overcome this limitation, resulting in intensified photorealism. Nonetheless, a notable drawback of these models lies in the intertwining of scene elements, including geometry, appearance, and lighting, which hinders user-defined control.

Various approaches have been proposed to untangle this complexity. However, they demand collections of multiview images of the subject scene for effective implementation. Unfortunately, this requirement poses difficulties when dealing with images taken under real-world conditions.

The proposed framework, known as FaceLit, introduces a method for acquiring a disentangled 3D representation of a face exclusively from images. The approach revolves around constructing a rendering pipeline that enforces adherence to established physical lighting models, similar to prior work, tailored to accommodate 3D generative modeling principles. The framework capitalizes on readily available lighting and pose estimation tools.

The physics-based illumination model is integrated into the recently developed Neural Volume Rendering pipeline, EG3D, which uses tri-plane components to generate deep features from 2D images for volume rendering. Spherical Harmonics are utilized for this integration. Subsequent training focuses on realism, taking advantage of the framework’s inherent adherence to physics to generate lifelike images. This alignment with physical principles naturally facilitates the acquisition of a disentangled 3D generative model.

The proposed approach is implemented and tested across three datasets: FFHQ, CelebA-HQ, and MetFaces. According to the authors, this yields state-of-the-art FID scores, positioning the method at the forefront of 3D-aware generative models. FaceLit provides a new AI framework for acquiring a disentangled 3D representation of a face exclusively from images.