gotardop <at> gmail <dot> com
UPDATE: I have recently left Disney Research to join Google AR in Zurich. Another update will appear here soon.
I’m a Senior Research Scientist with the Digital Humans group at DisneyResearch|Studios (DRS) in Zurich, Switzerland, where I supervise a mixed team of research scientists, engineers, and students while additionally helping implement our Lab's mission within The Walt Disney Company.
At DRS, I lead research work on computer vision, graphics and machine learning, with a focus on inverse/neural rendering for capturing and modeling highly-realistic digital humans. Prior to joining the Zurich team, I was also with Disney Research in Pittsburgh, at the Carnegie Mellon University campus. I received my BSc (2000) and MSc (2002) degrees in Informatics from Federal University of Parana (UFPR), Brazil, and my PhD degree (2010) in Electrical and Computer Engineering from The Ohio State University (OSU), USA. While at OSU, I was also a postdoc at the Computational Biology and Cognitive Science Lab (CBCSL) and a graduate research associate with the Advanced Computing Center for the Arts and Design (ACCAD).
Here's a list of conference and journal papers describing some of my research work. If you are looking for source code related to my work on nonrigid structure-from-motion, you can jump to my old OSU page here.
We propose a face model based on a data-driven, implicit neural physics that can be driven by both expression and style separately. At the core, we present a framework for learning implicit physics-based actuations for multiple subjects simultaneously, trained on a few arbitrary performance capture sequences from a small set of identities. [Project Page]
We present a novel technique for high-quality capture of a human face for 3D view synthesis and relighting using a sparse, compact capture rig consisting of 15 cameras and 15 lights. Our method combines a volumetric representation of the face reflectance with traditional multi-view stereo based geometry reconstruction. The proxy geometry allows us to anchor the 3D density field to prevent artifacts and guide the disentanglement of intrinsic radiance components of the face appearance such as diffuse and specular reflectance, direct and indirect light transport fields. Our hybrid representation significantly improves the state-of-the-art quality for arbitrarily dense renders of a face from desired camera viewpoint as well as environmental, directional, and near-field lighting. [Project Page]
We propose ReNeRF, a relightable radiance field model based on the intuitive and powerful approach of image-based relighting, which implicitly captures global light transport for arbitrary objects without complex, error-prone simulations. ReNeRF is simple and provides full control over viewpoint and lighting, without simplistic assumptions about how light interacts with the scene. ReNeRF generalizes to novel, continuous lighting directions, including nearfield lighting effects. [Project Page]
In this work, we propose a new loss function for monocular face capture, inspired by how humans would perceive the quality of a 3D face reconstruction given a particular image. It is widely known that shading provides a strong indicator for 3D shape in the human visual system. As such, our new perceptual shape loss aims to judge the quality of a 3D face estimate using only shading cues. [Project Page]
We propose the first facial landmark detection network that can predict continuous, unlimited landmarks. Our method allows the user to specify the number and location of the desired landmarks at inference time, as continuous 3D query points relative to a 3D template model. [Project Page]
Cleft lip and palate is the most frequent craniofacial malformation in newborns, without effective preventive measures. The use of intra-oral orthopedic plates reduces the cleft size, facilitating surgical treatment. This project, Burden-Reduced Cleft Lip and Palate Care and Healing (BRCCH), aims at an automatic, image-based design (e.g., using smartphone videos) of personalized oral plates that are fabricated using 3D printers. Ultimately, the goal is to facilitate the use of plate therapy in low-income countries. This project is funded by the Botnar Research Center for Child Health (BRCCH) and implemented in collaboration with the Computer Graphics Lab (CGL) at ETH Zürich and the team of Dr. Andreas Müller from the University Hospital in Basel. [Project page]
Gaspard Zoss, Prashanth Chandran, Eftychios Sifakis, Markus Gross, Paulo Gotardo, Derek Bradley
This paper presents the first practical, fully-automatic and production-ready method for re-aging faces in video images. We show how a longitudinal re-aging dataset can be constructed using a state-of-the-art facial re-aging method that, although failing on real images, does provide photoreal re-aging on synthetic faces. We leverage such synthetic data and formulate facial re-aging as a practical image-to-image translation task with a simple U-Net. [Project page]
C. Otto, J. Naruniec, L. Helminger, T. Etterlin, G. Mignone, P. Chandran, G. Zoss, C. Schroers, M. Gross, P. Gotardo, D. Bradley, R. Weber
We approach face swapping as learning simultaneous facial autoencoders for the source and target identities, using a shared encoder network with identity-specific decoders. Our decoders first lift the latent code into a 3D representation, before using a differentiable renderer, thus allowing for artistic control over the result. Training does not require 3D supervision, leading to better results than when using off-the-shelf monocular 3D face reconstruction. [Project page]
Prashanth Chandran, Gaspard Zoss, Markus Gross, Paulo Gotardo, Derek Bradley
We present a 3D+time morphable model that learns a motion manifold using a transformer autoencoder. This new model can synthesize temporal sequences of 3D meshes with arbitrary length and identity. [Project page]
Daoye Wang, Prashanth Chandran, Gaspard Zoss, Derek Bradley, Paulo Gotardo
We present MoRF, morphable radiance fields that extend NeRFs into generative models for synthesizing photorealistic human heads with controllable and fully disentangled identity and 3D pose. MoRF allows for applications such as synthesizing new photorealistic subjects or quickly fitting a NeRF to one or more full-head portrait images. [Project page]
Sebastian Winberg, Gaspard Zoss, Prashanth Chandran, Paulo Gotardo, Derek Bradley
We reconstruct and track individual facial hairs over complex performance sequences in a traditional multiview setup. We additionally create a realistic approximation of the dynamic clean-shaven facial surface, as if the actor had been captured without facial hair, thus removing the need to actually shave. [Project page]
Yingyan Xu, Jérémy Riviere, Gaspard Zoss, Prashanth Chandran, Derek Bradley, Paulo Gotardo,
We compare the results obtained with a state-of-the-art appearance capture method [RGB∗20], with and without our proposed improvements to the lighting model. [Project page]
Prashanth Chandran, Sebastian Winberg, Gaspard Zoss, Jérémy Riviere, Markus Gross, Paulo Gotardo, Derek Bradley
We propose to combine incomplete, high-quality renderings showing only facial skin with recent methods for neural rendering of faces, in order to automatically and seamlessly create photo-realistic full-head portrait renders from captured data without the need for artist intervention. [Project page]
Prashanth Chandran, Gaspard Zoss, Markus Gross, Paulo Gotardo, Derek Bradley
We propose Adaptive convolutions; a generic extension of AdaIN, which allows for the simultaneous transfer of both statistical and structural styles in real time. [Project page]
Jérémy Riviere, Paulo Gotardo, Derek Bradley, Abhijeet Ghosh, Thabo Beeler
We propose a new light-weight face capture system capable of reconstructing both high-quality geometry and detailed appearance maps from a single exposure. [Project page]
Paulo Gotardo, Jérémy Riviere, Derek Bradley, Abhijeet Ghosh, Thabo Beeler
We present a method to acquire dynamic properties of facial skin appearance, including dynamic diffuse albedo encoding blood flow, dynamic specular intensity, and per-frame high resolution normal maps for a facial performance sequence. [Project page]
Zdravko Velinov, Marios Papas, Derek Bradley, Paulo Gotardo, Parsa Mirdehghan, Steve Marschner, Jan Novak, Thabo Beeler
We present a system specifically designed for capturing the optical properties of live human teeth such that they can be realistically re-rendered in computer graphics. [Project page]
Dan Calian, Tomas Simon, Paulo Gotardo, Jean-François Lalonde, Iain Matthews, Kenny Mitchell
This paper presents an approach to directly estimate an HDR light probe from a single LDR photograph, shot outdoors with a consumer camera, without specialized calibration targets or equipment. [Project page]
Paulo Gotardo, Tomas Simon, Yaser Sheikh, Iain Matthews
This paper proposes photogeometric scene flow (PGSF) for high-quality dynamic 3D reconstruction. Results are obtained as the coupled solution of multiview stereo, photometric stereo, and optical flow (with relighting). [Project page]
Yannick Hold-Geoffroy, Jinsong Zhang, Paulo Gotardo, Jean-François Lalonde
This work investigates the numerical conditioning and solutions for outdoor photometric stereo under uncontrolled, natural illumination in which the main light source, the Sun, shines from nearly co-planar directions throughout the day. We show the events that contribute to making the problem solvable over variable weather and short time intervals.
While I was a PhD student and then postdoc at The Ohio State University, I developed state-of-the-art models and algorithms for matrix factorization and non-rigid structure from motion (NR-SfM), which were published in main computer vision venues and subsequently achieved 2nd, 3rd, and 4th places in the first NR-SfM challenge at CVPR 2017. More info and source codes are found on my old OSU home page, here: [Project page]
Paulo Gotardo, Alan Price
In a time before Microsoft's Kinect, this project explored the use of real-time stereo vision and skeletonization to provide 3D human body awareness in an inexpensive, immersive environment system. The goal was to enhance the user experience of immersion in a virtual scene projected in 3D, allowing for both the user and the virtual scene to become aware of each other's presence as part of a single, integrated 3D space. We focused on enabling authoring applications with direct manipulation of virtual objects, with users interacting from a first-person perspective (demo video). This emphasis contrasts with the avatar-based, reactive focus of game interfaces. For more info, please see my old OSU page: [Project page]
Paulo Gotardo, Kim L. Boyer, Joel Saltz, Subha V. Raman
Intra-ventricular dyssynchrony (IVD) in the left ventricle (LV) is caused by the asynchronous activation of the LV walls. Guidelines for resynchronization therapy rely on measures that do not reliably predict successful patient response to treatment, in part due to poor characterization of IVD. We present a two-class statistical pattern recognition approach for the detection of IVD in the LV from routinely acquired MRI sequences depicting complete cardiac cycles.
Paulo Gotardo, Olga R.P. Bellon, Luciano Silva, Kim L. Boyer
We present a novel robust estimator to iteratively detect and extract distinct planar and quadric surface patches in depth images. Our robust estimator extends M-estimator Sample Consensus/Random Sample Consensus (MSAC/RANSAC) to use local surface orientation, enhancing inlier/outlier classification when processing noisy range data describing multiple structures. An efficient approximation to the true geometric distance between a point and a quadric surface is also proposed. A genetic algorithm was specifically designed to accelerate the optimization process.