I am a R&D engineer at ASTRA team, Inria Paris. My current research topics are path and trajectory planning for autonomous vehicles, and deep reinforcement learning for autonomous applications. In 2022, I have received my PhD degree in Network, information and communication from Université Paris-Saclay, CentraleSupélec under the supervision of Dr. Salah Eddine Elayoubi and Dr. Vineeth S. Varma. My PhD thesis is entilted and can be checked Robust control of platooning systems over imperfect wireless channels. In 2018, I have received a MSc degree in Electrical Engineering with emphasis in automation from Universidade Estadual de Campinas (UNICAMP), Brazil under the supervision of Dr. Jose C. Geromel and Dr. Gabriela W. Gabriel. In 2016, I received my B.S. degree in electrical engineering from Federal University of Rio Grande do Norte - UFRN, Brazil, with Laureate Mention (rank 1/60+).
@inproceedings{cao2022monoscene,
title={MonoScene: Monocular 3D Semantic Scene Completion},
author={Anh-Quan Cao and Raoul de Charette},
booktitle={CVPR},
year={2022}
}
MonoScene proposes a 3D Semantic Scene Completion (SSC) framework, where the dense geometry and semantics of a scene are inferred from a single monocular RGB image. Different from the SSC literature, relying on 2.5 or 3D input, we solve the complex problem of 2D to 3D scene reconstruction while jointly inferring its semantics. Our framework relies on successive 2D and 3D UNets bridged by a novel 2D-3D features projection inspiring from optics and introduces a 3D context relation prior to enforce spatio-semantic consistency. Along with architectural contributions, we introduce novel global scene and local frustums losses. Experiments show we outperform the literature on all metrics and datasets while hallucinating plausible scenery even beyond the camera field of view. Our code and trained models are available at https://github.com/cv-rits/MonoScene
@inproceedings{cao21pcam,
title={{PCAM}: {P}roduct of {C}ross-{A}ttention {M}atrices for {R}igid {R}egistration of {P}oint {C}louds},
author={Cao, Anh-Quan and Puy, Gilles and Boulch, Alexandre and Marlet, Renaud},
booktitle={International Conference on Computer Vision (ICCV)},
year={2021},
}
Rigid registration of point clouds with partial overlaps is a longstanding problem usually solved in two steps: (a) finding correspondences between the point clouds; (b) filtering these correspondences to keep only the most reliable ones to estimate the transformation. Recently, several deep nets have been proposed to solve these steps jointly. We built upon these works and propose PCAM: a neural network whose key element is a pointwise product of cross-attention matrices that permits to mix both low-level geometric and high-level contextual information to find point correspondences. A second key element is the exchange of information between the point clouds at each layer, allowing the network to exploit context information from both point clouds to find the best matching point within the overlapping regions. The experiments show that PCAM achieves state-of-the-art results among methods which, like us, solve steps (a) and (b) jointly via deepnets.