Facial recognition in production is difficult because neural networks are slow and expensive to train, and must be re-trained to recognize new faces added to the set. Older approaches which address these issues such as eigenfaces exist don’t scale as they require a matrix decomposition. Apache Mahout offers a distributed singular value decomposition method, which scales to matrices of arbitrary sizes on Apache Spark, making it possible to use the older yet still powerful eigenfaces approach to recognize and add new faces in near real time (with the help of Apache Solr).
In this talk we present a full stack lambda-style facial recognition system. The offline component uses Apache Mahout/ Apache Spark to compute the eigenfaces. The online component leverages Apache Flink to identify faces in an image, decompose the face into a linear combination of the eigenfaces, search for a matching face using SOLR, and if no match is found add the face as a “new face.”