OrionStar Leads MegaFace Million-Scale Face Recognition Challenge

2019-12-03 12:37:09

These days, it’s relatively simple for a computer algorithm to correctly identify images when the dataset is small. That is the case with most facial recognition challenges, like Labeled Faces in the Wild, which has a dataset consisting of just 13,000 images. But what about datasets that are many times larger? That is exactly what researchers at the University of Washington wanted to find out when they launched their MegaFace Challenge, an open competition aimed at evaluating the performance of facial recognition algorithms at the million-person scale. Many teams achieved impressive results in the challenge, but one team stood out above the rest. On March 21st, OrionStar posted the highest identification rate ever on the MegaFace challenge. OrionStar’s score of 98.355% beat out more than forty other competitors, including Google and Tencent.

Despite advances in AI and big data in recent years, computers still have trouble recognizing faces in large datasets. In an attempt to understand this problem, researchers at the University of Washington created a massive database of faces called MegaFace to test different facial recognition algorithms as they increased in complexity. The researchers started with existing labeled image sets of celebrities and individuals with widely varying ages. They then introduced noise to the dataset in the form of “distractors” (one million Flickr images from around the world that are publicly available under a Creative Commons license). Teams were then called upon to download the database and see how their algorithms performed when they had to distinguish between a million possible matches.

The MegaFace challenge tested the algorithms on verification, or how well they could correctly identify whether two photos were of the same person, and identification, or how accurately they could match the photo of a single individual to a different photo of the same person mixed in with a million distractors.

For its part, OrionStar competed in the identification group of the FaceScrub dataset in Challenge 1, which tested identification with a varying number of distractors. The OrionStar team trained three deep networks (ResNet-101, ResNet-152, ResNet-200) with joint softmax and triplet loss on MS-Celeb-1M (95K identities, 5.1M images), and the triplet part was trained by batch online hard negative mining with subspace learning. The features of all networks were concatenated to produce the final feature, whose dimension was set to be 256x3. For data processing, they used original large images and followed its own system by detection and alignment. Particularly, in evaluation, they cleaned the FaceScrub (a dataset which includes 100K photos of 530 celebrities) and MegaFace with the code released by iBUG_DeepInsight.

OrionStar’s success in this challenge echoes its first-place ranking in the "MS-Celeb-1M" challenge, a competition in 2017 to recognize and identify images of one million celebrities in a pre-set database. The challenge was organized by Microsoft Research (MSR) for a workshop at ICCV 2017, the world's premier international computer vision event.