1 Introduction

Human trafficking is one of the most adverse social issues currently faced by countries worldwide. According to the United Nations Children’s Fund (UNICEF) and the Inter-Agency Coordination Group against Trafficking (ICAT), 28% of the identified victims of human trafficking globally are children [7] . The Wall Street Journal reported in 2012 that it is estimated that around 8 million children go missing around the world every year [ 8 ] . Children separated from their parents, such as refugees and migrants, are most vulnerable to trafficking. According to the FBI, in 2018 there were 424 , 066 NCIC (National Crime Information Center) entries for missing children in the United States [ 9 ] . As of 2018, juveniles under the age of 18 account for 34.8 % of the total active missing records in NCIC [ 9 ] . The actual number of missing children is much more than these official statistics as only a limited number of cases are reported because of the fear of traffickers, lack of information, and mistrust of authorities. million children go missing around the world every year. Children separated from their parents, such as refugees and migrants, are most vulnerable to trafficking. According to the FBI, inthere wereNCIC (National Crime Information Center) entries for missing children in the United States. As of 2018, juveniles under the age of 18 account for% of the total active missing records in NCIC. The actual number of missing children is much more than these official statistics as only a limited number of cases are reported because of the fear of traffickers, lack of information, and mistrust of authorities.

Face recognition is perhaps the most promising biometric technology for recovering missing children, since parents and relatives are more likely to have a lost child’s face photograph than other biometric modalities such as fingerprint or iris . While Automated Face Recognition (AFR) systems have been able to achieve high identification rates [3, 2, 11] , their ability to recognize children as they age is still limited.

A human face undergoes various temporal changes, including skin texture, weight, facial hair, etc. (see Figure Finding Missing Children: Aging Deep Face Features) [12, 13] . Several studies have analyzed the extent to which facial aging affects the performance of AFR (see Table 2). Two major conclusions can be drawn based on these studies: (i) Performance decreases with an increase in time lapse between subsequent image acquisitions [14, 15, 16] , and (ii) performance degrades more rapidly in the case of younger individuals than older individuals [16, 17] . Figure 7 illustrates that state-of-the-art face matchers fail considerably when it comes to matching an enrolled child in the gallery with the corresponding probe over large time lapses. Thus, it is essential to enhance the longitudinal performance of AFR systems, especially when the child is enrolled at a young age.

Locating missing children is analogous to the identification scenario (either open-set or closed-set) in face recognition where we search a gallery of missing children to determine the identity of a child retrieved at a later age (probe). As the time gap between a probe image and the true mate in the gallery, gets larger, the search problem gets harder.

(a) Saroo Brierley [ 18 ] (b) Jaycee Dugard [ 19 ] (c) Carlina White [ 20 ] Figure 4: Face images of missing children in three high profile cases who were successfully recovered after a large time lapse.

(a) (b) Figure 7: Heat map of rank-1 identification accuracy (%) (a) without modifying FaceNet features by the proposed aging module and (b) with modifying FaceNet features by the proposed aging module (darker colors indicate higher accuracy). The age of the child in the gallery along with time lapse to the probe are shown along the two axes.

Prior studies on face recognition under aging, both for adults and children, explored both generative and discriminative models. Given a probe face image, generative models can generate face images that can either predict how the person will look over time (age progression) or estimate how he looked in previous years (age regression) by utilizing Generative Adversarial Networks (GANs) [22, 23, 24, 25, 26, 27] . The primary motivation is to enhance the visual quality of the age progressed or regressed face images, rather than enhancing the face recognition performance. On the other hand, discriminative approaches focus on age-invariant face recognition under the assumption that age and identity related information can be separated [28, 29, 30, 31, 4, 5] . By separating age-related components, only the identity-related information is used for face matching. Since age and identity are highly correlated in the feature space, the task of disentangling them from face embeddings is not only difficult but can also be detrimental to AFR performance [32, 33] .

A majority of the prior studies on cross-age face recognition [28, 30, 31, 26, 5, 4] evaluate the performance of their models on longitudinal face datasets, such as MORPH (13,000 subjects in the age range of 16-77 years) and CACD (2,000 subjects in the age range of 16-62 years), which mainly comprise adult face images. Indeed, some benchmark face datasets such as FG-NET (82 subjects in the age range of 0-69 years) do include a small number of children, however, the associated protocol is based on matching all possible comparisons for all ages, which does not explicitly provide child-to-adult matching performance. Moreover, earlier studies employ cross-sectional techniques where the temporal performance is analyzed according to differences between age groups [5, 23, 34] . In cross-sectional or cohort-based approaches, which age groups or time lapses are evaluated is often arbitrary and varies from one study to another, thereby, making comparisons between studies difficult [35, 36] . Furthermore, cross-sectional analysis with summary statistics does not investigate whether age-related face recognition performance trends are due to other noise factors such as variations in illumination, expression, and pose. For these reasons, since facial aging is longitudinal by nature, cross-sectional analysis is not the correct model for exploring aggregated effects [35, 37, 38] . The correct model is the longitudinal model that has been utilized for temporal data for fingerprints [35] , face [37, 15] and iris [38] .

We propose an age-progression module that learns a projection in the feature space and can be used as a wrapper around any commodity face matcher. Our module can also synthesize the face image corresponding to aged features for a given individual and specified target age. Our empirical results show that the proposed module, based on an encoder-decoder architecture, can enhance the longitudinal face recognition performance of three face matchers (FaceNet [2] , CosFace [3] , and a commercial-off-the-shelf (COTS) matcher) for matching children as they age.