Evaluating Mixed-Methods Models for the Estimation of Ancestry from Skeletal Remains
Main Article Content
Abstract
Many methods of ancestry estimation in forensic anthropology consider only a single type of data, e.g., metric or morphoscopic. These data are typically limited in scope, being derived from only a few skeletal regions—that is, cranium, dentition, or postcranial skeleton—and are not independent variables. Methods of ancestry estimation should incorporate this potential covariation into their estimates.
Cranial morphoscopic and dental morphological traits were observed on a sample of 683 individuals of European American, African American, and Hispanic ancestry. These data were used to train and test models generated with random forest modeling and naive Bayesian analysis. Three models were generated with each method: one using only cranial data, one using only dental data, and one using a combination of those data sets. These models were compared to evaluate differences in performance from the combination of the cranial and dental data.
Model performance was evaluated using the positive predictive value, sensitivity, and the Matthews correlation coefficient, or where appropriate the RK statistic, and differences were examined both overall and by specific ancestry group. The general trend is for models based on the combination of cranial and dental data to exhibit higher performance, across metrics, than single-variable models. There is some variation in the performance of specific data sets on particular ancestry groups. For example, dental-based models have high sensitivity scores when classifying European American individuals. However, even when applied to specific groups, combined data models are better classifiers.