As shown in Table 1, we have analyzed 12 features related to demographic attributes, biomarkers, and sociomarkers of patients, implementing a random forest (RF) classification38 model to identify subjects at risk of a second hospital visit within one year. To ensure generalizability of the model, we trained our random forest model on the randomly selected 80% training data and tested the model on the remaining 20% of data. Further, to avoid overfitting in the model, we implemented five-fold cross-validation on the training dataset. In Model 1, in which we have used all available features (demographics, biomarkers, and sociomarkers), we obtained an average classification accuracy of 66.11% for the training set and 66.05% for the test set, as presented in Table 2. To evaluate the performance of the proposed model, we also calculated specificity and sensitivity from the confusion matrix. We obtained a specificity and sensitivity of 67.67 and 64.82% from the five-fold cross-validated training set and 67.63 and 64.82% from the test set, respectively. We can see that the model performs stably given the similarity across the test set and the training set. Overall, this all-inclusive model can predict the pediatric asthma patient at risk of hospital revisit with an accuracy of 66.05%.

Table 1 Variables and operationalization Full size table

Table 2 Classification statistics (in %) for each model with RF and SVM techniques Full size table

By using only demographics and biomarkers attributes in Model 2, we achieved a little lower accuracy of 65.48 and 65.39% from the five-fold cross-validated training set and test set, respectively. The specificity and sensitivity for the training and test sets were determined as 67.12, 64.14, 67.11, and 64.07%, respectively. This suggests that symptom-related features can identify the patient at risk with an accuracy of 65%. With Model 3, based on demographics and sociomarkers, we obtained an average classification accuracy of 61.28% from the cross-validated training set and 61.17% from the test set, respectively. The specificity and sensitivity for the training and test sets are 62.70, 60.16, 62.59, and 60.11%, respectively. Interestingly, without using any symptom-related predictors, simple information gathered from the ZIP code level and demographic characteristics of a patient still allow us to predict which patients will revisit the hospital with 61% accuracy and the results are stable with test data as well. We also implemented a secondary cross-validation for Model 3 by making sure that the training and testing datasets are extracted from different ZIP codes to evaluate whether our models would be valid for data coming from different neighborhoods. Our secondary cross-validation yielded 56.66% accuracy on the training data and 66.60% on the test data. Our results showed that the models we develop can be generalized to the other neighborhoods considered in this study. In this supplementary analysis, we note that the testing performance is better than the training performance. This unknown fit, where validation error is low and training error is relatively high,39 is may be due to the design of the ZIP code based cross-validation where the distribution of the total sample size and the distribution of Class0- Class1 cases across ZIP code areas significantly varies. However, deeper analysis on a larger dataset may be required to understand the reasons of unknown fit in this problem.

We also evaluated Support Vector Machine (SVM);40 however, the random forest classifier yielded the best classification performance on the training and test set.41,42,43 It can be observed from Table 2 that the proposed model does not over-fit and provides similar results for training and test datasets. The classification results from the SVM classifier are also presented in Table 2 and it can be noticed that the SVM classifier did not perform as much as the random forest classifier. The average accuracy, specificity, and sensitivity for the three models using the test set were found to be 62.1, 59.58, and 57.83%, respectively. Interestingly, the contribution margin of the sociomarkers are larger in this case as the difference of the means of the accuracy of Model 1 and Model 2 is 2.52 (compared to 0.65 in the random forest case).

Furthermore, to evaluate the null hypothesis that the mean accuracy obtained from each model, we conducted a two-sample t-test on the accuracies obtained from 1000 iterations of the test set from each model. As shown in Table 3, there are statistically significant differences between all models. The difference between Model 1 and Model 2 can show the statistically significant contribution of sociomarkers in predictive modeling. With the random forest classifier, the differences between the two groups are statistically significant at 0.001. The mean difference is 0.65 and the 95% confidence interval of the mean difference value of accuracies was found to be from 0.47 to 0.84. With the SVM classifier, comparing the difference between Model 1 and Model 2, the 1000 SVM results of Model 1 are statistically different from 1000 SVM results of Model 2. The mean difference is 2.51 and the 95% confidence interval of the mean difference value of accuracies was found to be from 2.31 to 2.70. Overall, the contribution of the sociomarkers is larger in the results using the SVM classifier.

Table 3 Two-tailed t-test results to compare accuracies of models Full size table

To evaluate the relative significance of each feature using the random forest classifier, we conducted a feature importance analysis as shown in Table 4. The importance score is normalized by the total importance scores in each model to show each feature’s relative importance in the prediction of each model. In Model 1, age and length of hospitalization contribute the most to the classification. Also, sociomarkers contributed more than the race variable, which is known as an important factor in the previous study.29 Among the sociomarkers, the blight prevalence and neighborhood quality are the most critical features and they are as important as gender. In Model 2, the length of hospitalization is the most critical feature in prediction. In Model 3, the age feature is the most important feature and gender and neighborhood inequality are next.