Skip to main content
Fig. 3 | Microbial Cell Factories

Fig. 3

From: Machine learning for data integration in human gut microbiome

Fig. 3

The general workflow of machine learning modelling. a, The entire pipeline of modeling commonly consists of four steps, including feature engineering, model training and optimization, performance evaluation, model application and explanation; b, Confusion matrix. It summarizes and visualizes four possibly predictions from a binary classification model, including true positive (TP), false positive (FP), true negative (TN), false negative (FN); c, ROC curve. It plots sensitivity against ‘1- specificity’ at varied classification thresholds of the predictive model; d, k-fold cross-validation. The original samples are randomly split into k subsets with equal size. When one round of cross-validation is implemented, a predictive model is trained using the k-1 subsets and validated using the single remaining subset

Back to article page