Results for all_runs: LDO

This diagram displays the mean rank of each model over all cross-validation splits: Within each CV split, the models are ranked according to their MSE. We calculate whether a model is significantly better than another one using the Friedman test and the post-hoc Conover test. The Friedman test shows whether there are overall differences between the models. After a significantFriedman test, the pairwise Conover test is performed to identify which models are significantly outperforming others. One line indicates which models are not significantly different from each other. The p-values are shown below. This can only be rendered if at least 3 models were run.

Results of Post-Hoc Conover Test

	DIPK	ElasticNet	GradientBoosting	MultiOmicsNeuralNetwork	MultiOmicsRandomForest	NaiveMeanEffectsPredictor	NaivePredictor	NaiveTissueMeanPredictor	ProteomicsRandomForest	RandomForest	SimpleNeuralNetwork
DIPK	1.0000	0.7789	0.4408	0.4408	0.5748	0.7789	0.0375	0.2939	0.5748	0.0518	0.9440
ElasticNet	0.7789	1.0000	0.2939	0.2939	0.7789	0.5748	0.0706	0.4408	0.7789	0.0946	0.7257
GradientBoosting	0.4408	0.2939	1.0000	1.0000	0.1845	0.6234	0.0049	0.0706	0.1845	0.0073	0.4834
MultiOmicsNeuralNetwork	0.4408	0.2939	1.0000	1.0000	0.1845	0.6234	0.0049	0.0706	0.1845	0.0073	0.4834
MultiOmicsRandomForest	0.5748	0.7789	0.1845	0.1845	1.0000	0.4006	0.1250	0.6234	1.0000	0.1627	0.5280
NaiveMeanEffectsPredictor	0.7789	0.5748	0.6234	0.6234	0.4006	1.0000	0.0188	0.1845	0.4006	0.0267	0.8332
NaivePredictor	0.0375	0.0706	0.0049	0.0049	0.1250	0.0188	1.0000	0.2939	0.1250	0.8884	0.0317
NaiveTissueMeanPredictor	0.2939	0.4408	0.0706	0.0706	0.6234	0.1845	0.2939	1.0000	0.6234	0.3626	0.2631
ProteomicsRandomForest	0.5748	0.7789	0.1845	0.1845	1.0000	0.4006	0.1250	0.6234	1.0000	0.1627	0.5280
RandomForest	0.0518	0.0946	0.0073	0.0073	0.1627	0.0267	0.8884	0.3626	0.1627	1.0000	0.0442
SimpleNeuralNetwork	0.9440	0.7257	0.4834	0.4834	0.5280	0.8332	0.0317	0.2631	0.5280	0.0442	1.0000

Violin Plots of Performance Measures over CV runs

Violin plots comparing all models

To focus on a specific metric, choose it in the dropdown menu in the top right corner.You can investigate the distribution of the performance measures by hovering over the plot. To select/exclude specific algorithms, (double-)click them in the legend.

Violin plots comparing all models with normalized metrics

Before calculating the evaluation metrics, all values were normalized by the predictions of the NaiveMeanEffectsPredictor. Since this only influences the R^2 and the correlation metrics, the error metrics are not shown.

Violin plots comparing performance measures for tests within each model

Heatmap Plots of Performance Measures over CV runs

Heatmap plots comparing all models

Unnormalized metrics collapsed over all CV runs with mean and standard deviation. The strictly standardized mean difference is a measure of effect size which is calculated pairwise. For two models, it is calculated as [mean1 - mean2] / [sqrt(var1 + var2)] for a specific measure. The larger the absolute SSMD, the stronger the effect (a strong effect could, is e.g., a |SSMD| > 2 ).

Heatmap plots comparing all models with normalized metrics

Heatmap plots comparing performance measures for tests within each model

Regression plots

Comparison of normalized R^2 values

R^2 values can be compared here between models, either per cell line or per drug. This can either show if a model has consistently higher or lower R^2 values than another model or identify cell lines/drugs for which models agree or disagree. The x-axis is the first dropdown menu, the y-axis is the second dropdown menu.