I will try to predict PAIN which is a very subjective term in the med and psychology with variables that I think “maybe” relevant. You can access data and my kaggle code from here:
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.2 v dplyr 1.0.7
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 2.0.1 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
df <- read_delim("SVMdata.csv",";", escape_double = FALSE, trim_ws = TRUE)
## Rows: 1267 Columns: 24
## -- Column specification --------------------------------------------------------
## Delimiter: ";"
## chr (9): Chief_complain, NRS_pain, SBP, DBP, HR, RR, BT, Saturation, Diagno...
## dbl (14): Group, Sex, Age, Patients number per hour, Arrival mode, Injury, M...
##
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.
t <- table(df$Pain)
t <- as.data.frame(t)
colnames(t) <- c("Pain","count")
ggplot(t, aes(x=Pain, y=count, fill=Pain)) +
geom_bar(stat="identity", color="black") +
theme_minimal() +
geom_text(aes(label=count), vjust=-0.6, size=6) +
scale_fill_brewer(palette="Set1")
Another painful part of this dataset is missing values. First, I would like to look at the total missing values for each column.
names(df)
## [1] "Group" "Sex"
## [3] "Age" "Patients number per hour"
## [5] "Arrival mode" "Injury"
## [7] "Chief_complain" "Mental"
## [9] "Pain" "NRS_pain"
## [11] "SBP" "DBP"
## [13] "HR" "RR"
## [15] "BT" "Saturation"
## [17] "KTAS_RN" "Diagnosis in ED"
## [19] "Disposition" "KTAS_expert"
## [21] "Error_group" "Length of stay_min"
## [23] "KTAS duration_min" "mistriage"
apply(is.na(df), 2, sum)
## Group Sex Age
## 0 0 0
## Patients number per hour Arrival mode Injury
## 0 0 0
## Chief_complain Mental Pain
## 0 0 0
## NRS_pain SBP DBP
## 0 0 0
## HR RR BT
## 0 0 0
## Saturation KTAS_RN Diagnosis in ED
## 688 0 2
## Disposition KTAS_expert Error_group
## 0 0 0
## Length of stay_min KTAS duration_min mistriage
## 0 0 0
# Plot missing values.
library(naniar)
gg_miss_var(df)
## Warning: It is deprecated to specify `guide = FALSE` to remove a guide. Please
## use `guide = "none"` instead.
This is obivious there are plenty of missing variables in the Saturation. But the more weird thing is in the data, there are some values called “??”. I seen it when I convert the values that I am interested in to numeric.
The simputation package in R (van der Loo, 2019) implements a variety of imputation strategies, such as group-wise median imputation, linear regression etc… The following code uses the method to impute the missing values in the Saturation column using linear regression.
library(simputation)
##
## Attaching package: 'simputation'
## The following object is masked from 'package:naniar':
##
## impute_median
df$Saturation<-as.numeric(df$Saturation)
## Warning: NAs introduced by coercion
df[11:15] <- sapply(df[11:15],as.numeric)
## Warning in lapply(X = X, FUN = FUN, ...): NAs introduced by coercion
## Warning in lapply(X = X, FUN = FUN, ...): NAs introduced by coercion
## Warning in lapply(X = X, FUN = FUN, ...): NAs introduced by coercion
## Warning in lapply(X = X, FUN = FUN, ...): NAs introduced by coercion
## Warning in lapply(X = X, FUN = FUN, ...): NAs introduced by coercion
# This part is purely improvisational. There are too many missings in this column and I'm not a fan of the mean imputation if there are many.
imp_df <- impute_lm(df[,-c(1)],Saturation~Age+DBP+SBP+HR+RR+BT)
library(imputeTS)
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
imp_df<-na_mean(imp_df)
## Warning: imputeTS: No imputation performed for column 17 because of this Error in na_mean(data[, i], option, maxgap): Input x is not numeric
#sanity check
apply(is.na(imp_df), 2, sum)
## Sex Age Patients number per hour
## 0 0 0
## Arrival mode Injury Chief_complain
## 0 0 0
## Mental Pain NRS_pain
## 0 0 0
## SBP DBP HR
## 0 0 0
## RR BT Saturation
## 0 0 0
## KTAS_RN Diagnosis in ED Disposition
## 0 2 0
## KTAS_expert Error_group Length of stay_min
## 0 0 0
## KTAS duration_min mistriage
## 0 0
imp_df<-na.omit(imp_df)
gg_miss_var(imp_df)
## Warning: It is deprecated to specify `guide = FALSE` to remove a guide. Please
## use `guide = "none"` instead.
One Hot Encoding for Chief_complain
imp_df<-imp_df %>% mutate(value = 1) %>% spread(Chief_complain, value, fill = 0 )
# In Chief_complain there were values called "??". Selecting variables that I am interested in:
norm<-imp_df%>%select(c(1,2,10:15,32:439))
norm
## # A tibble: 1,265 x 416
## Sex Age DBP HR RR BT Saturation KTAS_RN `abd pain`
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2 71 100 84 18 36.6 100 2 0
## 2 1 56 75 60 20 36.5 97.5 4 0
## 3 1 68 80 102 20 36.6 98 4 0
## 4 2 71 94 88 20 36.5 96.8 4 0
## 5 2 58 67 93 18 36.5 98.4 4 0
## 6 1 54 90 94 20 38.1 98 3 0
## 7 2 49 70 70 20 36.2 98 2 0
## 8 2 78 86 80 20 36 97.2 2 0
## 9 2 32 75 91 20 36.6 98.6 4 0
## 10 1 38 80 80 20 36.3 97 4 0
## # ... with 1,255 more rows, and 407 more variables: abd. Distension <dbl>,
## # Abd. pain <dbl>, abdomen discomfort <dbl>, abdomen distension <dbl>,
## # abdomen pain <dbl>, Abdominal pain (finding) <dbl>,
## # abdominal pain, LUQ <dbl>, abdominal pain, periumbilical area <dbl>,
## # abdominal pain, Rt <dbl>, abnormal lab. <dbl>,
## # Abnormality, Visual Acuity <dbl>, acute delirium <dbl>,
## # acute dyspnea <dbl>, acute epigastric pain <dbl>, alcohol smell <dbl>, ...
SVM Support vector machines (SVMs) offer a direct approach to binary classification (so it’s applicable to only two classes). In the Pain which we are interested in, we have two classes: -the patient has pain (𝑌 = +1) and patients that do not have pain (𝑌 = −1).
normalize <- function(x) {
return((x - min(x)) / (max(x) - min(x)))
}
norm<-apply(norm,2,normalize)
library(caTools)
set.seed(123)
norm<-as.data.frame(norm)
norm$Pain<-as.factor(imp_df$Pain)
sample<-sample.split(norm,SplitRatio = 0.8)
train<-subset(norm,sample==T)
test<-subset(norm,sample==F)
library(caret)
## Loading required package: lattice
##
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
##
## lift
set.seed(1854) # for reproducibility
first_churn_svm <- train(
Pain ~ .,
data = train,
method = "svmRadial",
trControl = trainControl(method = "cv", number = 5),
tuneLength = 8
)
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
confusionMatrix(first_churn_svm)
## Cross-Validated (5 fold) Confusion Matrix
##
## (entries are percentual average cell counts across resamples)
##
## Reference
## Prediction 0 1
## 0 32.7 8.2
## 1 11.2 47.8
##
## Accuracy (average) : 0.8056
Plotting the results, we see that smaller values of the cost parameter provide better cross-validated accuracy scores for these training data:
ggplot(first_churn_svm) + theme_light()
ctrl <- trainControl(
method = "cv",
number = 15,
classProbs = TRUE,
summaryFunction = twoClassSummary # also needed for AUC/ROC
)
levels(train$Pain) <- c("No", "Yes")
table(train$Pain)
##
## No Yes
## 443 565
SVMs classify new data by identifying which side of the decision boundary they fall on; as a result, they do not provide class probabilities automatically. Predicted class probabilities are more useful than predicted class labels in most scenarios.
# Tune an SVM
set.seed(123) # for reproducibility
Pain_svm_auc <- train(
Pain ~ .,
data = train,
method = "svmRadial",
metric = "ROC", # area under ROC curve (AUC)
trControl = ctrl,
tuneLength = 15)
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
Similar to before, we see that smaller values of the cost parameter provide better cross-validated AUC scores on the training data:
Pain_svm_auc$results
## sigma C ROC Sens Spec ROCSD SensSD
## 1 0.3753193 0.25 0.8599473 0.7016092 0.8693694 0.05612426 0.07941237
## 2 0.3753193 0.50 0.8751341 0.7244444 0.8568990 0.05172320 0.06818035
## 3 0.3753193 1.00 0.8791329 0.7517241 0.8569464 0.04794389 0.06729925
## 4 0.3753193 2.00 0.8775198 0.7358621 0.8551920 0.04417149 0.06189418
## 5 0.3753193 4.00 0.8709454 0.7223755 0.8480797 0.04884229 0.05431717
## 6 0.3753193 8.00 0.8698582 0.7403831 0.8444286 0.04948701 0.06128537
## 7 0.3753193 16.00 0.8660827 0.7495019 0.8374111 0.05206548 0.05858440
## 8 0.3753193 32.00 0.8527400 0.7311877 0.8128023 0.05191687 0.05749731
## 9 0.3753193 64.00 0.8313656 0.7203065 0.7880512 0.05736820 0.06186439
## 10 0.3753193 128.00 0.8090926 0.6818391 0.7597914 0.05861884 0.06240135
## 11 0.3753193 256.00 0.7912394 0.6368582 0.7385017 0.05922469 0.06971604
## 12 0.3753193 512.00 0.7677672 0.6095785 0.7436226 0.05729409 0.06880340
## 13 0.3753193 1024.00 0.7588243 0.6026820 0.7383120 0.05979422 0.06398500
## 14 0.3753193 2048.00 0.7570174 0.5914943 0.7418682 0.06422436 0.06074393
## 15 0.3753193 4096.00 0.7555826 0.5843678 0.7348032 0.06343554 0.08670660
## SpecSD
## 1 0.06056964
## 2 0.06419525
## 3 0.06167287
## 4 0.05835604
## 5 0.05809416
## 6 0.05229235
## 7 0.06485412
## 8 0.06418962
## 9 0.07843770
## 10 0.08397745
## 11 0.07380374
## 12 0.07215212
## 13 0.06105673
## 14 0.06472511
## 15 0.07028558
confusionMatrix(Pain_svm_auc)
## Cross-Validated (15 fold) Confusion Matrix
##
## (entries are percentual average cell counts across resamples)
##
## Reference
## Prediction No Yes
## No 33.0 8.0
## Yes 10.9 48.0
##
## Accuracy (average) : 0.8105
Like many other Machine Learning algorithms, support v.m do not emit any natural measures of feature importance but, we can use the VIP package to get importance results for each variable using the permutation approach
prob_yes <- function(object, newdata) {
predict(object, newdata = newdata, type = "prob")[, "Yes"]
}
# Variable importance plot
library(vip)
##
## Attaching package: 'vip'
## The following object is masked from 'package:utils':
##
## vi
set.seed(2827) # for reproducibility
vip(Pain_svm_auc, method = "permute", nsim = 5, train = train,
target = "Pain", metric = "auc", reference_class = "Yes",
pred_wrapper = prob_yes)
The results indicate that sex and dizziness is the most important feature in predicting pain in this model.
# construct PDPs forthe top four features according to the permutation-based variable importance scores
library(pdp)
##
## Attaching package: 'pdp'
## The following object is masked from 'package:purrr':
##
## partial
features <- c("Sex", "dizziness",
"dyspnea","fever","Age")
pdps <- lapply(features, function(x) {
partial(Pain_svm_auc, pred.var = x, which.class = 2,
prob = TRUE, plot = TRUE, plot.engine = "ggplot2") +
coord_flip()
})
grid.arrange(grobs = pdps, ncol = 2)
## Warning: Use of `object[[1L]]` is discouraged. Use `.data[[1L]]` instead.
## Warning: Use of `object[["yhat"]]` is discouraged. Use `.data[["yhat"]]`
## instead.
## Warning: Use of `object[[1L]]` is discouraged. Use `.data[[1L]]` instead.
## Warning: Use of `object[["yhat"]]` is discouraged. Use `.data[["yhat"]]`
## instead.
## Warning: Use of `object[[1L]]` is discouraged. Use `.data[[1L]]` instead.
## Warning: Use of `object[["yhat"]]` is discouraged. Use `.data[["yhat"]]`
## instead.
## Warning: Use of `object[[1L]]` is discouraged. Use `.data[[1L]]` instead.
## Warning: Use of `object[["yhat"]]` is discouraged. Use `.data[["yhat"]]`
## instead.
## Warning: Use of `object[[1L]]` is discouraged. Use `.data[[1L]]` instead.
## Warning: Use of `object[["yhat"]]` is discouraged. Use `.data[["yhat"]]`
## instead.
levels(test$Pain) <- c("No", "Yes")
y_pred = predict(Pain_svm_auc, test)
caret::confusionMatrix(table(y_pred,test$Pain))
## Confusion Matrix and Statistics
##
##
## y_pred No Yes
## No 86 31
## Yes 23 117
##
## Accuracy : 0.7899
## 95% CI : (0.7349, 0.838)
## No Information Rate : 0.5759
## P-Value [Acc > NIR] : 3.88e-13
##
## Kappa : 0.574
##
## Mcnemar's Test P-Value : 0.3408
##
## Sensitivity : 0.7890
## Specificity : 0.7905
## Pos Pred Value : 0.7350
## Neg Pred Value : 0.8357
## Prevalence : 0.4241
## Detection Rate : 0.3346
## Detection Prevalence : 0.4553
## Balanced Accuracy : 0.7898
##
## 'Positive' Class : No
##
%80. SVM is flexible enough to adapt to complex nonlinear decision boundaries. It directly attempts to maximize generalizability accuracy and is robust for outliers. But even with a very low tune length, it’s very slow to train data.