The tisthemachinelearner package provides a simple R
interface to scikit-learn models through Python’s
tisthemachinelearner package. This vignette demonstrates
how to use the package with R’s built-in mtcars
dataset.
We’ll use the classic mtcars dataset to predict miles
per gallon (mpg) based on other car characteristics:
# Load data
# Split features and target
X <- as.matrix(MASS::Boston[, -14]) # all columns except mpg
y <- MASS::Boston[, 14] # mpg column
# Create train/test split
set.seed(42)
train_idx <- sample(nrow(X), size = floor(0.8 * nrow(X)))
X_train <- X[train_idx, ]
X_test <- X[-train_idx, ]
y_train <- y[train_idx]
y_test <- y[-train_idx]Now let’s try Ridge regression with cross-validation for hyperparameter tuning:
# Fit ridge regression model
start <- proc.time()[3]
reg_ridge <- tisthemachinelearner::regressor(X_train, y_train, "Ridge",
#alphas = c(0.01, 0.1, 1, 10),
calibration = TRUE, venv_path = "../venv")
end <- proc.time()[3]
cat("Time taken:", end - start, "seconds\n")
# Make predictions
start <- proc.time()[3]
predictions_ridge_splitconformal <- predict(reg_ridge, X_test, method = "splitconformal")
end <- proc.time()[3]
cat("Time taken:", end - start, "seconds\n")
start <- proc.time()[3]
predictions_ridge_surrogate <- predict(reg_ridge, X_test, method = "surrogate")
end <- proc.time()[3]
cat("Time taken:", end - start, "seconds\n")
start <- proc.time()[3]
predictions_ridge_bootstrap <- predict(reg_ridge, X_test, method = "bootstrap")
end <- proc.time()[3]
cat("Time taken:", end - start, "seconds\n")
# Calculate coverage
coverage_ridge_splitconformal <- mean(y_test >= predictions_ridge_splitconformal[, "lwr"] & y_test <= predictions_ridge_splitconformal[, "upr"])
coverage_ridge_surrogate <- mean(y_test >= predictions_ridge_surrogate[, "lwr"] & y_test <= predictions_ridge_surrogate[, "upr"])
coverage_ridge_bootstrap <- mean(y_test >= predictions_ridge_bootstrap[, "lwr"] & y_test <= predictions_ridge_bootstrap[, "upr"])
cat("Ridge Regression Split Conformal Coverage:", coverage_ridge_splitconformal, "\n")
cat("Ridge Regression Surrogate Coverage:", coverage_ridge_surrogate, "\n")
cat("Ridge Regression Bootstrap Coverage:", coverage_ridge_bootstrap, "\n")sessionInfo()
#> R version 4.6.0 (2026-04-24)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] tisthemachinelearner_0.10.0 Matrix_1.7-5
#> [3] reticulate_1.46.0 rmarkdown_2.31
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.39 R6_2.6.1 fastmap_1.2.0 xfun_0.57
#> [5] lattice_0.22-9 maketools_1.3.2 cachem_1.1.0 knitr_1.51
#> [9] htmltools_0.5.9 png_0.1-9 buildtools_1.0.0 lifecycle_1.0.5
#> [13] cli_3.6.6 grid_4.6.0 sass_0.4.10 jquerylib_0.1.4
#> [17] compiler_4.6.0 sys_3.4.3 tools_4.6.0 evaluate_1.0.5
#> [21] bslib_0.11.0 Rcpp_1.1.1-1.1 yaml_2.3.12 jsonlite_2.0.0
#> [25] rlang_1.2.0