--- title: "Introduction to tisthemachinelearner, S3 interface booster" author: "T. Moudiki" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to tisthemachinelearner, S3 interface booster} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Introduction The `tisthemachinelearner` package provides a simple R interface to scikit-learn models through Python's `tisthemachinelearner` package. This vignette demonstrates how to use the package with R's built-in `mtcars` dataset. ## Setup First, let's load the required packages: ```{r} library(reticulate) library(tisthemachinelearner) ``` ## Data Preparation We'll use the classic `mtcars` dataset to predict miles per gallon (mpg) based on other car characteristics: ```{r, eval=FALSE} # Load data # Split features and target X <- as.matrix(MASS::Boston[, -14]) # all columns except mpg y <- MASS::Boston[, 14] # mpg column # Create train/test split set.seed(42) train_idx <- sample(nrow(X), size = floor(0.8 * nrow(X))) X_train <- X[train_idx, ] X_test <- X[-train_idx, ] y_train <- y[train_idx] y_test <- y[-train_idx] ``` ## Ridge Regression with Cross-Validation Now let's try Ridge regression with cross-validation for hyperparameter tuning: ```{r, eval=FALSE} # Fit booster model time <- proc.time()[3] reg_booster <- tisthemachinelearner::booster(X_train, y_train, "ExtraTreeRegressor", n_estimators = 100L, learning_rate = 0.1, show_progress = TRUE, verbose = TRUE, venv_path = "../venv") time <- proc.time()[3] - time cat("Time taken:", time, "seconds\n") # Make predictions time <- proc.time()[3] predictions <- predict(reg_booster, X_test) time <- proc.time()[3] - time cat("Time taken:", time, "seconds\n") # RMSE rmse <- sqrt(mean((y_test - predictions)^2)) cat("RMSE:", rmse, "\n") ``` ## Session Info ```{r} sessionInfo() ```