`easypheno.optimization.paramfree_fitting`

Module Contents

Classes

ParamFreeFitting

Class that contains all info for the whole optimization using optuna for one model and dataset.

class easypheno.optimization.paramfree_fitting.ParamFreeFitting(save_dir, genotype_matrix_name, phenotype_matrix_name, phenotype, n_outerfolds, n_innerfolds, val_set_size_percentage, test_set_size_percentage, maf_percentage, save_final_model, task, current_model_name, dataset, models_start_time)

Class that contains all info for the whole optimization using optuna for one model and dataset.

Attributes

task (str): ML task (regression or classification) depending on target variable

current_model_name (str): name of the current model according to naming of .py file in package model

dataset (Dataset): dataset to use for optimization run

datasplit_subpath (str): subpath with datasplit info relevant for saving / naming

base_path (str): base_path for save_path

save_path (str): path for model and results storing

user_input_params (dict): all params handed over to the constructor that are needed in the whole class

Parameters

save_dir (pathlib.Path) – directory for saving the results.
genotype_matrix_name (str) – name of the genotype matrix including datatype ending
phenotype_matrix_name (str) – name of the phenotype matrix including datatype ending
phenotype (str) – name of the phenotype to predict
n_outerfolds (int) – number of outerfolds relevant for nested-cv
n_innerfolds (int) – number of folds relevant for nested-cv and cv-test
test_set_size_percentage (int) – size of the test set relevant for cv-test and train-val-test
val_set_size_percentage (int) – size of the validation set relevant for train-val-test
maf_percentage (int) – threshold for MAF filter as percentage value
save_final_model (bool) – specify if the final model should be saved
task (str) – ML task (regression or classification) depending on target variable
current_model_name (str) – name of the current model according to naming of .py file in package model
dataset (easypheno.preprocess.base_dataset.Dataset) – dataset to use for optimization run
models_start_time (str) – optimized models and starting time of the optimization run for saving purposes

run_fitting(self)

Run fitting of parameter-free models

Returns: dictionary with results overview

write_runtime_csv(self, dict_runtime)

Write runtime info to runtime csv file

Parameters: dict_runtime (dict) – dictionary with runtime information

get_feature_importance(self, model, top_n=1000)

Get feature importances for models that possess such a feature, e.g. BLUP

Parameters

model (easypheno.model._param_free_base_model.ParamFreeBaseModel) – model to analyze
top_n (int) – top n features to select

Returns

DataFrame with feature importance information

Return type

pandas.DataFrame

easypheno.optimization.paramfree_fitting

Module Contents

Classes

`easypheno.optimization.paramfree_fitting`