utils

Tools for file handling, logger parametrization, dataframe manipulation…

Functions

column_names_to_snake_case(df)

Converts the dataframe's column names from camel case to snake case, and replace spaces by underscores

concatenate_non_na(row, col_names)

Function to concatenate values of N columns into a list, excluding NaN

convert_to_path(input_path)

Return the input_path in a PathLike format.

download_from_geo(gsm_ids_to_download, ...)

Download idat files from GEO (Gene Expression Omnibus) given one or several GSM ids.

download_from_link(dl_link, output_folder[, ...])

Download a file and save it to the target.

get_chromosome_number(chromosome_id[, ...])

From a string representing the chromosome ID, get the chromosome number.

get_column_as_flat_array(df, column[, remove_na])

Get values from one or several columns of a pandas dataframe, and return a flatten list of the values.

get_files_matching(root_path, pattern)

Equivalent to Path.rglob() for MultiplexedPath.

get_logger([level])

Get the current logger and sets its level if the parameter level is defined

get_logger_level()

return the current logger level

get_resource_folder(module_path[, ...])

Find the resource folder, and creates it if it doesn't exist and if the parameter is set to True (default)

load_object(filepath[, object_type])

Load any object from a pickle file:

merge_alt_chromosomes(chromosome_id)

Merges the alternative chromosomes with their respective reference chromosome, e.g. 22_KI270928V1_ALT-> 22.

merge_dataframe_by(df, by, **kwargs)

Merge a dataframe by one or several columns, using the merge_series_values function :param df: input dataframe :type df: pandas.DataFrame

merge_series_values(items[, how])

Merge the values of a series into a single value.

remove_probe_suffix(probe_id)

Remove the last part of a probe ID, split by underscore.

save_object(object_to_save, filepath)

Save any object as a pickle file

set_channel_index_as(df, column[, drop])

Use an existing column specified by argument column as the new channel index.

set_level_as_index(df, level[, drop_others])

Change the index of a MultiIndexed DataFrame, to a single Index, using a level of the MultiIndex.

set_logger(level)

Set the logger verbosity level