IdatDataset

class pylluminator.read_idat.IdatDataset(filepath: str, bit='float32')

Bases: object

Validates and parses an Illumina IDAT file.

Variables:
  • barcode

  • chip_type

  • Default (n_snps_read.) – 0

  • run_info – Default []

  • bit (str) – Defines the data type, hence the precision. Default: ‘float32’

  • probes_df (pandas.DataFrame) – dataframe with the .idat file data parsed (illumina IDs, mean_value, std_dev, n_beads)

Methods

__init__(filepath[, bit])

Initializes the IdatDataset by reading the .idat file provided.

get_section_offsets(infile, *args, **kwargs)

is_correct_version(infile, *args, **kwargs)

is_idat_file(infile, *args, **kwargs)

overflow_check()

Check if there is any negative value in the dataframe, meaning there was an overflow

read(idat_file)

Reads the IDAT file and parses the appropriate sections.

Methods and attributes detail

__init__(filepath: str, bit='float32')

Initializes the IdatDataset by reading the .idat file provided.

Parameters:
  • filepath (str) – the IDAT file to parse.

  • bit (str) – Either ‘float32’ or ‘float16’. ‘float16’ will pre-normalize intensities, capping max intensity at 32127. This cuts data size in half, but will reduce precision on ~0.01% of probes. [effectively downscaling fluorescence] Default: ‘float32’

Raises:

ValueError: The IDAT file has an incorrect identifier or version specifier.

overflow_check() bool

Check if there is any negative value in the dataframe, meaning there was an overflow

Returns:

True if an overflow was detected in any value

Return type:

bool

read(idat_file) DataFrame

Reads the IDAT file and parses the appropriate sections. Joins the mean probe intensity values with their Illumina probe ID.

Parameters:

idat_file (file-like) – the IDAT file to process.

Returns:

mean probe intensity values indexed by Illumina ID.

Return type:

pandas.DataFrame