IdatDataset

class pylluminator.read_idat.IdatDataset(filepath: str, bit='float32')

Validates and parses an Illumina IDAT file.

Variables:

barcode
chip_type
Default (n_snps_read.) – 0
run_info – Default []
bit (str) – Defines the data type, hence the precision. Default: ‘float32’
probes_df (pandas.DataFrame) – dataframe with the .idat file data parsed (illumina IDs, mean_value, std_dev, n_beads)

Methods

`__init__`(filepath[, bit])	Initializes the IdatDataset by reading the .idat file provided.
`get_section_offsets`(infile, args, *kwargs)
`is_correct_version`(infile, args, *kwargs)
`is_idat_file`(infile, args, *kwargs)
`overflow_check`()	Check if there is any negative value in the dataframe, meaning there was an overflow
`read`(idat_file)	Reads the IDAT file and parses the appropriate sections.

Methods and attributes detail

__init__(filepath: str, bit='float32')

Initializes the IdatDataset by reading the .idat file provided.

Parameters:

filepath (str) – the IDAT file to parse.
bit (str) – Either ‘float32’ or ‘float16’. ‘float16’ will pre-normalize intensities, capping max intensity at 32127. This cuts data size in half, but will reduce precision on ~0.01% of probes. [effectively downscaling fluorescence] Default: ‘float32’

Raises:

ValueError: The IDAT file has an incorrect identifier or version specifier.

overflow_check() → bool

Check if there is any negative value in the dataframe, meaning there was an overflow

read(idat_file) → DataFrame

Reads the IDAT file and parses the appropriate sections. Joins the mean probe intensity values with their Illumina probe ID.