Samples
- class pylluminator.samples.Samples(sample_sheet_df: DataFrame | None = None)
Bases:
objectSamples objects hold sample methylation signal in a dataframe, as well as annotation information, sample sheet data and probes masks.
- Variables:
annotation (Annotations | None) – probes metadata. Default: None.
sample_sheet (pandas.DataFrame | None) – samples information given by the csv sample sheet. Default: None
min_beads (int | None) – minimum number of beads required for a probe to be considered. Default: None
idata (dict[str, dict[Channel, pandas.DataFrame]]) – dictionary of dataframes containing the raw signal values for each sample and channel. Default: {}
masks (MaskCollection) – collection of probes masks. Default: MaskCollection()
Methods
__init__([sample_sheet_df])Initialize the object with only a sample-sheet.
add_annotation_info(annotation, label_column)Merge manifest dataframe with probe signal values read from idat files to build the signal dataframe, adding channel information, methylation state and mask names for each probe.
batch_correction(batch[, apply_mask, ...])Applies ComBat algorithm for batch correction.
calculate_betas([include_out_of_band])Calculate beta values for all probes.
cg_probes([apply_mask, sigdf])Get CG (CpG) type probes, and apply the mask if apply_mask is True
ch_probes([apply_mask, sigdf])Get CH (CpH) type probes, and apply the mask if apply_mask is True
controls([apply_mask, pattern, sigdf])Get the subset of control probes, matching the pattern with the probe_ids if a pattern is provided
copy()Create a copy of the Samples object
drop_samples(sample_labels)Remove some samples.
dye_bias_correction([sample_label, ...])Dye bias correction using normalization control probes.
dye_bias_correction_l([sample_label, ...])Linear dye bias correction. Scale both the green and red signal to a reference level. If the reference level
dye_bias_correction_nl([sample_labels, ...])Non-linear dye bias correction by matching green and red to mid-point.
get_betas([sample_label, drop_na, ...])Get the beta values for the sample.
get_m_values([sample_label, drop_na, ...])Get the M-values for the sample.
get_mean_ib_intensity([sample_label, apply_mask])Computes the mean intensity of all the in-band measurements.
Count the number of probes covered by the sample-s per chromosome and design type
get_negative_controls([apply_mask, sigdf])Get negative control signal
get_normalization_controls([apply_mask, ...])Returns the control values to normalize green and red probes.
get_probes(probe_ids[, apply_mask, sigdf])Returns the probes dataframe filtered on a list of probe IDs
get_probes_with_probe_type(probe_type[, ...])Select probes by probe type, meaning e.g. CG, Control, SNP.
get_signal_df([apply_mask])Get the methylation signal dataframe, and apply the mask if apply_mask is True
get_total_ib_intensity([sample_label, ...])Computes the total intensity of all the in-band measurements.
Return True if the beta values have already been calculated
ib([apply_mask, sigdf])Get the subset of in-band probes (for type I probes only), and apply the mask if apply_mask is True
ib_green([apply_mask, sigdf])Get the subset of in-band green probes (for type I probes only), and apply the mask if apply_mask is True
ib_red([apply_mask, sigdf])Get the subset of in-band red probes (for type I probes only), and apply the mask if apply_mask is True
infer_type1_channel([sample_labels, ...])For Infinium type I probes, infer the channel from the signal values, setting it to the channel with the max signal.
load(filepath)Load a pickled Samples object from filepath
mask_control_probes([sample_label])Shortcut to mask control probes
mask_non_cg_probes([sample_label])Shortcut to mask non-CpG probes
mask_non_unique_probes([sample_label])Shortcut to mask non-unique probes on this sample
mask_probes_by_names(names_to_mask[, ...])Match the names provided in names_to_mask with the probes mask info and mask these probes, adding them to the current mask if there is any.
mask_quality_probes([sample_label])Shortcut to mask quality probes
mask_snp_probes([sample_label])Shortcut to mask snp probes
mask_xy_probes([sample_label])Shortcut to mask probes from XY chromosome
merge_samples_by(by[, apply_mask])Merge the beads signal values of different samples by averaging them.
meth([apply_mask, sigdf])Get the subset of methylated probes, and apply the mask if apply_mask is True
noob_background_correction([sample_labels, ...])Subtract the background for a sample.
oob([apply_mask, sigdf])Get the subset of out-of-band probes (for type I probes only), and apply the mask if apply_mask is True
oob_green([apply_mask, sigdf])Get the subset of out-of-band green probes (for type I probes only), and apply the mask if apply_mask is True
oob_red([apply_mask, sigdf])Get the subset of out-of-band red probes (for type I probes only), and apply the mask if apply_mask is True
poobah([sample_labels, apply_mask, ...])Detection P-value based on empirical cumulative distribution function (ECDF) of out-of-band signal aka pOOBAH (P-value with out-of-band (OOB) array hybridization).
remove_probes_suffix([apply_mask])Merge probes that have the same ID but different suffixes (e.g. _BC11, _TC21..) by averaging their signal values.
Remove betas dataframe
Remove poobah p-values from the signal dataframe
save(filepath)Save the current Samples object to filepath, as a pickle file
scrub_background_correction([sample_label, ...])Subtract residual background using background median.
snp_probes([apply_mask, sigdf])Get SNP type probes ('rs' probes in manifest, but replaced by 'snp' when loaded), and apply the mask if apply_mask is True
subset(sample_labels)Keep only the specified samples.
type1([apply_mask, sigdf])Get the subset of Infinium type I probes, and apply the mask if apply_mask is True
type1_green([apply_mask, sigdf])Get the subset of type I green probes, and apply the mask if apply_mask is True
type1_red([apply_mask, sigdf])Get the subset of type I red probes, and apply the mask if apply_mask is True
type2([apply_mask, sigdf])Get the subset of Infinium type II probes, and apply the mask if apply_mask is True
unmeth([apply_mask, sigdf])Get the subset of unmethylated probes, and apply the mask if apply_mask is True
Attributes
Count the number of probes in the signal dataframe
Count the number of samples contained in the object
Return the list of probe IDs contained in the signal dataframe
Return the name of the sample sheet column used as sample labels.
Return the names of the samples contained in this object, that also exist in the sample sheet.
Methods and attributes detail
- __init__(sample_sheet_df: DataFrame | None = None)
Initialize the object with only a sample-sheet.
- Parameters:
sample_sheet_df (pandas.DataFrame | None) – sample sheet dataframe. Default: None
- add_annotation_info(annotation: Annotations, label_column: str, keep_idat=False, min_beads=1) None
Merge manifest dataframe with probe signal values read from idat files to build the signal dataframe, adding channel information, methylation state and mask names for each probe.
For manifest file, merging is done on Illumina IDs, contained in columns address_a and address_b of the manifest file.
- Parameters:
annotation (Annotations) – annotation data corresponding to the sample
label_column (str) – the name of the sample sheet column used for sample labels (eg sample_id, sample_name)
min_beads (int) – filter probes with less than min_beads beads. Default: 1
keep_idat (bool) – if set to True, keep idat data after merging the annotations. Default: False
- Returns:
None
- batch_correction(batch: list | str, apply_mask: bool = True, covariates: str | list[str] | None = None, par_prior=True, mean_only=False, ref_batch=None, precision=None, na_cov_action='raise') None
Applies ComBat algorithm for batch correction. To correct the beta values while staying in the [0:1] range, the algorithm is applied on M-values, that are converted back to beta values. If the batch correction fails, the beta values are reset to None.
- Parameters:
batch (str | list) – If a string is provided, it’s interpreted as the name of the column in the sample sheet that contains the batch information. If a list is provided, it should contain the batch indices, with as many values as samples.
apply_mask (bool) – set to False if you don’t want any mask to be applied. Default: True
covariates (str | list[str] | None) – a list of column names from the sample sheet to use as covariates in the model. It only supports categorical or string variables. Default: None
par_prior (bool) – False for non-parametric estimation of batch effects. Default: True
mean_only (bool) – True iff just adjusting the means and not individual batch effects Default: False
ref_batch – batch id of the batch to use as reference. Default: None
precision (float) – level of precision for precision computing. Default: None
na_cov_action – choose the way to handle missing covariates : raise raise an error if missing covariates and stop the code, remove remove samples with missing covariates and raise a warning, fill handle missing covariates, by creating a distinct covariate per batch. Default: raise
- Returns:
None
- calculate_betas(include_out_of_band=False) None
Calculate beta values for all probes. Values are stored in a dataframe and can be accessed via the betas() function
- Parameters:
include_out_of_band (bool) – is set to true, the Type 1 probes beta values will be calculated on in-band AND out-of-band signal values. If set to false, they will be calculated on in-band values only. equivalent to sumTypeI in sesame. Default: False
- Returns:
None
- cg_probes(apply_mask: bool = True, sigdf: DataFrame | None = None) DataFrame
Get CG (CpG) type probes, and apply the mask if apply_mask is True
- Parameters:
apply_mask (bool) – True removes masked probes, False keeps them. Ignored if sigdf is provided. Default: True
sigdf (pd.DataFrame | None) – signal dataframe to use. Useful to save time applying the mask. Default: None
- Returns:
methylation signal dataframe
- Return type:
- ch_probes(apply_mask: bool = True, sigdf: DataFrame | None = None) DataFrame
Get CH (CpH) type probes, and apply the mask if apply_mask is True
- Parameters:
apply_mask (bool) – True removes masked probes, False keeps them. Ignored if sigdf is provided. Default: True
sigdf (pd.DataFrame | None) – signal dataframe to use. Useful to save time applying the mask. Default: None
- Returns:
methylation signal dataframe
- Return type:
- controls(apply_mask: bool = True, pattern: str | None = None, sigdf: DataFrame | None = None) DataFrame | None
Get the subset of control probes, matching the pattern with the probe_ids if a pattern is provided
- Parameters:
- Returns:
methylation signal dataframe of the control probes, or None if None was found
- Return type:
pandas.DataFrame | None
- drop_samples(sample_labels: str | list[str]) None
Remove some samples. Delete the signal information, beta values, sample sheet rows and masks. Ignores non-existent sample names
- dye_bias_correction(sample_label: str | None = None, apply_mask: bool = True, reference: dict | None = None) None
Dye bias correction using normalization control probes.
- Parameters:
sample_label (str | None) – the name of the sample to correct dye bias for. If None, correct dye bias for all samples.
apply_mask (bool) – set to False if you don’t want any mask to be applied. Default: True
reference – values to use as reference to scale red and green signal for each sample (=dict keys). Default: None
- Type:
dict | None
- Returns:
None
- dye_bias_correction_l(sample_label: str | None = None, apply_mask: bool = True, reference: dict | None = None) None
- Linear dye bias correction. Scale both the green and red signal to a reference level. If the reference level
is not given, it is set to the mean intensity of all the in-band signals.
- Parameters:
sample_label (str | None) – the name of the sample to correct dye bias for. If None, correct dye bias for all samples.
apply_mask (bool) – set to False if you don’t want any mask to be applied. Default: True
reference – values to use as reference to scale red and green signal for each sample (=dict keys). Default: None
- Type:
dict | None
- Returns:
None
- dye_bias_correction_nl(sample_labels: str | list[str] | None = None, apply_mask: bool = True) None
Non-linear dye bias correction by matching green and red to mid-point. Each sample is handled separately.
This function compares the Type-I Red probes and Type-I Grn probes and generates and mapping to correct signal of the two channels to the middle.
- Parameters:
- Returns:
None
- get_betas(sample_label: str | None = None, drop_na: bool = False, probe_ids: list[str] | str | None = None, custom_sheet: DataFrame | None = None, apply_mask: bool = True) DataFrame | Series | None
Get the beta values for the sample. If no sample name is provided, return beta values for all samples.
- Parameters:
sample_label (str | None) – the name of the sample to get beta values for. If None, return beta values for all samples.
drop_na (bool) – if set to True, drop rows with NA values. Default: False
custom_sheet (pandas.DataFrame | None) – a custom sample sheet to filter samples. Ignored if sample_label is provided. Default: None
apply_mask (bool) – set to False if you don’t want any mask to be applied. Default: True
- Returns:
beta values as a DataFrame, or Series if sample_label is provided. If no beta values are found, return None
- Return type:
pandas.DataFrame | pandas.Series | None
- get_m_values(sample_label: str | None = None, drop_na: bool = False, probe_ids: list[str] | str | None = None, custom_sheet: DataFrame | None = None, apply_mask: bool = True) DataFrame | Series | None
Get the M-values for the sample. If no sample name is provided, return M-values for all samples. They are calculated from the beta values, so they need to be calculated first using the calculate_betas() function.
- Parameters:
sample_label (str | None) – the name of the sample to get beta values for. If None, return beta values for all samples.
drop_na (bool) – if set to True, drop rows with NA values. Default: False
custom_sheet (pandas.DataFrame | None) – a custom sample sheet to filter samples. Ignored if sample_label is provided. Default: None
apply_mask (bool) – set to False if you don’t want any mask to be applied. Default: False
- Returns:
beta values as a DataFrame, or Series if sample_label is provided. If no beta values are found, return None
- Return type:
pandas.DataFrame | pandas.Series | None
- get_mean_ib_intensity(sample_label: str | None = None, apply_mask=True) dict
Computes the mean intensity of all the in-band measurements. This includes all Type-I in-band measurements and all Type-II probe measurements. Both methylated and unmethylated alleles are considered.
- Parameters:
- Returns:
mean in-band intensity value
- Return type:
- get_nb_probes_per_chr_and_type() tuple[DataFrame, DataFrame]
Count the number of probes covered by the sample-s per chromosome and design type
- Returns:
two dataframes: number of probes per chromosome and number of probes per design type (masked and not masked)
- get_negative_controls(apply_mask: bool = True, sigdf: DataFrame | None = None) DataFrame | None
Get negative control signal
- Parameters:
apply_mask (bool) – True removes masked probes, False keeps them. Default: True
sigdf (pd.DataFrame | None) – signal dataframe to use. Useful to save time applying the mask. Default: None
- Returns:
the negative controls, or None if None were found
- Return type:
pandas.DataFrame | None
- get_normalization_controls(apply_mask: bool = True, average=False, sigdf: DataFrame | None = None) dict | DataFrame | None
Returns the control values to normalize green and red probes.
- Parameters:
apply_mask (bool) – True removes masked probes, False keeps them. Default: True
average (bool) – if set to True, returns a dict with keys ‘G’ and ‘R’ containing the average of the control probes. Otherwise, returns a dataframe with selected probes. Default: False
sigdf (pd.DataFrame | None) – signal dataframe to use. Useful to save time applying the mask. Default: None
- Returns:
the normalization controls as a dict or a dataframe, or None if None were found
- Return type:
dict | pandas.DataFrame | None
- get_probes(probe_ids: list[str] | str, apply_mask: bool = True, sigdf: DataFrame | None = None) DataFrame
Returns the probes dataframe filtered on a list of probe IDs
- Parameters:
- Returns:
methylation signal dataframe
- Return type:
- get_probes_with_probe_type(probe_type: str, apply_mask: bool = True, sigdf: DataFrame | None = None) DataFrame
Select probes by probe type, meaning e.g. CG, Control, SNP… (not infinium type I/II type), and apply the mask if apply_mask is True
- Parameters:
- Returns:
methylation signal dataframe
- Return type:
- get_signal_df(apply_mask: bool = True) DataFrame
Get the methylation signal dataframe, and apply the mask if apply_mask is True
- Parameters:
apply_mask (bool) – True set masked probes values to None. Default: True
- Returns:
methylation signal dataframe
- Return type:
- get_total_ib_intensity(sample_label: str | list[str] | None = None, apply_mask: bool = True) DataFrame
Computes the total intensity of all the in-band measurements. This includes all Type-I in-band measurements and all Type-II probe measurements. Both methylated and unmethylated alleles are considered.
- Parameters:
- Returns:
the total in-band intensity values
- Return type:
- ib(apply_mask: bool = True, sigdf: DataFrame | None = None) DataFrame
Get the subset of in-band probes (for type I probes only), and apply the mask if apply_mask is True
- Parameters:
apply_mask (bool) – True removes masked probes, False keeps them. Ignored if sigdf is provided. Default: True
sigdf (pd.DataFrame | None) – signal dataframe to use. Useful to save time applying the mask. Default: None
- Returns:
methylation signal dataframe
- Return type:
- ib_green(apply_mask: bool = True, sigdf: DataFrame | None = None) DataFrame
Get the subset of in-band green probes (for type I probes only), and apply the mask if apply_mask is True
- Parameters:
apply_mask (bool) – True removes masked probes, False keeps them. Ignored if sigdf is provided. Default: True
sigdf (pd.DataFrame | None) – signal dataframe to use. Useful to save time applying the mask. Default: None
- Returns:
methylation signal dataframe
- Return type:
- ib_red(apply_mask: bool = True, sigdf: DataFrame | None = None) DataFrame
Get the subset of in-band red probes (for type I probes only), and apply the mask if apply_mask is True
- Parameters:
apply_mask (bool) – True removes masked probes, False keeps them. Ignored if sigdf is provided. Default: True
sigdf (pd.DataFrame | None) – signal dataframe to use. Useful to save time applying the mask. Default: None
- Returns:
methylation signal dataframe
- Return type:
- infer_type1_channel(sample_labels: str | list[str] | None = None, switch_failed=False, mask_failed=False, summary_only=False) DataFrame
For Infinium type I probes, infer the channel from the signal values, setting it to the channel with the max signal. If max values are equals, the channel is set to R (as opposed to G in sesame).
- Parameters:
sample_labels (str | list[str] | None) – the name(s) of the sample(s) to infer the channel for. If None, infer with all samples. Default: None
switch_failed (bool) – if set to True, probes with NA values or whose max values are under a threshold (the 95th percentile of the background signals) will be switched back to their original value. Default: False.
mask_failed (bool) – mask failed probes (same probes as switch_failed). Default: False.
summary_only (bool) – does not replace the sample dataframe, only return the summary (useful for QC). Default: False
- Returns:
the summary of the switched channels
- Return type:
- static load(filepath: str) Samples
Load a pickled Samples object from filepath
- Parameters:
filepath (str) – path to the file to read
- Returns:
the loaded object
- mask_control_probes(sample_label: str | None = None) None
Shortcut to mask control probes
- Parameters:
sample_label (str | None) – The name of the sample to mask. If None, mask indexes for all samples.
- Returns:
None
- mask_non_cg_probes(sample_label: str | None = None) None
Shortcut to mask non-CpG probes
- Parameters:
sample_label (str | None) – The name of the sample to mask. If None, mask indexes for all samples.
- Returns:
None
- mask_non_unique_probes(sample_label: str | None = None) None
Shortcut to mask non-unique probes on this sample
- Parameters:
sample_label (str | None) – The name of the sample to mask. If None, mask indexes for all samples.
- Returns:
None
- mask_probes_by_names(names_to_mask: str | list[str], sample_label: str | None = None, mask_name: str | None = None) None
Match the names provided in names_to_mask with the probes mask info and mask these probes, adding them to the current mask if there is any.
- mask_quality_probes(sample_label: str | None = None) None
Shortcut to mask quality probes
- Parameters:
sample_label (str | None) – The name of the sample to mask. If None, mask indexes for all samples.
- Returns:
None
- mask_snp_probes(sample_label: str | None = None) None
Shortcut to mask snp probes
- Parameters:
sample_label (str | None) – The name of the sample to mask. If None, mask indexes for all samples.
- Returns:
None
- mask_xy_probes(sample_label: str | None = None) None
Shortcut to mask probes from XY chromosome
- Parameters:
sample_label (str | None) – The name of the sample to mask. If None, mask indexes for all samples.
- Returns:
None
- merge_samples_by(by: str, apply_mask=True) None
Merge the beads signal values of different samples by averaging them. Modifies the signal dataframe directly and removes p values column since their values need to be updated. Beta values are averaged as not to lose the batch correction result if needed. Masks are reset - but masked probes values are ignored if apply_mask is True
- meth(apply_mask: bool = True, sigdf: DataFrame | None = None) DataFrame
Get the subset of methylated probes, and apply the mask if apply_mask is True
- Parameters:
apply_mask (bool) – True removes masked probes, False keeps them. Ignored if sigdf is provided. Default: True
sigdf (pd.DataFrame | None) – signal dataframe to use. Useful to save time applying the mask. Default: None
- Returns:
methylation signal dataframe
- Return type:
- property nb_probes: int
Count the number of probes in the signal dataframe
- Returns:
number of probes
- Return type:
- property nb_samples: int
Count the number of samples contained in the object
- Returns:
number of samples
- Return type:
- noob_background_correction(sample_labels: str | list[str] | None = None, apply_mask: bool = True, use_negative_controls=True, offset=15) None
Subtract the background for a sample.
Background was modelled in a normal distribution and true signal in an exponential distribution. The Norm-Exp deconvolution is parameterized using Out-Of-Band (oob) probes. Multi-mapping probes are excluded.
- Parameters:
sample_labels (str | list[str] | None) – the name(s) of the sample(s) to correct dye bias for. If None, correct dye bias for all samples.
apply_mask (bool) – True removes masked probes, False keeps them. Default: True
use_negative_controls (bool) – if True, the background will be calculated with both negative control and out-of-band probes. Default: True
offset (int | float) – A constant value to add to the corrected signal for padding. Default: 15
- Returns:
None
- oob(apply_mask: bool = True, sigdf: DataFrame | None = None) DataFrame | None
Get the subset of out-of-band probes (for type I probes only), and apply the mask if apply_mask is True
- Parameters:
apply_mask (bool) – True removes masked probes, False keeps them. Ignored if sigdf is provided. Default: True
sigdf (pd.DataFrame | None) – signal dataframe to use. Useful to save time applying the mask. Default: None
- Returns:
methylation signal dataframe
- Return type:
pandas.DataFrame | None
- oob_green(apply_mask: bool = True, sigdf: DataFrame | None = None) DataFrame
Get the subset of out-of-band green probes (for type I probes only), and apply the mask if apply_mask is True
- Parameters:
apply_mask (bool) – True removes masked probes, False keeps them. Ignored if sigdf is provided. Default: True
sigdf (pd.DataFrame | None) – signal dataframe to use. Useful to save time applying the mask. Default: None
- Returns:
methylation signal dataframe
- Return type:
- oob_red(apply_mask: bool = True, sigdf: DataFrame | None = None) DataFrame
Get the subset of out-of-band red probes (for type I probes only), and apply the mask if apply_mask is True
- Parameters:
apply_mask (bool) – True removes masked probes, False keeps them. Ignored if sigdf is provided. Default: True
sigdf (pd.DataFrame | None) – signal dataframe to use. Useful to save time applying the mask. Default: None
- Returns:
methylation signal dataframe
- Return type:
- poobah(sample_labels: str | list[str] | None = None, apply_mask: bool = True, use_negative_controls=True, threshold=0.05) None
Detection P-value based on empirical cumulative distribution function (ECDF) of out-of-band signal aka pOOBAH (P-value with out-of-band (OOB) array hybridization). Each sample is handled separately.
Adds two columns in the signal dataframe, ‘p_value’ and ‘poobah_mask’. Add probes that are (strictly) above the defined threshold to the mask.
- Parameters:
sample_labels (str | list[str] | None) – the name(s) of the sample(s) to use for the pOOBAH calculation. If None, use all samples. Default: None
apply_mask (bool) – True removes masked probes from background, False keeps them. Default: True
use_negative_controls (bool) – add negative controls as part of the background. Default True
threshold (float) – used to output a mask based on the p_values.
- Returns:
None
- remove_probes_suffix(apply_mask=True)
Merge probes that have the same ID but different suffixes (e.g. _BC11, _TC21..) by averaging their signal values. Resets calculated pvalues and betas.
- Parameters:
apply_mask (bool) – skip masked probes values when merging samples if True. Default: True
- property sample_label_name: str
Return the name of the sample sheet column used as sample labels. By default, sample_name is used when creating the signal dataframe, but it can be changed by using the function merge_samples_by
- Returns:
the name of the identifier
- Return type:
- property sample_labels: list[str]
Return the names of the samples contained in this object, that also exist in the sample sheet.
- save(filepath: str) None
Save the current Samples object to filepath, as a pickle file
- Parameters:
filepath (str) – path to the file to create
- Returns:
None
- scrub_background_correction(sample_label: str | None = None, apply_mask: bool = True) None
Subtract residual background using background median.
This function is meant to be used after noob.
- snp_probes(apply_mask: bool = True, sigdf: DataFrame | None = None) DataFrame
Get SNP type probes (‘rs’ probes in manifest, but replaced by ‘snp’ when loaded), and apply the mask if apply_mask is True
- Parameters:
apply_mask (bool) – True removes masked probes, False keeps them. Ignored if sigdf is provided. Default: True
sigdf (pd.DataFrame | None) – signal dataframe to use. Useful to save time applying the mask. Default: None
- Returns:
methylation signal dataframe
- Return type:
- subset(sample_labels: str | list[str]) None
Keep only the specified samples. Delete the signal information, beta values, sample sheet rows and masks of all the samples that are not in the list. Ignores non-existent sample names
- type1(apply_mask: bool = True, sigdf: DataFrame | None = None) DataFrame
Get the subset of Infinium type I probes, and apply the mask if apply_mask is True
- Parameters:
apply_mask (bool) – True removes masked probes, False keeps them. Ignored if sigdf is provided. Default: True
sigdf (pd.DataFrame | None) – signal dataframe to use. Useful to save time applying the mask. Default: None
- Returns:
methylation signal dataframe
- Return type:
- type1_green(apply_mask: bool = True, sigdf: DataFrame | None = None) DataFrame
Get the subset of type I green probes, and apply the mask if apply_mask is True
- Parameters:
apply_mask (bool) – True removes masked probes, False keeps them. Ignored if sigdf is provided. Default: True
sigdf (pd.DataFrame | None) – signal dataframe to use. Useful to save time applying the mask. Default: None
- Returns:
methylation signal dataframe
- Return type:
- type1_red(apply_mask: bool = True, sigdf: DataFrame | None = None) DataFrame
Get the subset of type I red probes, and apply the mask if apply_mask is True
- Parameters:
apply_mask (bool) – True removes masked probes, False keeps them. Ignored if sigdf is provided. Default: True
sigdf (pd.DataFrame | None) – signal dataframe to use. Useful to save time applying the mask. Default: None
- Returns:
methylation signal dataframe
- Return type:
- type2(apply_mask: bool = True, sigdf: DataFrame | None = None) DataFrame
Get the subset of Infinium type II probes, and apply the mask if apply_mask is True
- Parameters:
apply_mask (bool) – True removes masked probes, False keeps them. Ignored if sigdf is provided. Default: True
sigdf (pd.DataFrame | None) – signal dataframe to use. Useful to save time applying the mask. Default: None
- Returns:
methylation signal dataframe
- Return type:
- unmeth(apply_mask: bool = True, sigdf: DataFrame | None = None) DataFrame
Get the subset of unmethylated probes, and apply the mask if apply_mask is True
- Parameters:
apply_mask (bool) – True removes masked probes, False keeps them. Ignored if sigdf is provided. Default: True
sigdf (pd.DataFrame | None) – signal dataframe to use. Useful to save time applying the mask. Default: None
- Returns:
methylation signal dataframe
- Return type: