3. DMP and DMR
[1]:
from pylluminator.samples import Samples
from pylluminator.visualizations import manhattan_plot_dmr, plot_dmp_heatmap, visualize_gene
from pylluminator.dm import DM
from pylluminator.utils import save_object, load_object
from pylluminator.utils import set_logger
set_logger('WARNING') # set the verbosity level, can be DEBUG, INFO, WARNING, ERROR
3.1. Load pylluminator Samples
We assume that you have already processed the .idat files according to your preferences and saved them. If not, please refer to notebook 1 - Read data and get beta values before going any further.
[2]:
my_samples = Samples.load('preprocessed_samples')
Here, we want to filter out the probes on the X or Y chromosomes.
[3]:
my_samples.mask_xy_probes()
To speed up the demo, we will only calculate DMP and DMR on 10% of the probes
[4]:
ten_pct_probes = int(0.1 * my_samples.nb_probes)
probe_ids = my_samples.probe_ids[:ten_pct_probes]
print(f'Selected {ten_pct_probes:,} first probes')
Selected 93,768 first probes
3.2. Differentially Methylated Probes
The second parameter of get_dmp() is a R-like formula used in the design matrix to describe the statistical model, e.g. ‘~age + sex’. The names must be the column names of the sample sheet provided as third parameter
More info on design matrices and formulas:
[5]:
my_samples.sample_sheet
[5]:
| sample_id | sample_name | sample_type | |
|---|---|---|---|
| 0 | GSM7698462 | LNCAP_500_3 | LNCAP |
| 1 | GSM7698443 | PREC_500_2 | PREC |
| 2 | GSM7698435 | PREC_500_1 | PREC |
| 3 | GSM7698446 | LNCAP_500_2 | LNCAP |
| 4 | GSM7698459 | PREC_500_3 | PREC |
| 5 | GSM7698438 | LNCAP_500_1 | LNCAP |
[6]:
my_dms = DM(my_samples, '~ sample_type', probe_ids=probe_ids)
You can now plot the results, for the 25 most variable probes:
[7]:
plot_dmp_heatmap(my_dms, my_dms.contrasts[0], nb_probes=25)
3.3. Differentially Methylated Regions
[8]:
my_dms.compute_dmr(my_dms.contrasts)
save_object(my_dms, 'dms')
[9]:
my_dms.get_top('DMR', my_dms.contrasts[0])
[9]:
| genes | |||
|---|---|---|---|
| segment_id | sample_type[T.PREC]_p_value | chromosome | |
| 42291 | 6.822830e-66 | 20 | ISM1 |
| 13864 | 7.581671e-63 | 6 | ENSG00000237669;ZNRD1ASP;HLA-J |
| 8774 | 8.426901e-61 | 3 | FOXL2;LINC01391;FOXL2NB |
| 11768 | 6.430889e-58 | 5 | PDE4D;ENSG00000247345 |
| 24684 | 5.611748e-56 | 11 | ADM-DT;SBF2;ADM |
| 13807 | 8.651960e-55 | 6 | OR2I1P |
| 26199 | 9.034690e-53 | 11 | PIWIL4-AS1;AMOTL1 |
| 20378 | 4.427272e-52 | 9 | CLTA;GNE |
| 16618 | 2.642552e-51 | 7 | ENSG00000278334;ENSG00000278020;HOXA11-AS;HOXA... |
| 28614 | 8.382595e-50 | 12 | DNAH10 |
[10]:
manhattan_plot_dmr(my_dms, my_dms.contrasts[0])
3.4. Gene visualization
We can then have a look at a particular gene identified as differentially methylated, for example the first one, ISM1.
[11]:
visualize_gene(my_samples, 'ISM1', figsize=(8, 7))