Tag Archives: epigenetics

MethylMix: an R package for identifying DNA methylation-driven genes

The paper I am going to explore today introduces MethylMix, an R package designed to identify DNA methylation-driven genes. DNA methylation is one of the processes that are more extensively studied in biomedicine, since it has been found as a principal mechanism of gene regulation in many diseases. Although high-throughput methods are able to produce huge amounts of DNA methylation measurements, there are quite a few tools to formally identify hypo and hypermethylated genes.

This is the reason why Olivier Gevaert from Stanford proposed his MethylMix, an algorithm to identify disease-specific hyper- and hypo-methylated genes, published online yesterday on Oxford Bioinformatics.

The key idea of this work is that it is not possible to lean on an arbitrary threshold to determine the differential methylation of a gene, and the assessment of differential methylation has to be made in comparison to normal tissue. Moreover, the identification of differentially methylated genes must come along with a transcriptionally predictive effect, thus implying a functional relevance of methylation.

MethylMix first calculates a set of possible methylation states for each CpG site that is found to be associated with genes showing differential expression. This set is created by comparison with clinical samples and using the Bayesian Information Criterion (BIC). Then, a normal methylation state is defined as the mean DNA-methylation level in normal tissue samples. Each set is compared with the normal methylation state in order to calculate the Differential Methylation Value or DM- value, defined as the difference between the methylation state with the mean DNA-methylation in control samples. The output is thus an indication of which genes are differentially methylated and differentially expressed.

As mentioned, the algorithm is implemented as an R package, it’s already included in the Bioconductor package section.

NGS and data analysis for epigenetics. A couple of reviews to take stock of the situation.

Epigenetic regulation of DNA is a point of growing interest. If many evidences lead to the idea that genotype and phenotype are linked in a very complex relationship, the genome architecture is the main player in this puzzling game affecting gene frequencies. Both 3D arrangement of chromatin, and the disposition of nucleosomes along the DNA, have been found to affect genomic regulation, and to have a crucial importance in evolution, adaptation, disease etiology and possible therapeutic approaches. Facing this complexity is a big challenge, and the best weapon we can use to afford it is high- throughput methods and Next- Generation Sequencing. I was focusing on this in the last days, and have found a couple of good reviews to understand the state of the art: NGS, in epigenetics, planet Earth, year 2014.

Epigenetics: an updated overview

If you are looking for a good and updated overview of what epigenetic is, and a good summary of the main topics, I can definitely suggest this paper. Authored by Shrutii Sarda and Sridhar Hannenhalli, bioinformaticians at the University of Maryland, the paper summarizes in a very brief, but still really clear overview, “the various types of epigenomic data afforded by NGS, and some of the novel discoveries yielded by the epigenomics projects” (cit.). In a few words: what is epigenomics, how to do it, and why it worths a try.

NGS: one achronym, many techniques

The NGS technologies are a set of experimental algorithms, implemented by different corporations into several commercial solutions. A discussion on wich solution fits better in a specific experimental need could be quite long to be properly made, and very often the choice of a specific machine is determined by the availability of the lab and the project specific needs, and influenced by the personal preference of the researcher. Anyways, here we go with some papers to get a fairly good overview of the main techniques.

A comparison of NGS Systems is the topic of a paper published by Linda Liu at the Beijing Genomics Institute, and available in PDF on atcgeek. After telling briefly the story of nucleic acids sequencing, the authors compare in detail all the features of Roche 454, AB Solid, Illumina GA and Compact PGM systems.

Talking about epigenomics, we can’t help focusing on Chip-Seq in particular. The most recent and best explanatory review about Chip-seq I got to find has been published on Nature in 2012. Authored by Terence Furey, chromatin structure expert at North Carolina University, the article reports the main methods to detect and functionally characterize DNA-bound proteins, starting from Chip-seq, but still going beyond.

The possible applications are countless, ranging from evolutionary biology to biomedicine, and I think it would be quite pointless spending further words on this. This post aims to give some little tip to those guys that are approaching massive sequencing and epigenomics for the first time, or to those one that need to keep up to date in this.