A research team led by investigators at the Children’s Hospital of Philadelphia (CHOP) and New Jersey Institute of Technology (NJIT) has released their findings for a new algorithm they developed that uses machine learning to predict sites of DNA methylation—a process that can change the activity of DNA without changing its overall structure—and could identify disease-causing mechanisms that would otherwise be missed by conventional screening methods.
Findings from the new study were published recently in Nature Machine Intelligence through an article titled “ Elucidation of DNA methylation on N 6 -adenine with deep learning .”
DNA methylation is involved in many key cellular processes and an important component in gene expression. Likewise, errors in methylation can be linked to a variety of human diseases. While genomic sequencing tools are effective at pinpointing polymorphisms that may cause disease, those same methods are unable to capture the effects of methylation because the individual genes still look the same. Specifically, there has been considerable effort to study DNA methylation on N6-adenine (6mA) in eukaryotic cells, which include human cells, but while genomic data is available, the role of methylation in these cells remains elusive.
“Previously, methods that had been developed to identify these methylation sites in the genome were very conservative and could only look at certain nucleotide lengths at a given time, so a large number of methylation sites were missed,” explained senior study investigator Hakon Hakonarson, MD, PhD, director of the Center for Applied Genomics (CAG) at CHOP. “We needed to develop a better way of identifying and predicting methylation sites with a tool that could identify these motifs throughout the genome that may have a robust functional impact and are potentially disease-causing.”
To address this issue plaguing the research community, CAG, and its partners at NJIT turned toward using deep learning programs. Zhi Wei, PhD, a professor of computer science at NJIT and a senior co-author of the study, worked with Hakonarson and his team to develop a deep learning algorithm that could predict where these sites of methylation happened, which would then help researchers determine the effect they might have on certain nearby genes.
“To exploit existing 6mA genomic data and address this challenge, here we develop a deep-learning-based algorithm for predicting potential DNA 6mA sites de novo from sequence at single-nucleotide resolution, with application to three representative model organisms, Arabidopsis thaliana, Drosophila melanogaster, and Escherichia coli,” the authors wrote. “Extensive experiments demonstrate the accuracy of our algorithm and its superior performance compared with conventional k-mer-based approaches. Furthermore, our saliency maps-based context analysis protocol reveals interesting cis-regulatory patterns around the 6mA sites that are missed by conventional motif analysis. Our proposed analytical tools and findings will help to elucidate the regulatory mechanisms of 6mA and benefit the in-depth exploration of their functional effects.”[…]