The majority of genome-wide association study (GWAS) risk variants reside in non-coding DNA sequences. Understanding how these sequence modifications lead to transcriptional alterations and cell-to-cell variability can help unraveling genotype-phenotype relationships. Here, we describe a computational method, dubbed CAPE, which computes the likelihood of a genetic variant to deactivate enhancers by disrupting the binding of transcription factors (TFs) in a given cellular context. CAPE learns sequence signatures associated with putative enhancers originating from large-scale sequencing experiments (such as ChIP-seq or DNase-seq) and models the change in enhancer signature upon a single nucleotide substitution. By coupling the sequence signatures associated with a genetic variant, CAPE accurately identifies causative cis-regulatory variation including expression quantitative trait loci (eQTLs) and DNase I sensitivity quantitative trait loci (dsQTLs) in a tissue-specific manner with precision superior to several currently available methods. The presented method can be trained on any tissue-specific dataset of enhancers and known functional variants and applied to prioritize disease-associated variants in the corresponding tissue.
Description of web sever
Prediction of deactivating mutations in enhancer regions
This study aimed to develop a classifier to accurately identify mutations in enhancers that can disrupt binding of TFs and thus deactivating enhancers. This kind of mutations has been defined as candidate killer mutations or deactivating mutations (deSNPs) in our previous study (1). To establish an approach that can identify potential causal regulatory SNPs impacting target gene expression or modulating chromatin states with higher accuracy, we developed a new method aimed to identify CellulAr dePendent dEactivating mutations (CAPE). Our new approach learns regulatory sequence signatures from a large-scale profile of regulatory signal tracks associated with enhancers (including DNase I sensitivity and ChIP-seq of histone marks and major TFs), and models the change of enhancer activity due to a mutation. By integrating two characteristics of a causal regulatory SNP – the variant’s disruptive effect on its cognate TF binding and the binding capability of the sequence surrounding the variant – we constructed a set of support vector machine (SVM) models to prioritize genetic variants that deactivate enhancers in a particular cellular context. To facilitate and spearhead the efforts on systematically prioritizing the regulatory variants and elucidating how these variants contribute to human diseases, we packaged CAPE as a publically accessible web server (https://cape.dcode.org). The web server takes a list of genetic variants as input (genomic location, reference and alternative alleles are required for each variant). It next scores the input variants using eQTL or dsQTL model once the tissue or cell line is selected. Users also can download the stand-alone package of CAPE and run it using their local computing environment. The output CAPE score of each variant assesses its probability of deactivating enhancers in a specific tissue.
- CAPE achieves higher accuracy when compared to the two existing well-known methods, CATO and deltaSVM.
- Differently from DeepSEA, CATO, CAPE can accurately pinpoint the functional regulatory variations with regard to cellular context.
- The output score of CAPE for each variant quantifies its likelihood to impact binding of major TFs in the corresponding tissue.
- CAPE provides a web server based application.
- Li, S et al. Human enhancers are fragile and prone to deactivating mutations. Molecular Biology and Evolution (2015).
- Maurano, M et al. Large-scale identification of sequence variants influencing human transcription factor occupancy in vivo. Nature Genetics (2015).
- Lee, D et al. A method to predict the impact of regulatory variants from DNA sequence. Nature Genetics (2015).
- Zhou, J et al. Predicting effects of noncoding variants with deep learning-based sequence model. Nature Methods (2015).