Introduction

Introduction

epiCOLOC was implemented in a web-based tool with built-in large-scale and context-dependent epigenomics annotations. The epigenomics profiles were indexed using GIGGLE (https://github.com/ryanlayer/giggle). The web server was developed by Python, jQuery, igv.js, amcharts.js and related JavaScript modules.

IntroductionepiCOLOC databaseBiological examplesStep1: Input querysStep2: Select Tissues & CategoriesStep3: OptionsJob submissionResultsPrioritization tablePiesBarplotsIGV plotsData source informationCollected epigenomic profiles in epiCOLOC

epiCOLOC database

current status of version 1

 

 

Biological examples

 

Step1: Input querys

epiCOLOC accepts two types of genomic format: BED-like (http://genome.ucsc.edu/FAQ/FAQformat.html#format1) format and VCF-like (http://samtools.github.io/hts-specs/VCFv4.2.pdf) format. Both plain text and uploaded file of regions of interest (ROIs) or single nucleotide variants are well supported. Uploaded file can be BED or VCF text file format or compressed by bgzip (https://github.com/samtools/htslib). The users must input their query regions or upload a query file (upload limit is set to be 20Mb), and select their tissues and epigenetic category of interest before submit.

There are two examples provided in epiCOLOC.

Click the example above Input Data and submit for quick usage.

 

Step2: Select Tissues & Categories

Users must select one or more tissues before submit the job.

 

Step3: Options

We provide a few basic options, including whole genomic size, flank length in both sides, window size center on ROIs, max region length.

Optionally, users can tailor one or more of these options to suit their own needs.

Note: The last three options are used to manipulate genomic intervals, and if selected, we will execute the logical operation from top to bottom.

 

Job submission

Once submit, we perform colocalization analysis for user input file and users selected epiCOLOC database. During the task execution, epiCOLOC displays a progress bar to track execution status.

 

When the submitted job finish, epiCOLOC will redirect the URL to the results page. epiCOLOC allows job retrieving through searching job id in epiCOLOC home page, or receive the email notification.

or go to the website http://mulinlab.org/epicoloc/results/<job_id> to see the results directly.

 

Results

Note: only significant results are displayed in epiCOLOC results part, for all results, please click the download button in the top of results page.

table

Prioritization table

The table shows statistics metrics of colocalization including combo score, Fisher’s exact P-value, odds ratio, the number of overlaps, users can customize the rank of table by all of the column items.

Apart from the statistics metrics, the table also extracts the dataset source of the functional profile file.

Click the add icon or the Expand All button, table will be expanded to show more details.

table

 

 

Pies

epiCOLOC provides two tissue-wise pie charts that depict the number of significant enriched colocalization results.

One for enriched results cross tissues (combo score >0 and P<=0.05), and another for depleted results cross tissues, (combo score <0 and P<=0.05).

Users can click the slice of each tissue in pie chart to see detailed sub-tissue results

Barplots

The last part of epiCOLOC results are three barplots, with scroll, zoom and search function to gain a better visualization experience.

Barplots are comprised of GIGGLE combo score barplot, odds ratio barplot and log10 transformed P value barplot. Bars are sorted by GIGGLE combo score.

 

 

Hover the mouse cursor over any of these bars, details about the bar are shown in the tooltip component.

 

Due to the web page capacity, tooltip only shows the assay id of file who owns the biggest combo score, and when the epicoloc results have more than 2k significant records, the barplots only show results with top 2k absolute combo scores.

 

Once the label under the tissue-wise bar plots be clicked, cell type-wise bars which depicts enrichment pattern for top 20 enriched cell types with highest GIGGLE combo scores in this function in a popup window.

 

 

Users can filter the results under different combination of empirical combo score, P value, odds ratio thresholds. The max and min value for combo score cutoff and odds ratio cutoff are dynamically adjusted to fit the real data.

 

 

IGV plots

 

Data source information

 

NameURLPMIDVersion
ENCODEhttps://www.encodeproject.org/2295561620190621
FANTOM5http://fantom.gsc.riken.jp/5/datafiles/latest/extra/Enhancers/2572310220150112
Cistromehttp://cistrome.org/db/#/bdown2778970220190108
DeepBlue(Blueprint)https://deepblue.mpi-inf.mpg.de/dashboard.php#ajax/deepblue_view_grid.php2833434920180924
ChIP Atlashttps://chip-atlas.org3041348220181101
ReMaphttp://pedagogix-tagc.univ-mrs.fr/remap/index.php?page=download29126285v1.2
BOCAhttps://bendlj01.u.hpc.mssm.edu/multireg/resources/boca_peaks.zip2994588220180626
TCGAhttps://gdc.cancer.gov/about-data/publications/ATACseq-AWG3036134120181026
HACERhttp://bioinfo.vanderbilt.edu/AE/HACER/download.html3024765420190108
ChomHMM RoadMap Segmentationhttp://egg2.wustl.edu/roadmap/data/byFileType/chromhmmSegmentations/ChmmModels/coreMarks/jointModel/final/all.mnemonics.bedFiles.tgz2144190720110323
Segwayhttps://noble.gs.washington.edu/proj/encyclopedia/interpreted/3146227520190828
scATAC-seqhttps://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE653602608375620150617
scATAC-seqhttps://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE967692970654920160207
scATAC-seqhttps://figshare.com/articles/Human_CD8_T_cells_uATACseq_/70057013019443420180426

 

Collected epigenomic profiles in epiCOLOC

http://mulinlab.tmu.edu.cn/epicoloc/metadata.txt