Data Example
File Format
Note
- For local usage: all formats, except rsID, require data to be sorted first by sequence name and then by leftmost coordinate.
rsID Format
The first column of rsID format must be the
Local Usage:
rs11191416 4.67E-15
rs4918072 9.63E-10
rs61848342 6.38E-10
rs4752700 8.02E-11
rs1887318 1.73E-17
...
VCF Format
VCF format must have
Local Usage:
##VCF meta lines...
#VCF header line
1 64649 rs181431124 A C 100 PASS .
1 81125 rs560365426 T C 100 PASS .
1 81712 rs558839829 C T 100 PASS .
1 88230 rs543088928 T C 100 PASS .
...
VCF-Like Format
Note
- Meta information and header line are optional for VCF-like format.
- The first five columns are the same as the VCF format.
Local Usage:
1 64649 rs181431124 A C col6 col7
1 81125 rs560365426 T C col6 col7
1 81712 rs558839829 C T col6 col7
1 88230 rs543088928 T C col6 col7
1 99687 rs139153227 C T col6 col7
...
BED-Like Format
The first three column of BED-like format must be the
Local Usage:
chr1 10141 10237
chr1 235503 235929
chr1 237727 237953
chr1 565508 565728
chr1 567451 567955
...
BED-Like Allele Format
The first five column of BED-like allele format must be the
Local Usage:
1 10000 10001 T A
1 10000 10001 T C
1 10000 10001 T G
1 10001 10002 A C
1 10002 10003 A C
...
Coord-Only Format
The first two column of Coord-Only format must be the
Local Usage:
1 69091
1 69092
1 13116
1 1645399
1 3706538
1 3706816
...
Coord-Allele Format
The first two column of Coord-Allele format must be the
Local Usage:
1 10000 10001 T A
1 10000 10001 T C
1 10000 10001 T G
1 10001 10002 A C
1 10002 10003 A C
...
TAB Format
The TAB format should be used with attributes
- Required:
c ,b ,e - Optional:
ref ,alt ,0 ,ci ,sep
Local Usage:
1 13482 G C
1 48204 G A
1 52152 ATAAT A
1 54712 T TTTTC
1 57226 G A
1 62863 CACTT C
1 63336 C T
1 68082 T C
1 73269 T A
1 76856 T A
...
Annotation Database File
A compressed
Annotation Database File must be indexed with
VCF Format Example(1000G_p3.sort.vcf.gz)
The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. The 1000 Genomes Project phase 3 it the completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping.
##VCF meta lines...
#VCF header line
1 67948 rs556268856 T C 100 PASS AC=3;AF=0.000599042;AN=5008;NS=2504;DP=25933;EAS_AF=0.003;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP
1 67955 rs576545302 T A 100 PASS AC=3;AF=0.000599042;AN=5008;NS=2504;DP=25869;EAS_AF=0;AMR_AF=0;AFR_AF=0.0023;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP
1 68082 rs367789441 T C 100 PASS AC=169;AF=0.033746;AN=5008;NS=2504;DP=25952;EAS_AF=0.001;AMR_AF=0.0331;AFR_AF=0.003;EUR_AF=0.0984;SAS_AF=0.0429;AA=.|||;VT=SNP
1 68118 rs562240137 T C 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=25464;EAS_AF=0;AMR_AF=0;AFR_AF=0.0008;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP
1 68247 rs527989887 G A 100 PASS AC=4;AF=0.000798722;AN=5008;NS=2504;DP=22292;EAS_AF=0;AMR_AF=0;AFR_AF=0.003;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP
1 68337 rs540193047 C G 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=19265;EAS_AF=0;AMR_AF=0;AFR_AF=0.0008;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP
...
VCF Format Example(cosmic.sort.vcf.gz)
COSMIC, the Catalogue Of Somatic Mutations In Cancer, is the world's largest and most comprehensive resource for exploring the impact of somatic mutations in human cancer. COSMIC collects these somatic mutation data from a variety of public sources into one standardized repository, and make it easily explorable in a variety of graphical, tabulated and downloadable ways.
##VCF meta lines...
#VCF header line
1 69224 COSV58737130 A C . . GENE=OR4F5;STRAND=+;LEGACY_ID=COSM3677745;CDS=c.134A>C;AA=p.D45A;CNT=1
1 69230 COSV58737076 A C . . GENE=OR4F5;STRAND=+;LEGACY_ID=COSM3677746;CDS=c.140A>C;AA=p.H47P;CNT=1
1 69236 COSV58737142 A C . . GENE=OR4F5;STRAND=+;LEGACY_ID=COSM3677747;CDS=c.146A>C;AA=p.H49P;CNT=1
1 69270 COSV58736820 A G . . GENE=OR4F5;STRAND=+;LEGACY_ID=COSM5424184;SNP;CDS=c.180A>G;AA=p.S60=;CNT=1
1 69345 COSV58736780 C A . . GENE=OR4F5;STRAND=+;LEGACY_ID=COSM911918;CDS=c.255C>A;AA=p.I85=;CNT=1
1 69359 COSV58736910 G T . . GENE=OR4F5;STRAND=+;LEGACY_ID=COSM6401900;CDS=c.269G>T;AA=p.C90F;CNT=2
1 69486 COSV58736947 C T . . GENE=OR4F5;STRAND=+;LEGACY_ID=COSM6734473;CDS=c.396C>T;AA=p.N132=;CNT=1
1 69511 COSV58736924 A G . . GENE=OR4F5;STRAND=+;LEGACY_ID=COSM4144171;SNP;CDS=c.421A>G;AA=p.T141A;CNT=1
1 69517 COSV58737059 G A . . GENE=OR4F5;STRAND=+;LEGACY_ID=COSM3492078;CDS=c.427G>A;AA=p.G143R;CNT=1
...
BED-like Format Example(roadmap.sort.bed.gz)
The NIH Roadmap Epigenomics Mapping Consortium was launched with the goal of producing a public resource of human epigenomic data to catalyze basic biology and disease-oriented research. The Consortium leverages experimental pipelines built around next-generation sequencing technologies to map DNA methylation, histone modifications, chromatin accessibility and small RNA transcripts in stem cells and primary ex vivo tissues selected to represent the normal counterparts of tissues and organ systems frequently involved in human disease.
#chrom chromStart chromEnd name score strand signalValue pValue qValue peak cellMark cellID cellName
1 9959 10511 Rank_24984 68 . 2.62240 6.83866 4.25719 287 E062-H3K9me3 E062 Primary mononuclear cells from peripheral blood
1 9975 10710 Rank_1550 475 . 12.86108 47.54574 44.03423 370 E080-H3K9me3 E080 Fetal Adrenal Gland
1 10003 10563 Rank_1363 488 . 13.42901 48.85612 44.82438 248 E084-H3K9me3 E084 Fetal Intestine Large
1 10011 10631 Rank_3377 217 . 8.95720 21.74096 18.31963 239 E092-H3K9me3 E092 Fetal Stomach
1 10012 10397 Rank_15590 193 . 7.50000 19.36886 16.46100 157 E061-H3K36me3 E061 Foreskin Melanocyte Primary Cells skin03
1 10012 10425 Rank_15065 71 . 3.84862 7.13214 4.01348 237 E012-H3K9me3 E012 hESC Derived CD56+ Ectoderm Cultured Cells
...
Coord-Allele Format Example(dbscSNV.sort.tab.gz)
dbscSNV includes all potential human SNVs within splicing consensus regions (−3 to +8 at the 5’ splice site and −12 to +2 at the 3’ splice site), i.e. scSNVs, related functional annotations and two ensemble prediction scores for predicting their potential of altering splicing.
1 860326 A C 1 924946 n y upstream SAMD11 . . UTR5 ENSG00000187634 . . 0.00764482882370293 0.03
1 860326 A G 1 924946 n y upstream SAMD11 . . UTR5 ENSG00000187634 . . 0.00764482882370293 0.032
1 860326 A T 1 924946 n y upstream SAMD11 . . UTR5 ENSG00000187634 . . 0.00692000194525311 0.03
1 860327 A C 1 924947 n y upstream SAMD11 . . UTR5 ENSG00000187634 . . 0.00430955476585136 0.04
1 860327 A G 1 924947 n y upstream SAMD11 . . UTR5 ENSG00000187634 . . 0.00430955476585136 0.04
1 860327 A T 1 924947 n y upstream SAMD11 . . UTR5 ENSG00000187634 . . 0.00430955476585136 0.042
...
Result File
Count Result
Query: q2.sort.bed Database: 1000G_p3.sort.vcf.gz, roadmap.sort.bed.gz
Output file: q2.sort.bed.count.gz
Chr Begin End 1000g roadmap Total
chr1 10141 10237 2 171 173
chr1 235503 235929 4 48 52
chr1 237727 237953 4 51 55
chr1 565508 565728 29 94 123
chr1 567451 567955 51 132 183
chr1 569701 570067 34 154 188
chr1 570129 570274 7 37 44
...
QueryRegion Result
Query: 1:2298288-2298289 Database: 1000G_p3.sort.vcf.gz
Output1 in console:
1000G_p3.sort.vcf.gz 1 2298289 rs182863424 T C 100 PASS AC=4;AF=0.000798722;AN=5008;NS=2504;DP=17505;EAS_AF=0.003;AMR_AF=0;AFR_AF=0.0008;EUR_AF=0;SAS_AF=0;AA=T|||;VT=SNP
Query: 1:959100-959200 Database: 1000G_p3.sort.vcf.gz, cosmic.sort.vcf.gz
Output2 in console:
1000g 1 959104 rs538473605 G A 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=17741;EAS_AF=0.001;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=G|||;VT=SNP
1000g 1 959128 rs548777990 C T 100 PASS AC=2;AF=0.000399361;AN=5008;NS=2504;DP=17304;EAS_AF=0;AMR_AF=0;AFR_AF=0.0008;EUR_AF=0;SAS_AF=0.001;AA=C|||;VT=SNP
1000g 1 959137 rs568781463 G C 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=17086;EAS_AF=0;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0.001;AA=G|||;VT=SNP
1000g 1 959155 rs3845291 G A 100 PASS AC=2404;AF=0.480032;AN=5008;NS=2504;DP=16907;EAS_AF=0.7361;AMR_AF=0.6182;AFR_AF=0.1309;EUR_AF=0.5398;SAS_AF=0.5286;AA=G|||;VT=SNP
1000g 1 959169 rs3845292 G C 100 PASS AC=2830;AF=0.565096;AN=5008;NS=2504;DP=16335;EAS_AF=0.8264;AMR_AF=0.6398;AFR_AF=0.3154;EUR_AF=0.5497;SAS_AF=0.5961;AA=c|||;VT=SNP
1000g 1 959193 rs188044457 G A 100 PASS AC=6;AF=0.00119808;AN=5008;NS=2504;DP=15909;EAS_AF=0;AMR_AF=0;AFR_AF=0.0008;EUR_AF=0;SAS_AF=0.0051;AA=G|||;VT=SNP
cosmic 1 959109 COSV65072933 C T . . GENE=AGRN;STRAND=+;LEGACY_ID=COSN6051067;CDS=c.463+1267C>T;AA=p.?;CNT=1
Intersect Result("OVERLAP" file)
An
Description of OVERLAP file format:
- Comment lines start with '@';
- Query lines start with '#';
- Database lines start with the database tag.
Note
- Comment lines can be removed by setting
-RC true ". - Comment lines are required for using
AnnotationIntersectFile program. Please refer AnnotationIntersectFile section for details.
Output file: q1.sort.vcf.overlap.gz
@query_file=/test_data/q1.sort.vcf
@query_format=2,1,2,2,0,##,4,5,true
@header=CHROM POS ID REF ALT QUAL FILTER INFO
@db_path=/test_data/1000G_p3.sort.vcf.gz
@db_index_type=VARNOTE
@db_tag=1000g
@db_path=/test_data/cosmic.sort.vcf.gz
@db_index_type=TBI
@db_tag=cosmic
@out_file=/test_data/q1.sort.vcf.overlap.gz
@end
#query 1 81125 rs560365426 T C 100 PASS .
1000g 1 81125 rs560365426 T C 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=22536;EAS_AF=0.001;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP
#query 1 81712 rs558839829 C T 100 PASS .
1000g 1 81712 rs558839829 C T 100 PASS
...
#query 1 914333 rs13302979 C G 100 PASS .
1000g 1 914333 rs13302979 C G 100 PASS AC=2786;AF=0.55631;AN=5008;NS=2504;DP=12459;EAS_AF=0.8056;AMR_AF=0.6542;AFR_AF=0.2436;EUR_AF=0.6034;SAS_AF=0.6043;AA=G|||;VT=SNP
cosmic 1 914333 COSV58020681 C G . . GENE=PERM1;STRAND=-;LEGACY_ID=COSM4591185;SNP;CDS=c.1795G>C;AA=p.E599Q;CNT=18
cosmic 1 914333 COSV58020681 C G . . GENE=PERM1_ENST00000341290;STRAND=-;LEGACY_ID=COSM4591185;SNP;CDS=c.1735G>C;AA=p.E579Q;CNT=18
#query 1 923978 rs70949537 A AG 100 PASS .
1000g 1 923978 rs70949537 A AG 100 PASS AC=4568;AF=0.912141;AN=5008;NS=2504;DP=18544;EAS_AF=0.9554;AMR_AF=0.9424;AFR_AF=0.7761;EUR_AF=0.9573;SAS_AF=0.9836;AA=|||unknown(HR);VT=INDEL
...
Output file with option
...
#query 1 64649 rs181431124 A C 100 PASS .
#query 1 81125 rs560365426 T C 100 PASS .
1000g 1 81125 rs560365426 T C 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=22536;EAS_AF=0.001;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP
#query 1 81712 rs558839829 C T 100 PASS .
1000g 1 81712 rs558839829 C T 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=20171;EAS_AF=0;AMR_AF=0;AFR_AF=0.0008;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP
...
Output file: q2.sort.bed.overlap.gz
@query_file=/test_data/q2.sort.bed
@query_format=65536,1,2,3,0,##,-1,-1,false
@header=col1 col2 col3
@db_path=/test_data/roadmap.sort.bed.gz
@db_index_type=VARNOTE
@db_tag=roadmap.sort.bed.gz
@out_file==/test_data/q2.sort.bed.overlap.gz
@end
#query chr1 10141 10237
roadmap.sort.bed.gz 1 9959 10511 Rank_24984 68 . 2.62240 6.83866 4.25719 287 E062-H3K9me3 E062 Primary mononuclear cells from peripheral blood
roadmap.sort.bed.gz 1 9975 10710 Rank_1550 475 . 12.86108 47.54574 44.03423 370 E080-H3K9me3 E080 Fetal Adrenal Gland
roadmap.sort.bed.gz 1 10003 10563 Rank_1363 488 . 13.42901 48.85612 44.82438 248 E084-H3K9me3 E084 Fetal Intestine Large
roadmap.sort.bed.gz 1 10011 10631 Rank_3377 217 . 8.95720 21.74096 18.31963 239 E092-H3K9me3 E092 Fetal Stomach
roadmap.sort.bed.gz 1 10012 10397 Rank_15590 193 . 7.50000 19.36886 16.46100 157 E061-H3K36me3 E061 Foreskin Melanocyte Primary Cells skin03
roadmap.sort.bed.gz 1 10012 10425 Rank_15065 71 . 3.84862 7.13214 4.01348 237 E012-H3K9me3 E012 hESC Derived CD56+ Ectoderm Cultured Cells
roadmap.sort.bed.gz 1 10012 10643 Rank_6221 143 . 7.06203 14.32418 10.81042 249 E093-H3K9me3 E093 Fetal Thymus
roadmap.sort.bed.gz 1 10013 10542 Rank_10363 165 . 6.79631 16.50425 13.62928 243 E016-H3K9me3 E016 HUES64 Cells
...
#query chr1 235503 235929
roadmap.sort.bed.gz 1 235476 235948 Rank_67025 61 . 3.83294 6.17368 4.21646 262 E118-H3K4me1 E118 HepG2 Hepatocellular Carcinoma Cell Line
roadmap.sort.bed.gz 1 235502 235748 Rank_76537 93 . 4.95012 9.30910 7.14876 151 E034-H3K27me3 E034 Primary T cells from peripheral blood
roadmap.sort.bed.gz 1 235538 235953 Rank_78927 47 . 2.99401 4.70182 2.87583 135 E123-H3K4me2 E123 K562 Leukemia Cells
roadmap.sort.bed.gz 1 235538 236066 Rank_36164 97 . 4.30883 9.79599 7.66409 129 E123-H3K27ac E123 K562 Leukemia Cells
roadmap.sort.bed.gz 1 235541 235947 Rank_185476 35 . 2.60000 3.56270 1.77043 198 E114-H3K27me3 E114 A549 EtOH 0.02pct Lung Carcinoma Cell Line
roadmap.sort.bed.gz 1 235545 235761 Rank_84115 57 . 4.18234 5.75779 4.17927 191 E030-H3K4me1 E030 Primary neutrophils from peripheral blood
...
Output file: q1.twomode.overlap.gz
@query_file=/test_data/q1.sort.vcf
@query_format=2,1,2,2,0,##,4,5,true
@header=CHROM POS ID REF ALT QUAL FILTER INFO
@db_path=/test_data/1000G_p3.sort.vcf.gz
@db_index_type=VARNOTE
@db_tag=1000g
@db_path=/test_data/roadmap.sort.bed.gz
@db_index_type=VARNOTE
@db_tag=roadmap
@out_file=/test_data/q1.twomode.overlap.gz
@end
#query 1 81125 rs560365426 T C 100 PASS .
1000g 1 81125 rs560365426 T C 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=22536;EAS_AF=0.001;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP
#query 1 81712 rs558839829 C T 100 PASS .
1000g 1 81712 rs558839829 C T 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=20171;EAS_AF=0;AMR_AF=0;AFR_AF=0.0008;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP
#query 1 88230 rs543088928 T C 100 PASS .
1000g 1 88230 rs543088928 T C 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=18579;EAS_AF=0;AMR_AF=0;AFR_AF=0.0008;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP
roadmap 1 88072 88328 Rank_137612 41 . 3.51627 4.16325 2.13235 117 E107-H3K27me3 E107 Skeletal Muscle Male
roadmap 1 88103 88478 Rank_26335 76 . 4.25000 7.66459 4.62783 188 E117-H3K27me3 E117 HeLa-S3 Cervical Carcinoma Cell Line
roadmap 1 88106 88400 Rank_14127 90 . 6.46950 9.08008 6.56416 154 E070-H3K27me3 E070 Brain Germinal Matrix
roadmap 1 88126 88414 Rank_14959 103 . 5.72283 10.34884 7.39796 170 E108-H3K27me3 E108 Skeletal Muscle Female
...
Intersection with remote DB Query: q3.sort.tab Database: http://202.113.53.226/VarNoteDB/VarNoteDB_AF_gnomAD_Genome.vcf.gz
Output file: q3.sort.tab.overlap.gz
@query_file=/test_data/q3.sort.tab
@query_format=0,1,2,2,0,##,3,4,false
@header=col1 col2 col3 col4
@db_path=http://202.113.53.226/VarNoteDB/VarNoteDB_AF_gnomAD_Genome.vcf.gz
@db_index_type=VARNOTE
@db_tag=gnomAD
@out_file=/test_data/q3.sort.tab.overlap.gz
@end
#query 1 13482 G C
gnomAD 1 13482 rs537951473 G C 624.47 RF;AC0 AC=0;AF=0.00000e+00;AN=29724;BaseQRankSum=-1.74200e+00;ClippingRankSum=-3.87000e-01;DP=1059428;FS=4.58900e+00;InbreedingCoeff=-1.20000e-03;MQ=2.99300e+01;MQRankSum=-1.70000e-01;QD=4.40000e-01;ReadPosRankSum=6.37000e-01;SOR=1.39200e+00;VQSLOD=-5.04700e+01;VQSR_culprit=MQ;GQ_HIST_ALT=1|0|1|1|0|0|1|0|1|1|2|1|1|1|0|0|3|0|1|3;DP_HIST_ALT=0|0|0|0|0|0|0|0|1|2|0|0|1|4|2|2|0|1|3|0;AB_HIST_ALT=0|0|11|7|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0;GQ_HIST_ALL=31|144|91|140|130|49|66|30|16|14|13|6|16|4|16|7|36|11|44|14593;DP_HIST_ALL=250|310|107|26|31|35|1005|2734|2043|1549|1445|1280|1150|917|748|510|384|265|202|142;AB_HIST_ALL=0|0|11|7|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0;AC_Male=0;AC_Female=0;AN_Male=16338;AN_Female=13386;AF_Male=0.00000e+00;AF_Female=0.00000e+00;GC_Male=8169,0,0;GC_Female=6693,0,0;GC_raw=15439,18,0;AC_raw=18;AN_raw=30914;GC=14862,0,0;AF_raw=5.82260e-04;Hom_AFR=0;Hom_AMR=0;Hom_ASJ=0;Hom_EAS=0;Hom_FIN=0;Hom_NFE=0;Hom_OTH=0;Hom=0;Hom_raw=0;AC_AFR=0;AC_AMR=0;AC_ASJ=0;AC_EAS=0;AC_FIN=0;AC_NFE=0;AC_OTH=0;AN_AFR=8662;AN_AMR=798;AN_ASJ=240;AN_EAS=1618;AN_FIN=3494;AN_NFE=13990;AN_OTH=922;AF_AFR=0.00000e+00;AF_AMR=0.00000e+00;AF_ASJ=0.00000e+00;AF_EAS=0.00000e+00;AF_FIN=0.00000e+00;AF_NFE=0.00000e+00;AF_OTH=0.00000e+00;POPMAX=.;AC_POPMAX=.;AN_POPMAX=.;AF_POPMAX=.;DP_MEDIAN=72;DREF_MEDIAN=1.24822e-06;GQ_MEDIAN=60;AB_MEDIAN=1.36395e-01;AS_RF=9.76716e-03;AS_FilterStatus=RF|AC0;AS_RF_NEGATIVE_TRAIN=1;CSQ=C|downstream_gene_variant|MODIFIER|WASH7P|ENSG00000227232|Transcript|ENST00000423562|unprocessed_pseudogene||||||||||rs537951473|1|881|-1||SNV|1|HGNC|38034||||||||||||||C:0.0004|C:0|C:0.0015|C:0|C:0|C:0|C:0|||C:0|C:4.500e-05|C:0.003195|C:0.0001704|C:0|C:0|C:0|C:0||||||||||||,C|downstream_gene_variant|MODIFIER|WASH7P|ENSG00000227232|Transcript|ENST00000438504|unprocessed_pseudogene||||||||||rs537951473|1|881|-1||SNV|1|HGNC|38034|YES|||||||||||||C:0.0004|C:0|C:0.0015|C:0|C:0|C:0|C:0|||C:0|C:4.500e-05|C:0.003195|C:0.0001704|C:0|C:0|C:0|C:0||||||||||||,C|non_coding_transcript_exon_variant&non_coding_transcript_variant|MODIFIER|DDX11L1|ENSG00000223972|Transcript|ENST00000450305|transcribed_unprocessed_pseudogene|6/6||ENST00000450305.2:n.444G>C||444|||||rs537951473|1||1||SNV|1|HGNC|37102||||||||||||||C:0.0004|C:0|C:0.0015|C:0|C:0|C:0|C:0|||C:0|C:4.500e-05|C:0.003195|C:0.0001704|C:0|C:0|C:0|C:0||||||||||||,C|non_coding_transcript_exon_variant&non_coding_transcript_variant|MODIFIER|DDX11L1|ENSG00000223972|Transcript|ENST00000456328|processed_transcript|3/3||ENST00000456328.2:n.730G>C||730|||||rs537951473|1||1||SNV|1|HGNC|37102|YES|||||||||||||C:0.0004|C:0|C:0.0015|C:0|C:0|C:0|C:0|||C:0|C:4.500e-05|C:0.003195|C:0.0001704|C:0|C:0|C:0|C:0||||||||||||,C|downstream_gene_variant|MODIFIER|WASH7P|ENSG00000227232|Transcript|ENST00000488147|unprocessed_pseudogene||||||||||rs537951473|1|922|-1||SNV|1|HGNC|38034||||||||||||||C:0.0004|C:0|C:0.0015|C:0|C:0|C:0|C:0|||C:0|C:4.500e-05|C:0.003195|C:0.0001704|C:0|C:0|C:0|C:0||||||||||||,C|non_coding_transcript_exon_variant&non_coding_transcript_variant|MODIFIER|DDX11L1|ENSG00000223972|Transcript|ENST00000515242|transcribed_unprocessed_pseudogene|3/3||ENST00000515242.2:n.723G>C||723|||||rs537951473|1||1||SNV|1|HGNC|37102||||||||||||||C:0.0004|C:0|C:0.0015|C:0|C:0|C:0|C:0|||C:0|C:4.500e-05|C:0.003195|C:0.0001704|C:0|C:0|C:0|C:0||||||||||||,C|non_coding_transcript_exon_variant&non_coding_transcript_variant|MODIFIER|DDX11L1|ENSG00000223972|Transcript|ENST00000518655|transcribed_unprocessed_pseudogene|3/4||ENST00000518655.2:n.561G>C||561|||||rs537951473|1||1||SNV|1|HGNC|37102||||||||||||||C:0.0004|C:0|C:0.0015|C:0|C:0|C:0|C:0|||C:0|C:4.500e-05|C:0.003195|C:0.0001704|C:0|C:0|C:0|C:0||||||||||||,C|downstream_gene_variant|MODIFIER|WASH7P|ENSG00000227232|Transcript|ENST00000538476|unprocessed_pseudogene||||||||||rs537951473|1|929|-1||SNV|1|HGNC|38034||||||||||||||C:0.0004|C:0|C:0.0015|C:0|C:0|C:0|C:0|||C:0|C:4.500e-05|C:0.003195|C:0.0001704|C:0|C:0|C:0|C:0||||||||||||,C|downstream_gene_variant|MODIFIER|WASH7P|ENSG00000227232|Transcript|ENST00000541675|unprocessed_pseudogene||||||||||rs537951473|1|881|-1||SNV|1|HGNC|38034||||||||||||||C:0.0004|C:0|C:0.0015|C:0|C:0|C:0|C:0|||C:0|C:4.500e-05|C:0.003195|C:0.0001704|C:0|C:0|C:0|C:0||||||||||||,C|regulatory_region_variant|MODIFIER|||RegulatoryFeature|ENSR00001576075|CTCF_binding_site||||||||||rs537951473|1||||SNV|1||||||||||||||||C:0.0004|C:0|C:0.0015|C:0|C:0|C:0|C:0|||C:0|C:4.500e-05|C:0.003195|C:0.0001704|C:0|C:0|C:0|C:0||||||||||||;GC_AFR=4331,0,0;GC_AMR=399,0,0;GC_ASJ=120,0,0;GC_EAS=809,0,0;GC_FIN=1747,0,0;GC_NFE=6995,0,0;GC_OTH=461,0,0;Hom_Male=0;Hom_Female=0
#query 1 48204 G A
gnomAD 1 48204 rs548809068 G A,T 2087.50 PASS AC=9,2;AF=3.16589e-04,7.03532e-05;AN=28428;BaseQRankSum=7.67000e-01;ClippingRankSum=1.70000e-02;DP=526368;FS=1.31000e+00;InbreedingCoeff=3.64000e-02;MQ=2.71500e+01;MQRankSum=-2.23000e-01;QD=3.16000e+00;ReadPosRankSum=5.61000e-01;SOR=8.28000e-01;VQSLOD=-4.89500e+01;VQSR_culprit=MQ;GQ_HIST_ALT=0|0|0|0|0|1|0|0|0|0|0|0|0|0|0|0|0|1|1|10,0|0|0|0|0|0|1|0|0|0|0|0|0|0|0|0|0|0|0|0;DP_HIST_ALT=0|0|0|0|0|1|1|3|2|0|2|2|0|0|1|0|0|0|1|0,0|0|1|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0;AB_HIST_ALT=0|0|2|2|4|3|1|0|0|0|0|1|0|0|0|0|0|0|0|0,0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0;GQ_HIST_ALL=98|54|55|191|382|280|594|623|322|656|714|356|1381|267|869|320|1060|100|935|5870;DP_HIST_ALL=181|702|1383|1550|2125|1843|3002|3745|548|37|6|2|1|0|1|0|0|0|1|0;AB_HIST_ALL=0|0|2|2|4|3|1|0|0|0|0|1|0|0|0|0|0|0|0|0;AC_Male=3,2;AC_Female=6,0;AN_Male=15750;AN_Female=12678;AF_Male=1.90476e-04,1.26984e-04;AF_Female=4.73261e-04,0.00000e+00;GC_Male=7871,3,0,0,0,1;GC_Female=6333,6,0,0,0,0;GC_raw=15113,13,0,0,0,1;AC_raw=13,2;AN_raw=30254;GC=14204,9,0,0,0,1;AF_raw=4.29695e-04,6.61070e-05;Hom_AFR=0,0;Hom_AMR=0,1;Hom_ASJ=0,0;Hom_EAS=0,0;Hom_FIN=0,0;Hom_NFE=0,0;Hom_OTH=0,0;Hom=0,1;Hom_raw=0,1;AC_AFR=9,0;AC_AMR=0,2;AC_ASJ=0,0;AC_EAS=0,0;AC_FIN=0,0;AC_NFE=0,0;AC_OTH=0,0;AN_AFR=8212;AN_AMR=726;AN_ASJ=250;AN_EAS=1588;AN_FIN=3076;AN_NFE=13686;AN_OTH=890;AF_AFR=1.09596e-03,0.00000e+00;AF_AMR=0.00000e+00,2.75482e-03;AF_ASJ=0.00000e+00,0.00000e+00;AF_EAS=0.00000e+00,0.00000e+00;AF_FIN=0.00000e+00,0.00000e+00;AF_NFE=0.00000e+00,0.00000e+00;AF_OTH=0.00000e+00,0.00000e+00;POPMAX=AFR,AMR;AC_POPMAX=9,2;AN_POPMAX=8212,726;AF_POPMAX=1.09596e-03,2.75482e-03;DP_MEDIAN=42,11;DREF_MEDIAN=1.58489e-16,7.93930e-32;GQ_MEDIAN=99,33;AB_MEDIAN=2.22222e-01,4.90833e-01;AS_RF=1.07496e-01,4.06749e-01;AS_FilterStatus=RF,PASS;AS_RF_NEGATIVE_TRAIN=1,2;CSQ=A|upstream_gene_variant|MODIFIER|OR4G4P|ENSG00000268020|Transcript|ENST00000594647|unprocessed_pseudogene||||||||||rs548809068|1|4845|1||SNV||HGNC|14822||||||||||||||A:0.0004||A:0.0008|A:0|A:0.001|A:0|A:0||||||||||||||||||||||,T|upstream_gene_variant|MODIFIER|OR4G4P|ENSG00000268020|Transcript|ENST00000594647|unprocessed_pseudogene||||||||||rs548809068|2|4845|1||SNV||HGNC|14822||||||||||||||A:0.0004||A:0.0008|A:0|A:0.001|A:0|A:0||||||||||||||||||||||,A|upstream_gene_variant|MODIFIER|OR4G4P|ENSG00000268020|Transcript|ENST00000606857|unprocessed_pseudogene||||||||||rs548809068|1|4269|1||SNV||HGNC|14822|YES|||||||||||||A:0.0004||A:0.0008|A:0|A:0.001|A:0|A:0||||||||||||||||||||||,T|upstream_gene_variant|MODIFIER|OR4G4P|ENSG00000268020|Transcript|ENST00000606857|unprocessed_pseudogene||||||||||rs548809068|2|4269|1||SNV||HGNC|14822|YES|||||||||||||A:0.0004||A:0.0008|A:0|A:0.001|A:0|A:0||||||||||||||||||||||;GC_AFR=4097,9,0,0,0,0;GC_AMR=362,0,0,0,0,1;GC_ASJ=125,0,0,0,0,0;GC_EAS=794,0,0,0,0,0;GC_FIN=1538,0,0,0,0,0;GC_NFE=6843,0,0,0,0,0;GC_OTH=445,0,0,0,0,0;Hom_Male=0,1;Hom_Female=0,0
#query 1 52152 ATAAT A
gnomAD 1 52152 rs568235219 ATAAT A 7526.08 PASS AC=14;AF=5.34392e-04;AN=26198;BaseQRankSum=3.51000e-01;ClippingRankSum=1.96000e-01;DP=472975;FS=1.11430e+01;InbreedingCoeff=-4.00000e-04;MQ=5.10900e+01;MQRankSum=-5.23000e-01;QD=1.23400e+01;ReadPosRankSum=7.69000e-01;SOR=8.23000e-01;VQSLOD=-4.27800e-01;VQSR_culprit=MQRankSum;VQSR_NEGATIVE_TRAIN_SITE;GQ_HIST_ALT=0|0|0|0|0|0|0|1|0|0|0|0|0|0|0|0|0|1|0|14;DP_HIST_ALT=0|0|0|4|1|2|4|1|2|1|0|0|0|0|0|0|1|0|0|0;AB_HIST_ALT=0|0|1|1|0|3|2|0|0|1|2|2|0|0|1|1|1|1|0|0;GQ_HIST_ALL=199|284|157|419|523|341|774|765|412|816|809|414|1312|254|844|293|1014|97|845|4278;DP_HIST_ALL=516|1175|1724|1894|2097|1709|1741|2319|990|337|164|77|45|27|17|5|7|1|1|3;AB_HIST_ALL=0|0|1|1|0|3|2|0|0|1|2|2|0|0|1|1|1|1|0|0;AC_Male=12;AC_Female=2;AN_Male=14590;AN_Female=11608;AF_Male=8.22481e-04;AF_Female=1.72295e-04;GC_Male=7283,12,0;GC_Female=5802,2,0;GC_raw=14834,16,0;AC_raw=16;AN_raw=29700;GC=13085,14,0;AF_raw=5.38721e-04;Hom_AFR=0;Hom_AMR=0;Hom_ASJ=0;Hom_EAS=0;Hom_FIN=0;Hom_NFE=0;Hom_OTH=0;Hom=0;Hom_raw=0;AC_AFR=0;AC_AMR=0;AC_ASJ=0;AC_EAS=0;AC_FIN=0;AC_NFE=14;AC_OTH=0;AN_AFR=7434;AN_AMR=610;AN_ASJ=240;AN_EAS=1582;AN_FIN=2490;AN_NFE=13048;AN_OTH=794;AF_AFR=0.00000e+00;AF_AMR=0.00000e+00;AF_ASJ=0.00000e+00;AF_EAS=0.00000e+00;AF_FIN=0.00000e+00;AF_NFE=1.07296e-03;AF_OTH=0.00000e+00;POPMAX=NFE;AC_POPMAX=14;AN_POPMAX=13048;AF_POPMAX=1.07296e-03;DP_MEDIAN=31;DREF_MEDIAN=6.25594e-46;GQ_MEDIAN=99;AB_MEDIAN=4.96838e-01;AS_RF=9.72455e-01;AS_FilterStatus=PASS;CSQ=-|upstream_gene_variant|MODIFIER|OR4G4P|ENSG00000268020|Transcript|ENST00000594647|unprocessed_pseudogene||||||||||rs568235219|1|893|1||deletion|1|HGNC|14822||||||||||||||-:0.0006||-:0.0008|-:0|-:0|-:0|-:0.002||||||||||||||||||||||,-|upstream_gene_variant|MODIFIER|OR4G4P|ENSG00000268020|Transcript|ENST00000606857|unprocessed_pseudogene||||||||||rs568235219|1|317|1||deletion|1|HGNC|14822|YES|||||||||||||-:0.0006||-:0.0008|-:0|-:0|-:0|-:0.002||||||||||||||||||||||;GC_AFR=3717,0,0;GC_AMR=305,0,0;GC_ASJ=120,0,0;GC_EAS=791,0,0;GC_FIN=1245,0,0;GC_NFE=6510,14,0;GC_OTH=397,0,0;Hom_Male=0;Hom_Female=0
...
Annotation Configuration File
Format of annotation configuration file:
- Command line starts with '#';
- Database tag starts with '@';
- Configuration fields including: fields, info_fields, cols, out_names, has_header, header_path, comment_indicator, vcf_info_path
Description of fields:
fields : Specify fields to extract. Valid fields for VCF are CHROM, BEGIN, REF, ALT, QUAL, FILTER, INFO, for BED and TAB format should apply fields referring to header or external header file.info_fields : Used for VCF format only, should be applied together with fields=[INFO].cols : BED and TAB field without header could use cols to specify the columns to extract. Default column name is "col"+n.out_names : Rename extracted fields.header_path : Used for BED and TAB format to specify header path.comment_indicator :Used for BED and TAB format to specify header path.vcf_info_path : Used with BED format (annotation output format is VCF) to specify field INFO.
all_dbs.annoc
@1000g
fields=[INFO]
info_fields=[AC, AF]
out_names=[AC:1000g_AC, AF:1000g_AF]
@gnomAD
fields=[INFO]
info_fields=[AC, AF]
out_names=[AC:gnomAD_AC, AF:gnomAD_AF]
@cosmic
fields=[INFO]
info_fields=[GENE, STRAND]
out_names=[GENE:cosmic_GENE, STRAND:cosmic_STRAND]
@dbscSNV
header_path=/path/to/config/dbscSNV.header
fields=[RefSeq_gene, rf_score]
out_names=[RefSeq_gene:dbscSNV_refGene, rf_score:dbscSNV_rf_score]
dbscSNV.header
chr pos ref alt hg38_chr hg38_pos RefSeq? Ensembl? RefSeq_region RefSeq_gene RefSeq_functional_consequence RefSeq_id_c_change_p_change Ensembl_region Ensembl_gene Ensembl_functional_consequence Ensembl_id_c_change_p_change ada_score rf_score
Annotation Result("ANNO" file)
Annotation without configuration file Query: q1.sort.vcf Database: 1000G_p3.sort.vcf.gz, cosmic.sort.vcf.gz
Output file: q1.sort.vcf.allfields.anno.gz
##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##INFO=<ID=1000g_CHROM,Number=.,Type=String,Description="">
##INFO=<ID=1000g_POS,Number=.,Type=String,Description="">
##INFO=<ID=1000g_REF,Number=.,Type=String,Description="">
...
##INFO=<ID=AA,Number=1,Type=String,Description="Ancestral Allele. Format: AA|REF|ALT|IndelType. AA: Ancestral allele, REF:Reference Allele, ALT:Alternate Allele, IndelType:Type of Indel (REF, ALT and IndelType are only defined for indels)">
##INFO=<ID=AC,Number=A,Type=Integer,Description="Total number of alternate alleles in called genotypes">
##INFO=<ID=AF,Number=A,Type=Float,Description="Estimated allele frequency in the range (0,1)">
...
##INFO=<ID=cosmic_ALT,Number=.,Type=String,Description="">
##INFO=<ID=cosmic_CHROM,Number=.,Type=String,Description="">
##INFO=<ID=cosmic_FILTER,Number=.,Type=String,Description="">
##INFO=<ID=cosmic_ID,Number=.,Type=String,Description="">
##INFO=<ID=cosmic_POS,Number=.,Type=String,Description="">
...
##contig=<D=1,assembly=b37,length=249250621>
##fileDate=20150218
##reference=ftp://ftp.1000genomes.ebi.ac.uk//vol1/ftp/technical/reference/phase2_reference_assembly_sequence/hs37d5.fa.gz
##source=1000GenomesPhase3Pipeline
#CHROM POS ID REF ALT QUAL FILTER INFO
...
1 81125 rs560365426 T C 100 PASS .;1000g_CHROM=1;1000g_POS=81125;1000g_ID=rs560365426;1000g_REF=T;1000g_ALT=C;1000g_QUAL=100;1000g_FILTER=PASS;1000g_AC=1;1000g_AF=0.000199681;1000g_NS=2504;1000g_AN=5008;1000g_EAS_AF=0.001;1000g_EUR_AF=0;1000g_AFR_AF=0;1000g_AMR_AF=0;1000g_SAS_AF=0;1000g_DP=22536;1000g_AA=.|||;1000g_VT=SNP
1 81712 rs558839829 C T 100 PASS .;1000g_CHROM=1;1000g_POS=81712;1000g_ID=rs558839829;1000g_REF=C;1000g_ALT=T;1000g_QUAL=100;1000g_FILTER=PASS;1000g_AC=1;1000g_AF=0.000199681;1000g_NS=2504;1000g_AN=5008;1000g_EAS_AF=0;1000g_EUR_AF=0;1000g_AFR_AF=0.0008;1000g_AMR_AF=0;1000g_SAS_AF=0;1000g_DP=20171;1000g_AA=.|||;1000g_VT=SNP
1 88230 rs543088928 T C 100 PASS .;1000g_CHROM=1;1000g_POS=88230;1000g_ID=rs543088928;1000g_REF=T;1000g_ALT=C;1000g_QUAL=100;1000g_FILTER=PASS;1000g_AC=1;1000g_AF=0.000199681;1000g_NS=2504;1000g_AN=5008;1000g_EAS_AF=0;1000g_EUR_AF=0;1000g_AFR_AF=0.0008;1000g_AMR_AF=0;1000g_SAS_AF=0;1000g_DP=18579;1000g_AA=.|||;1000g_VT=SNP
1 99687 rs139153227 C T 100 PASS .;1000g_CHROM=1;1000g_POS=99687;1000g_ID=rs139153227;1000g_REF=C;1000g_ALT=T;1000g_QUAL=100;1000g_FILTER=PASS;1000g_AC=161;1000g_AF=0.0321486;1000g_NS=2504;1000g_AN=5008;1000g_EAS_AF=0.001;1000g_EUR_AF=0.0895;1000g_AFR_AF=0.0045;1000g_AMR_AF=0.0331;1000g_SAS_AF=0.0419;1000g_DP=17422;1000g_AA=.|||;1000g_VT=SNP
...
Annotation with configuration file: config/all_dbs.annoc Query: q1.sort.vcf Database: 1000G_p3.sort.vcf.gz, cosmic.sort.vcf.gz
Output file: q1.sort.vcf.anno.gz
##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##INFO=<ID=1000g_AC,Number=A,Type=Integer,Description="Total number of alternate alleles in called genotypes">
##INFO=<ID=1000g_AF,Number=A,Type=Float,Description="Estimated allele frequency in the range (0,1)">
##INFO=<ID=cosmic_GENE,Number=1,Type=String,Description="Gene name">
##INFO=<ID=cosmic_STRAND,Number=1,Type=String,Description="Gene strand">
##contig=<D=1,assembly=b37,length=249250621>
##fileDate=20150218
##reference=ftp://ftp.1000genomes.ebi.ac.uk//vol1/ftp/technical/reference/phase2_reference_assembly_sequence/hs37d5.fa.gz
##source=1000GenomesPhase3Pipeline
#CHROM POS ID REF ALT QUAL FILTER INFO
...
1 895903 rs544271560 G A 100 PASS .;1000g_AC=7;1000g_AF=0.00139776
1 899747 rs368028255 G A 100 PASS .;1000g_AC=1;1000g_AF=0.000199681
1 901328 rs574254395 A G 100 PASS .;1000g_AC=4;1000g_AF=0.000798722
1 902018 rs567034360 A G 100 PASS .;1000g_AC=1;1000g_AF=0.000199681
1 905307 rs528578943 G A 100 PASS .;1000g_AC=18;1000g_AF=0.00359425
1 908062 rs527519589 C T 100 PASS .;1000g_AC=9;1000g_AF=0.00179712
1 914333 rs13302979 C G 100 PASS .;1000g_AC=2786;1000g_AF=0.55631;cosmic_GENE=PERM1,PERM1_ENST00000341290;cosmic_STRAND=-,-
1 923978 rs70949537 A AG 100 PASS .;1000g_AC=4568;1000g_AF=0.912141
1 924628 rs552249487 C T 100 PASS .;1000g_AC=1;1000g_AF=0.000199681
1 929190 rs9777939 A G 100 PASS .;1000g_AC=4688;1000g_AF=0.936102
...
Output file: q1.sort.bed.anno.gz
CHROM BEGIN END REF ALT QUAL FILTER 1000g_AC 1000g_AF
1 81124 81125 T C 100 PASS 1 0.000199681
1 81711 81712 C T 100 PASS 1 0.000199681
1 88229 88230 T C 100 PASS 1 0.000199681
1 99686 99687 C T 100 PASS 161 0.0321486
1 254262 254263 C T 100 PASS 1 0.000199681
1 534168 534169 G A 100 PASS 6 0.00119808
1 565078 565079 A G 100 PASS 2 0.000399361
1 570093 570094 G A 100 PASS 78 0.0155751
1 715845 715846 G A 100 PASS 40 0.00798722
1 722602 722603 T C 100 PASS 89 0.0177716
...
Remote database Query: q1.sort.vcf Database: 1000G_p3.sort.vcf.gz, VarNoteDB_AF_gnomAD_Genome.vcf.gz
Output file: q1.sort.vcf.remote.anno.gz
##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##INFO=<ID=1000g_AC,Number=A,Type=Integer,Description="Total number of alternate alleles in called genotypes">
##INFO=<ID=1000g_AF,Number=A,Type=Float,Description="Estimated allele frequency in the range (0,1)">
##INFO=<ID=gnomAD_AC,Number=A,Type=Integer,Description="Allele count in genotypes, for each ALT allele, in the same order as listed">
##INFO=<ID=gnomAD_AF,Number=A,Type=Float,Description="Allele Frequency among genotypes, for each ALT allele, in the same order as listed">
##contig=<ID=1,assembly=b37,length=249250621>
##fileDate=20150218
##reference=ftp://ftp.1000genomes.ebi.ac.uk//vol1/ftp/technical/reference/phase2_reference_assembly_sequence/hs37d5.fa.gz
##source=1000GenomesPhase3Pipeline
#CHROM POS ID REF ALT QUAL FILTER INFO
1 64649 rs181431124 A C 100 PASS .;gnomAD_AC=584;gnomAD_AF=2.05402e-02
1 81125 rs560365426 T C 100 PASS .;1000g_AC=1;1000g_AF=0.000199681
1 81712 rs558839829 C T 100 PASS .;1000g_AC=1;1000g_AF=0.000199681;gnomAD_AC=14;gnomAD_AF=5.39915e-04
1 88230 rs543088928 T C 100 PASS .;1000g_AC=1;1000g_AF=0.000199681
1 99687 rs139153227 C T 100 PASS .;1000g_AC=161;1000g_AF=0.0321486;gnomAD_AC=1434;gnomAD_AF=6.64689e-02
1 254263 rs558650540 C T 100 PASS .;1000g_AC=1;1000g_AF=0.000199681
1 534169 rs59089120 G A 100 PASS .;1000g_AC=6;1000g_AF=0.00119808;gnomAD_AC=37;gnomAD_AF=1.24899e-03
...
Run program with configuration file
Configuration file with
# configuration file with all possible arguments
# database should be started with [db]
query_file=./q1.sort.vcf
# format options of query
query_format=vcf
chrom=1
begin=2
end=2
ref=4
alt=5
zero_based=false
comment_indicator=##
header_indicator=#
has_header=true
# output options
out_file=./q1.sort.vcf.overlap.gz
is_loj=false
is_zip=true
# other options
thread=4
is_log=true
use_jdk_inflater=false
[db]
db_path=./1000G_p3.sort.vcf.gz
db_index_type=VarNote
db_tag=1000g
db_mode=1
[db]
db_path=./cosmic.sort.vcf.gz
db_index_type=TBI
db_tag=cosmic
db_mode=1
Configuration file with
# configuration file with all required arguments
query_file=./q1.sort.vcf
# other options
thread=4
[db]
db_path=./1000G_p3.sort.vcf.gz
db_tag=1000g
db_mode=1
[db]
db_path=./cosmic.sort.vcf.gz
db_index_type=TBI
db_tag=cosmic
db_mode=1
Configuration file with all possible arguments for
# configuration file with all possible arguments
# database should be started with [db]
query_file=./q1.sort.vcf
# format options of query
query_format=vcf
chrom=1
begin=2
end=2
ref=4
alt=5
zero_based=false
comment_indicator=##
header_indicator=#
has_header=true
# annotation options
anno_config=./config/all_dbs.annoc
force_overlap=false
out_format=VCF
# output options
out_file=./q1.sort.vcf.overlap.gz
is_loj=false
is_zip=true
# other options
thread=4
is_log=true
use_jdk_inflater=false
[db]
db_path=./1000G_p3.sort.vcf.gz
db_index_type=VarNote
db_tag=1000g
db_mode=1
[db]
db_path=./cosmic.sort.vcf.gz
db_index_type=TBI
db_tag=cosmic
db_mode=1