Calling
Extraction of methylation information for each cytosine including
subsequent filtering.
BAT_calling
BAT_calling
is used to extract the methylation information at all
cytosines from the bisulfite read alignments produced by
BAT_mapping
. It only requires the reference
genome genome in FastA format and the sorted read alignments in BAM or
(gzip'ed) SAM format. Note that for methylation calling using haarz of
the segemehl suite, the read alignments are converted to gzip'ed SAM
format if given as BAM and indexed. The position-wise methylation
information is reported in a novel methylation
VCF
format (gzip'ed). Therein, INFO fields comprise information of
cytosine strand (CS) and its sequence context (e.g. CC=CG). In
addition, FORMAT fields contain information of methylation mapping
coverage (MDP), detailed nucleotide composition of this position
(MDP3), and the estimated methylation rate (MR). The methylation rates
are estimated as #C/(#C+#T) where #C and #T are the number of read
alignments with a cytosine nucleotide (= unconverted, methylated) and
thymine nucleotide (= converted, unmethylated) at this position.
Basic usage
BAT_calling -d <file> -q <file>
Output files
File |
Description |
input.sam.gz |
Gzip'ed and indexed SAM file containing all read alignments, if not already present. |
input.vcf.gz |
VCF file containing the cytosine methylation information used for further analyses. |
prefix.calling.log |
Log file. |
Option |
Description |
-d |
Filename of reference genome FastA. |
-q |
Filename of input BAM or (gzip'ed) SAM file containing the read alignments. |
-o |
Path for output files (default: path of input file). |
Option |
Description |
--haarz |
Path to haarz executable. Required if haarz executable is not in PATH. For installation, manual or problems please go to the segemehl (haarz) website. |
--samtools |
Path to samtools executable. Required if samtools executable is not in PATH. For installation, manual or problems please go to the samtools website. |
(top)
BAT_filter_vcf
BAT_filter_vcf
facilitates the filtering of methylation information
in VCF format based on several criteria (e.g., genomic context,
bisulfite mapping coverage, methylation rate). In addition to the VCF
output file containing only the filtered positions, BAT_filter_vcf
reports the methylation rates per filtered cytosine as
bedGraph file
and automatically generates a PDF file containing plots of the
methylation rate and coverage distributions, separately at all and at
only filtered positions only. Note that, in case only the input VCF
file without any of the filtering parameter is provided,
BAT_filter_vcf
will simply produce the bedGraph and PDF file.
Basic usage
BAT_filter_vcf --vcf <file> --out <prefix>
Output files
File |
Description |
prefix.vcf.gz |
Gzip'ed VCF file containing only positions passing the filtering criteria (if defined). |
prefix.bedgraph |
BedGraph file of methylation rates at positions passing the filtering criteria (if defined). |
prefix.pdf |
PDF file containing plots of coverage and methylation rate distributions over all positions and positions passing the filtering criteria (if defined). |
Option |
Description |
--vcf |
Filename of gzip'ed VCF file produced by BAT_calling that contains the cytosine-wise methylation information |
--out |
Prefix of output files (i.e., gzip'ed VCF file, bedGraph file, and PDF file). |
Filtering options
Option |
Description |
--context |
Comma-separated list of genomic contexts (e.g., CG). |
--MDP_min |
Minimum number of reads (i.e. bisulfite mapping coverage) per sample. |
--MDP_max |
Maximum number of reads (i.e. bisulfite mapping coverage) per sample. |
--MR_min |
Minimum methylation rate. |
--MR_max |
Maximum methylation rate. |
--MR |
Indicate whether MR filter should be applied for all samples or only to the mean methylation/difference in methylation rate. Only relevant for VCF files containing multiple samples. |
Option |
Description |
-R |
Path to R executable. Required if R executable is not in PATH. For installation, manual or problems please go to the R website. |
(top)