Functionalities regarding genomic sequence analysis.
Finds genomic features overlapping genomic positions, like exons, reconstructs offsets into transcripts, and computes the amino-acid changaes of variants. Additionally finds mutations in exon junctions, and genes with high frequencies of mutations.
Positions, ranges, and mutations are specified using a common format. Each is
identified by several fields separated by the character
Mutations have an
additional field representing the mutant allele
12:4234412:T (the reference allele is redundant, as it is specified by
the chromosome and position). Indels are represented as in the following
+ATC for one and three base insertions repectively or
--- for one or three deletions repectively.
Chromosome ranges are
chromosome:start:end as in
It supports multiple organisms. The format of the
organism input is the
organism short code (
optionally followed by the date of the build. For example,
Hsa/jan2013 for a
recent build or
Hsa/may2009 for the hg18 build.
watson input is used to specify if the variants are described in
reference to the watson or forward strand, or in reference to the strand that
holds the overlapping gene. Using the wrong convention may make some mutations
coincide with the reference. The
is_watson method can take a guess by
checking this criteria.
vcf parameter will interpret the input as a VCF file, and will
genomic_mutations task to extract the mutations from it
The main tasks are:
- add_reference Add reference to mutations as (ref\>mut)
- affected_genes Finds genes affeted by genomic mutations, either by amino-acid changes on their protein products, or by changes in splicing sequences
- binomial_significance_syn For a list of mutations, find genes that suffer a higher rate of mutation than expected. Considers also synonymous mutations
- exon_junctions Report exon junctions overlapping positions
- exons Report exons overlapping positions
- gene_strand_reference Report the reference base at the provided positions on the gene coding strand
- genes Report genes overlapping positions
- genes_at_ranges Report genes overlapping ranges
- genomic_mutations Extract genomic mutations from a VCF file that match a quality criteria
- is_watson Guess wether the mutations provided are given in the watson strand or the gene strand
- mutated_isoforms Computes the consequence of genomic mutations in terms of amino-acid changes in protein isoforms
- mutated_isoforms_fast One-step implementation of the `mutated_isoforms` task
- reference Report the reference base at the provided positions
- splicing_mutations Find mutations that may affect the splicing of protein coding transcripts
- transcript_offsets Computes the offset inside the coding sequence of the transcripts overlapped the genomic mutations that overlap them.
- transcripts Report transcripts overlapping positions
- type Report the type of base change: transition, transversion, indel, unknown or none at all