Administration¶
For administrative tasks, Mutalyzer comes with the mutalyzer-admin
command
line utility. Use its -h
argument in combination with any subcommand for
detailed usage information, for example:
$ mutalyzer-admin setup-database -h
usage: mutalyzer-admin setup-database [-h] [--destructive] [-c ALEMBIC_CONFIG]
Setup database tables (if they do not yet exist).
optional arguments:
-h, --help show this help message and exit
--destructive delete any existing tables and data
-c ALEMBIC_CONFIG, --alembic-config ALEMBIC_CONFIG
path to Alembic configuration file
If Alembic config is given (--alembic-config), this also prepares the database
for future migrations with Alembic (recommended).
Managing genome assemblies¶
Mutalyzer can be loaded with any number of genome assemblies. Each assembly includes a list of chromosomes. To list the currently loaded genome assemblies:
$ mutalyzer-admin assemblies list
GRCh37 (hg19), Homo sapiens (9606)
GRCm38 (mm10), Mus musculus (10090)
Loading a new genome assembly is done with information in a JSON file, of
which there are some examples in the Mutalyzer source tree under the
extras/assemblies
directory, for example:
$ mutalyzer-admin assemblies add extras/assemblies/GRCh37.json
For any genome assembly, transcript mappings can be imported. These include genomic coordinate mappings of their CDS and exons. Currently, three sources of transcript mappings are supported.
Note
The following mutalyzer-admin assemblies
subcommands all accept
an optional --assembly
argument to specify the genome assembly
to import to.
Import mappings from an NCBI mapview file¶
The NCBI provides FTP downloads of transcript mappings for a large number of
genome assemblies as used by their Map Viewer service. These can be imported with
mutalyzer-admin
, but only after sorting by the feature_id and
chromosome columns.
For example, to import transcript mappings for the GRCh37 assembly, run the following:
$ wget ftp://ftp.ncbi.nlm.nih.gov/genomes/MapView/Homo_sapiens/sequence/ANNOTATION_RELEASE.105/initial_release/seq_gene.md.gz
$ zcat seq_gene.md.gz | sort -t $'\t' -k 11,11 -k 2,2 > seq_gene.sorted.md
$ mutalyzer-admin assemblies import-mapview seq_gene.sorted.md 'GRCh37.p13-Primary Assembly'
Note
The last argument, GRCh37.p13-Primary Assembly
, defines the group
label to filter the file on. You would usually want to include it.
Examples for other assemblies can be found in this Gist.
Import mappings from an EBI LRG transcripts map file¶
The EBI provides FTP downloads
of transcript mappings for all of the LRG sequences on the latest human genome
assembly. These can be imported with mutalyzer-admin
.
For example, to import LRG transcript mappings for the GRCh37 assembly, run the following:
$ wget ftp://ftp.ebi.ac.uk/pub/databases/lrgex/list_LRGs_transcripts_GRCh37.txt -O /tmp/hg19.lrgmap.txt
$ mutalyzer-admin assemblies import-lrgmap -a hg19 /tmp/hg19.lrgmap.txt
Import mappings from the UCSC Genome Browser MySQL database¶
Transcript mappings from the UCSC Genome Browser MySQL database can be imported on a per-gene basis. This is useful when the NCBI mappings do not (yet) include a certain gene or transcript.
For example, to import all TTN transcript mappings:
$ mutalyzer-admin assemblies import-gene TTN
Note
This subcommand chooses the UCSC genome assembly by using the alias of the specified Mutalyzer genome assembly (hg19 by default).
Import mappings from a reference file¶
For transcript mappings that are not available from our usual sources, importing from a genomic reference is supported:
$ mutalyzer-admin assemblies import-reference NC_012920.1
Note
Currently this subcommand is restricted to importing mtDNA transcripts, since it has the chromosome hard coded and only supports one exon per transcript.
Showing announcements to users¶
It is possible to define an announcement to be shown on the website interface. For example, to display Hello World! with a link to the GNU Hello World! page:
$ mutalyzer-admin announcement set 'Hello World!' \
--url http://www.gnu.org/fun/jokes/helloworld.html
To remove the announcement, use unset
:
$ mutalyzer-admin announcement unset
Synchronizing the cache with other installations¶
Using the sync-cache
subcommand, the reference file cache of a remote
Mutalyzer installation can be queried for new entries which are then retrieved
and added to the local cache.
The primary purpose for this is synchronizing reference files loaded by users
with the reference file loader between different servers. These reference
files are assigned a unique accession number (starting with UD_
) upon
creation, which is at that point unknown to any other Mutalyzer server.
For example, to synchronize the local reference file cache with the primary Mutalyzer server:
$ mutalyzer-admin sync-cache 'https://mutalyzer.nl/services/?wsdl' \
'https://mutalyzer.nl/Reference/{file}'
Mutalyzer database setup¶
After installation, a database needs to be setup for Mutalyzer to run (see Mutalyzer setup):
$ mutalyzer-admin setup-database --alembic-config migrations/alembic.ini
The --alembic-config
argument points to the alembic.ini
file in the
Mutalyzer source tree and it enables initialization of database migration
management. It is recommended to include it, but you don’t need it if you
don’t plan to ever upgrade your Mutalyzer installation.
This subcommand also takes an optional --destructive
argument, which can
be used to remove any existing database content.