Administration

For administrative tasks, Mutalyzer comes with the mutalyzer-admin command line utility. Use its -h argument in combination with any subcommand for detailed usage information, for example:

$ mutalyzer-admin setup-database -h
usage: mutalyzer-admin setup-database [-h] [--destructive] [-c ALEMBIC_CONFIG]

Setup database tables (if they do not yet exist).

optional arguments:
  -h, --help            show this help message and exit
  --destructive         delete any existing tables and data
  -c ALEMBIC_CONFIG, --alembic-config ALEMBIC_CONFIG
                        path to Alembic configuration file

If Alembic config is given (--alembic-config), this also prepares the database
for future migrations with Alembic (recommended).

Managing genome assemblies

Mutalyzer can be loaded with any number of genome assemblies. Each assembly includes a list of chromosomes. To list the currently loaded genome assemblies:

$ mutalyzer-admin assemblies list
GRCh37 (hg19), Homo sapiens (9606)
GRCm38 (mm10), Mus musculus (10090)

Loading a new genome assembly is done with information in a JSON file, of which there are some examples in the Mutalyzer source tree under the extras/assemblies directory, for example:

$ mutalyzer-admin assemblies add extras/assemblies/GRCh37.json

For any genome assembly, transcript mappings can be imported. These include genomic coordinate mappings of their CDS and exons. Currently, three sources of transcript mappings are supported.

Note

The following mutalyzer-admin assemblies subcommands all accept an optional --assembly argument to specify the genome assembly to import to.

Import mappings from an NCBI mapview file

The NCBI provides FTP downloads of transcript mappings for a large number of genome assemblies as used by their Map Viewer service. These can be imported with mutalyzer-admin, but only after sorting by the feature_id and chromosome columns.

For example, to import transcript mappings for the GRCh37 assembly, run the following:

$ wget ftp://ftp.ncbi.nlm.nih.gov/genomes/MapView/Homo_sapiens/sequence/ANNOTATION_RELEASE.105/initial_release/seq_gene.md.gz
$ zcat seq_gene.md.gz | sort -t $'\t' -k 11,11 -k 2,2 > seq_gene.sorted.md
$ mutalyzer-admin assemblies import-mapview seq_gene.sorted.md 'GRCh37.p13-Primary Assembly'

Note

The last argument, GRCh37.p13-Primary Assembly, defines the group label to filter the file on. You would usually want to include it.

Examples for other assemblies can be found in this Gist.

Import mappings from an EBI LRG transcripts map file

The EBI provides FTP downloads of transcript mappings for all of the LRG sequences on the latest human genome assembly. These can be imported with mutalyzer-admin.

For example, to import LRG transcript mappings for the GRCh37 assembly, run the following:

$ wget ftp://ftp.ebi.ac.uk/pub/databases/lrgex/list_LRGs_transcripts_GRCh37.txt -O /tmp/hg19.lrgmap.txt
$ mutalyzer-admin assemblies import-lrgmap -a hg19 /tmp/hg19.lrgmap.txt

Import mappings from the UCSC Genome Browser MySQL database

Transcript mappings from the UCSC Genome Browser MySQL database can be imported on a per-gene basis. This is useful when the NCBI mappings do not (yet) include a certain gene or transcript.

For example, to import all TTN transcript mappings:

$ mutalyzer-admin assemblies import-gene TTN

Note

This subcommand chooses the UCSC genome assembly by using the alias of the specified Mutalyzer genome assembly (hg19 by default).

Import mappings from a reference file

For transcript mappings that are not available from our usual sources, importing from a genomic reference is supported:

$ mutalyzer-admin assemblies import-reference NC_012920.1

Note

Currently this subcommand is restricted to importing mtDNA transcripts, since it has the chromosome hard coded and only supports one exon per transcript.

Showing announcements to users

It is possible to define an announcement to be shown on the website interface. For example, to display Hello World! with a link to the GNU Hello World! page:

$ mutalyzer-admin announcement set 'Hello World!' \
    --url http://www.gnu.org/fun/jokes/helloworld.html

To remove the announcement, use unset:

$ mutalyzer-admin announcement unset

Synchronizing the cache with other installations

Using the sync-cache subcommand, the reference file cache of a remote Mutalyzer installation can be queried for new entries which are then retrieved and added to the local cache.

The primary purpose for this is synchronizing reference files loaded by users with the reference file loader between different servers. These reference files are assigned a unique accession number (starting with UD_) upon creation, which is at that point unknown to any other Mutalyzer server.

For example, to synchronize the local reference file cache with the primary Mutalyzer server:

$ mutalyzer-admin sync-cache 'https://mutalyzer.nl/services/?wsdl' \
    'https://mutalyzer.nl/Reference/{file}'

Mutalyzer database setup

After installation, a database needs to be setup for Mutalyzer to run (see Mutalyzer setup):

$ mutalyzer-admin setup-database --alembic-config migrations/alembic.ini

The --alembic-config argument points to the alembic.ini file in the Mutalyzer source tree and it enables initialization of database migration management. It is recommended to include it, but you don’t need it if you don’t plan to ever upgrade your Mutalyzer installation.

This subcommand also takes an optional --destructive argument, which can be used to remove any existing database content.