.. _clamlst_initialise:
.. toctree::
:glob:
==========================
Initialise a MLST database
==========================
A MLST database contains the different alleles for each gene of the
scheme and a table of association of the alleles to determined the
sequence type (ST).
Import from pubMLST
===================
You can automatically import a MLST resource from `pubmlst
`_ or `pasteur `_.
.. code-block:: bash
claMLST import -h
Usage: claMLST import [OPTIONS] DATABASE [SPECIES]...
Creates a claMLST DATABASE from an online resource.
The research can be filtered by adding a SPECIES name.
Options:
--prompt / --no-prompt Do not prompt if multiple choices are found,
fail instead.
-f, --force Overwrites alrealdy existing DATABASE
-m, --mlst TEXT Specifies the desired MLST scheme name.
-r, --repository Choose the online repository to use [pubmlst|pasteur]
Create from other resource
==========================
Alternatively, you can create a database with the allele sequence and
MLST profile of your favorite species. To create a database, pyMLST
needs the gene name in the MLST profile header to match the name in
the fasta file. For example, the rpoB gene in the MLST profile header
must match the rpoB.fas file. You will also need to remove the
additional column corresponding to the clonal complex in the MLST
profile file, if present.
.. code-block:: bash
claMLST create --help
Usage: claMLST create [OPTIONS] DATABASE PROFILE ALLELES...
Creates a classical MLST DATABASE from a txt PROFILE and fasta ALLELES files.
Options:
-f, --force Overwrites alrealdy existing DATABASE
-s, --species TEXT Name of the species
-V, --version TEXT Version of the database
Scheme example
--------------
.. code::
ST cpn60 fusA gltA pyrG recA rplB rpoB
1 1 1 1 1 5 1 1
2 2 2 2 2 2 2 2
3 3 3 2 2 3 1 3
...
Allele example
--------------
.. code::
>cpn60_1
ATGAACCCAATGGATTTAAAACGCGGTATCGACATTGCAGTAAAAACTGTAGTTGAAAAT
ATCCGTTCTATTGCTAAACCAGCTGATGATTTCAAAGCAATTGAACAAGTAGGTTCAATC
TCTGCTAACTCTGATACTACTGTTGGTAAACTTATTGCTCAAGCAATGGAAAAAGTAGGT
AAAGAAGGCGTAATCACTGTAGAAGAAGGTTCTGGCTTCGAAGACGCATTAGACGTTGTA
GAAGGTATGCAGTTTGACCGTGGTTATATCTCTCCGTACTTTGCAAACAAACAAGATACT
TTAACTGCTGAACTTGAAAATCCGTTCATTCTTCTTGTTGATAAAAAAATCAGCAACATT
CGTGAATTGATTTCTGTTTTAGAAGCAGTTGCTAAAACTGGTAAA
>cpn60_2
ATGAACCCAATGGATTTAAAACGCGGTATCGACATTGCAGTAAAAACTGTAGTTGAAAAT
ATCCGTTCTATTGCTAAACCAGCTGATGATTTCAAAGCAATTGAACAAGTAGGTTCAATC
TCTGCTAACTCTGATACTACTGTTGGTAAACTTATTGCTCAAGCAATGGAAAAAGTAGGT
AAAGAAGGCGTAATCACTGTAGAAGAAGGCTCAGGCTTCGAAGACGCATTAGACGTTGTA
GAAGGTATGCAGTTTGACCGTGGTTATATCTCTCCGTACTTTGCAAACAAACAAGATACT
TTAACTGCTGAACTTGAAAATCCGTTCATCCTTCTTGTTGATAAAAAAATCAGCAACATT
CGTGAATTGATTTCTGTTTTAGAAGCAGTTGCTAAAACTGGTAAA
...