Check quality of the database¶
After loading all your strains to the database, you need to check allele calling quality before export results.
Note
You can have information of current data in the database using stats command.
wgMLST stats -h
Usage: wgMLST stats [OPTIONS] DATABASE
Extract stats from a wgMLST DATABASE.
Validate strains¶
To search potential strain with problems like bad assembly or wrong species, you can use the strain command with the -c option.
wgMLST strain -h
Usage: wgMLST strain [OPTIONS] DATABASE
Extracts a list of strains from a wgMLST DATABASE.
Options:
-m, --mincover INTEGER Minimun number of strain found to keep a gene
(default:0)
-k, --keep Keep only gene with different allele (omit missing).
-d, --duplicate Conserve duplicate gene (default remove).
-V, --inverse Keep only gene that do not meet the filter
of mincover or keep options.
-c, --count Count the number of gene present in the database for
each strains.
-o, --output FILENAME Export strain list to (default=stdout).
Note
If some strains show low number of genes found in comparison to the other, you can remove it using remove command.
Validate genes¶
Similarly to strains, it could be interesting to saved genes list to conserved for the rest of the analysis using gene command.
wgMLST gene -h
Usage: wgMLST gene [OPTIONS] DATABASE
Extracts a list of genes from a wgMLST DATABASE.
Options:
-m, --mincover INTEGER Minimun number of strain found to keep a gene
(default:0)
-k, --keep Keep only gene with different allele (omit missing).
-d, --duplicate Conserve duplicate gene (default remove).
-V, --inverse Keep only gene that do not meet the filter of
mincover or keep options.
-o, --output FILENAME Export GENE list to (default=stdout).
Note
Gene list that pass your threshold can be used further for export sequence.
Warning
An important parameter are the -m option that defined the minimum number of strains found to keep a gene.
If you are interesting by coregene, you can defined this number to correspond to 95% of the strain in the database. (As example, if you have 100 strains in your database, you need to set this parameter to 95)
Remove strains or genes¶
After checking the database, if some strains or genes need to be removed, you can use the remove commands.
wgMLST remove -h
Usage: wgMLST remove [OPTIONS] DATABASE [GENES_OR_STRAINS]...
Removes STRAINS or GENES from a wgMLST DATABASE.
Options:
--strains / --genes Choose the item you wish to remove [default: strains]
-f, --file FILENAME File list of genes or strains to removed on the wgMLST
database.