Skip to content

Adding new reference data

DominikS edited this page Jun 2, 2022 · 2 revisions

You can add new reference data to Xspect’s database for the species assignment, strain-typing or Oxa-gene screening.

Adding new Acinetobacter species:

If new species are discovered, it is possible to add them to XspecT. XspecT needs reference-data for the Bloom Filter. For an accurate assignment use up to four assemblies for each new species. Those assemblies need to be concatenated into one file and should not contain more than ~11.9 million distinct kmers. You can use the Tool Jellyfish to count kmers.

Follow these steps to add new species:

  1. Make sure that the filename is the species name

  2. Copy and paste the (concatenated-)file in the folder filter/new_species

  3. Copy and paste the training-data into the folder Training-data/genomes

  4. Make sure that the file-name contains an assembly Accession-Number (for better readability in the csv-file)

  5. Run the script Add_Species.py with the species-names as additional parameters

    python Add_Species NewSpecies1 NewSpecies2

  6. Restart XspecT to apply new changes

  7. ! After the script is finished make sure to delete the Assembly-Files from the filter/new_species Folder ! The new added species will be trained into a Bloom Filter and new SVM training-data will be generated.

Adding new A. baumannii sub-types:

Following soon

Adding new Oxa-gene families:

Following soon