Skip to content

File Formats

Vinh Tran edited this page Nov 7, 2023 · 16 revisions

Featuretypes file template

The featuretypes file contains information on which feature-types are used for calculation and if they should get linearized. Note that for non-core feature types, you need provide the annotation

#linearized
Pfam
SMART
#normal
fLPS
COILS2
SEG
TMHMM
SignalP
#checked 

Weight constraints file template

#class
pfam 0.3
fLPS 0.1
#type
pfam_WASH_WAHD 0.1

Pairwise file template

The pairwise file is a simple tab-separated file with:

seed_id_1 query_id_1
seed_id_2 query_id_2

For each pair that you want to calculate the score for. All, protein ids must be present in the seed/query input files

Pairwise file with different pairs of taxa template

Tab-separated file with 4 columns:

id_A taxon_1 id_A taxon_2
id_B taxon_1 id_B taxon_2
id_A taxon_1 id_C taxon_3

Note: with current output file format, protein IDs between different taxa need to be unique (e.g. 2 taxa cannot have proteins with the same IDs).

Phyloprofile mapping file template

The phyloprofile mapping file is a tab-separated file that contains the NCBI id of the source proteome of each query preotein.

A2P2R3  ncbi559292
A5Z2X5  ncbi559292
D6VPM8  ncbi559292