This repository contains some of the materials used for the manuscript "GTRpmix: A linked general-time reversible model for profile mixture models"
-
The nexus file Matrices_GTR20.nex contains in nexus format the matrices EAL, ELM, ELM50, MXM, MXMfix, and FM described in the main manuscript.
-
The folder trees contain the Newick format for trees depicted in the Supplementary material Figures S1, S10, and S11 (denoted Sim1.treefile, ELM.treefile, and EAL.treefile) and another tree ELM_50tax.treefile. The tree 'Sim1.treefile' was used to generate the simulations described in the section 'Parameter Estimation Performance' of the main manuscript. The trees ELM.treefile, ELM_50tax.treefile, and EAL.treefile were used to create the matrices ELM, ELM50, and ELM, respectively, as described in the main manuscript.
-
The folder Dataset contains two subdirectories, the subdirectory ELM contains the Pan-Eukaryotic data sets described in the section 'Data Sets' of the paper. The dataset denoted EAL_data.fas is the 78-taxon dataset while ELM50_data.fas is the 50-taxon dataset. Additionally, the files ELM.indices.tsv and ELM50.indices.tsv contain the partition file for every protein for the datasets ELM and ELM50 respectively. The subdirectory EAL contains the Eukaryotic-Archaeal data set, denoted EAL_data.fas, described in the section 'Data Sets' of the paper. Additionally, the file EAL_partition.out contains the partition file for every protein for such dataset.