Skip to content

Script to make data library of RefSeq reference genomes for specified genus

Notifications You must be signed in to change notification settings

gvlproject/galaxy_refseq_libraries

Repository files navigation

galaxy_refseq_libraries

refseq_to_library.py

Script to make data library of RefSeq reference genomes for specified genus

usage: refseq_to_library.py [-h] [-s SPECIES] [-u URL] [-d DIR] [-k KEY] [-v] genus

 Add RefSeq reference genomes to galaxy data libraries.

 positional arguments:
  genus                 the genus to create a library for

 optional arguments:
   -h, --help            show this help message and exit
   -s SPECIES, --species SPECIES     the species to create the library for
   -u URL, --url URL     the galaxy URL
   -d DIR, --dir DIR     the RefSeq directory containing all species
   -k KEY, --key KEY     the Galaxy API key to use

Needs an API key in GALAXY_KEY unless specified via command line

If species is specified, a library will be made with all refseq data for that species. If species is unspecified, a library will be made with all species in the genus. The refseq folder hierarchy is preserved in the library regardless.

Assumes refseq folder has the following structure:

refseq_folder/
    species/
        fna files

Adding to a remote Galaxy server

Ensure you specify the Galaxy URL using the -u URL or --url URL options.

directory_to_library.py

Script to make data library of local file/directory structure.

usage: directory_to_library.py [-h] [-u URL] [-k KEY] [-n NAME] [-v]
                               [-t [FILETYPES [FILETYPES ...]]] [-e]
                               [-a [ALLOW_USERS [ALLOW_USERS ...]]]
                               directory

Make a galaxy data library from a file/directory structure.

positional arguments:
  directory             the directory to make a data library from

optional arguments:
  -h, --help            show this help message and exit
  -u URL, --url URL     the Galaxy URL
  -k KEY, --key KEY     the Galaxy API key to use (overrides default)
  -n NAME, --name NAME  the name of the data library to create (overrides
                        default). Using an existing data library name will
                        update the existing library.
  -v, --verbose         Print out debugging information
  -t [FILETYPES [FILETYPES ...]], --filetypes [FILETYPES [FILETYPES ...]]
                        A space-seperated list of filetypes to include in the
                        data library. Defaults to fna, faa, ffn, gbk, gff
  -e, --exclude         Exclude the file types specified in -t. Defaults to
                        excluding fna, faa, ffn, gbk, gff
  -a [ALLOW_USERS [ALLOW_USERS ...]], --allow_users [ALLOW_USERS [ALLOW_USERS ...]]
                        A space-seperated list of emails of users to allow
                        access to the data library. Defaults to None- a public
                        library.
  • Needs an API key in galaxy_key variable, unless specified via command line.
  • Assumes Galaxy instance exists at localhost unless otherwise specified (see section below).

Example use

python directory_to_library.py test/test_directory -u http://116.156.70.12/galaxy -k 123ABC456

Will make a data library called 'test_directory' on the Galaxy instance at http://116.156.70.12/galaxy

python directory_to_library.py test/test_directory -u http://116.156.70.12/galaxy -k 123ABC456 -n MyLibrary

Will make a data library called 'MyLibrary' on the Galaxy instance at http://116.156.70.12/galaxy

python directory_to_library.py test/test_directory -u http://116.156.70.12/galaxy -k 123ABC456 -n MyLibrary -t fna faa

Will make a data library called 'MyLibrary' on the Galaxy instance at http://116.156.70.12/galaxy only including files with .fna or .faa extensions.

python directory_to_library.py test/test_directory -u http://116.156.70.12/galaxy -k 123ABC456 -n MyLibrary -t fna faa -e

Will make a data library called 'MyLibrary' on the Galaxy instance at http://116.156.70.12/galaxy including all files except those with .fna or .faa extensions.

python directory_to_library.py test/test_directory -u http://116.156.70.12/galaxy -k 123ABC456 -n MyLibrary -a madi@madi.com madi2@madi.com

Will make a data library called 'MyLibrary' on the Galaxy instance at http://116.156.70.12/galaxy and give read permissions only to madi@madi.com and madi2@madi.com. All other non-admin users will be unable to access the data library.

Adding to a remote Galaxy server

Ensure you specify the Galaxy URL using the -u URL or --url URL options.

Updating an existing Galaxy data library

Ensure you specify the data library name you wish to update using the -n NAME or --name NAME options.

User permissions

  • For new libraries, if -a is unspecified, the library will be public.
  • For existing libraries, if -a is unspecified, no permissions will change.
  • For existing libraries, if -a is specified, the specified users will be appended to the list of existing users with permission. i.e., existing permissions are not-overwritten.
  • If you need a more powerful permission manager, see library_permissions.py

About

Script to make data library of RefSeq reference genomes for specified genus

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages