Skip to content

Commit

Permalink
Merge pull request #309 from BU-ISCIII/develop
Browse files Browse the repository at this point in the history
Release v1.1.0
  • Loading branch information
Shettland authored Sep 16, 2024
2 parents b110cfa + 0d175ca commit 9ec59c7
Show file tree
Hide file tree
Showing 15 changed files with 802 additions and 469 deletions.
12 changes: 6 additions & 6 deletions .github/workflows/pypi_publish.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
name: Publish package python distribution to Pypi

on:
push:
branches: "main"
release:
types: [published]
workflow_dispatch:

jobs:
build:
Expand All @@ -23,14 +24,13 @@ jobs:
- name: Build a binary wheel and a source tarball
run: python3 -m build
- name: Store the distribution packages
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@v4
with:
name: python-package-distributions
path: dist/

publish-to-pypi:
name: Publish dist to PyPI
if: startsWith(github.ref, 'refs/tags/') # only publish to PyPI on tag pushes
needs:
- build
runs-on: ubuntu-latest
Expand All @@ -41,7 +41,7 @@ jobs:
id-token: write
steps:
- name: Download all the dists
uses: actions/download-artifact@v3
uses: actions/download-artifact@v4
with:
name: python-package-distributions
path: dist/
Expand All @@ -58,7 +58,7 @@ jobs:
id-token: write
steps:
- name: Download all the dists
uses: actions/download-artifact@v3
uses: actions/download-artifact@v4
with:
name: python-package-distributions
path: dist/
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/test_modules.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ jobs:
env:
OUTPUT_LOCATION: ${{ github.workspace }}/tests/
- name: Upload output file
uses: actions/upload-artifact@v2
uses: actions/upload-artifact@v4
with:
name: test-output
path: output.txt
Expand Down Expand Up @@ -73,7 +73,7 @@ jobs:
env:
OUTPUT_LOCATION: ${{ github.workspace }}/tests/
- name: Upload output file
uses: actions/upload-artifact@v2
uses: actions/upload-artifact@v4
with:
name: test-output
path: output.txt
11 changes: 10 additions & 1 deletion .github/workflows/test_sftp_handle.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,18 @@ jobs:
echo "Current permission level is ${{ steps.checkAccess.outputs.user-permission }}"
echo "Job originally triggered by ${{ github.actor }}"
exit 1
sleep_to_ensure_concurrency:
needs: security_check
runs-on: ubuntu-latest
steps:
- name:
run: sleep 10s
shell: bash

test_sftp_handle:
needs: security_check
needs: [security_check, sleep_to_ensure_concurrency]
if: github.repository_owner == 'BU-ISCIII'
concurrency:
group: ${{ github.repository }}-test_sftp_handle
cancel-in-progress: false
Expand Down
39 changes: 37 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [1.1.0dev] - 2024-0X-0X : https://github.com/BU-ISCIII/relecov-tools/releases/tag/1.1.X
## [1.X.Xdev] - 2024-XX-XX : https://github.com/BU-ISCIII/relecov-tools/releases/tag/1.X.X

### Credits

Expand All @@ -22,7 +22,42 @@ Code contributions to the hotfix:

### Requirements

## [1.0.0] - 2024-0X-0X : https://github.com/BU-ISCIII/relecov-tools/releases/tag/1.0.0
## [1.1.0] - 2024-09-13 : https://github.com/BU-ISCIII/relecov-tools/releases/tag/1.1.0

### Credits

Code contributions to the hotfix:

- [Pablo Mata](https://github.com/Shettland)
- [Sara Monzón](https://github.com/saramonzon)

### Modules

- New logs-to-excel function to create an excel file given a list of log-summary.json files [#300](https://github.com/BU-ISCIII/relecov-tools/pull/300)

#### Added enhancements

- Included a way to extract pango-designation version in read-bioinfo-metadata [#299](https://github.com/BU-ISCIII/relecov-tools/pull/299)
- Now log_summary.py also creates an excel file with the process logs [#300](https://github.com/BU-ISCIII/relecov-tools/pull/300)
- Read-bioinfo-metadata splits files and data by batch of samples [#306](https://github.com/BU-ISCIII/relecov-tools/pull/306)
- Included a sleep time in test_sftp-handle to avoid concurrency check failure [#308](https://github.com/BU-ISCIII/relecov-tools/pull/308)

#### Fixes

- Fixes in launch_pipeline including creation of samples_id.txt and joined validated json [#303](https://github.com/BU-ISCIII/relecov-tools/pull/303)
- Fixed failing module_tests.yml workflow due to deprecated upload-artifact version [#308](https://github.com/BU-ISCIII/relecov-tools/pull/308)

#### Changed

- Changed pypi_publish action to publish on every release, no need to push tags [#308](https://github.com/BU-ISCIII/relecov-tools/pull/308)

#### Removed

- Removed only_samples argument in log_summary.py as it was not used in any module. [#300](https://github.com/BU-ISCIII/relecov-tools/pull/300)

### Requirements

## [1.0.0] - 2024-09-02 : https://github.com/BU-ISCIII/relecov-tools/releases/tag/1.0.0

### Credits

Expand Down
52 changes: 33 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,10 @@ relecov-tools is a set of helper tools for the assembly of the different element
- [upload-to-ena](#upload-to-ena)
- [upload-to-gisaid](#upload-to-gisaid)
- [update-db](#update-db)
- [launch-pipeline](#launch-pipeline)
- [logs-to-excel](#logs-to-excel)
- [build-schema](#build-schema)
- [Mandatory Fields](#mandatory-fields)
- [launch-pipeline](#launch-pipeline)
- [custom logs](#custom-logs)
- [Python package mode](#python-package-mode)
- [Acknowledgements](#acknowledgements)
Expand Down Expand Up @@ -62,7 +63,7 @@ $ relecov-tools --help
\ \ / |__ / |__ | |___ | | | \ /
/ / \ | \ | | | | | | \ /
/ |--| | \ |___ |___ |___ |___ |___| \/
RELECOV-tools version 1.0.0
RELECOV-tools version 1.1.0
Usage: relecov-tools [OPTIONS] COMMAND [ARGS]...
Options:
Expand Down Expand Up @@ -160,7 +161,7 @@ Usage: relecov-tools read-bioinfo-metadata [OPTIONS]
- Note: Software-specific configurations are available in [bioinfo_config.json](./relecov_tools/conf/bioinfo_config.json).

#### validate
`validate` commands validate the data in json format outputted by `read-metadata` command against a json schema, in this case the relecov [schema specification](./relecov_tools/schema/relecov_schema.json).
`validate` commands validate the data in json format outputted by `read-metadata` command against a json schema, in this case the relecov [schema specification](./relecov_tools/schema/relecov_schema.json). It also creates a summary of the errors and warnings found in excel format as a report to the users.

```
$ relecov-tools validate --help
Expand Down Expand Up @@ -246,6 +247,35 @@ Usage: relecov-tools upload-to-gisaid [OPTIONS]
-t, --type Select the type of information to upload to database [sample,bioinfodata,variantdata]
-d, --databaseServer Name of the database server receiving the data [iskylims,relecov]

#### launch-pipeline
Create the folder structure to execute the given pipeline for the latest sample batches after executing download, read-lab-metadata and validate modules. This module will create symbolic links for each sample and generate the necessary files for pipeline execution using the information from validated_BATCH-NAME_DATE.json.
```
Usage: relecov-tools launch-pipeline [OPTIONS]
Create the symbolic links for the samples which are validated to prepare for
bioinformatics pipeline execution.
Options:
-i, --input PATH Path to the input folder where sample files are located
-t, --template PATH Path to the pipeline template folder to be copied in the output folder
-c, --config PATH Path to the the template config file
-o, --out_dir PATH Path to output folder
--help Show this message and exit.
```

#### logs-to-excel
Creates an xlsx file with all the entries found for a specified laboratory in a given set of log_summary.json files (from log-summary module). The laboratory name must match the name of one of the keys in the provided logs to work.
```
Usage: relecov-tools logs-to-excel [OPTIONS]
Creates a merged xlsx report from all the log summary jsons given as input
Options:
-l, --lab_name Name for target laboratory in log-summary.json files
-o, --output_folder Path to output folder where xlsx file is saved
-f, --files Paths to log_summary.json files to merge into xlsx file, called once per file
```

### build-schema
The `build-schema` module provides functionality to generate and manage JSON Schema files based on database definitions from Excel spreadsheets. It automates the creation of JSON Schemas, including validation, drafting, and comparison with existing schemas.

Expand Down Expand Up @@ -289,22 +319,6 @@ required (Y/N): Indicates if the property is required (Y) or optional (N).
complex_field (Y/N): Indicates if the property is a complex (nested) field (Y) or a standard field (N).
```

#### launch-pipeline
Create the folder structure to execute the given pipeline for the latest sample batches after executing download, read-lab-metadata and validate modules. This module will create symbolic links for each sample and generate the necessary files for pipeline execution using the information from validated_BATCH-NAME_DATE.json.
```
Usage: relecov-tools launch-pipeline [OPTIONS]
Create the symbolic links for the samples which are validated to prepare for
bioinformatics pipeline execution.
Options:
-i, --input PATH Path to the input folder where sample files are located
-t, --template PATH Path to the pipeline template folder to be copied in the output folder
-c, --config PATH Path to the the template config file
-o, --out_dir PATH Path to output folder
--help Show this message and exit.
```

#### custom logs
After executing each of these modules, you may find a custom log report in json format named "DATE_EXECUTED-MODULE_log_summary.json. These custom log summaries can be useful to detect errors in metadata in order to fix them and/or notify the users.

Expand Down
76 changes: 72 additions & 4 deletions relecov_tools/__main__.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
#!/usr/bin/env python
import logging

# import re
import os
import json

# from rich.prompt import Confirm
import click
import relecov_tools.download_manager
import relecov_tools.log_summary
import rich.console
import rich.logging
import rich.traceback
Expand Down Expand Up @@ -60,7 +61,7 @@ def run_relecov_tools():
)

# stderr.print("[green] `._,._,'\n", highlight=False)
__version__ = "1.0.0"
__version__ = "1.1.0"
stderr.print(
"\n" "[grey39] RELECOV-tools version {}".format(__version__), highlight=False
)
Expand Down Expand Up @@ -487,7 +488,7 @@ def launch_pipeline(input, template, output, config):


# schema builder
@relecov_tools_cli.command(help_priority=13)
@relecov_tools_cli.command(help_priority=14)
@click.option(
"-i",
"--input_file",
Expand Down Expand Up @@ -523,5 +524,72 @@ def build_schema(input_file, schema_base, draft_version, diff, out_dir):
schema_update.handle_build_schema()


@relecov_tools_cli.command(help_priority=15)
@click.option(
"-l",
"--lab_code",
type=click.Path(),
help="Name for target laboratory in log-summary.json files",
required=True,
)
@click.option(
"-o",
"--output_folder",
type=click.Path(),
help="Path to output folder where xlsx file is saved",
required=False,
)
@click.option(
"-f",
"--files",
help="Paths to log_summary.json files to merge into xlsx file, called once per file",
required=True,
multiple=True,
)
def logs_to_excel(lab_code, output_folder, files):
"""Creates a merged xlsx report from all the log summary jsons given as input"""
all_logs = []
full_paths = [os.path.realpath(f) for f in files]
for file in full_paths:
if not os.path.exists(file):
stderr.print(f"[red]File {file} does not exist")
continue
try:
with open(file, "r") as f:
all_logs.append(json.load(f)[lab_code])
except Exception as e:
stderr.print(f"[red]Could extract data from {file}: {e}")
if not all_logs:
stderr.print("All provided files were empty.")
exit(1)
logsum = relecov_tools.log_summary.LogSum(output_location=output_folder)
merged_logs = logsum.merge_logs(key_name=lab_code, logs_list=all_logs)
final_logs = logsum.prepare_final_logs(logs=merged_logs)
logsum.create_logs_excel(logs=final_logs)


@relecov_tools_cli.command(help_priority=16)
@click.option(
"-c",
"--config_file",
type=click.Path(),
help="Path to config file in yaml format",
required=True,
)
@click.option(
"-o",
"--output_folder",
type=click.Path(),
help="Path to the base schema file. This file is used as a reference to compare it with the schema generated using this module. (Default: installed schema in 'relecov-tools/relecov_tools/schema/relecov_schema.json')",
required=False,
)
def wrapper(config_file, output_folder):
"""Executes the modules in config file sequentially"""
process_wrapper = relecov_tools.dataprocess_wrapper.ProcessWrapper(
config_file=config_file, output_folder=output_folder
)
process_wrapper.run_wrapper()


if __name__ == "__main__":
run_relecov_tools()
Loading

0 comments on commit 9ec59c7

Please sign in to comment.