Skip to content

Commit

Permalink
Merge pull request #472 from Knowledge-Graph-Hub/469-merge-fails-due-…
Browse files Browse the repository at this point in the history
…to-error-tokenizing-data

Addressing errors in tokenizing input data
  • Loading branch information
caufieldjh authored Aug 3, 2023
2 parents 66e310a + 2f8a167 commit 61b4ac9
Show file tree
Hide file tree
Showing 5 changed files with 8 additions and 9 deletions.
11 changes: 5 additions & 6 deletions download.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,6 @@
# drug - target interactions from Drug Central
url: https://unmtid-shinyapps.net/download/DrugCentral/2021_09_01/drug.target.interaction.tsv.gz
local_name: drug.target.interaction.tsv.gz
-
url: http://juniper.health.unm.edu/tcrd/download/TCRDv6.12.4.tsv
local_name: tcrd.zip

#
# PPI from STRING DB
Expand All @@ -46,8 +43,8 @@

# This is protein network data (incl. distinction: direct vs. interologs)
# constrained to just human interactions.
url: https://stringdb-static.org/download/protein.links.full.v11.0/9606.protein.links.full.v11.0.txt.gz
local_name: 9606.protein.links.full.v11.0.txt.gz
url: https://stringdb-static.org/download/protein.links.full.v11.5/9606.protein.links.full.v11.5.txt.gz
local_name: 9606.protein.links.full.v11.5.txt.gz

-
# gene to ensembl IDs
Expand Down Expand Up @@ -81,7 +78,8 @@

#
# SciBite CORD-19 annotations v1.6
#
# As of June 2023 these are no longer available and must be retrieved from previous caches

-
url: https://media.githubusercontent.com/media/SciBiteLabs/CORD19/master/annotated-CORD-19/1.6/pdf_json_part_1.zip
local_name: pdf_json_part_1.zip
Expand All @@ -95,6 +93,7 @@
local_name: pmc_json.zip

# SciBite CORD-19 entity co-occurrences v1.2
# Also a dead link
-
url: https://media.githubusercontent.com/media/SciBiteLabs/CORD19/master/sentence-co-occurrence-CORD-19/1.2/cv19_scc_1_2.zip
local_name: cv19_scc_1_2.zip
Expand Down
2 changes: 1 addition & 1 deletion kg_covid_19/transform_utils/drug_central/drug_central.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
Ingest drug - drug target interactions from Drug Central.
Essentially just ingests and transforms this file:
http://unmtid-shinyapps.net/download/drug.target.interaction.tsv.gz
https://unmtid-shinyapps.net/download/DrugCentral/2021_09_01/drug.target.interaction.tsv.gz
And extracts Drug -> Protein interactions.
"""
Expand Down
2 changes: 1 addition & 1 deletion kg_covid_19/transform_utils/string_ppi/string_ppi.py
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ def run(self, data_file: Optional[str] = None) -> None:
"""
if not data_file:
data_file = os.path.join(
self.input_base_dir, "9606.protein.links.full.v11.0.txt.gz"
self.input_base_dir, "9606.protein.links.full.v11.5.txt.gz"
)
os.makedirs(self.output_dir, exist_ok=True)
protein_node_type = "biolink:Protein"
Expand Down
2 changes: 1 addition & 1 deletion poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Binary file not shown.

0 comments on commit 61b4ac9

Please sign in to comment.