Novelty Score Calculation Update #30

pg427 · 2024-09-26T20:05:52Z

Updated novelty score calcualtions including integrations of:

Clinical Trials/ Clinical Approvals - MVP1
TDLs - MVP2
Gene Distinctiveness - MVP2

Updated novelty score calcualtions including integrations of: 1. Clinical Trials/ Clinical Approvals - MVP1 2. TDLs - MVP2 3. Gene Distinctiveness - MVP2

maximusunc · 2024-09-27T20:17:03Z

app/novelty/compute_novelty.py

+from known import find_known_results
+from extr_smile_molpro_by_id import mol_to_smile_molpro
+from mol_similarity import find_nearest_neighbors


These three lines are not package imports, but local. They fail to be imported.

Got it. I've imported the two necessary functions into the compute_novelty.py file to make it simpler. There won't be a need to import the three files anymore. I'll keep them in the repo in case there is future requirement of use.

If those files are no longer being used, could you remove them please? If needed in the future, we can always reference them in GitHub history. Ideally any commented out code would also be deleted.

Understood. I have removed the files from the repo and the commented out code from compute_novelty.py

maximusunc · 2024-09-27T20:55:33Z

app/novelty/compute_novelty.py

+                           'Gene Distinctiveness', 'TDLs', 'novelty_score']
+        df_numpy = pd.DataFrame([["NAN"]*len(column_list)]*len(message['results']), columns = column_list)
+
+    return df_numpy


The final output used to only have drug and novelty_score, with drug being the curie. Is that now Result ID? Also, at least for the query I'm trying, every value is NaN. If that's expected, could we make the default value be 0 or something?

Yes. Since now we have MVP1 and MVP2 queries I changed the 'drug' to 'Result ID'. Can you send me the query? The lowest value is 0 for the novelty score. If it's showing NaN then there might be an issue I'd like to look a bit closer into. I'm assuming the "message" key from the json response is being passed from the merged Response from the ARS.

I'm working with a smaller message, and a large message. The smaller message seems to be fine, but the larger message returns all NaNs. The CI PK for that larger message is d1e6d43f-cb3e-4c6b-a6bc-5393f576053f.

I checked the pk and it seems that the message gave an Error code of 422. It is probably why you're seeing ther NaN. I kept the NaN on purpose to distinguish between getting a 0 novelty score and when the code is erroring out. At the moment the code only works for messages with the code of 200 and shows NaNs otherwise. Do you want me to switch these out for 0s ?

The 422 error was given by the ARS, but that's not necessarily indicative of what the Appraiser did. I can pass that message to the Appraiser and it handles it fine, other than the novelty code, so I think something needs to be fixed in this code.

FYI, ALL fields are set to NaN, not just novelty_score.
FYI FYI, if the novelty code errors out, I'm already setting the default novelty to 0.0, so I think it would be a good idea to have that same default value in your code instead of a NaN (which I think is actually the string "NaN" instead of the value NaN.

Ok. I've changed the default "NaN" strings (for instances when an error is encountered) to 0 for all entries in the DataFrame generated.

So now the default value is 0, but ALL fields are still 0. I'm also not seeing any error messages come out of novelty, so it looks like everything went fine. In my opinion, Query ID and Result ID should always have values and not be 0, regardless of any errors.

Removed local imports from files and transferred functions from the files to compute_novelty.py

Due to code import in compute_novelty.py, removed files: known.py, molecular_similarity.py, extr_smile_molpro_by_id.py

Removed commented code from compute_novelty.py

Switched error cases to include 0s instead of "NAN" ion the Dataframe.

Redis is implemented in the function get_publication_info to compute recency score at a much faster rate.

Novelty Score Calculation Update

79254fa

Updated novelty score calcualtions including integrations of: 1. Clinical Trials/ Clinical Approvals - MVP1 2. TDLs - MVP2 3. Gene Distinctiveness - MVP2

maximusunc requested changes Sep 27, 2024

View reviewed changes

pg427 added 5 commits September 27, 2024 17:26

Updated compute_novelty.py

b483302

Removed local imports from files and transferred functions from the files to compute_novelty.py

Obsolete files removed

3b2f40c

Due to code import in compute_novelty.py, removed files: known.py, molecular_similarity.py, extr_smile_molpro_by_id.py

Removed commented code

c291ae1

Removed commented code from compute_novelty.py

Updated compute_novelty.py

0d257b1

Switched error cases to include 0s instead of "NAN" ion the Dataframe.

Recency computation latency resolved

1a36292

Redis is implemented in the function get_publication_info to compute recency score at a much faster rate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Novelty Score Calculation Update #30

Novelty Score Calculation Update #30

pg427 commented Sep 26, 2024

maximusunc Sep 27, 2024

pg427 Sep 27, 2024

maximusunc Sep 30, 2024

pg427 Sep 30, 2024 •

edited

Loading

maximusunc Sep 27, 2024

pg427 Sep 27, 2024 •

edited

Loading

maximusunc Sep 30, 2024

pg427 Sep 30, 2024 •

edited

Loading

maximusunc Sep 30, 2024

pg427 Sep 30, 2024 •

edited

Loading

maximusunc Sep 30, 2024

Novelty Score Calculation Update #30

Are you sure you want to change the base?

Novelty Score Calculation Update #30

Conversation

pg427 commented Sep 26, 2024

maximusunc Sep 27, 2024

Choose a reason for hiding this comment

pg427 Sep 27, 2024

Choose a reason for hiding this comment

maximusunc Sep 30, 2024

Choose a reason for hiding this comment

pg427 Sep 30, 2024 • edited Loading

Choose a reason for hiding this comment

maximusunc Sep 27, 2024

Choose a reason for hiding this comment

pg427 Sep 27, 2024 • edited Loading

Choose a reason for hiding this comment

maximusunc Sep 30, 2024

Choose a reason for hiding this comment

pg427 Sep 30, 2024 • edited Loading

Choose a reason for hiding this comment

maximusunc Sep 30, 2024

Choose a reason for hiding this comment

pg427 Sep 30, 2024 • edited Loading

Choose a reason for hiding this comment

maximusunc Sep 30, 2024

Choose a reason for hiding this comment

pg427 Sep 30, 2024 •

edited

Loading

pg427 Sep 27, 2024 •

edited

Loading

pg427 Sep 30, 2024 •

edited

Loading

pg427 Sep 30, 2024 •

edited

Loading