feat(scrape_pacer_free_opinions): apply task recap_document_into_opinions #4638

grossir · 2024-10-31T01:03:38Z

Features:

add corpus_importer.tasks.recap_document_into_opinions into the scrape_pacer_free_opinions chain: this will make the recap documents be ingested in real time into the case law database
tweaked the task so it can be used by both the scraper_pacer_free_opinions and recap_into_opinions command

Refactors:

renamed task ingest_recap_document to recap_document_into_opinions
renamed task extract_recap_document to extract_recap_document_for_opinions
deleted duplicated function extract_recap_document in recap_into_opinions command file; delete unused imports and code

sentry-io · 2024-10-31T01:03:46Z

🔍 Existing Issues For Review

Your pull request is modifying functions with the following pre-existing issues:

📄 File: cl/corpus_importer/tasks.py

Function	Unhandled Issue
`ingest_recap_document`	HTTPStatusError: Server error '500 Internal Server Error' for url 'http://cl-doctor:5050/extract/recap/text/?strip... ... `Event Count:` 3

_{Did you find this useful? React with a 👍 or 👎}

…ions Features: - add corpus_importer.tasks.recap_document_into_opinions into the scrape_pacer_free_opinions chain: this will make the recap documents be ingested in real time into the case law database - tweaked the task so it can be used by both the scraper_pacer_free_opinions and recap_into_opinions command Refactors: - renamed task `ingest_recap_document` to `recap_document_into_opinions` - renamed task `extract_recap_document` to `extract_recap_document_for_opinions` - deleted duplicated function `extract_recap_document` in recap_into_opinions command file; delete unused imports and code

mlissner · 2024-10-31T19:39:00Z

cl/corpus_importer/management/commands/recap_into_opinions.py

@@ -89,6 +49,8 @@ def import_opinions_from_recap(

        # Manually select the replica db which has an addt'l index added to
        # improve this query.


This is a one-time script, right? Let's make sure we remove that index when we're done. Can we make a sub-issue to remember to do that? We need to be really careful not to permanently add indexes we won't use in the future.

grossir force-pushed the regular_ingestion_pacerfree_to_caselaw branch 2 times, most recently from 3d4ab38 to 79957bc Compare October 31, 2024 01:13

grossir requested a review from flooie October 31, 2024 01:17

grossir force-pushed the regular_ingestion_pacerfree_to_caselaw branch from 79957bc to 7ca17a0 Compare October 31, 2024 03:31

mlissner reviewed Oct 31, 2024

View reviewed changes

Merge branch 'main' into regular_ingestion_pacerfree_to_caselaw

6891ce0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(scrape_pacer_free_opinions): apply task recap_document_into_opinions #4638

feat(scrape_pacer_free_opinions): apply task recap_document_into_opinions #4638

grossir commented Oct 31, 2024

sentry-io bot commented Oct 31, 2024

mlissner Oct 31, 2024

		@@ -89,6 +49,8 @@ def import_opinions_from_recap(

		# Manually select the replica db which has an addt'l index added to
		# improve this query.

feat(scrape_pacer_free_opinions): apply task recap_document_into_opinions #4638

Are you sure you want to change the base?

feat(scrape_pacer_free_opinions): apply task recap_document_into_opinions #4638

Conversation

grossir commented Oct 31, 2024

sentry-io bot commented Oct 31, 2024

🔍 Existing Issues For Review

mlissner Oct 31, 2024

Choose a reason for hiding this comment