Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible "broken" URLs in FIBO PROD literals #2018

Open
mereolog opened this issue May 13, 2024 · 3 comments
Open

Possible "broken" URLs in FIBO PROD literals #2018

mereolog opened this issue May 13, 2024 · 3 comments
Assignees

Comments

@mereolog
Copy link
Contributor

The attached table collects 500+ triples from FIBO PROD (commit: dbdd526) where the object value is a URL that possibly does not refer (any more).

fibo_prod_possibly_broken_urls_dbdd526.xlsx

The check for these is not 100% precise, e.g., the links from the https://www.ffiec.gov domain like https://www.ffiec.gov/npw/Help/InstitutionTypes or https://www.ffiec.gov/nicpubweb/Content/DataDownload/NPW%20Data%20Dictionary.pdf, are classified as 403 although you can access them via a web browser.

For a great deal of other cases, but not for all of them, the 'http' schema should be updated to the 'https'.

@ElisaKendall
Copy link
Contributor

@merelog - after discussion on the FIBO DER telecon this afternoon, we think you should exclude 403 errors. We did a spot check on those, and they resolved via a browser, as you mentioned, so that "doesn't count" as broken.

@ElisaKendall
Copy link
Contributor

@merelog - also per our conversation on the FIBO DER telecon, please exclude all of the ones that are identified in the MarketsIndividuals ontology in FBC - they are provided by ISO and so we have no control over what they claim is the website of the exchange or other market participant.

@mereolog
Copy link
Contributor Author

@merelog - after discussion on the FIBO DER telecon this afternoon, we think you should exclude 403 errors. We did a spot check on those, and they resolved via a browser, as you mentioned, so that "doesn't count" as broken.

I would advise that we might check what url the browser resolves, e.g., when you sent http://www.otcmarkets.com you are being redirected to https://www.otcmarkets.com, so you can interpret some of these 403s as warnings for updates in the schema.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants