Fetch OSM for any region from Protomaps extract API #842
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fetch OSM data with minutely updates applied using the OSM extract service from https://protomaps.com. This is a minimum viable endpoint and backend component with background TaskAction progress reporting. A successful download yields an OSMPBF DataSource. This is very similar to what we were doing in the past with Vanilla Extract, but using a much more mature tool, https://github.com/protomaps/OSMExpress. (In fact there's an experimental branch of Vanilla Extract taking some cues from OSMExpess at https://github.com/conveyal/vanilla-extract/tree/lmdb).
This is not really intended to be used as-is, but as a sketch of how we might eventually treat OSM (and eventually GTFS) as both uploadable and auto-fetchable data sources, which can then be selected and combined into networks. It also serves as working example code for interacting with this particular extract service.
This requires a Protomaps API token from https://app.protomaps.com/dashboard. I believe the sign-up process is not public yet. I have a key and have tested this out, it seems to work exactly as intended.
Some important caveats:
--strategy complete_ways
in Osmium, so long ways may extend very far outside the bounding box. This can be relatively harmless if it's a few short segments, but could also lead to very oversized bounds or analysis areas, or at the very least the inclusion of lots of data that don't actually affect the analysis at hand.For production use, it would be advantageous to perform some simple filtering on the fly in OSMExpress, skipping over ways with certain tags, and possibly truncating ways that extend too far outside the bounding box.