Ogr2OgrExecOutput issues #76

fsteggink · 2018-06-13T08:45:53Z

I've noticed a couple of issues which make Ogr2OgrExecOutput a bit less flexible than necessary. I'm writing them here, because of pending changes in execoutput in PR #75, so these changes can be done once that PR has been closed (either accepted or dismissed). I'm prepared to do these changes myself. For now it would be wise to have a discussion about the proposed changes.

Handling of position arguments. The position arguments of ogr2ogr are dst_datasource_name src_datasource_name [layer [layer ...]]. The rest are keyword arguments and they can be ignored for this matter. So you'll need at least two position arguments for ogr2ogr to work. They can be expanded indefinitely by adding layers. So, I'd like to propose that, when composing the ogr2ogr command string the keyword arguments are all put in front of the position arguments. This means that self.dest_data_source should only be added to the command in the execute method. We're not handling layers yet, but they could be added to in the future.
The dest_data_source should be optional. For my current use case I'm exporting data from PostgreSQL to a couple of different files. I'm using a combination of LineStreamerFileInput (list of data to be exported) + FormatConverter (split list into separate strings) + StringSubstitutionFilter (handling temp dir) + Ogr2OgrExecOutput. So, ogr2ogr is invoked multiple times. In my command file I'm specifying the dest source, as well as an -sql option, which is also different for every invocation. In my config I've defined an empty dest_data_source.
Add an optional parameter for the source data source. In my use case (see above) this is always the same, namely the Postgres connection string.
Make it configurable whether the lco options should be only called once or during every invocation of ogr2ogr. Currently I've put a -lco argument in the options string.

As you might guess, I'm looking for a solution where not only the source can change after each invocation, but also the destination and parameters. Ideally they should be passed in through a record. While this is possible, I think this is a next step in the evolution of this output object.

justb4 · 2018-07-04T21:06:20Z

OK, #75 has been integrated, my proposal is to first release this week v.1.2 (Milestone 1.2) for which all issues, including #75 are closed/merged now.

How to progress further? Actually Stetl started as a pipeline of Bash scripts, calling xsltproc, ogr2ogr etc via Linux pipeline symbols (|), hence this notation still in [etl] Chains.

When moving to Python the first approach was to embed ogr2ogr in an ogr2ogr Input class that evolved into OgrInput, i.e. using GDAL Python bindings. The ogr2ogr Output class was (and is) initially Ogr2OgrOutput, which had limitations, but had one parameter specifying the entire ogr2ogr command line with options. See here.

Later I tried to develop a 'native' OgrOutput (see this example) with GDAL Python bindings, but this was never really completed.

IMHO it is still a bit hacky to embed commands/programs in any Stetl Component. Though Ogr2OgrExecOutput, is powerful, like in NLExtract, we, as this issue states, still have to deal with the myriad of commandline options. Plus:

it requires special installation and invocation of ogr2ogr on different platforms (Windows, Linux, Mac)
it is hard to control any error situations
we need the "temp input file"

Long story short: I would rather like to move to and enhance the native Python-binded OgrOutput. At this point I can't assess all the issues/complexity involved, but this (native OgrOutput) seems a much cleaner solution for the future...

justb4 · 2020-07-22T13:43:34Z

Hasn't this been (partly) solved? e.g. "lco options" via #95?

fsteggink · 2020-08-05T08:47:59Z

The last issue (LCO) has been solved via #95. AFAIK, the rest still stands. When I'm in the occasion, I can submit a few PRs.
I agree that we should give more attention to OgrOutput, but I have no idea how much work still needs to be done.

justb4 added the enhancement label Jul 4, 2018

justb4 added this to the Version 1.3 milestone Jul 4, 2018

justb4 assigned justb4 and fsteggink Jul 4, 2018

justb4 modified the milestones: Version 1.3, Version 2.0, Version 2.1 Mar 18, 2019

justb4 modified the milestones: Version 2.1, Version 2.2 Jul 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ogr2OgrExecOutput issues #76

Ogr2OgrExecOutput issues #76

fsteggink commented Jun 13, 2018

justb4 commented Jul 4, 2018

justb4 commented Jul 22, 2020

fsteggink commented Aug 5, 2020

Ogr2OgrExecOutput issues #76

Ogr2OgrExecOutput issues #76

Comments

fsteggink commented Jun 13, 2018

justb4 commented Jul 4, 2018

justb4 commented Jul 22, 2020

fsteggink commented Aug 5, 2020