Stetl bgt improvements #69

fsteggink · 2018-02-08T17:14:48Z

While working on improving NLExtract's BGT Extract, I've found it necessary to add two filters, and improve two other filters. The changes should be self-explanatory. If not, please let me know.

…mplatingFilter

…as packet data

…with named groups. The extracted data is returned as a record.

justb4 · 2018-02-10T16:47:52Z

The Travis build can easily be fixed: just run flake8 in the project root. Just some minor code formatting issues, all regular tests passed.

fsteggink · 2018-02-27T08:56:11Z

Thanks Just, I didn't know that tool. Could be useful for NLExtract as well ;)

fsteggink · 2018-02-27T09:07:58Z

I'm also considering more improvements to Stetl for the BGT extract. Right now I have "hacked" a way in NLExtract to prepare a custom GFS which contains only the feature type and also the feature count. This greatly improves the import speed with ogr2ogr. I think it is useful to add this as a filter in Stetl as well. It will depend on OGR on the command line and LXML.

justb4 · 2018-02-27T10:04:46Z

stetl/filters/templatingfilter.py

+    @Config(ptype=bool, default=False, required=False)
+    def safe_substitution(self):
+        """
+        Apply safe substitution?


Possibly add more comment (I did not know e.g. about this standard option in Python Templates), like
if placeholders are missing from mapping and keywords, instead of raising an exception, the original placeholder will appear in the resulting string intact.

Fair point. Usually I don't add comments for things which can be easily looked up.

justb4 · 2018-02-27T10:11:26Z

stetl/filters/regexfilter.py

@@ -0,0 +1,61 @@
+#!/usr/bin/env python


Useful, an example will help, hard to grasp otherwise. Suggestions:

can't regexes be compiled once during init?

more uses expected? Maybe a baseclass RegexFilter and subclasses RegexToRecordFilter?

Compilation: good point.
More uses: I haven't thought about it yet. It is possible, but at the moment I don't have any other concrete use cases yet. When looking at the possible formats, I think only struct will be a good option. Although formats like geojson_feature, ogr_feature and etree_element could represent the parsed data, they are too specialized. The output of regexfilter, a dictionary, is not something you would typically write directly.

justb4

Ok with PR, you may fill in the suggestions.

fsteggink · 2018-02-27T13:53:09Z

I have added unit tests for the new filter classes. I haven't added a unit test for my change to the StringTemplatingFilter, since unit tests are missing entirely.

We should continue working on them, but I'd like to do that outside of the scope of this PR.

justb4 · 2018-02-27T14:09:50Z

Ok, great, merged...Yes I also started adding tests like for splitting, and there is a separate issue #40 for Unit testing.

fsteggink · 2018-02-27T14:11:07Z

Thanks for the quick merge after these fixes!

fsteggink added 4 commits February 8, 2018 18:03

Make deletion of extracted files optional in ZipFileExtractor

030521c

Added option to use safe_substitute instead of substitute in StringTe…

c6f5e5a

…mplatingFilter

Added new ExecFilter, which executes a command and returns it output …

2c06228

…as packet data

Added new RegexFilter, which parses data from a string using a regex …

bf020b4

…with named groups. The extracted data is returned as a record.

justb4 self-requested a review February 10, 2018 16:44

justb4 assigned fsteggink Feb 10, 2018

justb4 added the enhancement label Feb 10, 2018

justb4 added this to the Version 1.2 milestone Feb 10, 2018

Solved some flake8 issues

a5bfef6

justb4 reviewed Feb 27, 2018

View reviewed changes

fsteggink added 3 commits February 27, 2018 13:37

Small improvements, per Just's recommendations

4d75bba

Added unit test for RegexFilter

3db9655

Added unit test for CommandExecFilter

07670a6

justb4 merged commit d50832b into geopython:master Feb 27, 2018

fsteggink deleted the stetl_bgt_improvements branch February 27, 2018 14:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stetl bgt improvements #69

Stetl bgt improvements #69

fsteggink commented Feb 8, 2018

justb4 commented Feb 10, 2018

fsteggink commented Feb 27, 2018

fsteggink commented Feb 27, 2018

justb4 Feb 27, 2018

fsteggink Feb 27, 2018

justb4 Feb 27, 2018

fsteggink Feb 27, 2018

justb4 left a comment

fsteggink commented Feb 27, 2018

justb4 commented Feb 27, 2018

fsteggink commented Feb 27, 2018

Stetl bgt improvements #69

Stetl bgt improvements #69

Conversation

fsteggink commented Feb 8, 2018

justb4 commented Feb 10, 2018

fsteggink commented Feb 27, 2018

fsteggink commented Feb 27, 2018

justb4 Feb 27, 2018

Choose a reason for hiding this comment

fsteggink Feb 27, 2018

Choose a reason for hiding this comment

justb4 Feb 27, 2018

Choose a reason for hiding this comment

fsteggink Feb 27, 2018

Choose a reason for hiding this comment

justb4 left a comment

Choose a reason for hiding this comment

fsteggink commented Feb 27, 2018

justb4 commented Feb 27, 2018

fsteggink commented Feb 27, 2018