ProcessOutput #233

saltzberg · 2017-04-07T20:18:57Z

About 2x faster at reading stat files and can parse stat files based on a list of score criteria.

Support for statv1

About 2x faster at reading stat files and can parse stat files based on a list of score criteria. Support for statv1

benmwebb

Looks generally OK, and seems like the tests all pass now, so I'll be happy to merge this once you've made the changes I suggested. Oh, and run the file through tools/dev_tools/python_tools/reindent.py or autopep8. The indentation looks a little wonky to me in places.

benmwebb · 2017-04-19T22:23:18Z

pyext/src/output.py

+
+        #Store these keys in a dictionary. Example pair: {109 :'Total_Score'}
+        self.dict = self.parse_line(line, header=True)
+        #self.dict = ast.literal_eval(line)


Remove, not comment out, old stuff. If we want to go back, that's what git is for.

benmwebb · 2017-04-19T22:24:16Z

pyext/src/output.py

+            for k in kkeys:
+                self.inv_dict.update({self.dict[k]: k})
+        else:
+            print("WARNING: statfile v1 is deprecated.  Please convert to statfile v2")


Nothing wrong with your patch here, but when you update it, you should use the "proper" deprecation function, as in current develop.

benmwebb · 2017-04-19T22:25:54Z

pyext/src/output.py

        f.close()

+
+    def parse_line(self, line, header=False):
+        # Parses a line and returns a dictionary of key:value pairs


Use a docstring ("foo") rather than comment (#foo) here so it gets picked up by doxygen. (If you don't want it to be part of the public interface - which requires documentation - make it a private function by calling it _parse_line rather than parse_line.)

benmwebb · 2017-04-19T22:26:27Z

pyext/src/output.py

+            fd = split[i]
+            fields = fd.split(",")   # split via commas to get key:value pair
+            for h in fields[0:-1]: 
+                if h != "":  # For some reason, there is occasionally an empty field. Ignoring these seems to work.


Don't duplicate all this code from above. Put it in a function.

benmwebb · 2017-04-19T22:27:32Z

pyext/src/output.py

    def get_keys(self):
+        """ Returns a list of the string keys that are included in this dictionary
+        """
+        self.klist = [k[1]


Do this only once, not each time the function is called. Easiest would just be to setup klist when the file is first parsed.

benmwebb · 2017-04-19T22:29:59Z

pyext/src/output.py

+        # otherwise, just return the string
+        try:
+            float(c)
+        except:


no bare except:

benmwebb · 2017-04-19T22:30:07Z

pyext/src/output.py

+
+
+    def does_line_pass_criteria(self, fields, c):
+        # Given a stat file line (as a dictionary) and a criteria tuple, decide whether 


benmwebb · 2017-04-19T22:31:12Z

pyext/src/output.py

+
+
+        comparison = c[2]
+        if comparison not in ["==", "<", ">"]:


Generally considered better style to use () rather than [] here, since the set of items is immutable.

benmwebb · 2017-04-19T22:31:53Z

pyext/src/output.py

+
+        comparison = c[2]
+        if comparison not in ["==", "<", ">"]:
+            raise Exception('ERROR: IMP.pmi.output.ProcessOutput.does_line_pass_criteria() - Comparison string must be \'>\', \'<\' or \'==\', instead of %s' % (comparison))  


Raise a more specific exception. ValueError is probably the most appropriate here.

benmwebb · 2017-04-19T22:32:08Z

pyext/src/output.py

+        model_value = self._float_string(fields[intkey])
+
+        if type(value) is not type(model_value):
+            raise Exception('ERROR: IMP.pmi.output.ProcessOutput.does_line_pass_criteria() - Comparison field %s is of type %s while you tried to compare it to a %s' % (key, type(model_value), type(value)))  


more specific exception

ProcessOutput

0c32376

About 2x faster at reading stat files and can parse stat files based on a list of score criteria. Support for statv1

benmwebb self-requested a review April 19, 2017 22:21

benmwebb requested changes Apr 19, 2017

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ProcessOutput #233

ProcessOutput #233

saltzberg commented Apr 7, 2017

benmwebb left a comment

benmwebb Apr 19, 2017

benmwebb Apr 19, 2017

benmwebb Apr 19, 2017

benmwebb Apr 19, 2017

benmwebb Apr 19, 2017

benmwebb Apr 19, 2017

benmwebb Apr 19, 2017

benmwebb Apr 19, 2017

benmwebb Apr 19, 2017

benmwebb Apr 19, 2017



		def does_line_pass_criteria(self, fields, c):
		# Given a stat file line (as a dictionary) and a criteria tuple, decide whether

ProcessOutput #233

Are you sure you want to change the base?

ProcessOutput #233

Conversation

saltzberg commented Apr 7, 2017

benmwebb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment