-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
698: library API for non-path input, accept markdown and dict input #712
Conversation
- add: xls2xform.convert for library users to call pyxform without needing to use files (accepts bytes/file handles/strings) - accepts markdown input since this is widely used by pyxform - accepts dict to avoid needing to use internal funcs that may change - chg: avoid writing to files unless validate=True (for ODK Validate) - also avoid assuming any files were written, e.g. missing_ok=True - chg: move xls/x_sheet_to_csv, sheet_to_csv from utils.py to xls2json_backends.py because they are backends for csv input. - chg: move md_to_dict from test directory into xls2json_backends.py - chg: refactor pyxform_test_case.py to use xls2xform.convert only, instead of internal funcs associated with md_to_dict, so that the existing tests check API stability e.g. file types, dict input, etc.
1a1462a
to
e8f8fdc
Compare
FYI @spwoodcock |
Forum thread for community input: https://forum.getodk.org/t/handling-xlsform-xform-conversion-in-memory/45830/1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a bit of trouble getting through this one! Maybe I should have ignored all the test changes initially. convert
looks good. I almost recommended some parameter name changes like enketo
-> enketo_validate
but matching the existing xls2xform_convert
feels right after all.
Couple of tiny questions/suggestions inline but I don't think they need to block merge.
@lindsay-stevens |
@lognaturel it's still public, just moved to be with the other backend functions.
This PR set the scene for making parts of pyxform non-public (by offering a library API that does what most users would want to do), but didn't make changes in that regard. |
That's a breaking change, then, and should be released as a major version bump, right? I believe that otherwise the changes are non-breaking so maybe it's worth preserving the prior APIs (calling the new functions) so it's non-breaking? Or we bend the semver rules because we think usage as a library is rare? |
Agree it's a breaking change, and agree it's probably not going to affect many people (possibly no-one?) - the solution is to update the import path (the function bodies are unchanged). Equally, incrementing the version is OK. |
Closes #698
Closes #599
open()
)/strings)Why is this the best possible solution? Were any other approaches considered?
Differences to #698: as shown in the new tests in
test_xls2xform.py
, there are a few reasonable call patterns that don't return BytesIO; and adding a new function allows more easily tailoring it to the library use case.Was not initially planning on the markdown/dict changes but considering a) #599 has been open for a while, b) the recent review of pyxform-related projects showed there is a need for alternative input formats, or at least more flexibility, c) I figured that it would be best to support markdown/dict now so that the API was designed for that, rather than needing to potentially refactor for it later. I think the main value in pyxform is in workbook_to_json -> survey.py -> xform, so by adding the option of providing a dict, library users can do whatever deserialisation is necessary and pass the data to pyxform.
What are the regression risks?
There are some functions moved around as mentioned above, so at upgrade time, library users would need to change import locations or switch to using
xls2xform.convert
. Functionality should be the same.Does this change require updates to documentation? If so, please file an issue here and include the link below.
I think there's a good amount of info in the docstrings and tests, but a readme section could be added to make it easier to take in, if required.
Before submitting this PR, please make sure you have:
tests
python -m unittest
and verified all tests passruff format pyxform tests
andruff check pyxform tests
to lint code