Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

647: include all choice lists in output #736

Merged
merged 3 commits into from
Nov 2, 2024

Conversation

lindsay-stevens
Copy link
Contributor

Closes #647
Closes #630

Why is this the best possible solution? Were any other approaches considered?

  • Allows users to include reference data for expressions via choices without needing a hidden dummy question. Includes all choices rather than parsing expressions to find references, because it seems unlikely that both a XLSForm would include completely irrelevant choices and have so many such choices that the XForm document size is a problem.
  • Improves choices validation by including the row number in the error message and using a consistent structure
  • Improves xml tag validation by using the parsing module rather than a separate regex. This was initially a tidy-up thing but I realised it also mostly addressed Forbid colon as first character in XML identifiers #630. Forbidding colons altogether seemed uncertain there so they're still allowed in this PR.

What are the regression risks?

  • If users have XLSForms with giant reference choice lists, and only a subset of which are used in the form, and the use case for that XForm has slow internet connectivity, or the choices list is just too big for Collect/Webforms, then it's a problem. It can be avoided easily, by removing choice lists that are not being used, or cut/paste them to a different sheet like "reference choices" which pyxform would ignore.
  • If users were expecting the exact error messages about choices as before, well hopefully they are momentarily delighted by these new ones.
  • If users have question names starting with a colon they will need to change them.

Does this change require updates to documentation? If so, please file an issue here and include the link below.

I don't think so. The problem scenario for the choices sheet is probably not relevant to most users, and I don't think the dummy question workaround was documented (except on the forum?) but for those who wanted it to work and had to go looking for the workaround, well now it just works. As for the leading colon in the name, xlsform.org says what characters names can contain and colon isn't one of them.

Before submitting this PR, please make sure you have:

  • included test cases for core behavior and edge cases in tests
  • run python -m unittest and verified all tests pass
  • run ruff format pyxform tests and ruff check pyxform tests to lint code
  • verified that any code or assets from external sources are properly credited in comments

- no formatting changes, just keeping pace with ruff really
- allows users to include reference data for expressions via choices
  without needing a hidden dummy question. Includes all choices rather
  than parsing expressions to find references, because it seems unlikely
  that both a XLSForm would include completely irrelevant choices and
  have so many such choices that the XForm document size is a problem.
- test_xls2json_xls.py
  - moved loop test into test_loop.py and modified to use the same style
    as the existing test
- test_loop.py
  - use xls2xform.convert rather than internals directly
- xls2json.py:
  - move choices checks to new validator module
  - include row reference in all errors/warnings. To have a consistent
    error message structure, each duplicate choice gets a copy of the
    message (in one error) rather than being smushed into one message.
- xlsparseutils.py
  - replace separate regex for xml tags with expression parser and use
    that instead, for consistent parsing rules and to use cache
  - move the remaining sheet misspellings check to new validator module
- expression.py
  - fix issue where a name containing "or" or "and" was parsed as an
    operator rather than a name. These tokens are only valid when
    surrounded by spaces so the regex is updated accordingly.
  - add tests for positive and negative match cases for tag names.
Copy link
Contributor

@lognaturel lognaturel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much delight will be had, indeed!

One wording piece I want to check my understanding of. Otherwise looks great.

)
INVALID_LABEL = (
"[row : {row}] On the 'choices' sheet, the 'label' value is invalid. "
"Choices should have a label. "
Copy link
Contributor

@lognaturel lognaturel Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is optional/should because a choice list might now just be used for reference and not for a select?

Copy link
Contributor Author

@lindsay-stevens lindsay-stevens Nov 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for this check being a "should" / warning is to match existing behaviour - historically choice labels aren't required, but that might change per #534

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhh, thank you for introducing me to my past self. :D

@lognaturel
Copy link
Contributor

I agree this can close #630 and we can continue allowing : in the middle of an identifier.

@tiritea
Copy link

tiritea commented Nov 1, 2024

Much delight will be had

Hidden selects Begone! :-)

+1

@lindsay-stevens lindsay-stevens merged commit b65e727 into XLSForm:master Nov 2, 2024
10 checks passed
@lindsay-stevens lindsay-stevens deleted the pyxform-647 branch November 2, 2024 09:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants