Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement for sch:phase - @when #71

Open
rjelliffe opened this issue Mar 29, 2024 · 10 comments
Open

Enhancement for sch:phase - @when #71

rjelliffe opened this issue Mar 29, 2024 · 10 comments
Labels
2025 A change made in preparing the 2025 edition

Comments

@rjelliffe
Copy link
Member

rjelliffe commented Mar 29, 2024

(Added: In my Schematron users meeting presentation [Prague 2024] I identified this as proposal as one of the most important IMHO.)

Motivating Use-Case

The user has a stream of XML documents they want to validate. The documents can be from several different schemas, perhaps different versions of schemas, perhaps entirely new namespaces. We want to cope with these inside Schematron rather than relying on some external mechanism to look at the document and select the appropriate Schematron schema or phase. We do not want to have to add lots of conditions to every sch:rule/@context.

An example might be a Schematron schema that can validate every kind of XSLT document, so that phases test the /xsl:stylesheet/@Version attribute and only run if matching: e.g. a phase for 1.0, 2.0, 3.0, 3.1, and so-called 4.0.

Suggestion
We introduce an attribute sch:phase/@test which takes an XPath expression evaluated in the current global scope of variables (i.e. on the initial document) and on the selected document. When attempting to run a phase (because it is selected or because the #ALL default is operating) the test is first evaluated as boolean; if the test succeeds then the phase is selected.

I considered names like @when, @for, @because, @if etc. however I thought re-using @test was better, for not multiplying names. However, I am 100% not wedded to the name, and another might be preferred.

(I considered allowing it on sch:pattern as well, ih particular to interact with sch:pattern/@document, but I thought that the use-case was not as clear, and it seemed to create messiness and incomprehensibility rather than reduce it. I think it would be better to support e.g. sch:pattern/@document so that phases can apply to sub-documents. But that needs more work and thought and is not part of this proposal.)

Example

<sch:phase id="vanilla-html" test="/html"> ...
<sch:phase id="xhtml" test="/xhtml:html"> ...
<sch:phase id="not-html" test="not(/html or /xhtml:html)"><sch:active pattern="report-not-html-as-fatal-error"/></sch:phase>  

<sch:pattern name="report-not-html-as-fatal-error">
   <sch:rule context="/*"><sch:report severity="FATAL">The document must be HTML or XHTML</sch:report></sch:rule></sch:pattern> 

In this example, the @test allows the incoming document to be HTML-in-XML or XHTML, and it generates a warning otherwise without attempting any other validation.

I considered having some default message that would be activated if no sch:phase/@test tests true, but I though the above was the minimum to declare victory and the simplest to implement and understand.

Implementation
I think sch:pattern/@test is quite easy to implement, e.g. to generate on the lines of

<xsl:template match="/"  >
   <xsl:if test="contains($phase, 'vanilla-html') or contains($phase, '#ALL) 
        or (not($phase) and (contains( $defaultPhase, 'vanilla-html') or contains($defaultPhase, '#ALL'))))"> 
         <xsl:if test="/html">
            <xsl:call-template name="vanilla-html"  mode="pattern-mode" />
        </xsl:if>
  </xsl:if>
 ...

`

@rkottmann
Copy link

I also have this kind of use case. Hence, I support this proposal.

I would like to propose NOT to name the attribute test, because it is semantically quiet different to (report|assert)/@test.

However, I have no positive suggestion.

Xforms uses e.g. relevance for similar use-case.

The ant build tool uses if and unless.

when is also a good name.

@rjelliffe
Copy link
Member Author

rjelliffe commented Jun 6, 2024 via email

@AndrewSales
Copy link
Collaborator

or because the #ALL default is operating

I want to point out (mainly for the benefit of implementers) that the standard defines #ALL as denoting "that all patterns are active" [my italics]. Note the same wording also applies if #DEFAULT is specified but no @defaultPhase is given in the schema.

This is subtly different from all phases being active, and this proposal would need to take account of the difference: the text of the current standard implies that the presence of phases is effectively immaterial if #ALL is specified. This proposal would require implementations to retain what phase patterns belong to, because of the need to evaluate phase/@when.

If this proposal is included in the standard, I would suggest also clarifying that #ALL (and #DEFAULT where no @defaultPhase is present in the schema) mean all phases are active, and all patterns which do not belong to a phase are active. I feel this would clarify the processing model.

@rjelliffe
Copy link
Member Author

Andrew is right about the wording problem but I dont think we need to change #ALL .

I suggest that a new defaultPhase called "#ALL-PHASES" be defined, which means all phases are tried in implementation-dependant order (each with their @when test). If #ALL then no phases are active, so no sch:phase/@when is tested. If a phase is specified in @defaultPhase, then any @when is tested.

@AndrewSales
Copy link
Collaborator

I dont think we need to change #ALL

I think that horse has already exited the stable in this case, unfortunately.

The current text has: "Two strings, #ALL and #DEFAULT, have special meanings when specifying active phases." [my italics]
Although it then goes on to mention patterns explicitly and not phases, the net result is that #ALL and the proposed #ALL-PHASES might be similar enough to cause confusion.

Is it an implicit part of the use case here to exclude from processing patterns which don't belong to a phase? I can see there might be cases where you would want some non-phase patterns applied regardless. But if the idea is to exclude them, perhaps #PHASES-ONLY would work as an active phase specifier, whose semantic would be to only process patterns belonging to a phase (a useful side-effect feature?) and pave the way for phase/@when.

If as a user you do want to benefit from phase/@when and have non-phase patterns processed too, then you can just use #ALL in the re-worded definition I gave earlier.

@rjelliffe
Copy link
Member Author

rjelliffe commented Jun 23, 2024 via email

@rjelliffe
Copy link
Member Author

rjelliffe commented Jun 23, 2024 via email

@AndrewSales
Copy link
Collaborator

#ALL is the name of a phase: the built-in default one which invokes all
patterns, regardless of any phase declarations.

Redefining phases so that if you select a phase then it will also activate
patterns that are not in any phase is a breaking change.

These two statements are contradictory, because the latter is what #ALL already does.
It's also not what I was suggesting.

Is it an implicit part of the use case here to exclude from processing patterns which don't belong to a phase?

To be clear, I was asking the question above (as usual) to try to get to the bottom of the requirement, so that I and the ISO Working Group can understand what is proposed and see what we need to do to capture it in the text of the international standard and what implications it might have for that document as a whole. Also, to be very clear: there is no requirement for either me or any other member of the Working Group to engage with the proposals registered here; we do so voluntarily and at our discretion.

#ALL is the name of a phase

You may believe it is, but it is not what the standard defines it as, and that is central to the current issue.
I think it would be clearer then if #ALL and #DEFAULT were explicitly defined as implicit phases containing all patterns in the schema.

@rjelliffe
Copy link
Member Author

rjelliffe commented Jun 24, 2024 via email

@AndrewSales AndrewSales added the 2025 A change made in preparing the 2025 edition label Jul 2, 2024
@AndrewSales
Copy link
Collaborator

No, as I said above, there are two such cases, #ALL and:

#DEFAULT where no @defaultPhase is present in the schema

I indicated the need for clarity above, and I will now take this forward with the Working Group instead, if we have time to alter the standard appropriately.

@AndrewSales AndrewSales changed the title Enhancement for sch:phase - @test Enhancement for sch:phase - @when Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2025 A change made in preparing the 2025 edition
Projects
None yet
Development

No branches or pull requests

3 participants