Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Enforce that only an element defined by defelement can be document element of the xml to validate. Added documentation for the content definition command "elementtype". |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | schema |
Files: | files | file ages | folders |
SHA3-256: |
f84ee3522656e610e11b48fcf66a307d |
User & Date: | rolf 2020-02-11 20:02:02.460 |
Context
2020-02-12
| ||
00:39 | Code gardening. check-in: e7f010f104 user: rolf tags: schema | |
2020-02-11
| ||
20:02 | Enforce that only an element defined by defelement can be document element of the xml to validate. Added documentation for the content definition command "elementtype". check-in: f84ee35226 user: rolf tags: schema | |
16:28 | Better handling of validation command using SAX parser in partial parsing mode (-final 0). check-in: 266d76531f user: rolf tags: schema | |
Changes
Changes to doc/schema.html.
1 2 3 4 5 6 | <html> <head> <link rel="stylesheet" href="manpage.css"><title>tDOM manual: schema</title><meta name="xsl-processor" content="Jochen Loewer ([email protected]), Rolf Ade ([email protected]) et. al."><meta name="generator" content="$RCSfile: tmml-html.xsl,v $ $Revision: 1.11 $"><meta charset="utf-8"> </head><body> <div class="header"> <div class="navbar" align="center"> | | | | | | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | <html> <head> <link rel="stylesheet" href="manpage.css"><title>tDOM manual: schema</title><meta name="xsl-processor" content="Jochen Loewer ([email protected]), Rolf Ade ([email protected]) et. al."><meta name="generator" content="$RCSfile: tmml-html.xsl,v $ $Revision: 1.11 $"><meta charset="utf-8"> </head><body> <div class="header"> <div class="navbar" align="center"> <a href="#SECTid0x55d608b71910">NAME</a> · <a href="#SECTid0x55d608adeab0">SYNOPSIS</a> · <a href="#SECTid0x55d608b68810">DESCRIPTION </a> · <a href="#SECTid0x55d608bcac50">Schema definition scripts</a> · <a href="#SECTid0x55d608bd7500">Quantity specifier</a> · <a href="#SECTid0x55d608bd9350">Text constraint scripts</a> · <a href="#SECTid0x55d608be73d0">Local key constraints</a> · <a href="#SECTid0x55d608be91e0">Exampels</a> · <a href="#SECTid0x55d608beb190">KEYWORDS</a> </div><hr class="navsep"> </div><div class="body"> <h2><a name="SECTid0x55d608b71910">NAME</a></h2><p class="namesection"> <b class="names">tdom::schema - </b><br>Create a schema validation command</p> <h2><a name="SECTid0x55d608adeab0">SYNOPSIS</a></h2><pre class="syntax">package require tdom <b class="cmd">tdom::schema</b> <i class="m">?create?</i> <i class="m">cmdName</i> </pre> <h2><a name="SECTid0x55d608b68810">DESCRIPTION </a></h2><p>This command creates validation commands with a simple API. The validation commands have methods to define a schema and are able to validate XML data or to post-validate a tDOM DOM tree (and to some degree other kind of hierarchical data) against this schema.</p><p>Additionally, a validation command may be used as argument to the <i class="m">-validateCmd</i> option of the <i class="m">dom parse</i> and the <i class="m">expat</i> commands to enable validation additional to what they otherwise do.</p><p>The valid methods of the created commands are:</p><dl class="commandlist"> |
︙ | ︙ | |||
71 72 73 74 75 76 77 | elments with the same name and namespache but different content models. The <i class="m">definition script</i> is evaluated and defines the content model of the element. If the <i class="m">namespace</i> argument is given, any <i class="m">element</i> or <i class="m">ref</i> references in the definition script not wrapped inside a <i class="m">namespace</i> command are resolved in that namespace. If there is already a elementtype definition for | | > | | 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 | elments with the same name and namespache but different content models. The <i class="m">definition script</i> is evaluated and defines the content model of the element. If the <i class="m">namespace</i> argument is given, any <i class="m">element</i> or <i class="m">ref</i> references in the definition script not wrapped inside a <i class="m">namespace</i> command are resolved in that namespace. If there is already a elementtype definition for the name/namespace combination the command raises error. The document element of any XML to validate cannot be a <i class="m">defelementtype</i> defined element.</dd> <dt> <b class="method">defpattern</b> <i class="m">name</i> <i class="m">?namespace?</i> <i class="m"><definition script></i> </dt> <dd>This method defines a (maybe complex) content particle |
︙ | ︙ | |||
406 407 408 409 410 411 412 | <dt><b class="method">reset</b></dt> <dd>This method resets the validation command into state READY (while preserving the defined grammer).</dd> </dl> | | | 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 | <dt><b class="method">reset</b></dt> <dd>This method resets the validation command into state READY (while preserving the defined grammer).</dd> </dl> <h2><a name="SECTid0x55d608bcac50">Schema definition scripts</a></h2><p>Schema definition scripts are ordinary Tcl scripts that are evaluatend in the namespace tdom::schema. The below listed schema definition commands in this tcl namespace allow to define a wide variety of document structures. Every schema definition command establish a validation constraint on the content which has to match or must be optional to qualify the content as valid. It is a validation error if there is additional (not matched) content.</p><p>The schema definition commands are:</p><dl class="commandlist"> |
︙ | ︙ | |||
434 435 436 437 438 439 440 | defined until validation then only an empty element with name <i class="m">name</i> and namespace <i class="m">namespace</i> and no attributes matches. </dd> <dt> | > > > > > > > > > > > > | > | 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 | defined until validation then only an empty element with name <i class="m">name</i> and namespace <i class="m">namespace</i> and no attributes matches. </dd> <dt> <b class="method">elementtype</b> <i class="m">name</i> <i class="m">?quant?</i> </dt> <dd>This command refers to the element definined with <i class="m">defelementtype</i> with the type name <i class="m">name</i> in the current context namespace. Forward references to a so far not defined element types or recursive references are allowed. If a forward referenced element type isn't defined until validation no content or attributes are expected.</dd> <dt> <b class="method">ref</b> <i class="m">name</i> <i class="m">?quant?</i> </dt> <dd>This command refers to the content particle defined with <i class="m">defpattern</i> with the name <i class="m">name</i> in the current context namespace. Forward references to a so far not defined pattern or recursive references are allowed. If a forward referenced pattern isn't defined until validation no content whatsoever is expected ("empty match").</dd> |
︙ | ︙ | |||
637 638 639 640 641 642 643 | call. This is meant as toplevel command of a <i>schemacmd define</i> script. This command is not allowed nested in an other definition script command and will raise error, if you call it there.</dd> </dl> | | | 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 | call. This is meant as toplevel command of a <i>schemacmd define</i> script. This command is not allowed nested in an other definition script command and will raise error, if you call it there.</dd> </dl> <h2><a name="SECTid0x55d608bd7500">Quantity specifier</a></h2><p>Serveral schema definition commands expects a quantifier as one of their arguments, which specifies how often the content particle specified by the command is expected. The valid values for a <i class="m">quant</i> argument are:</p><dl class="optlist"> <dt><b>!</b></dt> <dd>The content particle must occur exactly once in valid documents.</dd> |
︙ | ︙ | |||
681 682 683 684 685 686 687 | n to m times (both inclusive) in a row in valid documents. The quantifier must be a tcl list with two elements. Both elements must be integers, with n >= 0 and n < m.</dd> </dl><p>If an optional quantifier is not given then it defaults to * in case of the mixed command and to ! for all other commands.</p> | | | 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 | n to m times (both inclusive) in a row in valid documents. The quantifier must be a tcl list with two elements. Both elements must be integers, with n >= 0 and n < m.</dd> </dl><p>If an optional quantifier is not given then it defaults to * in case of the mixed command and to ! for all other commands.</p> <h2><a name="SECTid0x55d608bd9350">Text constraint scripts</a></h2><p>Text - parsed character data, as XML calles it - must sometimes be of a certain kind, must comply to some rules etc to be valid. The text constraint script arguments to the text, attribute, nsattribute and deftext commands allow the following text constraint commands to check text for certain properties.</p><p>The text constraint commands are:</p><dl class="commandlist"> <dt> <b class="cmd">integer</b> <i class="m">?(xsd|tcl)?</i> |
︙ | ︙ | |||
984 985 986 987 988 989 990 | <dd>This text constraint match if the text value is a xsd:unsignedLong. This is an integer between 0 and 18446744073709551615, both included, optionally preceded by a + sign and leading zeros.</dd> </dl> | | | 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 | <dd>This text constraint match if the text value is a xsd:unsignedLong. This is an integer between 0 and 18446744073709551615, both included, optionally preceded by a + sign and leading zeros.</dd> </dl> <h2><a name="SECTid0x55d608be73d0">Local key constraints</a></h2><p>Document wide uniqueness and foreign key constraints are available with the text constraint commands id and idref. Keyspaces allow for sub-tree local uniqueness and foreign key constraints.</p><dl class="commandlist"> <dt> <b class="cmd">keyspace</b> <i class="m"><names list></i> <i class="m"><constraint script></i> </dt> |
︙ | ︙ | |||
1020 1021 1022 1023 1024 1025 1026 | active always matches. If the keyspace is active then reports error if there is still no key as the value at the end of the keyspace <i class="m"><name></i>. Otherwise it matches.</dd> </dl> | | | 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 | active always matches. If the keyspace is active then reports error if there is still no key as the value at the end of the keyspace <i class="m"><name></i>. Otherwise it matches.</dd> </dl> <h2><a name="SECTid0x55d608be91e0">Exampels</a></h2><p>The XML Schema Part 0: Primer Second Edition (<a href="https://www.w3.org/TR/xmlschema-0/">https://www.w3.org/TR/xmlschema-0/</a>) starts with this example schema:</p><pre class="example"> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:annotation> <xsd:documentation xml:lang="en"> Purchase order schema for Example.com. |
︙ | ︙ | |||
1189 1190 1191 1192 1193 1194 1195 | foreach e {name email} { defelement $e {text} } } </pre> | | | 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 | foreach e {name email} { defelement $e {text} } } </pre> <h2><a name="SECTid0x55d608beb190">KEYWORDS</a></h2><p class="keywords"> <a class="keyword" href="keyword-index.html#KW-Validation">Validation</a>, <a class="keyword" href="keyword-index.html#KW-Postvalidation">Postvalidation</a>, <a class="keyword" href="keyword-index.html#KW-DOM">DOM</a>, <a class="keyword" href="keyword-index.html#KW-SAX">SAX</a> </p> </div><hr class="navsep"><div class="navbar" align="center"> <a class="navaid" href="index.html">Contents</a> · <a class="navaid" href="category-index.html">Index</a> · <a class="navaid" href="keyword-index.html">Keywords</a> · <a class="navaid" href="http://tdom.org">Repository</a> </div> </body> </html> |
Changes to doc/schema.n.
︙ | ︙ | |||
220 221 222 223 224 225 226 | elments with the same name and namespache but different content models. The \fIdefinition script\fR is evaluated and defines the content model of the element. If the \&\fInamespace\fR argument is given, any \fIelement\fR or \&\fIref\fR references in the definition script not wrapped inside a \fInamespace\fR command are resolved in that namespace. If there is already a elementtype definition for | | | > | 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 | elments with the same name and namespache but different content models. The \fIdefinition script\fR is evaluated and defines the content model of the element. If the \&\fInamespace\fR argument is given, any \fIelement\fR or \&\fIref\fR references in the definition script not wrapped inside a \fInamespace\fR command are resolved in that namespace. If there is already a elementtype definition for the name/namespace combination the command raises error. The document element of any XML to validate cannot be a \&\fIdefelementtype\fR defined element. .TP \&\fB\fBdefpattern\fP \fIname\fB \fI?namespace?\fB \fI<definition script>\fB \&\fRThis method defines a (maybe complex) content particle with the \fIname\fR (optional in the namespace \&\fInamespace\fR) in the schema, to be referenced in other definition scripts with the definition command \fIref\fR. The \&\fIdefinition script\fR is evaluated and defines the content |
︙ | ︙ | |||
497 498 499 500 501 502 503 | references to so far not defined elements or pattern or other local definitions of the same name inside the \fIdefinition script\fR are allowed. If a forward referenced element isn't defined until validation then only an empty element with name \&\fIname\fR and namespace \fInamespace\fR and no attributes matches. .TP | > > > > > > > > | | 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 | references to so far not defined elements or pattern or other local definitions of the same name inside the \fIdefinition script\fR are allowed. If a forward referenced element isn't defined until validation then only an empty element with name \&\fIname\fR and namespace \fInamespace\fR and no attributes matches. .TP \&\fB\fBelementtype\fP \fIname\fB \fI?quant?\fB \&\fRThis command refers to the element definined with \&\fIdefelementtype\fR with the type name \fIname\fR in the current context namespace. Forward references to a so far not defined element types or recursive references are allowed. If a forward referenced element type isn't defined until validation no content or attributes are expected. .TP \&\fB\fBref\fP \fIname\fB \fI?quant?\fB \&\fRThis command refers to the content particle defined with \&\fIdefpattern\fR with the name \fIname\fR in the current context namespace. Forward references to a so far not defined pattern or recursive references are allowed. If a forward referenced pattern isn't defined until validation no content whatsoever is expected ("empty match"). .TP |
︙ | ︙ |
Changes to doc/schema.xml.
︙ | ︙ | |||
70 71 72 73 74 75 76 | elments with the same name and namespache but different content models. The <m>definition script</m> is evaluated and defines the content model of the element. If the <m>namespace</m> argument is given, any <m>element</m> or <m>ref</m> references in the definition script not wrapped inside a <m>namespace</m> command are resolved in that namespace. If there is already a elementtype definition for | | > | | 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 | elments with the same name and namespache but different content models. The <m>definition script</m> is evaluated and defines the content model of the element. If the <m>namespace</m> argument is given, any <m>element</m> or <m>ref</m> references in the definition script not wrapped inside a <m>namespace</m> command are resolved in that namespace. If there is already a elementtype definition for the name/namespace combination the command raises error. The document element of any XML to validate cannot be a <m>defelementtype</m> defined element.</desc> </commanddef> <commanddef> <command><method>defpattern</method> <m>name</m> <m>?namespace?</m> <m><definition script></m></command> <desc>This method defines a (maybe complex) content particle with the <m>name</m> (optional in the namespace <m>namespace</m>) in the schema, to be referenced in other |
︙ | ︙ | |||
412 413 414 415 416 417 418 | script</m> are allowed. If a forward referenced element isn't defined until validation then only an empty element with name <m>name</m> and namespace <m>namespace</m> and no attributes matches. </desc> </commanddef> <commanddef> | > > > > > > > > > > | > | 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 | script</m> are allowed. If a forward referenced element isn't defined until validation then only an empty element with name <m>name</m> and namespace <m>namespace</m> and no attributes matches. </desc> </commanddef> <commanddef> <command><method>elementtype</method> <m>name</m> <m>?quant?</m></command> <desc>This command refers to the element definined with <m>defelementtype</m> with the type name <m>name</m> in the current context namespace. Forward references to a so far not defined element types or recursive references are allowed. If a forward referenced element type isn't defined until validation no content or attributes are expected.</desc> </commanddef> <commanddef> <command><method>ref</method> <m>name</m> <m>?quant?</m></command> <desc>This command refers to the content particle defined with <m>defpattern</m> with the name <m>name</m> in the current context namespace. Forward references to a so far not defined pattern or recursive references are allowed. If a forward referenced pattern isn't defined until validation no content whatsoever is expected ("empty match").</desc> </commanddef> |
︙ | ︙ |
Changes to generic/schema.c.
︙ | ︙ | |||
1484 1485 1486 1487 1488 1489 1490 | const char *name, void *namespace ) { Tcl_HashEntry *h; void *namespacePtr, *namePtr; SchemaCP *pattern; | | | 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 | const char *name, void *namespace ) { Tcl_HashEntry *h; void *namespacePtr, *namePtr; SchemaCP *pattern; int rc = 1, reportError; if (sdata->skipDeep) { sdata->skipDeep++; return TCL_OK; } if (sdata->validationState == VALIDATION_FINISHED) { SetResult ("Validation finished."); |
︙ | ︙ | |||
1582 1583 1584 1585 1586 1587 1588 | } } else { pattern = NULL; } if (!sdata->stack) { sdata->validationState = VALIDATION_STARTED; | > | > > > > > > > > | 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 | } } else { pattern = NULL; } if (!sdata->stack) { sdata->validationState = VALIDATION_STARTED; reportError = 0; if (pattern) { if (pattern->flags & PLACEHOLDER_PATTERN_DEF || pattern->flags & FORWARD_PATTERN_DEF) { reportError = 1; } } else { reportError = 1; } if (reportError) { if (recover (interp, sdata, UNKNOWN_ROOT_ELEMENT, name, namespace, NULL, 0)) { sdata->skipDeep = 1; return TCL_OK; } SetResult ("Unknown element"); return TCL_ERROR; |
︙ | ︙ |
Changes to tests/schema.test.
︙ | ︙ | |||
722 723 724 725 726 727 728 729 730 731 732 733 734 735 | append xml [string repeat "</n></n></n></n></n></n></n></n></n></n>" 20000] append xml "</doc>" set result [s validate $xml errMsg] s delete list $result $errMsg } {0 {error "Element "a" doesn't match" at line 1 character 600009}} test schema-2.1 {grammar definition: ref} { tdom::schema create grammar grammar defpattern thisPattern { element a element b } grammar defpattern thatPattern { | > > > > > > > > > > > > > > | 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 | append xml [string repeat "</n></n></n></n></n></n></n></n></n></n>" 20000] append xml "</doc>" set result [s validate $xml errMsg] s delete list $result $errMsg } {0 {error "Element "a" doesn't match" at line 1 character 600009}} test schema-1.32 {Unknown root element} { tdom::schema s s define { defelement e { element doc ? { element e } } } set result [s validate <doc><e><doc><e/></doc></e></doc>] s delete set result } 0 test schema-2.1 {grammar definition: ref} { tdom::schema create grammar grammar defpattern thisPattern { element a element b } grammar defpattern thatPattern { |
︙ | ︙ | |||
7436 7437 7438 7439 7440 7441 7442 7443 7444 7445 7446 7447 7448 7449 | s defelementtype a2 a ns { elementtype e2 } set result [lsort -index 0 [s info definedElementtypes]] s delete set result } {{a http://my.foo} {a2 http://my.foo}} test schema-23.1 {validatefile} { tdom::schema s s define { set fd [open [file join [file dir [info script]] ../doc/tmml.schema] r] eval [read $fd] close $fd | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 7450 7451 7452 7453 7454 7455 7456 7457 7458 7459 7460 7461 7462 7463 7464 7465 7466 7467 7468 7469 7470 7471 7472 7473 7474 7475 7476 7477 7478 7479 7480 7481 7482 7483 7484 7485 7486 7487 7488 7489 7490 7491 7492 7493 7494 7495 7496 7497 7498 7499 7500 7501 7502 7503 7504 7505 7506 7507 7508 7509 7510 7511 7512 7513 | s defelementtype a2 a ns { elementtype e2 } set result [lsort -index 0 [s info definedElementtypes]] s delete set result } {{a http://my.foo} {a2 http://my.foo}} test schema-22.7 {defelementtype} { tdom::schema s s defelement doc { elementtype e1type elementtype e2type * } foreach e {e1 e2} { s defelementtype ${e}type $e {} } set result [list] foreach xml { <doc/> <doc><e1/></doc> <doc><e1/><e2/></doc> <doc><e1/><e2/><e2/></doc> <doc><e1/><e2/><e2/><e2/></doc> <doc><e1/><e2/><e2/><e2/><e1/></doc> <doc><e2/></doc> } { lappend result [s validate $xml] } s delete set result } {0 1 1 1 1 0 0} test schema-22.8 {defelementtype} { tdom::schema s s defelementtype doctype doc { elementtype e1type elementtype e2type * } foreach e {e1 e2} { s defelementtype ${e}type $e {} } set result [list] foreach xml { <doc/> <doc><e1/></doc> <doc><e1/><e2/></doc> <doc><e1/><e2/><e2/></doc> <doc><e1/><e2/><e2/><e2/></doc> <doc><e1/><e2/><e2/><e2/><e1/></doc> <doc><e2/></doc> } { lappend result [s validate $xml] } s delete set result } {0 0 0 0 0 0 0} test schema-23.1 {validatefile} { tdom::schema s s define { set fd [open [file join [file dir [info script]] ../doc/tmml.schema] r] eval [read $fd] close $fd |
︙ | ︙ |