Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Wip (not well working inbetween state). |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | wip |
Files: | files | file ages | folders |
SHA3-256: |
0da31f4fcb8d1805aa164d0fde651059 |
User & Date: | rolf 2019-03-23 15:20:11.373 |
Context
2019-03-26
| ||
01:43 | Clean up the mess, a bit. check-in: 7154504138 user: rolf tags: wip | |
2019-03-23
| ||
15:20 | Wip (not well working inbetween state). check-in: 0da31f4fcb user: rolf tags: wip | |
2019-03-20
| ||
16:00 | Still wip. check-in: 3bf5b1e10d user: rolf tags: wip | |
Changes
Changes to doc/expat.xml.
︙ | ︙ | |||
28 29 30 31 32 33 34 | to parse any kind of well-formed XML. The parsers are stream oriented XML parser. This means that you register handler scripts with the parser prior to starting the parse. These handler scripts are called when the parser discovers the associated structures in the document being parsed. A start tag is an example of the kind of structures for which you may register a handler script.</p> | > | | | | > | 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | to parse any kind of well-formed XML. The parsers are stream oriented XML parser. This means that you register handler scripts with the parser prior to starting the parse. These handler scripts are called when the parser discovers the associated structures in the document being parsed. A start tag is an example of the kind of structures for which you may register a handler script.</p> <p>The parsers always check for XML well-formedness of the input (and report error, if the input isn't well-formed). They parse the internal DTD and, at request, external DTD and external entities, if you resolve the identifier of the external entities with the -externalentitycommand script (see there). If you use the -validateCmd option (see there), the input is additionally validated.</p> <p>Additionly, the Tcl extension code that implements this command provides an API for adding C level coded handlers. Up to now, there exists the parser extension command "tdom". The handler set installed by this extension build an in memory "tDOM" DOM tree, while the parser is parsing the input.</p> <p>It is possible to register an arbitrary amount of different handler scripts |
︙ | ︙ | |||
77 78 79 80 81 82 83 84 85 86 87 88 89 90 | until the next call to the parse method, thus delaying the reporting of some of the data.</p> <p>If this option is set to "1" then documents which are not well-formed upon end of input will generate an error.</p></desc> </optdef> <optdef> <optname>-baseurl</optname> <optarg>url</optarg> <desc><p>Reports the base url of the document to the parser.</p></desc> </optdef> | > > > > > > > > > > > | 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 | until the next call to the parse method, thus delaying the reporting of some of the data.</p> <p>If this option is set to "1" then documents which are not well-formed upon end of input will generate an error.</p></desc> </optdef> <optdef> <optname>-validateCmd</optname> <optarg><tdom schema cmd></optarg> <desc><p>This option expects the name of a tDOM schema command. If this option is given, then the input is also validated. If the schema command hasn't set a reportcmd then the first validation error will stop further parsing (as a well-formedness error).</p></desc> </optdef> <optdef> <optname>-baseurl</optname> <optarg>url</optarg> <desc><p>Reports the base url of the document to the parser.</p></desc> </optdef> |
︙ | ︙ |
Changes to generic/schema.c.
︙ | ︙ | |||
864 865 866 867 868 869 870 | return 1; } popStack (sdata); if (rc == -1) mayskip = 1; break; case SCHEMA_CTYPE_VIRTUAL: | | < | 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 | return 1; } popStack (sdata); if (rc == -1) mayskip = 1; break; case SCHEMA_CTYPE_VIRTUAL: Tcl_Panic ("Virtual constrain in MIXED or CHOICE"); } if (!mayskip && mayMiss (candidate->quants[i])) mayskip = 1; } break; |
︙ | ︙ | |||
912 913 914 915 916 917 918 | } ac++; hm = 0; } if (isName) { if (recover (interp, sdata, "UNEXPECTED_ELEMENT", 15)) { /* Skip the just opened element tag and the following | | | < > > > > > > > | 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 | } ac++; hm = 0; } if (isName) { if (recover (interp, sdata, "UNEXPECTED_ELEMENT", 15)) { /* Skip the just opened element tag and the following * content of the current. */ sdata->skipDeep = 2; return 1; } return 0; } return -1; case SCHEMA_CTYPE_VIRTUAL: case SCHEMA_CTYPE_CHOICE: case SCHEMA_CTYPE_TEXT: case SCHEMA_CTYPE_ANY: /* Never pushed onto stack */ Tcl_Panic ("Invalid CTYPE onto the validation stack!"); case SCHEMA_CTYPE_INTERLEAVE: mayskip = 1; for (i = 0; i < cp->nc; i++) { if (se->interleaveState[i]) { if (maxOne (cp->quants[i])) continue; } icp = cp->content[i]; switch (icp->type) { case SCHEMA_CTYPE_TEXT: if (icp->nc) { if (!checkText (interp, icp, "")) { mayskip = 0; } } break; case SCHEMA_CTYPE_ANY: sdata->skipDeep = 1; if (mayskip && minOne (cp->quants[i])) mayskip = 0; se->hasMatched = 1; se->interleaveState[i] = 1; return 1; case SCHEMA_CTYPE_NAME: if (icp->name == name && icp->namespace == namespace) { |
︙ | ︙ | |||
967 968 969 970 971 972 973 | rc = matchElementStart (interp, sdata, name, namespace); if (rc == 1) { se->hasMatched = 1; se->interleaveState[i] = 1; return 1; } popStack (sdata); | | > | > > > | | > | | 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 | rc = matchElementStart (interp, sdata, name, namespace); if (rc == 1) { se->hasMatched = 1; se->interleaveState[i] = 1; return 1; } popStack (sdata); if (mayskip && rc != -1) mayskip = 0; break; case SCHEMA_CTYPE_VIRTUAL: Tcl_Panic ("Virtual constraint child of INTERLEAVE"); break; } } if (mayskip) break; if (recover (interp, sdata, S("UNCOMPLET_CP"))) { sdata->skipDeep = 2; return 1; } } return -1; } int probeElement ( Tcl_Interp *interp, SchemaData *sdata, const char *name, |
︙ | ︙ | |||
1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 | } if (sdata->stack) { SchemaValidationStack *se; se = sdata->stack; if (se->pattern->type == SCHEMA_CTYPE_NAME && se->activeChild >= se->pattern->nc) { SetResult ("Unexpected child element \""); if (namespacePtr) { Tcl_AppendResult (interp, namespacePtr, ":", NULL); } Tcl_AppendResult (interp, name, "\" for element \"", NULL); if (se->pattern->namespace) { Tcl_AppendResult (interp, namespace, ":", NULL); } Tcl_AppendResult (interp, name, "\"", NULL); return TCL_ERROR; } } else { if (!pattern) { SetResult ("Unknown element"); return TCL_ERROR; } pushToStack (sdata, pattern); sdata->validationState = VALIDATION_STARTED; return TCL_OK; } | > > > > > > > > > | 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 | } if (sdata->stack) { SchemaValidationStack *se; se = sdata->stack; if (se->pattern->type == SCHEMA_CTYPE_NAME && se->activeChild >= se->pattern->nc) { if (recover (interp, sdata, S("UNEXPECTED_ELEMENT"))) { sdata->skipDeep = 1; return TCL_OK; } SetResult ("Unexpected child element \""); if (namespacePtr) { Tcl_AppendResult (interp, namespacePtr, ":", NULL); } Tcl_AppendResult (interp, name, "\" for element \"", NULL); if (se->pattern->namespace) { Tcl_AppendResult (interp, namespace, ":", NULL); } Tcl_AppendResult (interp, name, "\"", NULL); return TCL_ERROR; } } else { if (!pattern) { SetResult ("Unknown element"); if (recover (interp, sdata, S("UNKNOWN_DOKUMENT_ELEMENT"))) { sdata->skipDeep = 1; return TCL_OK; } return TCL_ERROR; } pushToStack (sdata, pattern); sdata->validationState = VALIDATION_STARTED; return TCL_OK; } |
︙ | ︙ | |||
1405 1406 1407 1408 1409 1410 1411 | if (checkElementEnd (interp, sdata)) { mayMiss = 1; } popStack (sdata); break; case SCHEMA_CTYPE_VIRTUAL: | | | | 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 | if (checkElementEnd (interp, sdata)) { mayMiss = 1; } popStack (sdata); break; case SCHEMA_CTYPE_VIRTUAL: Tcl_Panic ("Virtual constrain in MIXED or CHOICE"); } if (mayMiss) break; } if (mayMiss) break; return 0; case SCHEMA_CTYPE_VIRTUAL: |
︙ | ︙ | |||
1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 | startElement( void *userData, const char *name, const char **atts ) { ValidateMethodData *vdata = (ValidateMethodData *) userData; char *namespace; const char *s; int i = 0; DBG(fprintf (stderr, "startElement: '%s'\n", name);) if (Tcl_DStringLength (vdata->cdata) | > > < | | | | 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 | startElement( void *userData, const char *name, const char **atts ) { ValidateMethodData *vdata = (ValidateMethodData *) userData; SchemaData *sdata; char *namespace; const char *s; int i = 0; DBG(fprintf (stderr, "startElement: '%s'\n", name);) sdata = vdata->sdata; if (Tcl_DStringLength (vdata->cdata) || (sdata->stack->pattern->flags & CONSTRAINT_TEXT_CHILD)) { if (probeText (vdata->interp, sdata, Tcl_DStringValue (vdata->cdata)) != TCL_OK) { sdata->validationState = VALIDATION_ERROR; XML_StopParser (vdata->parser, 0); Tcl_DStringSetLength (vdata->cdata, 0); vdata->onlyWhiteSpace = 1; return; } Tcl_DStringSetLength (vdata->cdata, 0); vdata->onlyWhiteSpace = 1; |
︙ | ︙ | |||
1768 1769 1770 1771 1772 1773 1774 | vdata->uri[i] = '\0'; namespace = vdata->uri; } } else { s = name; } | | | > | | | > | > | > | | | | | | | 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 | vdata->uri[i] = '\0'; namespace = vdata->uri; } } else { s = name; } if (probeElement (vdata->interp, sdata, s, namespace) != TCL_OK) { sdata->validationState = VALIDATION_ERROR; XML_StopParser (vdata->parser, 0); return; } if (atts[0] || (sdata->stack && sdata->stack->pattern->attrs)) { if (probeAttributes (vdata->interp, sdata, atts) != TCL_OK) { sdata->validationState = VALIDATION_ERROR; XML_StopParser (vdata->parser, 0); } } } static void endElement ( void *userData, const char *name ) { ValidateMethodData *vdata = (ValidateMethodData *) userData; SchemaData *sdata; DBG(fprintf (stderr, "endElement: '%s'\n", name);) sdata = vdata->sdata; if (sdata->validationState == VALIDATION_ERROR) { return; } if (!sdata->skipDeep && (Tcl_DStringLength (vdata->cdata) || sdata->stack->pattern->flags & CONSTRAINT_TEXT_CHILD)) { if (probeText (vdata->interp, sdata, Tcl_DStringValue (vdata->cdata)) != TCL_OK) { sdata->validationState = VALIDATION_ERROR; XML_StopParser (vdata->parser, 0); Tcl_DStringSetLength (vdata->cdata, 0); vdata->onlyWhiteSpace = 1; return; } Tcl_DStringSetLength (vdata->cdata, 0); vdata->onlyWhiteSpace = 1; } if (probeElementEnd (vdata->interp, sdata) != TCL_OK) { sdata->validationState = VALIDATION_ERROR; XML_StopParser (vdata->parser, 0); } } static void characterDataHandler ( void *userData, |
︙ | ︙ |
Changes to tests/schema.test.
︙ | ︙ | |||
3795 3796 3797 3798 3799 3800 3801 | element a element b element c } } s reportcmd schema-18 set result "" | | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | > | 3795 3796 3797 3798 3799 3800 3801 3802 3803 3804 3805 3806 3807 3808 3809 3810 3811 3812 3813 3814 3815 3816 3817 3818 3819 3820 3821 3822 3823 3824 3825 3826 3827 3828 3829 3830 3831 3832 3833 3834 3835 3836 3837 3838 3839 3840 3841 3842 3843 3844 3845 3846 3847 3848 3849 3850 3851 3852 3853 3854 3855 3856 3857 3858 3859 3860 3861 3862 3863 3864 3865 3866 3867 3868 3869 3870 3871 3872 3873 3874 3875 3876 3877 3878 3879 3880 3881 3882 3883 | element a element b element c } } s reportcmd schema-18 set result "" set rc [s validate {<doc><a/><c/></doc>}] lappend result $rc s delete set result } {s MISSING_CP 1} test schema-18.3 {reportcmd} {knownBug} { tdom::schema s s define { defelement doc { element a } } s reportcmd schema-18 set result "" foreach xml { <foo/> <bar>baz</bar> <bar><a><b/></a></bar> } { set rc [s validate $xml] lappend result $rc } s delete set result } {s MISSING_CP 1} proc validatedSAX {g xml {keepEmpties 1}} { set args [list -validateCmd $g] if {!$keepEmpties} { lappend args -ignorewhitespace 1 } xml::parser p {*}$args set rc [catch {p parse $xml} errMsg] p delete return $rc } proc validatedDOM {g xml {keepEmpties 0}} { set args [list -validateCmd $g] if {$keepEmpties} { lappend args -keepEmpties } set rc [catch { set doc [dom parse {*}$args $xml] } errMsg] if {$doc ne ""} { $doc delete } return $rc } proc postValidation {g xml} { set doc [dom parse $xml] set rc [$g domvalidate $doc errMsg] $doc delete return $rc } test schema-18.4 {reportcmd} {knownBug} { tdom::schema s s define { defelement doc { element a } } s reportcmd schema-18 set result "" foreach xml { <foo/> <bar>baz</bar> <bar><a><b/></a></bar> } { set rc [s validate $xml] lappend result $rc } s delete set result } {s MISSING_CP 1} } |