Author: Jan Nijtmans <[email protected]> State: Draft Type: Project Created: 30-Jul-2021 Post-History: Keywords: Tcl Tcl-Version: 8.7 Tcl-Branch: tip607-encoding-failindex
This TIP proposes to add a "-failindex" option to encoding convertto/convertfrom. The implementation brings TIP  to the script level. In case of untransformable data, the error location and the so far transformed string may be retrieved.
The command is extended by a -failindex option:
encoding convertfrom ?-failindex posvar? ?encoding? data encoding convertto ?-failindex posvar? ?encoding? data
The distinct behaviour is as follows:
- No conversion error
- Option -failindex not given: Converted data returned as command result.
- Option -failindex given: Additionaly, the value -1 is written to the given variable in the caller scope.
- Conversion error present
- Option -failindex not given: In TCL 8.7 or with -nocomplain option, the data is converted with replacement characters as currently done. Otherwise, an error message is thrown by the command (Error Code: EILSEQ).
- Option -failindex given: The converted data until the failed index is returned as command result. The position of the conversion error in the source string is written to the specified variable in the caller scope.
This specification is inspired by the already present option -failindex of the string is command.
This option may not be used together with the TCL encoding option **-nocomplain'' of TIP601.
Implementation of byte compiled encoding commands
Jan mentioned, that the implementation is not trivial, as the encoding ensemble is a partially compiled command.
Nevertheless, an implementation is tried in the branch
The command string is -failindex var is also not byte compiled.
I plan to prefix the current generic byte compiler by a test for a present -failindex option, and not do byte compiling in this case.
incomplete multi byte sequences.
Note: there was a side discussion within the thread if an incomplete multi-byte sequence is an error or not. Unfortunately, the required detail about the reporting method of an incomplete multi-byte sequence was not solved. So, it is considered as an error within this alternate solution.
The proposal was initiated by a post by Andreas Leitgeb 2021-05-12 on the core list.
This document has been placed in the public domain.