Work in progress... Currently focused on incompatibilities. Possibly hints on improving scripts making use of new facilities could be added at some point ## Strict handling of ill-formed encoded data Conceptually, Tcl strings are a sequence of Unicode code points. These strings are converted to and from external encodings via channel I/O or explicit use of the `encoding` command. The behavior of Tcl has changed in Tcl 9 in the presence of ill-formed encoded data in input streams or if output streams encodings do not support the Unicode code point being written. In the case of input channels, Tcl 8 would silently map invalid bytes in the stream to the numerically equivalent Unicode code points, in essence presuming iso8859-1 encoding. On output streams, characters not supported by the output encoding were substituted with an encoding-specific replacement character. In contrast, channels in Tcl 9 are by default configured with the `strict` encoding profile and will raise an error exception for both the above cases. If Tcl 8 behavior is desired, applications can use the `-profile` option to `fconfigure` to apply the `tcl8` encoding profile to the channel. This is not recommended as it is not standards-compliant and can be seen as corrupting data. Instead, a standards-compliant alternative is to use the `replace` profile which like the `tcl8` profile will also not raise errors in the presence of invalidly encoded data. The same behavioural changes also apply to explicit transforms with the `encoding` command. Ill formed data or unsupported characters will result in an error exception. The `-profile` option can be used in this case as well to change this behavior. For more on encoding profiles, refer to the `encoding` manpage. See [TIP 657](https://core.tcl-lang.org/tips/doc/trunk/tip/657.md). ## Tilde substitution File path arguments in Tcl 8 would be subject to tilde substitution where `~` and `~USER` prefixes in the path would be replaced with the home directory of the current or named user respectively. Although convenient for interactive use, this behavior lead to both security and robustness issues as documented in TIP 602. Tcl 9 therefore no longer does this substitution and scripts must explicitly do the same if desired with the new commands `file home` and `file tildeexpand`. Conversely, commands that had to protect against this treatment of `~` by prefixing with `./` (`glob`, `file split` etc.) will no longer to do so. Scripts that had their own workarounds to protect against tilde substitution no longer need to do so. As an exception for convenience for end users, tilde substitution is still done at start up time on environment values containing library paths (`TCLLIBPATH` and analogous tm path variants). See [TIP 602](https://core.tcl-lang.org/tips/doc/trunk/tip/602.md). ## Octals The Tcl 8 intepretation of numeric strings beginning with `0` as octal representation has been done away with. These are now treated as decimal. All such uses within a script should instead use the explicit `0o` prefix for octal notation. Particular care must be taken on input and output when interacting with programs or users based on the old interpretation. See [TIP 114](https://core.tcl-lang.org/tips/doc/trunk/tip/114.md). ## Underscores in integers Tcl 9 allows the use of the `_` underscore character as a separator in numeric strings, e.g. `1_000_000`, `1_0.0_1`. It is important to note that this applies not just to literals in source code as in some other languages but run-time values as well. For example, if the `string is integer` command is used to validate user input as integers, it will allow underscores to be entered. If this is not desired, such validation needs to be changed to use `scan` or regexps. See [TIP 551](https://core.tcl-lang.org/tips/doc/trunk/tip/551.md). ## Changes in integer classification and handling The command `string is integer` now accepts integers of any size and is not limited to 32 bit values. Analogously, the `int()` function will no longer truncate integer values to a 32-bit range. The following illustrates the difference. In Tcl 8, ``` % string is integer 0x100000001 0 % expr {int(0x100000001)} 1 ``` In Tcl 9, ``` % string is integer 0x100000001 1 % expr {int(0x100000001)} 4294967297 ``` Thus `string is integer` should not be used for range validation; use explicit range checks. Likewise, `int()` cannot be used for truncation; use explicit masking of high order bits. Further, the commands `string is entier` and `string is wide` and functions `entier()` and `wide()` are deprecated. Again, use explicit range checks and masking instead. See [TIP 514](https://core.tcl-lang.org/tips/doc/trunk/tip/514.md). ## Change in variable name resolution In Tcl 8 variable names that are not absolute, are resolved by looking first in the current namespace, and then in the global namespace. In Tcl 9 such variables are always interpreted as relative to the current namespace. This avoids the problem that setting a variable inside a namespace scope will overwrite a global variable, if that global variable exists and the variable does not exist relative to the namespace. But it also has some consequences that may not immediately be obvious: * Access to well-known variables such as env and tcl_platform inside a namespace eval block will need to either fully qualified or the variable must be brought into scope using `namespace upvar` command (`namespace upvar :: env env`). * Inside procs in a namespace ns, it used to be possible to access namespace variables within a global namespace `ns` by referring to them as ns::var. That no longer works. Again, the solution is to either fully qualify the variable, or bring it into scope. In addition to the namespace upvar command, in this case that can also be done using the global command (`global ns::var`), as well as the variable command (`variable var`, or `variable ::ns::var`). The following egrep may be useful to detect such cases: ``` egrep '(\S+\s+\$|(set|unset|append|lset|lappend|dict\s+(set|incr|append|lappend|with|update|unset)|upvar|namespace var|parray|info exists|gets\s+\S+|array\s+\S+|vwait)\s+)[a-zA-Z]+::' *.tcl ``` Not a complete regexp (oo, expr, variable and may be other commands)! Note each occurrence has to be checked since it may be legitimate reference relative to the current namespace, i.e. a child namespace, (to be left alone) or a no-longer-valid reference relative to global namespace (which must be fully qualified). Strongly recommended this check be done. tclprodebug had over 500 such references (perfectly legitimate in Tcl8, not bugs) leading to error exceptions if lucky and to rather mysterious phase of the moon misbehavior otherwise. Q - has the behavior of `global` changed within namespaces? The behavior of the `global` command within namespaces has not changed. The command has no effect when used outside the context of a proc or apply body. Using `global` inside a namespace eval block does nothing. Variable names passed to the global command are still resolved relative to the global namespace, as they have always been. So, using `global ns::var` (inside a proc or apply body) will create a local variable var that is linked to the namespace variable ::ns::var. See [TIP 278](https://core.tcl-lang.org/tips/doc/trunk/tip/278.md). ## Changes in parsing of variable names When parsing braced variable references of the form `${..}`, in 8.6 the first `}` always terminated the variable name. In 9.0, any nested `{` are counted and the variable name is terminated by the `}` matching the opening `{`. A similar change also applies to parsing of array element names enclosed in `()`. See [TIP 465](https://core.tcl-lang.org/tips/doc/trunk/tip/465.md). ## Arguments to `load` are Unicode-aware and case sensitive In Tcl 8, the second argument to the `load` command which specifies the initialization function was case-insensitive. This is inconsistent with the rules for package names which are case-sensitive. In Tcl 9, initialization function names are now case sensitive. For example, if the initialization function was called `Sample_Init`, either of the following commands would work in Tcl 8 ``` load sample.dll sample load sample.dll Sample ``` In Tcl 9, only the latter would be successful. If compatibility with both Tcl 8 and 9 is desired, the following idiom may be used ``` load sample.dll [string totitle sample] ``` Note that instances of the `load` command that do not have the second argument specified need not be changed. In addition, Tcl 9 does not restrict the names to ASCII. In practice this should not have any compatibility issues. In addition, library names compiled with TCL9 headers get the prefix "tcl9" in its name. The aim is to have a version for TCL8.7 and TCL9 in the same folder. The names are as follows: * tk87.dll -> tcl9tk87.dll (on Windows) * libtk8.7.so -> libtcl9tk8.7.so (on UNIX) A typical multi-version pkgIndex.tcl file may look like this: ``` if {![package vsatisfies [package provide Tcl] 8.6-]} { return } if {[package vsatisfies [package provide Tcl] 9.0-]} { package ifneeded tdbc::odbc 1.1.6 \ "[list load [file join $dir tcl9tdbcodbc116.dll] [string totitle tdbcodbc]]" } else { package ifneeded tdbc::odbc 1.1.6 \ "[list load [file join $dir tdbcodbc116.dll] [string totitle tdbcodbc]]" } ``` See [TIP 595](https://core.tcl-lang.org/tips/doc/trunk/tip/595.md). ## Writing version and build-independent pkgIndex.tcl files When writing `pkgIndex.tcl` files that are compatible with both Tcl 8 and Tcl 9 as well as the autoconf and nmake build systems, the following differences must be accounted for: * Irrespective of the build system, Tcl 9 extension binaries are named differently (prefixed with `tcl9`) from Tcl 8 * The initialization function name in Tcl 9 is case sensitive as noted previously. * In Tcl 8, extension binaries built with the nmake system have a suffix attached. For example, threaded builds have a `t` suffix attached while non-threaded builds do not. The autotools based build system does not distinguish in this manner. This inconsistency is not present in Tcl 9 but needs to be accounted for if the package supports both Tcl versions and build systems. The simplest way to achieve the above via the TEA build system is to write a `pkgIndex.tcl.in` template for autoconf similar to the following: ``` package ifneeded @PACKAGE_NAME@ @PACKAGE_VERSION@ \ [list apply [list {dir} { set packageInitName [string totitle @PACKAGE_NAME@] set path [file join $dir "@PKG_LIB_FILE@"] uplevel #0 [list load $path $packageInitName] }] $dir] ``` The `configure.ac` in the autotools build should contain the line ``` AC_CONFIG_FILES([Makefile pkgIndex.tcl]) ``` The `makefile.vc` should contain the line ``` pkgindex: default-pkgindex-tea ``` This will result in both build systems generating a `pkgIndex.tcl` that works for Tcl 8 as well as Tcl 9. ## Default encoding for scripts is UTF-8 The default encoding used when reading scripts, either with the `source` command without an explicit `-encoding` option, or passed as a command line argument to `tclsh` is utf-8 in Tcl 9 instead of the system encoding as in Tcl 8. No changes are required for scripts that were pure ASCII and used the `\u` or `\U` escapes for non-ASCII characters. Scripts that were saved using a (non-utf8) system encoding will have to be modified or sourced using an explicit `-encoding` option. In TCL8.6, it is anyway good practice to include the `-encoding` parameter to the source command for portability. A pkgIndex.tcl file may look like this: ``` package ifneeded mypackage 1.0 [list source -encoding cp1252 [file join $dir mypackage.tcl]] ``` This will still work in 9.0 for any supported codepage. See [TIP 587](https://core.tcl-lang.org/tips/doc/trunk/tip/587.md). ## Encoding profile *strict* for scripts The strict encoding profile applies to the source command. The source command may throw encoding errors. Scripts with non ASCII characters, even in comments, may now fail if the encoding is wrong. A typical example is a copyright glyph in a non UTF-8 codepage "©" script file. In TCL8.6, those errors did not arise, as the active tcl8 profile silently changed those characters to inline iso-8859-1 characters. If it was real code (like text), the script changed in functionality. ## System encoding on Windows may change to UTF-8 The Windows executable Manifest of tclsh.exe and wish.exe were changed to claim UTF-8 character set. On my Windows 10 system with German locale, the command *encoding system* returns *utf-8* with TCL/Tk 9.0 and *cp1252* with TCL/Tk 8.6. This is due to the manifest change, not any change within TCL/Tk. Due to that, the change to UTF-8 encoding of the source command may not be *reverted* by: ``` source -encoding [encoding system] file.tcl ``` For me, this is according where Windows is moving. If I create nowdays a file using the Windows editor application, it is encoded in UTF-8. With older Windows versions, this was cp1252. One effect of this is that a file written by a 8.x application using the default encoding will not be readable by a 9.x application unless the channel is explicitly configured to use the original encoding. Unfortunately, there is no way to know the original default encoding. A **guess** may be made by checking the Windows registry; something along the lines of (error checking ignored) ``` package require registry set winCodePage [registry get {HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage} ACP] set tclName cp[format %u $winCodePage] if {$tclName in [encoding names]} { fconfigure $chan -encoding $tclName } else { ...try some other method... } ``` ## Full Unicode support In Tcl 8, characters outside the BMP were stored as a surrogate pair and effectively treated as a string of length 2. This is no longer the case in Tcl 9 which has full support for the entire Unicode range and will (correctly) treat these characters as strings of length 1. Any workarounds to deal with these shortcomings in Tcl 8 need to be removed. See [TIP 497](https://core.tcl-lang.org/tips/doc/trunk/tip/497.md). ## Traces on upvars linked to arrays In Tcl 8, traces set on an array would not fire when an element of the array was modified through an `upvar` link. This documented but incongruent behavior is fixed in Tcl 9. Array traces will fire even when the element is accessed through an `upvar` link. See [TIP 634](https://core.tcl-lang.org/tips/doc/trunk/tip/634.md). ## `glob` no longer raises an error if no files match In Tcl 9, the `glob` command returns an empty list if no files match the specified pattern instead rasiing an error. This is true irrespective of the presence of the `-nocomplain` option which is still accepted but has no significance. Any scripts that expect an error to be raised in such cases will need to be modified to check for an empty list instead. See [TIP 637](https://core.tcl-lang.org/tips/doc/trunk/tip/637.md). ## Removal of deprecated commands or arguments These commands or arguments have been deprecated and have been removed from Tcl 9. Scripts making use of these should be modified as below: * Replace `case` with `switch` * Replace `read nonewline` with `read -nonewline ` * Replace `puts nonewline` with `puts -nonewline ` See [TIP 485](https://core.tcl-lang.org/tips/doc/trunk/tip/485.md) ## Deprecated `trace` subcommands `variable`, `vdelete`, `vinfo` have been removed These commands have been deprecated since Tcl 8.4 and have been removed from Tcl 9. Scripts making use of these should be modified as below: * Replace `trace variable` with `trace add variable` * Replace `trace vdelete` with `trace remove variable` * Replace `trace vinfo` with `trace info variable` See [TIP 673](https://core.tcl-lang.org/tips/doc/trunk/tip/673.md). ## Removal of `tcl_precision` The long deprecated variable `tcl_precision` that controlled conversion of floating point numbers to strings has been removed. Use the `format` command instead to control number of digits generated. See [TIP 488](https://core.tcl-lang.org/tips/doc/trunk/tip/488.md). ## Zipfs file system Tcl 9 introduces the ability to embed scripts as a zip archive bound to an executable or shared library. The default build is configured to make use of this capability and binds the Tcl initialization and support scripts into the shared library (for shared builds) or executable (for static builds). From a scripting perspective, this has several ramifications in terms of compatibility: * There is new file system type `zipfs` with its own rules for file names and directories. In particular, names are case sensitive even on Windows. Other differences include disallowing of creation of new files and directories. Existing files may be modified but the changes are not persisted to disk and are lost after dismounting the archive. See the `zipfs` manpage for details. * The feature has the side effect of introducing the concept of volumes on Unix (Windows programmers already had to deal with drives as volumes). ``` % file volumes //zipfs:/ / ``` * One implication of the above is that searches for files can no longer simply begin at `/` but need to iterate across all volumes. * The paths within `auto_path` *may* point to locations within the `zipfs` volume. These are not writable so any installers that copy files into that location will fail. Channel option *-eofchar* on write is not supported any more. The default on MS-Windows for channels is now the empty string (like on all other platforms). ``` % set t [open test.txt w+] file46a7048 % fconfigure $t -eofchar % fconfigure $t -eofchar {a b} bad value for -eofchar: must be non-NUL ASCII character ``` See [TIP 646](https://core.tcl-lang.org/tips/doc/trunk/tip/646.md). ## Threaded builds Tcl builds are now threaded by default. The `tcl_platform(threaded)` variable is no longer defined. To check for threaded builds in a 8.6/9 compatible way, use ``` tcl::pkgconfig get threaded ``` See [TIP 491](https://core.tcl-lang.org/tips/doc/trunk/tip/491.md).