Check-in [7f50ba4aca]

Login
Bounty program for improvements to Tcl and certain Tcl packages.
Tcl 2019 Conference, Houston/TX, US, Nov 4-8
Send your abstracts to [email protected]
or submit via the online form by Sep 9.

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:New TIP 504
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: 7f50ba4aca98e5fb52eb0538eb6230bc26cd11ce3bf5ca8f5e47a521c77a6f79
User & Date: dgp 2018-03-21 19:07:06
Context
2018-03-26
16:01
New TIP 505. check-in: 001cc04be5 user: dgp tags: trunk
2018-03-21
19:07
New TIP 504 check-in: 7f50ba4aca user: dgp tags: trunk
12:25
Completed TIP 503 vote. YES Porter, Landers, Fellows, Kenny, Kupries Partial YES Nijtmans Partial PRESENT Nijtmans check-in: faf9df40f0 user: dgp tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to index.json.

cannot compute difference between binary files

Changes to index.md.

87
88
89
90
91
92
93







94
95
96
97
98
99
100
<th>#</th>
<th>Type</th>
<th>Tcl Version</th>
<th>Status</th>
<th>Title</th>
</tr></thead><tbody>








<tr class='project projectfinal projectfinal87 project87'>
<td valign='top'><a href='./tip/503.md'>503</a></td>
<td valign='top'>Project</td>
<td valign='top'>8.7</td>
<td valign='top'>Final</td>
<td valign='top'># TIP 503: End Tcl 8.3 Source Compatibility Support</td>
</tr>






>
>
>
>
>
>
>







87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
<th>#</th>
<th>Type</th>
<th>Tcl Version</th>
<th>Status</th>
<th>Title</th>
</tr></thead><tbody>

<tr class='project projectdraft projectdraft87 project87'>
<td valign='top'><a href='./tip/504.md'>504</a></td>
<td valign='top'>Project</td>
<td valign='top'>8.7</td>
<td valign='top'>Draft</td>
<td valign='top'># TIP 504: New subcommand [string insert]</td>
</tr>
<tr class='project projectfinal projectfinal87 project87'>
<td valign='top'><a href='./tip/503.md'>503</a></td>
<td valign='top'>Project</td>
<td valign='top'>8.7</td>
<td valign='top'>Final</td>
<td valign='top'># TIP 503: End Tcl 8.3 Source Compatibility Support</td>
</tr>

Added tip/504.md.












































































































































































































































































































































































































>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
# TIP 504: New subcommand [string insert]
	Author:         Don Porter <[email protected]>
	State:          Draft
	Type:           Project
	Vote:           Pending
	Created:        21-Mar-2018
	Obsoletes:	475
	Post-History:   
	Keywords:	Tcl,string,insert
	Tcl-Version:	8.7
-----

# Abstract

This TIP proposes a [`string insert`] subcommand for inserting a substring at a
given index.  This new [`string insert`] command is to be the string analogue of
[`linsert`].

# History

[[TIP 475]](475.md) proposed the same subcommand, and in addition proposed
a public C routine to access the same functionality. It was rejected, but
only for reasons relating to the C routine. This TIP is a duplication of
the rejected TIP, modified only to remove the C routine from the proposal,
and with appropriate revisions to the sections describing the Reference
Implementation and Future Work.
 
# Rationale

Substring insertion is a basic string operation not directly available in
current Tcl.  Substring insertion can be synthesized from existing string
commands, but the numerous legal forms of indexing yield significant difficulty.
A novice user cannot be expected to know, much less implement, all possible
index formats.  Thus it is reasonable to provide a standard substring insertion
command.

The current design of [`string replace`] expressly (albeit inexplicably)
prevents its use for performing string insertion.  TIP 323 originally proposed
to extend [`string replace`] to allow string insertion, but this aspect of TIP
323 was [withdrawn]
(https://core.tcl.tk/tips/fdiff?v1=6e0ba0ee9838accc&v2=34809f1432fc528f&sbs=1)
for the sake of compatibility.

> *Clarification: TIP 504 proposes no changes to the semantics of [`string
> replace`].*

To mirror the behavior of [`linsert`], [`string insert`] at `end` should append
to the string.  This is in conflict with [`string replace`], were it to be
extended to permit replacing an empty substring.

[`lreplace`] of an empty range at `end` inserts elements immediately before the
final element, whereas appending requires the start index to be `end+1`.  The
hypothetical extended [`string replace`] would mirror [`lreplace`] and therefore
would not have the end-relative indexing semantics needed to implement [`string
insert`].

In conclusion, it is most straightforward to simply provide a [`string insert`]
command with the same semantics as [`linsert`].

# Current Behavior

Currently available methods for string insertion are awkward and do not handle
end-relative and [TIP 176-style](176.md) indexing without extraordinary effort.
To demonstrate the surprising degree of complexity, the following is a pure Tcl
script reference implementation intended to handle all possible index formats
and corner cases:

<a name="ref"></a>

>
    # Pure Tcl implementation of [string insert] command.
    proc ::tcl::string::insert {string index insertString} {
        # Convert end-relative and TIP 176 indexes to simple integers.
        if {[regexp -expanded {
            ^(end(?![\t\n\v\f\r ])      # "end" is never followed by whitespace
            |[\t\n\v\f\r ]*[+-]?\d+)    # m, with optional leading whitespace
            (?:([+-])                   # op, omitted when index is "end"
            ([+-]?\d+))?                # n, omitted when index is "end"
            [\t\n\v\f\r ]*$             # optional whitespace (unless "end")
        } $index _ m op n]} {
            # Convert first index to an integer.
            switch $m {
                end     {set index [string length $string]}
                default {scan $m %d index}
            }
>
            # Add or subtract second index, if provided.
            switch $op {
                + {set index [expr {$index + $n}]}
                - {set index [expr {$index - $n}]}
            }
        } elseif {![string is integer -strict $index]} {
            # Reject invalid indexes.
            return -code error "bad index \"$index\": must be\
                    integer?\[+-\]integer? or end?\[+-\]integer?"
        }
>
        # Concatenate the pre-insert, insertion, and post-insert strings.
        string cat [string range $string 0 [expr {$index - 1}]] $insertString\
                   [string range $string $index end]
    }
>
    # Bind [string insert] to [::tcl::string::insert].
    namespace ensemble configure string -map [dict replace\
            [namespace ensemble configure string -map]\
            insert ::tcl::string::insert]

More sample implementations can be found on the [Additional String Functions]
(http://wiki.tcl.tk/44#pagetoc706ab8bb) page of the Tcler's Wiki, but at time of
writing, they do not handle end-relative indexing nor can be used to append to a
string.  Since they are implemented in terms of [`string replace`] and do not
perform any index arithmetic of their own, they actually do support TIP 176
indexes.

# Compatibility Considerations

The existence of a command named [`string insert`] breaks any existing code that
assumes [`string in`] is an unambiguous abbreviation for [`string index`].  Two
options exist:

1. Special-case [`string in`] to mean [`string index`].
2. Take no special action, in which case [`string in`] becomes an error.

This TIP proposes option #2 because abbreviations are not guaranteed to be
stable in the long term.  This TIP targets Tcl 8.7, the first alpha version of
which was recently released.  Thus there are three reasons why compatibility is
deemphasized in this situation.

# Specification

Add a new [`string insert`] command:

> **string insert** *string index insertString*

> Returns a copy of *string* with *insertString* inserted at the *index*'th
> character.  *index* may be specified as described in the [**STRING
> INDICES**](https://www.tcl.tk/man/tcl/TclCmd/string.htm#M54) section.

> If *index* is start-relative, the first character inserted in the returned
> string will be at the specified index.  If *index* is end-relative, the last
> character inserted in the returned string will be at the specified index.

> If *index* is at or before the start of *string* (e.g., *index* is **0**),
> *insertString* is prepended to *string*.  If *index* is at or after the end of
> *string* (e.g., *index* is **end**), *insertString* is appended to *string*.

# Reference Implementation

A pure Tcl reference implementation is given [above](#ref).

The [`dgp-string-insert`]
(http://core.tcl.tk/tcl/timeline?t=dgp-string-insert) branch in the Tcl Fossil
repository provides an implementation of the proposed subcommand, complete
with documentation, bytecode compilation, and a set of test cases.

# Future Work

The direct evaluation of [`string insert`] is routed though a new 
internal routine `TclStringReplace`. It is a conventional substring
replacer routine that serves as the inner core of both the
[`string insert`] and [`string replace`] commands with suitable
screening of corner cases by the callers.  There is one routine to
perform this function, so that there is one place to get the 
functionality right (*debugging*), one place to work on performance
and representation efficiency (*optimization*), and one place where
we can experiment with transformation to different data structures.
This is one of a family of `TclStringFoo` routines that are engines
of functionality for other [`string foo`] subcommands.

The existing internal routine `TclStringReplace` does not include
the full collection of optimizations that the prior routine
`Tcl_ReplaceObj` did. Since this routine remains internal, it can
continue to gain these revisions without further TIP examination.
Likewise, alternative bytecode compiler and execution strategies may
also be pursuse internally.

These routines may be good candidates to become available to applications
and extensions in the public C API. The current TIP does not propose that.
It is out of scope. A set of questions will need to be addressed when
considering converting these routines into public ones. First it will
need to be determined what level of robustness to present in a public
interface. Should such a function be permitted to fail or even abort
if given improper arguments? Or should all required argument validation
be built into the routines to detect such things and raise catch-able
errors instead? This general question includes the specific question
about whether such routines are permitted to panic when memory allocation
fails. Second several of these routines require an argument specifying
an index into a string. These arguments have type `int`. It is expected
that string indices limited to `int` will no longer be desirable in Tcl 9,
so it must be decided whether it makes sense to create new routines
for Tcl 8.7 that are destined to be discarded by Tcl 9, or whether a
migration path needs to be put in place from the beginning. Since these
questions are non-trivial, addressing them is saved for a later TIP.

# Copyright

This document has been placed in the public domain.