Tcl Improvement Proposals: Artifact [8d5eccdcfe]

Artifact 8d5eccdcfeae8b9f9daa39a0b6d526b4de984403c02152831d3fd34ab9160ae2:

File tip/280.tip — part of check-in [b0d0a1a2de] at 2006-11-29 17:15:23 on branch trunk — Implemented by AK (user: dkf size: 18747)
TIP:		280
Title:		Add Full Stack Trace Capability With Location Introspection
Version:	$Revision: 1.13 $
Author:		Andreas Kupries <[email protected]>
Author:		Andreas Kupries <[email protected]>
State:		Final
Type:		Project
Vote:		Done
Created:	10-Aug-2004
Post-History:	
Keywords:	Tcl
Obsoletes:	211
Tcl-Version:	8.5

~ Abstract

This TIP proposes adding a new subcommand to the '''info''' command to get a
list of all the frames on the current stack, with additional information about
command location, type of execution, etc., rather than the limited list
returned by '''info level'''. It is also related to Peter MacDonald's [86], or
rather, to the '''info linenum''' feature proposed there. The base feature of
providing location information for a command was extended, allowing the user
to ask for the current location in all stack levels. Enough information is
returned to consider this as an extended [211].

~ Rationale

The motivation for this feature is the debugging of scripts in situations
where an error stack is wanted, to see where the problem occurred, but
aborting the execution is not desired. To enable this a means of providing the
essential information shown in an error stack is needed, which does not
require an abort (and subsequent unwinding of the C stack) to assemble this
information.

An example is the testing of scripts where an unexpected error should not fail
the test case, nor the framework, yet still allow the recording of the
problematic location and how it was reached in some log.

The original motivation for [211] was that there is currently no way to get a
list of all the frames in the current stack managed by
'''Tcl_PushCallFrame()''' and '''Tcl_PopCallFrame()'''. The '''info level'''
command does not contain frames that are callers of '''uplevel''', reporting
only the frames that are accessible via another '''uplevel''' command. There
are times when the lack of information can have a negative impact on code
design.

This motivation asks in essence for an error stack as well, but limited itself
to the returning commands themselves, and not the other information, like the
line the command is on, its context, etc.

Other use cases, also found in [211]:

 1. '''tcltest''', and other testing frameworks.

 > The first case is with the core's Tcltest package, where the
   complete lack of ability to gain access to that information means
   it is impossible to gain information about a test without modifying
   the Tcltest code itself. Being able to find out the caller info
   would be very useful, especially for logging information. Currently,
   there is no way to get the caller's info, due to the fact that the
   code for the test is ''uplevel''ed and, hence, not visible via
   '''info level'''.

 2. '''TestStubs Package'''

 > The TestStubs package provides the ability to temporarily redefine
   commands, in particular for stubbing out or replacing functionality in a
   test case. There is a command in the package called '''chain''', which is
   used within the code replacing a command (or part of a command) to call the
   original definition of the command. For example, one could do:

|stubs::stub ensemble array names {
|    return [lsort [uplevel 1 chain names $args]]
|}

 > However, since the '''chain''' command is (and should be) limited to only
   running from within a stub definition, it needs to call '''info level''' to
   find out if its caller is one of the stubbed commands, and what the name of
   that command is. With '''info level''', it would not have access to the
   level that is running inside the stubbed procedure. Hence, either it cannot
   check this constraint, or stubs cannot be allowed to use '''uplevel''' when
   calling it (which means things like the above either cannot work, or need
   to be rewritten in a considerably less clear manner).

~ Specification of the Proposed Change

~~ Tcl Level API

The builtin command '''info''' is extended to accept a new subcommand,
'''frame'''. When this subcommand is called it returns information about the
current command and its location. This information is available not only for
the current stack level, but also the higher stack levels used to reach the
current location.

The syntax of the new subcommand is

 > '''info frame''' ?''level''?

The new functionality will provide access to all frames on the stack rather
than the current limited subset. This TIP does ''not'' propose to alter
'''uplevel''' or '''upvar''' so that they can see these hidden levels.

If ''level'' is not specified, this command returns a number giving the frame
level of the command. This is 1 if the command is invoked at top-level.

If ''level'' is specified, then the result is a dictionary containing the
location information for the command at the ''level'' on the stack.

If ''level'' is positive (> 0) then it selects a particular stack level (1
refers to the top-most active command, i.e., '''info frame''' itself, 2 to the
command it was called from, and so on); otherwise it gives a level relative to
the current command (0 refers to the current command, i.e., '''info frame'''
itself, -1 to its caller, and so on).

This is similar to how '''info level''' works, except that this subcommand
reports all frames, like '''source''''d scripts, '''eval'''s, '''uplevel'''s,
etc.

Note that for nested commands, like "foo [[bar [[x]]]]" only "x" will be seen
by an '''info frame''' invoked within "x". This is the same as for '''info
level''' and error stack traces.

The result dictionary may contain the keys listed below, with the specified
meanings for their values.

 * '''type'''

 > This entry is always present and describes the nature of the location for
   the command. The recognized values are '''source''', '''proc''',
   '''eval''', and '''precompiled'''.

 > In some circumstances it makes sense to have this information extensible,
   i.e. to allow user-defined type names. For more about this topic see the
   discussion at the end of the document.

 > * '''source''' means that the command is found in a script loaded by the
     '''source''' command.

 > * '''proc''' means that the command is found in dynamically created
     procedure body.

 > * '''eval''' means that the command is executed by '''eval''' or
     '''uplevel'''.

 > * '''precompiled''' means that the command is found in a precompiled script
     (loadable by the package ''tbcload''), and no further information will be
     available.

 * '''line'''

 > This entry provides the number of the line the command is at inside of the
   script it is a part of. This information is not present for type
   '''precompiled'''. For type '''source''' this information is counted
   relative to the beginning of the file, whereas for the last two types the
   line is counted relative to the start of the script.

 * '''file'''

 > This entry is present only for type '''source'''. It provides the
   normalized path of the file the command is in.

 * '''cmd'''

 > This entry provides the string representation of the command. This is
   usually the unsubstituted form, however for commands which are a pure list
   executed by eval it is the substituted form as they have no other string
   representation. Care is taken that the pure-List property of the latter is
   not spoiled.

 * '''proc'''

 > This entry is present only if the command is found in the body of a regular
   Tcl procedure. It then provides the name of that procedure.

 * '''lambda'''

 > This entry is present only if the command is found in the body of an
   anonymous Tcl procedure, i.e. a lambda. It then provides the entire
   definition of the lambda in question.

 * '''level'''

 > This entry is present only if the queried frame has a corresponding frame
   returned by '''info level'''. It provides the index of this frame, relative
   to the current level (0 and negative numbers).

A thing of note is that for procedures statically defined in files the
locations of commands in their bodies will be reported with type '''source'''
and absolute line numbers, and not as type '''proc'''. The same is true for
procedures nested in statically defined procedures, and literal eval scripts
in files or statically defined procedures.

In contrast, for a procedure definition or eval within a dynamically
eval'uated environment count linenumbers relative to the start of their
script, even if they would be able to count relative to the start of the outer
dynamic script. That type of number usually makes more sense.

A different way of describing this behaviour is that we track file based
locations as deeply as we can, and where we cannot the lines are counted based
on the smallest possible eval or procbody scope, as that scope is usually
easier to find than any dynamic outer scope.

The syntactic form '''{expand}''' is handled like '''eval'''. This means that
if it is given a literal list argument the system tracks the line-number
within the list words as well, and otherwise all line-numbers are counted
relative to the start of each word (smallest scope)

The following other builtin commands are changed as well to support the
tracking of line numbers:

 1. catch

 1. dict for

 1. dict with

 1. eval

 1. for

 1. foreach

 1. if

 1. interp eval

 1. namespace eval

 1. proc

 1. source

 1. switch

 1. while

~~ Public C API

No changes are made to the public C API.

~ Examples

Note that this is not a complete set of examples covering all possible cases.
Let us assume that the file is named EX. The "cmd" is always the "info frame
..." command, and "level" is 0 too, always, in this situation. This is left
out of the result dictionaries to keep them small.

| puts [info frame 0]           ;# 01* - output: {cmd {info frame 0} line 1 file EX type source}
|                               ;# 02
| proc foo {} {                 ;# 03  /foo remembers 3
|     puts [info frame 0]       ;# 04*
| }                             ;# 05
| foo                           ;# 06  - output: {cmd {...} line 4 file EX type source}
|                               ;# 07
| set script {                  ;# 08  1
|     puts [info frame 0]       ;# 09  2*
| }                             ;# 10  3
| eval $script                  ;# 11  - output: {cmd {...} line 2 type eval}
|                               ;# 12
| eval {                        ;# 13  1
|     puts [info frame 0]       ;# 14* 2*
| }                             ;# 15  3  - output: {cmd {...} line 14 type source}
|                               ;# 16
| proc fox {} {                 ;# 17  /fox remembers 17
|     eval $::script            ;# 18
| }                             ;# 19
| fox                           ;# 20  - output: {cmd {...} line 2 type eval}
|                               ;# 21
| proc squirrel {} {            ;# 22  /squirrel remembers 22
|     eval {                    ;# 23  1
|         puts [info frame 0]   ;# 24* 2*
|     }                         ;# 25  3
| }                             ;# 26
| squirrel                      ;# 27  - output: {cmd {...} line 24 type source}
|                               ;# 28
| proc fuchs {} {               ;# 29    /fuchs remembers 29
|     proc dog {} {             ;# 30  1 /dog   remembers 30
|         puts [info frame 0]   ;# 31* 2*
|     }                         ;# 32  3
| }                             ;# 33
| fuchs                         ;# 34
| dog                           ;# 35  - output: {cmd {...} line 31 type source}
|                               ;# 36
| eval {                        ;# 37  1
|     proc wolf {               ;# 38  2 /wolf remembers 38
|         puts [info frame 0]   ;# 39* 3*
|     }                         ;# 40  4
| }                             ;# 41  5
| wolf                          ;# 42  - output: {cmd {...} line 39 type source}
|                               ;# 43
| set scripted {                ;# 44  1
|     proc deer {} {            ;# 45  2 /deer remembers 2
|         puts [info frame 0]   ;# 46  3*
|     }                         ;# 47  4
| }                             ;# 48  5
| eval $scripted                ;# 49
| deer                          ;# 50  - output: {cmd {...} line 3 type eval}
|                               ;# 51
| puts [set a b                 ;# 52
|       info frame 0]           ;# 53* - output: {cmd {...} line 53 type source file EX}
|                               ;# 54
| puts [info \
|       frame 0]                ;# 56  - output: {cmd {...} line 55 type source file EX}
|                               ;# 57
| proc salmon {} \
| {                             ;# 59
|     puts [info frame 0]       ;# 60*
| }                             ;# 61
| salmon                        ;# 62  - output: {cmd {...} line 60 type source file EX}
|                               ;# 63
| set method frame              ;# 64
| puts [info $method 0]         ;# 65* - output: {cmd {info $method 0} line 65 type source file EX}
|                               ;# 66
| proc trout {} {               ;# 67
|     puts [info $method 0]     ;# 68*
| }                             ;# 69
| trout                         ;# 70  - output: {cmd {info $method 0} line 68 type source file EX}

Another example showing how to query the whole stack of frames. It assumed
that the file is named EX. The output shown after the example was manually
reformatted to explicitly ordered the keys and the columns vertically aligned,
for readability and better comparability of the lines. The returned level
information was removed, and level numbers for '''info frame''' were added at
the beginning of the lines.

| proc setRes {result} {               ;# 01
|   if {$result == "fail"} {           ;# 02
|      set level [info frame]          ;# 03
|      while {$level} {                ;# 04
|           puts [info frame $level]   ;# 05
|           incr level -1              ;# 06
|       } ;# end of while              ;# 07
|   } ;# end of if                     ;# 08
| } ;# end of proc setRes              ;# 09
| proc runTest {tc} {                  ;# 10
|      # run the testcase              ;# 11
|      uplevel 1  setRes fail          ;# 12
| } ;# end of proc runTest             ;# 13
| runTest TC0001                       ;# 14

'''Output:'''

| 4 {cmd {runTest TC0001}        line 14 type source file EX}
| 3 {cmd {uplevel 1 setRes fail} line 12 type source file EX proc runTest}
| 2 {cmd {setRes fail}           line  1 type eval}
| 1 {cmd {info frame $level}     line  5 type source file EX proc setRes}

To see the connection between the new feature and error stack traces replace
the '''info frame''' command on line 3 of ''setRes'' with "'''return''' -code
error X". This will generate an error trace and with a bit of reformatting the
relationship can be seen easily:

'''Stack trace:'''

|	X
|	    while executing
|	"setRes fail"
|	    ("uplevel" body line 1)
|	    invoked from within
|	"uplevel 1  setRes fail"
|	    (procedure "runTest" line 3)
|	    invoked from within
|	"runTest TC0001"
|	    (file "EX.tcl" line 14)

Reformat the trace, with all output from a single command on one line, and
some noise removed.

|	X
|	"setRes fail" ("uplevel" body line 1)
|	"uplevel 1  setRes fail" (procedure "runTest" line 3)
|	"runTest TC0001" (file "EX.tcl" line 14)

Now revert the order of the lines, and add vertical alignment. Now have in
essence the ouput of the unmodified example, with same difference in the line
numbers. But that is only because the error stack trace always counts lines
relative to the script, and doesn't attempt to determine an absolute location
in a file.

|	"runTest TC0001"         (file "EX.tcl" line 14)
|	"uplevel 1  setRes fail" (procedure "runTest" line 3)
|	"setRes fail"            ("uplevel" body line 1)
|	X

~~ Child Interpreters

The current implementation of '''info level''' only returns levels up to the
top of the stack for the current interpreter. Such an approach puts
limitations on what information can be retrieved, but allows for a certain
level of "''security''" when running code in child interps, especially safe
interps.

Given the security considerations of safe interps, and consistancy with
regards to what information is returned across multiple circumstances, the
stack trace returned will only return information up to the top level of the
current interp, the same limit '''info level''' is bound by.

~ Discussion

A point noted by Lars Hellstrom is that this TIP makes the builtin
'''source''' command special. It is not possible anymore to re-create
'''source''' purely in Tcl, doing so will loose the name of the file. We are
still tracking line numbers, but they are then relative to the start of the
eval'd script, withut a connection to the file.

In a similar way, user-defined control structures lose the ability of the
builtin ''if'', etc. to track line numbers in a literal code block relative to
the enclosing file. The user-defined commands are again reduced to providing
line numbers relative to the start of the code block.

To bring user-defined commands back to the same level as the builtins we need:

 1. An extension of '''info frame''' which delivers the line information for
    all words of the command we are inside of. As each word can start on a
    different line (because of continuation lines and preceding multi-line
    literals).

  > This information is actually already available internally. The current
    implementation needs it for the tracking into literals.

 1. Extensions to the ''uplevel'' and ''eval'' commands which allow a script
    to provide them with location information for the script they are
    evaluating.

For example:

| proc newsource {foo script} {
|   ... preprocess whatever ...
|   set loc [info frame data for word 2]
|   uplevel -location $loc 1 $script
| }

Internally we also need quite a bit more flexible data structures. The type
names for example are defined through a C enum, and can be checked quickly and
efficiently. The file information, i.e. path, is stored and used dependent on
that, only for type '''source'''.

To keep this TIP and its implementation reasonably small the actual detailed
specification and implementation of such extensibility is defered to a future
TIP.

~ Reference Implementation

An implementation patch is available on
SourceForge[http://sourceforge.net/support/tracker.php?aid=1571568].

~ Copyright

This document has been placed in the public domain.

Please note that any correspondence to the author concerning this TIP is
considered in the public domain unless otherwise specifically requested by the
individual(s) authoring said correspondence. This is to allow information
about the TIP to be placed in a public forum for discussion.