Tcl Source Code

View Ticket
Login
Ticket UUID: 29e8848eb9766b81413989ad8125b0c22d1d2c7e
Title: calling an imported alias in a deletion trace on the alias causes Tcl to cycle on routine resolution
Type: Bug Version:
Submitter: pooryorick Created on: 2020-08-10 14:03:11
Subsystem: - New Builtin Commands Assigned To: nobody
Priority: 5 Medium Severity: Minor
Status: Open Last Modified: 2020-08-17 16:39:55
Resolution: Fixed Closed By: nobody
    Closed on:
Description: (text/x-fossil-wiki)
A command is typically available while its deletion is in progress:

<code><verbatim>
namespace eval ns1 {
    proc p1 {oldname newname op} {
        puts [list $oldname signing off!]
    }
    trace add command p1 delete [namespace which p1]
    rename p1 {}
}
</verbatim></code>

Output:

<code><verbatim>
::ns1::p1 signing off!
</verbatim></code>

However, in the case of an imported alias, the result is instead an infinite
loop in command dispatch:

<code><verbatim>
namespace eval ns2 {
    namespace export *

    proc p1 {oldname newname op} {
        puts [list $oldname is being deleted!]
    }
    interp alias {} [namespace current]::p2 {} [namespace which p1]
}


namespace eval ns3 {
    namespace import ::ns2::p2
}


proc ondelete {oldname newname op} {
    catch {
        ::ns3::p2 $oldname $newname $op
    } cres
    puts $cres
}


trace add command ::ns2::p2 delete [namespace which ondelete]
rename ns2::p2 {}
</verbatim></code>

Output:

<code><verbatim>
too many nested evaluations (infinite loop?)
</verbatim></code>

What's happening here is that <code>tclNamesp.c:InvokeImportedNRCmd</code>
extracts the command structure and passes it to
<code>tclBasic.c:TclNREvalObjv</code>, along with the words of the original
command, but then <code>tclBasic.c:EvalObjvCore</code> uses the
<code>CMD_IS_DELETED</code> flag to decide to discard the command structure
that was passed to it, and instead resolves the command name again, which leads
back to <code>tclNamesp.c:InvokeImportedNRCmd</code> where the cyle repeats
itself.

<code>CMD_IS_DELETED</code> seems to be intended to mean "command deletion in
progress", so perhaps the check for it can simply be removed from
<code>tclBasic.c:EvalObjvCore</code>.


A script-level workaround is to lookup up the alias and call it directly:

<code><verbatim>
proc ondelete {oldname newname op} {
    set alias [interp alias {} $oldname]
    {*}$alias $oldname $newname $op
}
</verbatim></code>
User Comments: pooryorick added on 2020-08-12 08:00:13:
To put the difference succinctly, sebres' fix treats a deletion trace on a
routine as a post-deletion action, and my fix treats it as a pre-deletion
action.

sebres added on 2020-08-11 18:33:56: (text/x-fossil-wiki)
Although I've "fixed" this new test in the manner how the branch work - invocation of deleting command in trace is prohibited; test case namespace-57.1 is covering "attempt to invoke a deleted command" now, I'm confused a bit.

Related to Tcl 8.6 (before [7eed2baf73fbb1f9]) the fix that Nathan provided for this is rather a regression, because this test-case works exactly so as in my branch ("attempt to invoke a deleted command").

But related to Tcl 8.5 rather old 8.6th handling (and my branch) would be a regression (there it's possible to invoke the command in trace during a deletion)...

So I'd really like to listen the opinion of TCT about the issue.

pooryorick added on 2020-08-11 17:39:05: (text/x-fossil-wiki)
If the caller of <code>TclNREvalObjv</code> is expected to do its own due
diligence, then you are right and there is no need for
<code>TclNREvalObjv</code> to perform the checks itself.  On the other hand,
your fix incorrectly relates the <code>TCL_EVAL_NORESOLVE</code> and
<code>CMD_IS_DEAD</code> flags.  In [9641d70a9caa2017] I've added a new test to
your branch that illustrates the issue:  Because it happens to take a path
through <code>TclNRInvoke</code>, both the <code>TCL_EVAL_NORESOLVE</code> and
and <code>CMD_IS_DEAD</code> flags are set, causing <code>EvalObjvCore</code>
to reject the command when it should dispatch it.

sebres added on 2020-08-11 10:22:24: (text/x-fossil-wiki)
I'm agree, just I'm trying to follow where is this new flag CMD_DEAD really expected. For instance see my alternative branch (without CMD_DEAD at all).
Everything works, the new test "namespace-57.0" is passed too. 
So either CMD_DEAD is completely superfluous at the moment or the test "namespace-57.0" is missing something yet (so would work within your commit and would fail in my branch bug-29e8848eb976-alt).

At the moment I can still not follow from the test why this CMD_DEAD is expected at all.

pooryorick added on 2020-08-11 09:39:37: (text/x-fossil-wiki)
I think my initial report answers that question:  A command should be available
for calling in any deletion traces that are evaluated during the course of its
deletion.  <code>CMD_IS_DELETED</code> should have been named
<code>CMD_DYING</code>.  Before the fix for this ticket,
<code>EvalObjvCore</code> interpreted it to mean that the command is no longer
available, and that's wrong.  During a deletion trace the command is still
available, even though it is already flagged as <code>CMD_IS_DELETED</code>.
Whem <code>EvalObjvCore</code> makes this incorrect intrepretation, errors like
the one illustrated in the initial report can occur.

sebres added on 2020-08-11 08:01:54: (text/x-fossil-wiki)
<i><blockquote>> I think the new flag is warranted because the Command structure includes a refCount flag, implying that is is meant to be shared and that there should be a way for something holding it to know whether it is still valid.
</blockquote></i>

Sure, just I don't see why in <code>EvalObjvCore</code> it would be important to distinguish between supplied command will be deleted or command is ultimately deleted. As well I don't follow why in case of CMD_IS_DELETED the command should be considered as still valid.

pooryorick added on 2020-08-10 19:56:27: (text/x-fossil-wiki)
Yes, I created the new <code>CMD_DEAD</code> flag and then inadvertently used
the <code>CMD_IS_DELETED</code> instead.  Fixed in [68a1f5598fe5b281].  I think
the new flag is warranted because the <code>Command</code> structure includes a
<code>refCount</code> flag, implying that is is meant to be shared and that
there should be a way for something holding it to know whether it is still
valid.  Maybe instead of the new flag, <code>Tcl_DeleteCommandFromToken </code>
could increment the <code>cmdEpoch</code>.  That might be a little more costly
to evaluate in <code>EvalObjvCore</code> though.

sebres added on 2020-08-10 19:31:09: (text/x-fossil-wiki)
[./vdiff?from=d97891f95235dc77&to=e0d0c4ffb276e6a3|Here is alternative branch] without new (unneeded?) flag CMD_DEAD.

sebres added on 2020-08-10 19:17:53: (text/x-fossil-wiki)
@Nathan: Is the fix [7eed2baf73] really fulfilled? Looks a bit strange.

Because a new flag <code>CMD_DEAD</code> checking in tclBasic (EvalObjvCore) now will be nowhere set :)
Either you wanted set it in 3271 (instead of <code>CMD_IS_DELETED</code>)... Or it is even enough, and a new flag (as well as a check for deleted) is unneeded at all.

In this case alternative diff (doing the same as your code without new flag CMD_DEAD) relative previous version would be something like that:
<pre><code>
<b style="color:blue">@@ -4331,13 +4331,9 @@ EvalObjvCore(</b>
          * Caller gave it to us.
          */
 
<b style="color:red">-	if (!(preCmdPtr->flags & CMD_IS_DELETED)) {
-	    /*
-             * So long as it exists, use it.
-             */
-
-	    cmdPtr = preCmdPtr;
-	} else if (flags & TCL_EVAL_NORESOLVE) {</b>
<b style="color:green">+	if ( (flags & TCL_EVAL_NORESOLVE)
+	  && (preCmdPtr->flags & CMD_IS_DELETED)
+	) {</b>
 	    /*
 	     * When it's been deleted, and we're told not to attempt resolving
 	     * it ourselves, all we can do is raise an error.
<b style="color:blue">@@ -4348,6 +4344,7 @@ EvalObjvCore(</b>
 	    Tcl_SetErrorCode(interp, "TCL", "EVAL", "DELETEDCOMMAND", NULL);
 	    return TCL_ERROR;
 	}
<b style="color:green">+	cmdPtr = preCmdPtr;</b>
     }
     if (cmdPtr == NULL) {
 	cmdPtr = TEOV_LookupCmdFromObj(interp, objv[0], lookupNsPtr);
</code></pre>

Just...<br/>
@all: it looks like [./artifact?name=4cd056636cb5020b&ln=4340-4350|this code] is not covered at all (or I can simply not find the test case covering that).