Tcl Source Code

Ticket Change Details
Login
Overview

Artifact ID: b11ef71db56e7fc8312483930d6986fb4b827f7783fd366c55ef42acd62f72d7
Ticket: de232b49f26da1c18e07513d4c7caa203cd27910
write-only nonblocking refchan and Tcl internal buffers
User & Date: apnadkarni 2024-04-02 07:35:31
Changes

  1. icomment:
    Nathan,
    
    I'll make one last attempt at persuasion that
    
    - from an architectural point of view, generating I/O callback events without
    knowing channel state is not at all an appropriate model for async event driven
    I/O.
    
    - The related changes in trunk vis-a-vis 8.6 are broken in multiple aspects,
    mostly because of the above.
    
    The motivation for async i/o is that applications can do useful work while
    waiting for I/O to complete. Now this by itself does not need select on Unix,
    completion ports on Windows or fileevent in Tcl. One can just call the i/o
    functions and check for "EAGAIN" equivalents. There are multiple problems with
    this as the application has to poll:
    
    - Try too frequently and its wasting processing cycles processing EAGAINS.
    
    - Try too infrequently and there is an unnecessary delay / latency in servicing
    I/O requests.
    
    An event system solves both the above **as long as events are generated based on
    the I/O state.** No time is wasted in unnecessary polls and there is not
    additional delay once the channel is ready for I/O. However, if I/O events are
    generated based on timers **with no knowledge of I/O state** it has exactly the
    same effect as application polling! It is completely pointless - generating
    events when I/O state does not reflect channel readiness, and unnecessary delays
    after readiness before the timer expires. If you truly believe this as a
    solution, you should be amenable to completely getting rid of Tcl's I/O related
    event subsystem! The application could just generate write events using `chan
    postevent` on a regular basis with `after`. This is basically what the current
    Tcl 9 implementation does, queueing events on a timer basis.
    
    The above is the motivation for async/event-driven operation from an
    efficiency and performance perspective. However, there is also the "simple"
    matter of correctness. An event should not be generated prematurely
    reflecting a state that does not actually exist.
    
    In other words, from both the performance and correctness point of view, the
    current timer based write event generation in ChannelTimerProc **which is not
    based on channel state** is fundamentally flawed. It does not fulfil the
    intended purpose of an event based i/o system by essentially polling, and
    moreover does not meet correctness criteria as it has no idea of channel state
    and generates events prematurely (14.11 failure).
    
    All the above is a comment on event driven i/o and channels in general. Now as
    far as refchans are concerned, there is a limitation in the refchan framework as
    mentioned in an earlier post which led to the original defect you logged in this
    ticket. To reiterate, there is no script level equivalent of the
    `Tcl_EventSetupProc` and `Tcl_EventCheckProc`. Thus *some*, not all, refchans
    are forced to use a timer base scheme to generate these events. However, **this
    must be done by the refchan script implementation itself and not the generic
    channel infrastructure** because the former knows the channel state, the latter
    does not. This is still not ideal from the efficiency perspective, but being
    able to check state, it is at least correct.
    
    I believe that is how Andreas' virtchannel modules in tcllib work. And as proof
    of concept, following the tcllib model, I've modified your refchan
    implementation of 44.6 and attached as refchan-async-redux.tcl (proof of concept
    only modeling tcllib). This works in Tcl 8.6 as well (which your version did
    not). TL;DR the changes you made to the core in ChannelTimerProc (a) lead to at
    least two bugs 11.14 and event q starvation, (b) affected channels other than
    refchans, (c) were unnecessary.
    
    Given (imho) the enhancement of the refchan framework as too much of a risk for
    a 9.0 release, I see two possibilities that are acceptable (not perfect, just
    acceptable) for 9.0 release:
    
    - revert the implementation to what 8.6 does. No need for -buffering none in
    this case but the script level refchan implementation has to generate timer
    events, **check state** and then do a `after idle after 0 chan postevent` from
    the timer callback. See tcllib or attached sample. Alternatively, the refchan
    can do the -buffering none itself and avoid the timer if that suits its purpose.
    
    - If you do not want the channel script implementation to have that responsibility,
    (I would like to know why not) then set -buffering none for refchans as
    proposed in my branch.
    
    I prefer (1).
    
    I am pretty much going to stay silent on this topic now. I cannot provide any
    more clarity on my objections. Finally, some group of people has to decide
    on a course of action. Hopefully, that group is not just you and me.
    
  2. login: "apnadkarni"
  3. mimetype: "text/x-markdown"