Overview
Artifact ID: | b11ef71db56e7fc8312483930d6986fb4b827f7783fd366c55ef42acd62f72d7 |
---|---|
Ticket: | de232b49f26da1c18e07513d4c7caa203cd27910
write-only nonblocking refchan and Tcl internal buffers |
User & Date: | apnadkarni 2024-04-02 07:35:31 |
Changes
- icomment:
Nathan, I'll make one last attempt at persuasion that - from an architectural point of view, generating I/O callback events without knowing channel state is not at all an appropriate model for async event driven I/O. - The related changes in trunk vis-a-vis 8.6 are broken in multiple aspects, mostly because of the above. The motivation for async i/o is that applications can do useful work while waiting for I/O to complete. Now this by itself does not need select on Unix, completion ports on Windows or fileevent in Tcl. One can just call the i/o functions and check for "EAGAIN" equivalents. There are multiple problems with this as the application has to poll: - Try too frequently and its wasting processing cycles processing EAGAINS. - Try too infrequently and there is an unnecessary delay / latency in servicing I/O requests. An event system solves both the above **as long as events are generated based on the I/O state.** No time is wasted in unnecessary polls and there is not additional delay once the channel is ready for I/O. However, if I/O events are generated based on timers **with no knowledge of I/O state** it has exactly the same effect as application polling! It is completely pointless - generating events when I/O state does not reflect channel readiness, and unnecessary delays after readiness before the timer expires. If you truly believe this as a solution, you should be amenable to completely getting rid of Tcl's I/O related event subsystem! The application could just generate write events using `chan postevent` on a regular basis with `after`. This is basically what the current Tcl 9 implementation does, queueing events on a timer basis. The above is the motivation for async/event-driven operation from an efficiency and performance perspective. However, there is also the "simple" matter of correctness. An event should not be generated prematurely reflecting a state that does not actually exist. In other words, from both the performance and correctness point of view, the current timer based write event generation in ChannelTimerProc **which is not based on channel state** is fundamentally flawed. It does not fulfil the intended purpose of an event based i/o system by essentially polling, and moreover does not meet correctness criteria as it has no idea of channel state and generates events prematurely (14.11 failure). All the above is a comment on event driven i/o and channels in general. Now as far as refchans are concerned, there is a limitation in the refchan framework as mentioned in an earlier post which led to the original defect you logged in this ticket. To reiterate, there is no script level equivalent of the `Tcl_EventSetupProc` and `Tcl_EventCheckProc`. Thus *some*, not all, refchans are forced to use a timer base scheme to generate these events. However, **this must be done by the refchan script implementation itself and not the generic channel infrastructure** because the former knows the channel state, the latter does not. This is still not ideal from the efficiency perspective, but being able to check state, it is at least correct. I believe that is how Andreas' virtchannel modules in tcllib work. And as proof of concept, following the tcllib model, I've modified your refchan implementation of 44.6 and attached as refchan-async-redux.tcl (proof of concept only modeling tcllib). This works in Tcl 8.6 as well (which your version did not). TL;DR the changes you made to the core in ChannelTimerProc (a) lead to at least two bugs 11.14 and event q starvation, (b) affected channels other than refchans, (c) were unnecessary. Given (imho) the enhancement of the refchan framework as too much of a risk for a 9.0 release, I see two possibilities that are acceptable (not perfect, just acceptable) for 9.0 release: - revert the implementation to what 8.6 does. No need for -buffering none in this case but the script level refchan implementation has to generate timer events, **check state** and then do a `after idle after 0 chan postevent` from the timer callback. See tcllib or attached sample. Alternatively, the refchan can do the -buffering none itself and avoid the timer if that suits its purpose. - If you do not want the channel script implementation to have that responsibility, (I would like to know why not) then set -buffering none for refchans as proposed in my branch. I prefer (1). I am pretty much going to stay silent on this topic now. I cannot provide any more clarity on my objections. Finally, some group of people has to decide on a course of action. Hopefully, that group is not just you and me.
- login: "apnadkarni"
- mimetype: "text/x-markdown"