Tk Source Code

View Ticket
Login
Ticket UUID: 4af5ca1921122de63f37a99beefb2f5dbef72518
Title: XCopyArea is very (very) slow on macOS with version 9
Type: Bug Version: 9.0.1
Submitter: anonymous Created on: 2025-07-23 12:54:55
Subsystem: 64. Graphic Contexts Assigned To: nobody
Priority: 5 Medium Severity: Important
Status: Open Last Modified: 2025-07-26 14:08:03
Resolution: None Closed By: nobody
    Closed on:
Description:

I would like to report a bug (I hope so) on macOS with the XCopyArea function on Tcl/Tk 9.0.1. Tcl/Tk 8.6 and Tcl/Tk 9 on Windows are not impacted.(I can't tested on linux I don't have this platform)

Context :

I interface Nim and Tcl/Tk and I find a very big difference (x1000) between TK 8.6 and TK9 on this function.

Versions:

Tcl/Tk 8.6.16
Tcl/Tk 9.0.1

I'm not a C programmer, but I tried using this code, which is similar to my Nim code, to show you the difference in time.
In one frame .f, I tried to construct an image.

Note: I couldn't get this code to work on version 8.6.16 on macOS (I don't know why). XCopyArea code returns always BadDrawable. In my Nim code, I create an image with

[image create myIM]
. Perhaps this is necessary for the code below to function, but I do not have the expertise to determine this.

/*
 * xcopyarea.c - XCopyArea with Tcl/Tk (version dylib)
 *
 */

#include <tcl.h>
#include <tk.h>
#include <stdio.h>
#include <X11/Xlib.h>
#include <X11/Xutil.h>
#include <X11/X.h>
#include <time.h>
#include <sys/time.h>
#include <string.h>
#include <stdlib.h>

/* Variables globales */
static Tk_PhotoHandle photo = NULL;
static int width = 800;
static int height = 400;
static Pixmap pixmap;

double get_time_ms(void) {
    struct timeval tv;
    gettimeofday(&tv, NULL);
    return tv.tv_sec * 1000.0 + tv.tv_usec / 1000.0;
}

static int TestImageUpdateCmd(ClientData clientData, Tcl_Interp *interp, 
                             int objc, Tcl_Obj *const objv[]) {
    double start, end;
    int iterations = 100;
    int i, x, y;
    double total_time = 0.0;
    Tk_PhotoImageBlock block;
    unsigned char *data;
    
    if (objc > 2) {
        Tcl_WrongNumArgs(interp, 1, objv, "?iterations?");
        return TCL_ERROR;
    }
    
    if (Tcl_GetIntFromObj(interp, objv[1], &iterations) != TCL_OK) {
        return TCL_ERROR;
    }

    Tk_Window tkwin = Tk_MainWindow(interp);
    if (tkwin == NULL) {
        Tcl_SetResult(interp, "No main window", TCL_STATIC);
        return TCL_ERROR;
    }

    Tk_Window targetWindow = Tk_NameToWindow(interp, ".f", tkwin);

    if (targetWindow == NULL) {
        Tcl_SetResult(interp, "No '.f' frame", TCL_STATIC);
        return TCL_ERROR;
    }

    Tk_MakeWindowExist(targetWindow);

    Window window    = Tk_WindowId(targetWindow);
    Display *display = Tk_Display(targetWindow);

    if (window == None) {
        Tcl_SetResult(interp, "Window not yet created", TCL_STATIC);
        return TCL_ERROR;
    }
    
    XGCValues gcValues;
    gcValues.graphics_exposures = False;
    GC gc = Tk_GetGC(targetWindow, GCGraphicsExposures, &gcValues);

    data = (unsigned char *)malloc(width * height * 4);
    if (data == NULL) {
        Tcl_SetResult(interp, "Memory allocation failed", TCL_STATIC);
        return TCL_ERROR;
    }
    
    for (y = 0; y < height; y++) {
        for (x = 0; x < width; x++) {
            int offset = (y * width + x) * 4;
            data[offset+0] = (x & 0xFF);
            data[offset+1] = (y & 0xFF);
            data[offset+2] = ((x + y) & 0xFF);
            data[offset+3] = 255;
        }
    }

    XImage *ximage = XCreateImage(
      display, 
      Tk_Visual(targetWindow),
      Tk_Depth(targetWindow),
      ZPixmap,    
      0,          
      (char*)data,
      width,
      height,
      32,
      width * 4
    );
    
    
    pixmap = Tk_GetPixmap(display, window, width, height, Tk_Depth(targetWindow));

    if (pixmap == None) {
        Tk_FreeGC(display, gc);
        Tcl_SetResult(interp, "Error Tk_GetPixmap", TCL_STATIC);
        return TCL_ERROR;
    }

    int ximage_status = XPutImage(display, pixmap, gc, ximage, 0, 0, 0, 0, width, height);

    if (ximage_status != 0) {
        Tcl_SetResult(interp, "Error XPutImage", TCL_STATIC);
        return TCL_ERROR;
    }

    for (i = 0; i < iterations; i++) {
        start = get_time_ms();
        
        int status = XCopyArea(display, pixmap, window, gc, 0, 0, width, height, 0, 0);
        printf("status: %d\n", status);
        XFlush(display);
        end = get_time_ms();

        double elapsed = end - start;
        total_time += elapsed;

        if (i < 10 || i % 10 == 0) {
            printf("Iteration %d: %.3f ms\n", i, elapsed);
        }
    }
    

    double avg_time = total_time / iterations;
    printf("\nAverage time: %.3f ms\n", avg_time);
    
    char result[256];
    sprintf(result, "%.3f", avg_time);
    Tcl_SetResult(interp, result, TCL_VOLATILE);
    
    return TCL_OK;
}

#ifdef __cplusplus
extern "C" {
#endif

DLLEXPORT int Imagetest_Init(Tcl_Interp *interp) {
    if (Tcl_InitStubs(interp, "8.6-", 0) == NULL) {
        return TCL_ERROR;
    }
    
    if (Tk_InitStubs(interp, "8.6-", 0) == NULL) {
        return TCL_ERROR;
    }

    Tcl_CreateObjCommand(interp, "test_image_update", TestImageUpdateCmd, NULL, NULL);
    
    if (Tcl_PkgProvide(interp, "imagetest", "1.0") != TCL_OK) {
        return TCL_ERROR;
    }
    
    return TCL_OK;
}

#ifdef __cplusplus
}
#endif

And my test code :


if {$tcl_version eq "8.6"} {
    catch {load libimagetest86.dylib}
    catch {load libimagetest86.dll}
} else {
    catch {load libimagetest90.dylib}
    catch {load libimagetest90.dll}
}

wm geometry . 800x400
wm title . "XCopyArea Test"

frame .f -width 800 -height 400
pack .f -fill both -expand true

update

# test
test_image_update 50

tclsh90 testcode.tcl (simulation code C)

Iteration 0: 18.325 ms
Iteration 1: 17.566 ms
Iteration 2: 17.386 ms
Iteration 3: 16.321 ms
Iteration 4: 18.275 ms
Iteration 5: 19.266 ms
Iteration 6: 19.256 ms
Iteration 7: 18.219 ms
Iteration 8: 18.227 ms
Iteration 9: 16.239 ms
Iteration 10: 17.399 ms
Iteration 20: 18.261 ms
Iteration 30: 18.177 ms
Iteration 40: 19.216 ms

tclsh86 testcode.tcl (simulation code nim)

Iteration 0: 0.325 ms
Iteration 1: 0.566 ms
Iteration 2: 0.386 ms
Iteration 3: 0.321 ms
Iteration 4: 0.275 ms
Iteration 5: 0.266 ms
Iteration 6: 0.256 ms
Iteration 7: 0.219 ms
Iteration 8: 0.227 ms
Iteration 9: 0.239 ms
Iteration 10: 1.399 ms
Iteration 20: 0.261 ms
Iteration 30: 0.177 ms
Iteration 40: 0.216 ms

User Comments: marc_culler (claiming to be Marc Culler) added on 2025-07-26 14:08:03:
Here is the thing:  Tk *never* calls XCopyArea (except when scrolling
a Text widget in Tk 9.0 only).

The calls to XCopyArea from generic code are not used by macOS.  They
all look like:
#ifndef TK_NO_DOUBLE_BUFFERING
    [code that uses XCopyArea for linux and windows]
#else
    [macOS code which does not use XCopyArea]
#endif

The only calls to XCopyArea in the 8.6 macOS code are in XCopyPlane
and XCopyPlane is only used for drawing buttons with monochrome bitmap
images.

In 9.0 there is one additional place where XCopyArea is used.  That is for
scrolling a Text widget. In 8.6 a deprecated method [NSView scrollRect: by:]
is used for scrolling; that has been replaced with XCopyArea in 9.0.  XCopyArea
could not be used for this in 8.6 because the 8.6 implemenation of XCopyArea
did not do a copy - it redrew part of the window in a pixmap.  (That didn't
matter for Tk since Tk never uses XCopyArea, but it was a potential problem
for extensions.)

So you are measuring times for something that never happens, and trying to
draw a conclusion from it.  The fact that you get short times in 8.6 and
longer times in 9.0 suggests to me that you are only seeing some misleading
artifacts from the way you are trying to measure the time spent in XCopyArea.

anonymous added on 2025-07-26 11:55:52:

While I still don't understand what "it" refers

The "it" referred to XPutImage or XCopyArea, but that was an assumption, which, based on your comments, seems to be incorrect.

I looked at it, but I don't know how to interpret what I am seeing. If you could provide a little explanation and guidance it would be much appreciated.

The first window that appears corresponds to the code launched with Tclsh86.
The second window that appears corresponds to the code launched with Tclsh90.

Here is my Tcl code below :

proc anim {img1 img2} {
    # Do Something...
    
   # Updates my 2 images
    pix::surfXUpdate $img1
    pix::surfXUpdate $img2

   after 16 [list anim $img1 $img2]

}

package require pix

set w 800
set h 400

# Create my context
set ctx [pix::ctx::new [list $w $h] "rgba(255, 255, 255, 255)"]

# Creating my two images using the Tk_CreateImageType procedure
set img1  [image create pix -data $ctx]
set img2 [image create pix -data $ctx]

# Pack
label .l1  -image $img1 -borderwidth 0 ; pack .l1
label .l2 -image $img2 -borderwidth 0 ; pack .l2

update
# Event loop
after 100 anim

What you see displayed on the terminal corresponds to the calculation time for the XCopyArea function.

tclsh86 testnim.tcl

XCopyArea: 0.008999999999981245ms
XCopyArea: 0.004000000000004ms
XCopyArea: 0.1049999999999662ms
XCopyArea: 0.06699999999998374ms
XCopyArea: 0.006999999999979245ms
XCopyArea: 0.01199999999995649ms
XCopyArea: 0.1129999999999742ms
XCopyArea: 0.04500000000001725ms
XCopyArea: 0.014000000000014ms
XCopyArea: 0.01000000000001ms
XCopyArea: 0.05800000000000249ms
...
tclsh90 testnim.tcl
XCopyArea: 15.05099999999998ms
XCopyArea: 15.21800000000001ms
XCopyArea: 15.63799999999999ms
XCopyArea: 15.71800000000001ms
XCopyArea: 15.57900000000001ms
XCopyArea: 15.57399999999998ms
XCopyArea: 15.83500000000004ms
XCopyArea: 15.64699999999997ms
XCopyArea: 15.76899999999992ms
XCopyArea: 15.53599999999999ms
XCopyArea: 15.75000000000004ms
XCopyArea: 15.68799999999992ms
...

I don't know if this answers all your questions, but it's all I can provide. (Perhaps the Nim code, but I'm not sure that would be of much help.).
Thanks


marc_culler (claiming to be Marc Culler) added on 2025-07-26 02:58:22:
> At first, I thought it might be due to epoll/kqueue **TIP458**

While I still don't understand what "it" refers to, I did look up
TIP #458 and I agree that whether epoll is used in the notifier
is totally unrelated to the time required for updating graphics.
The graphics updates do not involve the notifier at all. No events
are being processed.  No files are being read or written.  And,
even if the notifier were involved, using epoll should only improve
things.

One thing that is happening in CGContextDrawImage, on my laptop anyway,
is that the entire image must be resampled in order to account for
the Retina display.

marc_culler (claiming to be Marc Culler) added on 2025-07-26 01:47:28:
Thank you for the GIF.  I looked at it, but I don't know how to interpret
what I am seeing.  If you could provide a little explanation and guidance
it would be much appreciated.

marc_culler (claiming to be Marc Culler) added on 2025-07-25 22:32:36:
I don't think this is a limitation of Tk 9.  I think it is a limitation
of Apple's CoreGraphics.  Both Tk 8 and Tk 9 use CGContextDrawImage and
that is where the time is being spent in the testmage command.

However there are still many aspects of what you are saying that I do not
understand or that are incomplete.

1.  You have never said what "simulation code nim" means.  So I do not know
what you are comparing against the testimage code.  It appears on the
surface that you may be comparing apples and oranges.

2. The testimage code that I attached will compile as an 8.6 extension,
but it will not update the screen under 8.6.  I know the reason from my
instrumentation: the calls to XPutImage are being made outside of drawRect.
This is a serious limitation of 8.6.  Drawing operations often have to be
repeated because they are attempted outside of drawRect, which (after
macOS 10.14) means that it is impossible to obtain a valid CGContextRef
because the window's contentView is not the focusView.  You see this in
practice as a return value of BadDrawable, as you reported.  The testimage
code has no mechanism for calling XPutImage a second time if the first
attempt fails due to the contentView not being the focusView.  So it
couldn't possibly work with Tk 8.

3. A consequence of (2) is that whatever code you may be using to compare
Tk 8 against Tk 9 (and you still haven't said what that is) it can't
possibly work the same way as the testimage code.  Maybe that just means
that the testimage command is not testing the right thing.

In any case, if there is any way for me to see a comparison of working
Tk 8 code versus working Tk 9 code where the Tk 8 code is significantly
faster, then I would be very interested and it would be very helpful to
me.  But so far I haven't seen anything like that.

anonymous added on 2025-07-25 18:07:36:

Unfortunately for me, even though the focus is on version 9, 8.6 allowed me to compare (I wouldn't have opened a ticket otherwise).

At first, I thought it might be due to epoll/kqueue TIP458 (I don't know if Tk/X11 uses it) but now if you tell me that it's a limitation of Tk on macOS.

Here is a GIF link that shows the difference with my Nim program.

In any case, thank you for considering my request.I consider this to be a limitation on macOS version 9.
I would have liked to give a real example in C with the creation of an image, but I am limited in that way too.

Thank you for your detailed feedback.


marc_culler (claiming to be Marc Culler) added on 2025-07-25 04:06:50:
By the way, the actual bottleneck as far as frame rate is concerned is
probably the frequency with which updateLayer is called.  The job of
updateLayer is to copy the window's backing CGImage to the screen.  Drawing
is done in that CGImage, but the drawing does not appear on the screen
until updateLayer is called.

Tk is not allowed to call updateLayer.  All Tk can do is to set a flag
indicating that an update is needed.  The Aqua window manager calls
uodateLayer if the flag is set and if the time since the last update is
long enough.  Apple is responsible for deciding what the maximum update
frequency should be in order to provide smooth animations.

When I run the testimage command, updateLayer is not called until close
to the last iteration, even though the flag is set on each iteration.
This would not be reflected in your timing results, since they are
checking how long the drawing operation takes, not how long it takes
until the screen is updated.

marc_culler (claiming to be Marc Culler) added on 2025-07-25 01:50:18:
I have done a lot of timing tests inside of the Tk drawing routines
and checked times when running my version of your program, which does
one call to XPutImage per iteration. What I learned is that for an 800x400
image like yours a single call to CGContextDrawImage takes between 5 and 7
milliseconds.  That means that almost all of the time spent drawing in one
iteration of the test program is accounted for by the one call to
CGContextDrawImage that happens during each iteration.

There is no alternative to CGContextDrawImage for drawing an image on
the screen.  This has nothing to do with Tk versions or even with Tk.
It is just what Quartz provides for drawing images on the screen.
So I think there is nothint that could be done to speed up the drawing
in Tk, for any version of Tk.  (Other than getting rid of drawRect, which
forces drawing operations to be repeated multiple times -- and that
is what Tk 9 has already done.)

One thing this means is that if you are trying to do an animation by
drawing into an XImage and copying the Ximage to the screen, then you are
limited to somewhere between 160 and 200 frames per second.  But smooth
animation does not require anything close to that rate.  So I don't
really think that this presents an obstacle to producing smooth
animations with Tk.  If the other timings that you reported were times
for updating the XImage itself, then you should be in good shape.  The
bottleneck will be CGContextDrawImage and the part that matters,
updating your image, will have plenty of time to do its magic without
lowering the frame rate significantly.

The job of CGContextDrawImage is fairly complicated.  If the image is
not the same size as the rectangle it is being drawn into, the image will
be scaled to fit the target rectangle.  If the screen is a Retina screen
the image will be resampled.  It is not totally surprising that this
takes some time.

marc_culler (claiming to be Marc Culler) added on 2025-07-24 22:13:57:
OK, I have a version of your code working.  It does not use XCopyArea
at all.  For each iteration it calls XPutImage once to copy the static
XImage to the screen.  And I see nice triangles containing gradients
tiling the window.

I am seeing comparable times to what you report.  The average is 6.5 ms.
The factor of 3 is probably CPU speed.  (I am using an M3 Macbook Air.)

So the next question would be "How fast would we expect this operation to be?"  Your second list of times is for "simulation code nim".  But I don't
know what that means.  What is being measured by the times in your second
table?

marc_culler (claiming to be Marc Culler) added on 2025-07-24 20:46:40:
When I run the code above with Wish9.0 the calls to XCopyArea all fail
with status 9 (i.e. BadDrawable).  You indicated that you only saw that
with 8.6.

Your code should print out the status for each iteration, even if it is 0
since it contains no test for whether the status is 0.  Yet, your printout
does not contain the status lines.

Also when I built the extension there were two unused variables: photo
and block.

So I am guessing that  the posted code is incomplete.  Do you know what
might be missing?

marc_culler (claiming to be Marc Culler) added on 2025-07-24 19:27:20:
OK, that is interesting.  I will see if I can build your extension and
run this myself.

Comparison with 8.6 would not be useful.  There is no way that Tk 9 can
or should go back to using drawRect.  And 8.6 is obsolete, with releases
ending next year.

We need to focus on Tk 9.

anonymous added on 2025-07-24 18:22:06:

What's interesting is that if I only use XPutImage as you suggested in the C code I provided in my ticket, it's now this function that's slow. Unfortunately, I can't get it to work in 8.6 to compare.


anonymous added on 2025-07-24 16:52:26:

Did you try using XPutImage?

In my implementation, XPutImage and XCopyArea are interconnected, and one cannot function without the other.
Why ?

  1. I use XPutImage to fill a pixmap (server-side buffer) with my image data. This pixmap serves as a cache to avoid retransferring data each time it is displayed.
  2. I use XCopyArea to copy from the pixmap to the window. It is this step that makes the image visible to the user.

Why this dependency on each other?

Without XPutImage: The pixmap remains empty, XCopyArea does not copy anything → black screen.
Without XCopyArea: The image is in the pixmap but is never transferred to the window → black screen.

XPutImage is only called when the image is modified.
XCopyArea is fast and called at each redraw/exposure.
I avoid retransferring image data at each refresh.

That's how I experienced it, I'm not sure if it's the best way. In 8.6, I can get super smooth image animation, which is not the case in 9, so I'm submitting this ticket.


marc_culler (claiming to be Marc Culler) added on 2025-07-24 16:31:58:
Just to clarify my question ... I was trying to ask whether you
tried using XPutImage (or TkPutImage, which happens to be the same
function on macOS) to copy your image to the screen, as opposed to
copying the image to a pixmap.  TkPutImage accepts a window as its
target drawable.  (I imagine that your pixie drawing code is rendering
to something easily convertible to a pixmap and that your goal is to
display that pixmap on the screen.)

marc_culler (claiming to be Marc Culler) added on 2025-07-24 14:00:07:
It sounds like you have an XImage which you would like to copy into a
Tk Widget.  Did you try using XPutImage?

Your project seems similar to the tkpath package, which uses Cairo to draw
into a Tk widget which behaves like a Tk Canvas.  That package ends up
replicating most of the Tk Canvas code (and in particular does not use
XCopyArea on macOS - instead it draws directly into the graphics context of
the toplevel's NSView).  It might be worth looking at their code.

anonymous added on 2025-07-24 07:09:58:

I don't fully understand what you are doing here and will need more explanation. But here is some background.

I create my own image with Tk_CreateImageType.

Can you post a url for your project?

pix

Although the version of my X11 binding is not up to date on GitHub, it gives you an idea of what I want to do.
Here is my link: X11

As I said, I am not able to do this in C, and I tried to create an example in C to get as close as possible to what I do in Nim. The C code example I provided does not work for me on version 8.6.16 on macOS. However, it does work on version 9. I hope you can see that XCopyArea takes time.


marc_culler (claiming to be Marc Culler) added on 2025-07-23 17:34:08:
I don't fully understand what you are doing here and will need more
explanation.  But here is some background.

XCopyArea is heavily used by Tk in the linux and windows ports to copy a
region out of a window, which is then modified and copied back into the
window.  This is possible because on those platforms it is possible to
directly read and write from the window backing store (using the system's
XCopyArea on linux.). However, XCopyArea is never used by the macOS port.
On macOS the modifications are done using drawing functions provided by the
system.  If you look in the code, you will see blocks like
#ifndef TK_NO_DOUBLE_BUFFERING
  ... code which uses XCopyArea ...
#else
  ... macOS specific code
#endif

In Tk 8, which used Apples [NSView drawRect] function, it was not
possible after macOS 10.14 to directly access the backing store of a
window.  The XCopyArea function did no support copying from a pixmap
(which is why you were getting those BadDrawable errors) and the
implementation of XCopyArea in the case of a window drawable was a hack.
It used a feature which allows the drawRect method to draw to a CGImage
instead of to a window.  The way that XCopyArea "worked" was to create
a CGImage, use drawRect to redraw the window, with drawing clipped to the
source rectangle, into the CGImage and then convert the CGImage into
an XPixmap.  There was no copying.  Everything was redrawn.

In Tk 9, [NSView drawRect] is no longer used.  Instead [NSView updateLayer]
is used.  This means that the backing layer of the window is an actual
CGImage which Tk can modify directly.  Wnen an update of the screen is
needed that CGImage is blitted to the screen.  This allowed an
implementation of XCopyArea which worked with pixmaps as well.  However,
it involves a lot of converting between image formats, not to mention
dealing with retina vs non-retina displays.  I would not expect high
performance with XCopyArea on macOS, and I do not understand why you
would ever need to use it on macOS.

My guess is that you should be working with images of type nsimage but,
as I said, I don't understand yet what you are trying to do.

Can you post a url for your project?

Attachments: