TIP 529: Add metadata dictionary property to tk photo image

Login
Author:         Harald Oehlmann <[email protected]>
State:          Final
Type:           Project
Vote:           Done
Vote-Summary:   Accepted 7/0/1
Votes-For:      AK, FV, JD, JN, KW, MC, SL 
Votes-Against:  None
Votes-Present:  KK
Created:        07-Dec-2018
Keywords:       Tk, image
Tcl-Version:    8.7
Tk-Branch:     tip529-image-metadata-no-match-method

Abstract

An additional property is proposed for photo images. This property shall hold a dictionary with image metadata:

myimage cget -metadata
myimage configure -metadata [dict create DPI 300.0]

The content of the dictionary is initialized on image load and used on image save.

Rationale

Image files may contain a lot of metadata like resolution, comments, GPS location etc. This metadata should be accessible and modifiable for the following aims:

image resolution

This TIP especially targets the resolution (DPI) value of the image.

The image resolution included in an image file is crucial for its usage, as many applications (word & co.) use this field to calculate a default size. One may imagine, that image files used in pdf4tcl are automatically scaled to the correct resolution (e.g. the resolution saved in the image file).

This information is included in .png files (supported by core Tk) and many other image formats included in the Img package.

I authored an extension to the Img package to specify the dpi field of a .bmp file when writing. The syntax was accorded with Jeff Hobbs:

myimage write file.bmp -format [list bmp -resolution 300 i]

This may be expressed (when all packages are adapted) by:

myimage configure -metadata [dict create DPI 300.0]
myimage write file.bmp

Comment data

A comment may be used to save custom data in the image file.

An example is a vision automation project where a test procedure is connected to each image. My solution is to use .gif images and to store the test procedure (a TCL script) in the gif's comment.

Preview extension for new command "image metadata"

The match functions should also be able to return the metadata dictionary. This is due to the plan by Paul Obermeier to make the match function available on the script level by itself. See the discussion section for the message.

Specification

Metadata Dictionary

The property "-metadata" is added to each image. It contains a dictionary, where the keys of the dictionary are specific to each photo image format.

The following default keys are proposed:

Key Description Example image formats
DPI Horizontal Image resolution in DPI (double) png
aspect Aspect ratio horizontal/vertical (double) png,gif
comment Text comment png, gif

Comments on the key choice:

It is valid to set any key within the application. Any unknown key should be ignored by the application and image format drivers.

If a particular image does not specify any keys (whether during creation or otherwise) then the dictionary will be empty.

Each photo image format driver may define additional keys and may decide to use them for input (as a parameter for image read and/or image write), output (as an image read result) or both.

The TIP implementation does not propose to immediately implement all possible keys of all image formats for reading and writing. The set of predefined image keys may grow over time on a per-case basis instead.

Commands

The following commands are extended by a -metadata parameter:

image create photo myimage -metadata $metadict
myimage cget -metadata
myimage configure -metadata $metadict
myimage data -metadata $metadict
myimage put -metadata $metadict
myimage read -metadata $metadict
myimage write -metadata $metadict

Any image format handler may use the content of the metadata dictionary. This may be an ongoing process, especially within the Img package.

Here is an overview, which command reads or sets the metadata dictionary:

Command Reads current image metadata dict Reads command options metadata dict Writes current image metadata dict Driver data merged in
image create no yes yes yes
myimage cget yes no no no
myimage configure yes (1) yes yes yes
myimage put no yes no no
myimage read no yes no no
myimage data yes (1) yes no no
myimage write yes (1) yes no no

Footnotes:

(1) The current metadata is ignored if a metadata dictionary is given as command parameter.

Each command is now discussed within its own section:

image create

The create command will parse the image data and create the metadata dictionary of the image.

As an example, a gif file with a comment would create a comment metadata key within the image:

% image create photo myimage -format GIF -file testwithcomment.gif
% myimage cget -metadata
Comment {This is the image comment}

A metadata dictionary given on the command line will be merged with the parsed metadata dictionary. The meta data read from the file will be given priority. This enables the specification of default values for keys which should be present.

An example with the same image as above:

% image create photo myimage -format GIF -file testwithcomment.gif\
   -metadata [dict create User A Comment "Comment from command line"]
% myimage cget -metadata
User A Comment "This is the image comment"

myimage cget

The metadata dictionary may be retrieved by:

myimage cget -metadata

myimage configure

The metadata dictionary of the image may be overwritten by:

myimage configure -metadata [dict create Comment "Comment from cconfigure"]

The image data is not touched and no image data interpretation is triggered.

The retrieval methods will return the metadata dictionary as for any other option:

% myimage configure -metadata
-metadata {} {} {} {Comment {Current comment}}

Setting one of the -format, -data or -file options to a different value will recreate the image with the new parameters. In this case, an eventually present -metadata option will first replace the present metadata of the image. Then, the image recreation will take place (using an eventually specified metadata dictionary) and may add keys to the image metadata dictionary.

It is not possible to trigger an image recreation by just specifying a metadata dictionary. This is to avoid unneeded image recreation.

Note: parameters to change the rendered image should use the -format option. The metadata may provide additional data.

When the image is rendered again due to a change of the options -file, -data or -format, the following procedure applies:

myimage put

The put command sets (parts of) the image data to the specified new image data.

The -metadata property of the image is not changed. This is consistent of other parameters like -format.

To replace the whole image including metadata, the configure command may be used by setting the -data option.

Example with gif data containing a comment:

% image create myimage -metadata [dict create Comment "Comment from image create"]
% myimage put $GIFWithCommentData
% myimage cget -metadata
Comment {Comment from image create}

A -metadata option may be specified to support the read operation. Nevertheless, this metadata is not included in the metadata property of the image.

Example:

% image create myimage -metadata [dict create Comment "Comment from image create"]
% myimage put $GIFWithCommentData -metadata [dict create Comment "Comment from put command line"]
% myimage cget -metadata
Comment {Comment from image create}

myimage read

The read command sets (parts of) the image data to new image data read from a file. This command acts like the put command, except that the image data comes from a file.

The -metadata property of the image is not changed. This is consistent of other parameters like -format.

To replace the whole image including metadata, the configure command may be used by setting the -file option.

Example with a gif file containing a comment:

% image create myimage -metadata [dict create Comment "Comment from image create"]
% myimage read gifwithcomment.gif
% myimage cget -metadata
Comment {Comment from image create}

A -metadata option may be specified to support the read operation. Nevertheless, this metadata is not included in the metadata property of the image. There is currently no practical application for this, but there might be examples which use that.

Example:

% image create myimage -metadata [dict create Comment "Comment from image create"]
% myimage read test.gif -metadata [dict create Comment "Comment from put command line"]
% myimage cget -metadata
Comment "Comment from image create"

myimage data

The data command writes the image data into a variable.

If the image formats supports a specified metadata key, it is included in the output file.

If a -metadata option is given, the metadata property of the image is ignored. Otherwise, the metadata property of the image is used.

Example to write a comment in gif data included in the image properties:

% image create myimage -file test.png -metadata [dict create Comment "Comment from image create"]
% myimage data -format "GIF"
... GIF data with comment included

Example to specify the comment with the command options:

% image create myimage -file test.png
% myimage data -format "GIF"-metadata [dict create Comment "Comment from data command"]
... GIF data with comment included

myimage write

The write command writes the image data to a file. With respect to metadata, it works the same way as the data command.

Example to write a metadata comment:

% image create myimage -file test.png
% myimage write GifwithComment.gif -format "GIF"-metadata [dict create Comment "Comment from write command"]
... GIF data with comment included

Notes on Options to image and metadata creation

The metadata is not suited to passing processing options to the driver. Such options should be added to the -format option instead.

In contrast, a driver may understand options passed through the -format option to modify its metadata processing.

Lets try the following imaginary example: An image driver contains a full EXIF parser which creates many keys as output. This processing is expensive, in both time and output data creation. As a consequence, the driver creator may decide to only create the EXIF output when an option requesting such is specified:

image create photo photo.jpg -format "jpg -exif 1"

Image format driver interface

The image format driver interface is changed in the following aspects:

Pass metadata dictionary as parameter

Each driver function receives a Tcl object pointer metadataIn as parameter. This parameter serves to pass a metadata dictionary to the driver function. It may be NULL. Doing so indicates that the metadata dictionary is empty.

A typical driver code snippet to check for a metadata key is:

if (NULL != metadataIn) {
    Tcl_Obj *itemData;
    Tcl_DictObjGet(interp, metadataIn, Tcl_NewStringObj("Comment",-1), &itemData));

To strictly fulfill the objective of the TIP it is only necessary for the Write functions of the format driver to receive metadata.

Nevertheless, it is implemented the same way as the -format parameter which is available to all functions. This allows the passing of additional options to the driver which only concern the metadata processing.

Receive a metadata dictionary from the driver (FileRead,StringRead)

The image match and read functions (FileMatch, StringMatch, FileRead, StringRead) may set keys in a prepared metadata dictionary to return them. These functions receive an additional Tcl object pointer as "metadataOut" as parameter.

This parameter may be NULL. This indicates that no metadata can be returned (put, read subcommands).

This parameter is initialized to an empty unshared dictionary object if metadata return is intended (image create command, configure subcommand). The driver may set dictionary keys in this object to return metadata.

A sample driver code snippet is:

if (NULL != metadataOut) {
    Tcl_DictObjPut(NULL, metadataOut, Tcl_NewStringObj("XMP",-1), Tcl_NewStringObj(xmpMetadata);

Image format driver interface

For image format drivers a new registration procedure is proposed. This new procedure includes functions with the new parameters. In addition, the parameters are reordered to always have the order of

The new stubs enabled function is:

void Tk_CreatePhotoImageFormatVersion3(const Tk_PhotoImageFormatVersion3 *formatPtr)

The function parameters of Tk_PhotoImageFormatVersion3 are as follows:

int (Tk_ImageFileMatchProcVersion3) (Tcl_Interp *interp, Tcl_Channel chan,
        const char *fileName, Tcl_Obj *format, Tcl_Obj *metadataIn, int *widthPtr,
        int *heightPtr, Tcl_Obj *metadataOut,);

int (Tk_ImageStringMatchProcVersion3) (Tcl_Interp *interp, Tcl_Obj *dataObj,
        Tcl_Obj *format, Tcl_Obj *metadataIn, int *widthPtr, int *heightPtr,
        Tcl_Obj *metadataOut);

int (Tk_ImageFileReadProcVersion3) (Tcl_Interp *interp, Tcl_Channel chan,
        const char *fileName, Tcl_Obj *format, Tcl_Obj *metadataIn,
        Tk_PhotoHandle imageHandle, int destX, int destY, int width,
        int height, int srcX, int srcY, Tcl_Obj *metadataOut);

int (Tk_ImageStringReadProcVersion3) (Tcl_Interp *interp, Tcl_Obj *dataObj,
        Tcl_Obj *format, Tcl_Obj *metadataIn, Tk_PhotoHandle imageHandle,
        int destX, int destY, int width, int height, int srcX, int srcY,
        Tcl_Obj *metadataOut);

int (Tk_ImageFileWriteProcVersion3) (Tcl_Interp *interp, const char *fileName,
        Tcl_Obj *format, Tcl_Obj *metadataIn, Tk_PhotoImageBlock *blockPtr);

int (Tk_ImageStringWriteProcVersion3) (Tcl_Interp *interp, Tcl_sObj *format,
        Tcl_Obj *metadataIn, Tk_PhotoImageBlock *blockPtr);

Documentation

The script-visible changes are documented in the manpage doc/photo.n. The changes to the C level interface on the other hand are documented in doc/CrtPhImgFmt.3.

Implementation

The implementation is done in branch tip-529-image-metadata-no-match-method of the Tk fossil repository. The following metadata keys are implemented:

With thanks to Paul Obermeier, the Img package has implemented the new interface and uses it currently for setting and reporting of DPI information.

A set of test cases is included in the implementation. In addition, the Img package features additional tests for this change which exercise additional features like the stub table and image parameters.

Note that the Img package currently links against the optional branch (see below), and not against the branch tag tip-529-image-metadata-no-match-method. This was necessary to fix the file tests/earth.gif, as it is incomplete.

Warning: The two following chapters ("Discussion" and "Rejected alternatives") contain a lot of information not relevant for the TIP. A reader only interested in the functionality of the TIP may stop here.

Discussion

image metadata command planned by Paul Obermeier

What about extending and exposing the functionality of the MatchProc function at the Tcl level?

That way it would be possible to implement a command like image metadata <fileName>, where you can retrieve the image size, resolution and additional metadata without explicitely loading the image.

In my image browser (Screenshot) I am currently using a modified version of Tcllib's fileutil::fileType command to extract the image size without creating a photo image. Getting that information (and additional metadata) directly from the C-based image parsers would be faster and there would be no need to code that functionality twice.

This function is implemented in Tk branch tip-529-image-metadata.

Update DPI metadata property on image script

Paul Obermeier has made the following proposal:

How do you want to handle the physical resolution (DPI) in the case of image scaling? Just keep the original DPI value or adjust the DPI values automatically, maybe using an option.

HaO: Currently, this is not thought out and may be implemented by another TIP.

No use of other optional features by Paul Obermeier

I asked Paul, if he sees any use of the optional features below for him or TkImg. The answer was: no, I don't see any use.

Rejected Alternatives

Within the last two years of development, the following additional ideas were implemented as well. They are all implemented in the Tk branch tip-529-image-metadata-optional. They are not included in the TIP and not included in the main implementation. People may speak up to get any feature back into the main feature branch.

The optional features are:

The following section discusses these optional features. The format driver interface with all options is shown in the section after that.

Another rejected alternative using only one metadata pointer in the interface follows as the last section.

XMP data

Photo images may contain an XMP data structure which may hold structured data. The aim is to make this data accessible. The parsing of the XML structure is not part of this TIP and may be done by other packages.

The metadata key is:

Key Description Example image formats
XMP xmp image data gif,png

XMP support is implemented for the gif format. It was removed from the main branch, due to not having a use case at the moment.

SVG optimization by a metadata key holding the preparsed svg blob

Rationale

This is used within the current SVG implementation included in Tk8.7a3.

The routines imported from the nanosvg project split SVG processing into two steps:

  1. Transform the xml data to a binary representation of the splines
  2. Render the splines to an image presentation

When an SVG file is loaded via

image create photo i1 -file test.svg -format {svg -scaletoheight 16}

the file is read, the xml data loaded, and processing steps 1 and 2 are performed.

When the image is later scaled by:

i1 configure -file "" -data {<svg source="metadata" >} -format {svg -scaletoheight 32}

the same steps are performed as on image load, while only step 2 would be necessary. The performance is poor and the file must still be available.

The idea is to store the binary representation of the splines (result of processing step 1) as a key in the -metadata dictionary (say SVGBLOB). This then enables the code to only perform step 2 on scaling.

In addition, an svg image may even be "compiled" to the metadata structure, so the following command may work:

image create i1 -metadata {SVGBLOB ...} -data {<svg source="metadata" >} -format {svg -scaletoheight 32}

While this will only work within the same patchlevel of Tk and on the same architecture (endianess, int size), as the format may change, it may be useful when creating a starkit, for example.

In my talk at ETCL 2019 I showed an Android GUI where buttons could be scaled by a pinch-to-zoom gesture. The current performance is quite poor.

Specification

The SVG driver returns a preparsed image blob in the metadata key "SVGBLOB". This data is used as image data, if the -data parameter contains the string "<svg data="metadata" />".

A sample script is as follows:

image create photo foo -data $svggradient -format svg
foo configure -file "" -data "\<svg data=\\"metadata\\" />" -format "svg -scale 2"

Internally, the driver uses a hidden DString structure to communicate between the match and read functions.

The output of the SVG parser is serialized into an array and put into the memory block. All functions of the rendering functions dealing with the input data are changed to use the array.

Discussion

I see only a small speed-up (around 10%) by this solution. In addition, the version without this option is as fast as the optimized version. So, we have a slowdown for the normal case and no gain for the optimized version.

My test script is as follows:

For me, even removing the file operations and the parsing should be a magnitude faster. But the facts are different. Apparently copying all this data around in addition takes a lot of time. And the svg-nano parser is super fast. Most of the processing time is taken by image rendering.

Match and read function communication memory

The image match and read functions get an additional parameter driverInternalPtr which points to an initialized DString structure. The DString structure is cleared by the framework.

Via this DString the driver match function can pass data to its read function.

The rationale is the current implementation of the SVG driver:

Indicate that the match function does not need the channel any more

The driver file match function may indicate that it does not need the channel any more. This is done by setting the additional output int closeChannel to 1. In this case, a NULL driver is passed to the read driver function.

The rationale is the current implementation of the SVG driver:

Format driver interface with all options

The function parameters in Tk_PhotoImageFormatVersion3 are as follows:

int (Tk_ImageFileMatchProcVersion3) (Tcl_Interp *interp, Tcl_Channel chan,
    const char *fileName, Tcl_Obj *format, Tcl_Obj *metadataIn, int *widthPtr,
    int *heightPtr, Tcl_Obj *metadataOut, int *closeChannelPtr,
    Tcl_DString *driverInternalPtr);

int (Tk_ImageStringMatchProcVersion3) (Tcl_Interp *interp, Tcl_Obj *dataObj,
    Tcl_Obj *format, Tcl_Obj *metadataIn, int *widthPtr, int *heightPtr,
    Tcl_Obj *metadataOut, Tcl_DString *driverInternalPtr);

int (Tk_ImageFileReadProcVersion3) (Tcl_Interp *interp, Tcl_Channel chan,
    const char *fileName, Tcl_Obj *format, Tcl_Obj *metadataIn,
    Tk_PhotoHandle imageHandle,
    int destX, int destY, int width, int height, int srcX, int srcY,
    Tcl_Obj *metadataOut, Tcl_DString *driverInternalPtr);

int (Tk_ImageStringReadProcVersion3) (Tcl_Interp *interp, Tcl_Obj *dataObj,
    Tcl_Obj *format, Tcl_Obj *metadataIn, Tk_PhotoHandle imageHandle,
    int destX, int destY, int width, int height, int srcX, int srcY,
    Tcl_Obj *metadataOut, Tcl_DString *driverInternalPtr);

int (Tk_ImageFileWriteProcVersion3) (Tcl_Interp *interp, const char *fileName,
    Tcl_Obj *format, Tcl_Obj *metadataIn, Tk_PhotoImageBlock *blockPtr);

int (Tk_ImageStringWriteProcVersion3) (Tcl_Interp *interp, Tcl_sObj *format,
    Tcl_Obj *metadataIn, Tk_PhotoImageBlock *blockPtr);

Single metadata parameter for input and output

A first approach was to use one metadata parameter to the format driver functions, combining input and output. The properties are:

This solution was not chosen due to the complicated way for the format driver functions to set a metadata dictionary. In addition, it is seen as valueable, that the information "no metadata output, please" may be transmitted.

The implementation is in branch tip-529-image-metadata-jan.

Copyright

This document has been placed in the public domain.