Tcl Library Source Code

Documentation
Login


[ Main Table Of Contents | Table Of Contents | Keyword Index | Categories | Modules | Applications ]

NAME

uri - URI utilities

Table Of Contents

SYNOPSIS

package require Tcl 8.2
package require uri ?1.2.7?

uri::setQuirkOption option ?value?
uri::split url ?defaultscheme?
uri::join ?key value?...
uri::resolve base url
uri::isrelative url
uri::geturl url ?options...?
uri::canonicalize uri
uri::register schemeList script

DESCRIPTION

This package does two things.

First, it provides a number of commands for manipulating URLs/URIs and fetching data specified by them. For fetching data this package analyses the requested URL/URI and then dispatches it to the appropriate package (http, ftp, ...) for actual retrieval. Currently these commands are defined for the schemes http, https, ftp, mailto, news, ldap, ldaps and file. The package uri::urn adds scheme urn.

Second, it provides regular expressions for a number of registered URL/URI schemes. Registered schemes are currently ftp, ldap, ldaps, file, http, https, gopher, mailto, news, wais and prospero. The package uri::urn adds scheme urn.

The commands of the package conform to RFC 3986 (https://www.rfc-editor.org/rfc/rfc3986.txt), with the exception of a loophole arising from RFC 1630 and described in RFC 3986 Sections 5.2.2 and 5.4.2. The loophole allows a relative URI to include a scheme if it is the same as the scheme of the base URI against which it is resolved. RFC 3986 recommends avoiding this usage.

COMMANDS

SCHEMES

In addition to the commands mentioned above this package provides regular expression to recognize URLs for a number of URL schemes.

For each supported scheme a namespace of the same name as the scheme itself is provided inside of the namespace uri containing the variable url whose contents are a regular expression to recognize URLs of that scheme. Additional variables may contain regular expressions for parts of URLs for that scheme.

The variable uri::schemes contains a list of all registered schemes. Currently these are ftp, ldap, ldaps, file, http, https, gopher, mailto, news, wais and prospero.

EXTENDING

Extending the range of schemes supported by uri::split and uri::join is easy because both commands do not handle the request by themselves but dispatch it to another command in the uri namespace using the scheme of the URL as criterion.

uri::split and uri::join call Split[string totitle ] and Join[string totitle ] respectively.

The provision of split and join commands is sufficient to extend the commands uri::canonicalize and uri::geturl (the latter subject to the availability of a suitable package with a geturl command). In contrast, to extend the command uri::resolve to a new scheme, the command itself must be modified.

To extend the range of schemes for which pattern information is available, use the command uri::register.

An example of a package that provides both commands and pattern information for a new scheme is uri::urn, which adds scheme urn.

QUIRK OPTIONS

The value of a "quirk option" is boolean: the value false requests conformance with RFC 3986, while true requests use of the quirk. Use command uri::setQuirkOption to access the values of quirk options.

Quirk options are useful both for allowing backwards compatibility when a command specification changes, and for adding useful features that are not included in RFC specifications. The following quirk options are currently defined:

BACKWARD COMPATIBILITY

To behave as similarly as possible to versions of uri earlier than 1.2.7, set the following quirk options:

In code that can tolerate the return by uri::split of an additional key pbare, set

in order to achieve greater compliance with RFC 3986.

NEW DESIGNS

For new projects, the following settings are recommended:

DEFAULT VALUES

The default values for package uri version 1.2.7 are intended to be a compromise between backwards compatibility and improved features. Different default values may be chosen in future versions of package uri.

EXAMPLES

A Windows® local filename such as "C:\Other Files\startup.txt" is not suitable for use as the path element of a URI in the scheme file.

The Tcl command file normalize will convert the backslashes to forward slashes. To generate a valid path for the scheme file, the normalized filename must be prepended with "/", and then any characters that do not match the regexp bracket expression

[a-zA-Z0-9$_.+!*'(,)?:@&=-]

must be percent-encoded.

The result in this example is "/C:/Other%20Files/startup.txt" which is a valid value for path.

% uri::join path /C:/Other%20Files/startup.txt scheme file

file:///C:/Other%20Files/startup.txt

% uri::split file:///C:/Other%20Files/startup.txt

path /C:/Other%20Files/startup.txt scheme file

On UNIX® systems filenames begin with "/" which is also used as the directory separator. The only action needed to convert a filename to a valid path is percent-encoding.

CREDITS

Original code (regular expressions) by Andreas Kupries. Modularisation by Steve Ball, also the split/join/resolve functionality. RFC 3986 conformance by Keith Nash.

Bugs, Ideas, Feedback

This document, and the package it describes, will undoubtedly contain bugs and other problems. Please report such in the category uri of the Tcllib Trackers. Please also report any ideas for enhancements you may have for either package and/or documentation.

When proposing code changes, please provide unified diffs, i.e the output of diff -u.

Note further that attachments are strongly preferred over inlined patches. Attachments can be made by going to the Edit form of the ticket immediately after its creation, and then using the left-most button in the secondary navigation bar.

KEYWORDS

fetching information, file, ftp, gopher, http, https, ldap, mailto, news, prospero, rfc 1630, rfc 2255, rfc 2396, rfc 3986, uri, url, wais, www

CATEGORY

Networking