# TIP 406: "C" is for Cookie
State: Final
Type: Project
Tcl-Version: 8.7
Vote: Done
Post-History:
Author: Donal K. Fellows <[email protected]>
Created: 01-Aug-2012
Tcl-Branch: dkf-http-cookies
Vote-Summary: Accepted 5/0/1
Votes-For: DKF, KBK, JN, FV, SL
Votes-Against: none
Votes-Present: BG
-----
# Abstract
The "http" package needs cookie support, especially to support complex modern
web authentication protocols. This TIP defines a pluggable interface and a
TclOO class that implements that interface so that Tcl programmers can control
just what is shared and stored, and where.
# Rationale and Design Constraints
Cookies are a feature of many modern web applications. They're short strings
stored in HTTP clients on behalf of servers, and which are then sent back to
those servers with further requests to those servers. Often \(but not
universally\) these short strings are IDs that are used as database keys to
look up a "session object" in a server-side database that holds relevant
information; such information can include whether the user has logged in, what
color scheme to use in the stylesheet, etc. Of particular relevance to Tcl
programmers is the fact that they are often now associated with login
information; this means that it is not practical to leave Tcl code without
cookie support.
Currently, Tcl programs that handle cookies have to do so manually, examining
the otherwise-unparsed HTTP headers to see if anything relevant has been set.
As the [cookie protocol (RFC 6265)](http://tools.ietf.org/html/rfc6265) is quite complex,
it is highly desirable to have a centralized implementation of at least the
basic parsing and injection code.
The main down-side of adding cookie handling is that definitely increases the
ability of web servers to track clients written in Tcl. In particular, there
is a danger of cross-application tracking. It is therefore necessary to ensure
that the cookie handling mechanism is off by default, and that the locations
used to store the cookies are controllable by the Tcl application.
# Proposed Change to 'http' Package
I propose to add a new configuration option to the _http_ package:
> **-cookiejar** _commandPrefixList_
This option \(configurable via **http::config**\), can either be an empty
string \(the default\), or it can be a list of words that is a command prefix.
An empty string will disable cookie handling: the current _http_ package
behavior in relation to cookies \(i.e., ignoring them\) will prevail. However,
if a non-empty list of words is supplied, they will be used as if they were a
command \(or rather a prefix to be expanded\) in the following ways:
## For Each Cookie Provided by an HTTP Server
When a cookie is supplied by an HTTP server, it will be reported to the cookie
jar command like this:
> \{\*\}_$commandPrefixList_ **storeCookie** _cookieName cookieValue optionDictionary_
The _cookieName_ and _cookieValue_ are relatively self-explanatory; they
represent the name/value pair to store. The _optionDictionary_ contains the
parsed cookie options, being at least these:
**hostonly**: Boolean; whether the cookie should only ever be returned to the
originating host \(if not, it should be sent according to the domain\). This
property is always present.
**httponly**: Boolean; whether the cookie ought to be only used with HTTP
connections. \(NB: This is unlikely to be something we enforce.\) This property
is always present.
**persistent**: Boolean; whether the HTTP server wishes the cookie to persist for
longer than the current "session". This property is always present.
Non-persistent cookies are expected to never be committed to permanent
storage.
**secure**: Boolean; whether the cookie should only be sent on "secure"
connections to the HTTP server. \(This typically means "HTTPS is required",
but the various cookie specifications are less-than-clear in this area.\) This
property is always present.
**expires**: Timestamp; when a persistent cookie will cease to be interesting to
the HTTP server that issued it. \(NB: If this is in the past, any matching
cookie should be deleted.\) This property is only present if the "persistent"
property is true.
**domain**: String; what domains should this cookie be sent to. This property is
always present, but may sometimes be the same as the "origin" property; see
the "hostonly" property for how to treat this.
**origin**: String; what host did we send the request to that caused this cookie
to be generated. This property is always present.
**path**: String; what resource paths within the relevant HTTP servers should a
cookie be sent to. This property is always present.
The result of this command will be ignored \(so long as it is non-exceptional\).
## When Making an HTTP Request
Each time an HTTP request is made, the cookie store is consulted \(prior to the
connection being opened\) to find out what cookies should be sent as part of
the request, like this:
> \{\*\}_$commandPrefixList_ **getCookies** _protocol host path_
The _protocol_ is the name of the protocol scheme that will be used to
contact the HTTP server \(typically **http** or **https**\), the _host_ is
the server's hostname, and the _path_ is the resource path on that server.
The query and fragment parts of the URL are _never_ supplied to the cookie
store as part of this request, nor is the port, nor are any user
identification credentials \(the cookie specification specifically states that
it is a known problem that cookies have _always_ ignored the service port
number for the purposes of whether to send the cookie; we therefore duplicate
this failure\).
The result is treated as a list of keys and values \(i.e., it is expected to be
a list with an even number of items in it, with the first key at index 0 and
the first value at index 1\) and describes the collection of cookies to send.
The _http_ package will manage the formatting of the cookies as part of the
request to send.
# A Cookie Jar Implementation
To go with the above specification, this TIP also describes a TclOO class that
will be provided to implement the cookie store side of the protocol. This
class will be provided in the **cookiejar** package.
The name of the class will be **::http::cookiejar**, and its instances will
be cookie stores that participate in the above protocol. The constructor of
the class will take an optional argument that names an SQLite database that
will be used to store the cookies; if no name is provided, an in-memory
database will be used and all cookies will be treated like pure session
cookies.
The result is that it will be possible to enable cookie handling in a Tcl
script using this:
package require http
package require cookiejar
http::config -cookiejar [http::cookiejar new ~/mycookies.db]
The **::http::cookiejar** class will also allow configuration of its logging
level via the **loglevel** method on the class \(which takes a single
argument, the new logging level, or which returns the current logging level if
called with no arguments\). Permitted log levels are **error**, **warn**,
**info** and **debug**. Log messages will be written by a call to
**::http::Log** \(which does nothing by default anyway\).
An example of setting the logging level to the \(substantially more verbose\)
**debug** level:
http::cookiejar loglevel debug
The instances of the **::http::cookiejar** class will additionally support
the method **forceLoadDomainData** and the method **lookup**.
> _instance_ **forceLoadDomainData**
This instructs the instance to load \(or reload\) its definitions of what
domains may not have cookies set for them. It takes no extra arguments and
produces an empty result \(unless an error occurs\).
> _instance_ **lookup** ?_host_? ?_key_?
This looks up cookies in the store. If neither _host_ nor _key_ are
specified, this returns the list hosts for which cookies are defined. If
_host_ is specified but _key_ is not, this returns the list of cookie keys
for the host \(note that these may be session cookies or durable cookies; this
interface does not distinguish\). If both _host_ and _key_ are specified,
the value for the particular cookie is returned \(with it being an error if no
such cookie is defined\). This method provides no mechanism for setting the
value of a cookie or creating a new one.
## Implementation Notes
It is worth noting that the current cookiejar package will download a list of
"bad" domains \(i.e., domains that correspond to super-registries, such as
**com**, **ac.uk** or **tk**\) when a new database is constructed
\(provided the cookiejar instance is backed by a database file; in-memory
databases never have this part populated by default\). This extensive list of
domains
<http://mxr.mozilla.org/mozilla-central/source/netwerk/dns/effective_tld_names.dat?raw=1>
is a security feature that prevents the setting of cookies for large numbers
of hosts at once, but it is belived sufficiently long that it is excessive for
supplying as part of the package directly.
The degree to which it is necessary to update cookie stores from that list has
not yet been studied.
The underlying SQLite database is forwarded to the cookiejar instance's
interface as the \(default non-exported\) method **Database**; this takes all
the normal subcommands of an SQLite _dbcmd_, as documented in
<http://www.sqlite.org/tclsqlite.html> .
The cookiejar package handles conversion of host names into and out of
punycode so that lookups are always performed on canonical names. This is
important because there is no guarantee that the encoding of host names will
be the same as they are referred to in another context; for example, the list
of forbidden domains above is in UTF-8 and not using the IDNA scheme.
# Privacy
Cookies are often associated in the public mind with problems with privacy.
There are two principal mechanisms provided here to mitigate these \(beyond the
proposed domain restrictions, which follow recommended Best Practice\):
1. No Tcl interpreter will ever have cookie handling enabled by default.
There will always need to be an explicit action taken to turn it on.
2. Applications have to pick the name of their cookie stores when creating
and installing them; there is no default.
# Copyright
This document has been placed in the public domain.