Tk Library Source Code

View Ticket
Login
Ticket UUID: 1611929
Title: Suggestions for improving doctools documentation
Type: Bug Version: None
Submitter: dnew Created on: 2006-12-09 01:44:24
Subsystem: doctools Assigned To: andreas_kupries
Priority: 8 Severity:
Status: Closed Last Modified: 2007-03-21 03:37:45
Resolution: Fixed Closed By: andreas_kupries
    Closed on: 2007-03-20 20:37:45
Description:
I recently put together a fairly large man page for my Amazon S3 library.
In developing the library, I wrote all the documentation into a plain text file before writing the code. Then it came time to format the manual page in a way that would be easy to incorporate into Tcllib. I had trouble. :-)

I recognise that Tcllib is a free resource maintained by many out of the goodness of their heart. I would like, in that spirit, to describe some of the things
that gave me trouble in writing the manual pages, and to suggest some improvements.
I do not intend this to be interpreted as demands, criticisms of past work, or ingratitude in any way. I would like this to be interpreted merely as suggestions
on how the documentation could be more helpful to those in my situation: people trying to follow existing standards in developing new libraries for Tcl.

Note too that I make a lot of rhetorical questions. Many of these I still don't know the answer to, but I don't expect any sort of specific reply to any of the
comments, complaints, suggestions, or questions. I offer this purely in the spirit of "here is my experience, and if you'd like to improve that for others, I thank you for your time."


I think there are two giant unobviousnesses from reading the doctools pages.

1) What does a manual page look like?

2) How do you process manual pages into readable documentation?

As for the first part, let's look at doctools_fmt. (I haven't tried to do any directory or TOC work.)

The new EBNF grammar is helpful, but confusing at the same time. *My* confusion comes perhaps from using EBNF in RFCs and such (which is where I assume the "EBNF" comes from), and having the EBNF shown here not follow the same rules. There is no literal "]" anywhere in the EBNF, so already it's obvious that something is "wrong" with the EBNF. THe "COMMENT" production yields "comment", but neither [comment] nor [comment here comes stuff] is a valid command. 

It would be helpful to give a reference to how to interpret the EBNF, just in terms of the syntactic elements such as curly braces. (Since it looks like curly braces are the only "extended" feature, perhaps simply mentioning that it means "zero or more times" would be sufficient.)


The block after the sentence "Then we define rules for all the keywords. Here we introduce our knowledge that some commands allow only whitespace after them." could be completely eliminated simply by using the same spelling for both sets of terms. Why equate EX_BEGIN to "example_begin" when you could just use "EXAMPLE_BEGIN"? It's also incorrect to state that some commands allow only whitespace after them, when what is really meant is they disallow non-whitespace text between the end of the command and the start of the next command. 
It also looks, on first glance, the the block of EBNF after that already indicates where a "text_chunk" or a "regular_text" can appear. Indeed, the last chunk of EBNF is the only chunk that's notably useful. That the EBNF lacks the information about what arguments the formatting commands take 
(which could have been in the second block) is also a stumbling block, inducing scrolling up and 
down looking for the various details needed to type one line of manual.

For example, in the second block you could just have
  BULLET ::= "[" "bullet" "]"
  SECTION ::= "[" "section" ARG "]"
and then ARG could be defined as
  ARG ::= (all text except WHITE or "]") | QUOTE (all text except QUOTE) QUOTE
  QUOTE ::= (a literal " mark)

Alternately, just giving a good example would clarify a lot of things, since the intent is to describe rather than prescribe.

Those who work with other markup languages might find it surprising that when [comment] takes a single argument, [comment here is text] is illegal and must be [comment "here is text"]. That it takes Tcl-style quoting (with " or {} but not ') might be more surprising. (Or not, of course.)

In understanding this stuff, it's often useful to have even a simple example, or a ferinstance, for unobvious manipulations of never-before-seen terms. "Vset varname value" - what's a "document variable"? Is it like a macro? Is it something that the processing engine plug-in uses?  Adding a comment like "Useful in the same way a global constant is in a program" or "helpful when a phrase that might change is repeated in several places in the manual" or some such might help.

see_also doesn't describe the format of its arguments. If they're links, it's unobvious what format they need to be in. Similary for keywords. Funny enough, the "see also" on this page are *not* links, while the keywords *are* links, which would seem counterintuitive. The labels to see_also identifying the document don't identify the document; they give the human-readable name of the document, and assume the reader can identify the document from the name. (A URL, for example, would identify the document, or an argument that the processing engine could turn into a URL.) The statement that "each argument is a single keyword" confuses the fact that it's actually (in this document) an entire [uri] production it seems.

It's not obvious what the difference between [arg] and [arg_def] is, given the description; similarly for other pairs of similar terms. For things like [arg] and [cmd], I would use the expression "formats the argument /text/ as the name of a command argument" rather than "declaring" something. Particularly given that [call] actually declares something in the sense that I (at least) think of as a declaration. I think programmers normally think of "declarations" as something you later refer back to, rather than simply formatting commands. The authors of the doctools probably think of it as declaring the text as something to the formatting engine, but I think that's confusing for users of the formatting engine. What does the "declaration" of a namespace imply, beyond simply formatting it a certain color? Does it show up anywhere else? I would think that only [call] and [section] actually declare something, as such. Possibly [keyword] and [see_also] as well, as they seem to move the text around some. Otherwise, [emph] would declare the word to be emphasised.

All the different "declares that the argument /text/ is the name of ..." don't really say anything, in the same sense that {incr x ; # increment x} isn't a useful comment. It might be useful to have a call or two as an example, formatted, and show what the markup is for each element of the call. Have a Tcl proc with positional, optional, and -keyword based arguments, an expr function, a Tk widget creation call, a widget call, and a ITcl/Snit/whatever call, and show the markup for each in the example section. (See the next paragraph for why this would be helpful.)

The documentation for list_begin is confusing, if for no other reason than it is unclear which half of the nested list the "what" refers to. Is it [list_begin arg] or [list_begin arg_def]? An example of each use would probably be helpful. Mentioning that it is the [call] tag that actually places the call in the list at the top of the man page would also be useful. A_LIST is confusing because it tells only half the story, and you have to go read the prose about the "list_begin" command (several screens away) to figure out there's an argument-matching condition. Does the list of options need to have [option] in it? That is, should it be [opt_def -possible] or should it be 
[opt_def [option -possible]] or should it be 
[opt_def [opt [option -possible]]]? Or is it just 
[lst_item [option -possible]]?

What's the difference between [cmd_def] and [call]? Impossible to guess from the documentation. Given the unobvious-to-the-uninitiated behavior of [call], the description of [usage] is even more incomprehensible.

What's the difference between [para] and [nl]? Why is one permitted only inside lists and the other permitted only outside lists? Is there a user-interesting reason for this, or is it simply a restriction of the formatting engines?

The "Example" section is useless. It should at a minimum mention where to get the source of man pages. While almost every possible construct is used here, I suspect they're not used correctly, since there is no actual documentation of (for example) tkoptions, args, etc. I.e., it may be an example of how the formatting looks, but not how it is supposed to be used.

The description of everything is in a really difficult order. It's neither alphabetical, nor top-down, nor bottom-up. 

As for the processing, giving an example of how to invoke dtp.kit would help. I'm not sure where I got mine (it was recently downloaded, tho), but simply double-clicking the .kit under Windows simply says it can't read "C:\Document", indicating it's not handling directory spaces correctly. Figuring out that it needs to be invoked from the command line as 
  tclsh dtp.kit doc html S3.man >S3.html
was unnecessarily difficult. True, it's giving options for setting the processing engine and all, so it's understandable why it's difficult, but the documentation should address this, possibly in a section called "GETTING STARTED", so it's easy to find. Adding a -version flag/option/etc would probably be a good idea.

I think that any or all of these four overhauls would make the documentation much more useful:

1) Fix the EBNF to actually generate valid documents when expanded. Put in the work to deal with the literal text, the arguments, etc. Either that, or drop the first two blocks, since they're basically uninformative given the third block.

2) Give a pointer to an example manual page that uses all the features in the correct way. Create a namespace called "docexample" and create in it a command with options and optional arguments, tk-style commands, and so on. Give people a skeleton framework from which to work.

3) Give instructions on finding and using the latest version of the processing program, at least to the extent that a user can figure out whether what they've written matches what the program will process. If they can't even experiment, it's difficult to get right. Simply a URL to where the latest release is stored along with instructions on invoking it to generate (say) HTML, would be helpful. 

4) Improve the error messages. I'm personally used to using systems that don't tell you where the error is, requiring adding text a paragraph at a time so you know what you changed since it last worked. Many aren't. I know it's difficult, tho, probably moreso with a plug-in architecture. Perhaps offer hints or, again, a skeleton to start with.

My guess is that 2 and 3 are probably pretty easy for those currently in-the-know about this tool, and that they'd also produce the most bang for the buck.
User Comments: andreas_kupries added on 2007-03-21 03:37:45:
Logged In: YES 
user_id=75003
Originator: NO

Ok, yesterday I committed the whole set of updated documentation, not only for doctools, but the docidx and doctoc formats as well. FAQs have been added, but are relatively small so far. All manpages now have a section about how and where to provide feedback. This is something I will add over time to all manpages in Tcllib. I made the doctools language nicer too (See changelog for the details. I mainly loosened a number of constraints, and added better names for the various list types and list item commands. The docs show the updated language, old stuff is still accepted, but deprecated and causes warnings).

nobody added on 2007-02-22 06:50:37:
Logged In: NO 

This looks a lot better to me! The intro doc is excellent. The other thing I would suggest is perhaps adding one more document that is simply a collection of examples of each type of command. That is, a web page where one could cut-and-paste the formatting for a widget, for a command that takes -name/value pairs that are optional, etc etc etc. It would be just a matter of digging out representative samples from the existing package documentation and putting them in an easy-to-find place. Even just a list of "to see a sample mega-widget documentation, check package Yadda. To see a sample OO class document, see package Boogle" would do the trick, I think. Just some place that makes sense of the different formatting commands at a detailed level.

andreas_kupries added on 2007-02-17 06:36:47:

File Added - 216376: doc.tgz

Logged In: YES 
user_id=75003
Originator: NO

Now attached a first go at a set of new man pages for the doctools markup language. Please review and respond.

Looking at this again, (4), ... The error messages do tell the location of the problem, line number and column.

Of the 4 documents the command reference likely needs most of the expansion (i.e. more argument descriptions, add examples for each command).

File Added: doc.tgz

lvirden added on 2006-12-11 20:44:47:
Logged In: YES 
user_id=15949
Originator: NO

One thing that needs to be carefully kept in mind as doc for this package is written is that at least two types of documentation is needed.

One type of doc is for someone who is going to be working on the code itself. That code would focus on design decisions, what the plugin architecture is for and how to use it, etc.

The other type of doc is for someone who wants to write code in doctools format. For that person, the emphasis needs to be on SEMANTICAL use of the format. One of the reasons for doctools is semantical markup of text. In a semantical view, one doesn't care whether the resulting text is bold faced, underlined, etc. What matters is that people CORRECTLY marks the various parts of text. If one marks the text correctly, then the doctools tools should be able to generate things in a format that is useful later.
And viewing the doc from that perspective is invaluable.  

I wonder if some sort of tool would be useful to have that would somehow display the documentation visually emphasizing the semantical markup, so that one could easily look at text and identify where something had a mistaken markup?  I'm not certain about a good way to do this - something to think about.

dnew added on 2006-12-11 10:05:56:
Logged In: YES 
user_id=37425
Originator: YES

Actually, the Wiki seems to address a majority of the insufficiencies in the documentation. Removing the "incorrect" information from the manual page (like the EBNF that's wrong) and either incorporating the examples and specifics from the Wiki or including a prominent pointer to the Wiki in the man page would probably be sufficient to answer most questions. I guess I'm just old-fashioned and haven't gotten into the habit of checking the wiki before the man pages. :-)

mic42 added on 2006-12-10 19:25:37:
Logged In: YES 
user_id=302287
Originator: NO

For reference, the wiki pages, which should be included (at least partially) into the doctools docs:
http://wiki.tcl.tk/doctools
http://wiki.tcl.tk/dtp

These document at least the basics. Would they be enough to satisfy your 3. point?

mic42 added on 2006-12-10 19:20:31:
Logged In: YES 
user_id=302287
Originator: NO

I fully agree that the doctools docs need improvements.
Most Tcllib authors probably use sak.tcl to generate the docs from their doctools formatted man pages, where it simply is: sak.tcl doc html <packagename> IIRC, but that kind of magic should be exported into a general use package for non-tcllib packages.
So i agree that point 3 should be easy to do and have a big bang for the buck ratio.

Attachments: