Movatterモバイル変換

[Mirrored from:http://www.inf.tu-dresden.de/~jw6/doc/sdc/intro-en.html]

Typeset, a short introduction

This is a short introduction intosdc.

sdc is a formater for SGML documents.

This document describes the use of the document types which arecurrently handled fromsdc. It doesnot describe theadaption of sdc to other document types or target formats.

1 Overview

Typeset is an extensible formatter for documents. It transformsdocuments using SGML markup into various target formats.

Typeset comes with a couple of document type definitions (DTD's).

The DTD's feature the reuse of text, minimization of markup andreadability of the SGML source. They share their elements as much aspossible.

The formatting differs due to the features possible in the targetformat and to the rules common for the type of the document. Thisincludes the automated rearrangment of text and insertion of standardparts like contents sections, sorted index and bibliography. The latterfor instance is composed from the items of a database which arereferenced in the document. For some formats the output may be spreadover a couple of files. See the target type documentation for details.

According to the goal of text reuse and the aim to support many targetformats, these DTD's don't attempt to cover each and every casepossible. Instead, they try to provide all elements nessesary fordaily use and leave the implementation of special features toextensions.

It is also possible to have parts of the documents using othernotations. E.g., pictures drawn with tgif, xfig, the @Fig package ofLout or encapsulated postscript.

It is fairly easy to coerce sdc to parse documents with otherDTD's. But this implies to write rules for formatting in the desiredtarget format(s), or fit in another parsing stage which changes itinto a form as if it was marked acording to a supported DTD.

The transformation (formating) is described by files of scheme coderelated to both, the document type and the target format. Onlycombinations of common value are supported by default. (For instancefor letters only PostSript output is defined.)

Currently there are these DTD's

document: Simple ``plain'' documents.
report: Technical reports, documentation etc.
book: Books (longer documentations).
bibdata: Bibliography database.
manpage: Pages for the Unix(TM) man command.
brief: A letter according to DIN.

Currently the following target formats are supported:

PostScript (for english and german text)
LaTeX
HTML (Hyper Text Markup Language)
Info (to be used into the on-line help ofemacs)
man suitable forroff -man
ASCII
source code (literate programing)
slide to extract slides from a document
limitted support for RTF

Future output formats will include: roff -ms (or -mm), RTF.

2 What is SGML

This section may be translated some time. It's present in the germanversion and it's intended to explain the advantages of text processingover word processing and the advantages of a generalized markup overtarget dependand one.

If you retrieved the package you are probably convinced anyway. Soit's obsolete.

You may refer for a description of SGML to [1].

3 Invocation

First of all set your environment variableDOCPATH.It should point to the top directory of the tree, where all your filescontaining SGML data are stored.

Setting this variable could be done like this (for (t)csh) andcompatible shells.

 setenv DOCPATH $HOME

Invocation systax for sdc:

sdc [options|filename.sgml]*

Options:

-ofilename

Set the name for the output file. Ifomitted or set to- the output goes to the standard output.(This can cause problems with some target formats if they split thedocument.)

-Otype

Set the target format.type can be:

ps: create a PostScript document.
latex: create a LaTeX file.
html: create a HTML page.
info: create an Info file.
man: create a man page.
literate: create source files (literate programing)
rtf: (only partially supported) create a RTFfile.
slide: create a PostScript file holding theslides from the document.

If the-O switch is omitted a guess is made from the extensionof the output file name. If neither gives a target type this is anerror.

-Ddirectory

Adddirectory in front of the path searched for entities (files)of the docuemnts. Each option can add only one directory. Multipleoptions are processed left to right, i. e., the last directory at thecommand line is searched first.

-ientityname

Ensure, that a definition like

<!ENTITY % entityname "INCLUDE" >

precedes the processing of the documents. This is useful for optionalincluding of marked sections. Refer to the manual sgmls(1) for a detailed description. This option is passed tosgmls.

-mfile

Extend the list ofcatalog files to search for some SGMLentities. Refer to the manual sgmls(1) for adetailed description. This option is passed to sgmls.

-Ldirname

Set the name of the directory to use as library of files tosearch for target format descriptions.

-Rfile

Set a startup file to load after the default~/.typesetrc. Multiple-R options are allowed andprocessed in the given order. The files argument is treated to beeither a path name to the file or one relative to therc directory of a directory in the library (see-L).

Startup files can have their own arguments. If the argument given witha -R option contains a colon, only the half up to that colon gives thefile name to be loaded. The rest of the argument (without the colon)is assigned to the variable*-R-option-argument* while thespecified file is loaded. If there was no colon in the argument,#f is assigned.

-Vlevel

Be verbose and don't delete temporary files (for debuging).Levelmust be a number. The default forlevel is 1. This will giveonly warnings and (for a historical reason) a message upon success.Higher values give more messages.

With the-R option there are additional (long) optionsavailable to change the over all behavior. These are used by supplyingone or more of the following file names to the-R option.

nidx: Pretend having theNIDX token in the face attribute.
1c, 2c, 1s, 2s: Simmilar tonidx modify the value in effectof the face attribute in the top level document.
no-margin: No page margin in the (ascii) output. Only implementedfor lout processing at the moment.
HTML2: Don't use HTML-3 features in formatting.

Attention! Be careful to supply theexact name to the -R option. The same policy as for dotfiles applies to thosefiles: if they don't exist they are silently not loaded! THere is nowarning message.

A typical call would be:

sdc -o text.ps source.sgml

3.1 Environment

Typeset recognizes the following environment variables:

DOCPATH: This path is used to find the entities of the document. It getsextended (at the end) by sdc to include the files of it'sown. Also directories give by a-D option are prepended.
Usually a good value forDOCPATH is something like$HOME or$HOME/text:$HOME/doc.
SGML_CATALOG_FILES: The files mentioned by this variable are consulted by the underlyingparser to find some SGML entities. For a detailed description refer tothe the manual sgmls(1) . This variable gets extendedby sdc to include one file of its own, the first file namedCATALOG found in the library. As for sgmls the value can beextended by the-m option, which is simply passed to sgmls.
Usually it's good to leave this variable alone.
TYPESETLIB: This varaible is used by sdc to find the directories to search forformatting translation files and the DTD's and CATALOG files for theunderlying SGML parser. It may point to one directory or a list ofdirectories seperated by colons. This value can be overwritten by the-L option.
Usually it's good to leave this variable alone, except if you want tooverwrite some but not all files of the library.

3.2 Files

personal.data: is used by the DTD's which come withsdc to find definitions for the SGML entities related to theauthor. These two are myself and my-Inst. It may define some more. But theseare used to insert default values for the name and the institution ofthe author. Therefore it's a good idea to set the environment variableDOCPATH so sdc will find this file. An example how to setup the content of this file comes with sdc.
~/.typesetrc: if any, is loaded after startup and comandline evaluation. It might contain any scheme code.

sdc uses the files and directory structure in its library toparse the document and determine the formatting. For a descritionof this refer to the developers documentation.

4 Document types

4.1 Document type`document`

We start with an example:

<!doctype document public "-//JFW//DTD Document//EN" ><document>The Title<sect>IntroHere goes the introductory text.<sect>We continueThis is the text body of the first section. It's going to be a littlebit longer to show, that the formating of the source file reallydoesn't matter for the output.We start a new paragraph simply by inserting a newline.

A document starts as every SGML document with the document typedeclaration. It's opened with the document tag.

There are the following attributes available to a document:

date: The date of the document.
author: The author.
inst: The institution.
lang: The lang attribute for the document is obsolete. It has thesame effect as changing the public document language in the documentdeclaration.
face: This attribut is intended to affect the representation ofthe output. E. g., a value of1c should cause printing in onecolumn and a value of2c should result in two column. A tokennidx will supress the generation of an index even ifindex tags are used. Forface multiple values can beassigned (in quotes). See documentation for the target formats for thetreatment of this argument.

Then some paragraphs might follow. Next after these paragraphs eithernone or more than one sections can follow. Eventually and appendix canfinish the document.

4.2 Dokument type`report`

A technical report consists of an abstract followed by sections andpossible appendixes. Appendices themself are divided into sections.

Again an example:

<!doctype report public "-//JFW//DTD Report//EN"><report date="Today">Reporttitel<abstract>This is the abstract.</abstract>Here comes some introductory text. For some targets (e.g., PostScriptbecause Lout doen't allow text at this point) this text is taken tobe and section named "Introduction" (in the document language).<sect>first sectionThe text.<sect>second sectionMore text.<appendix><sect>AppendixText of the appendix.

The part from and including theappendix-tag is optional.

The sections may be divided bysect1-Tags. These bysect2.There is no division of thesect2.

4.3 Dokument type`book`

Books are written using a document type declaration like:

<!doctype book public "-//JFW//DTD Book//EN"><book date="Today">Title of the Book

The<book> tag has the same attributes as the document orreport tag. This is for consitency but questionable. Theinst attribute would better be called publisher. Futureversions will eventually rename this attribute.

In principle a book is divided into chapters (at least two) with the<chapt> tag as a simple document is divided intosections. Chapters themself consist of sections. Prior to the firstchapter there may be two special sections named<preface> and<intro>.

Future versions will support grouping of chapters into<part>'s.

4.4 Document type`manpage`

The document typemanpage is intended to produce pages for theUnix man command. It restricts the set of available elements to whatroff -man can handle. Especially figures are not valid elements. Furthermore this document type enforces a ``good style'' forthe man page. The possible sections are predefined. (Well there is abackdoor to use self defined sections, seedoc/badman.sgml.)

A manpage begins like this:

<!doctype manpage public "-//JFW//DTD Manpage//EN"><manpagedate="today">TYPESET<short>an extensible SGML formatter<synopsis> <code/typeset/ <var/options/ <var/files/ ...<descript>

Up to this point the elements are enforced. That is, you can't write aman page without a title (between the<manpage> and the<short> tag, a short description and a synopsis. Up to thesynopsis these elements can't even spawn multible lines, which couldn'tbe formatted to be a man page.

The description section can be preceded by a config section. But itwhile config can be omitted<descript> can not.

The description section plays a special role in another way: it's theonly one which may be devided into subsections using<sect1>.

The following sections are also valid for a man page. They must appearin this order:

options, return, errors, example, env, files,conform, notes,diag, restrict, history, see.

Here what goes into the sections:

4.4.1 Synopsis

Tag:<synopsis>.

The command or what to write to call the funtion. For example:

<code/typeset/ <var/options/ <var/files/ ...

or for C funtions and system calls:

#include <something.h>int foo(int bar);int foo2(int bar);

4.4.2 Config

Tag:<config>.

This section explains how a device is configured: major/minor numbers,their meanings and the meaning of the device name.

4.4.3 Description

Tag:<descript>.

long drawn out discussion of the program. It's a good ideato break this up into subsections. Using<sect1.

There is no BUGS section, instead discuss them here.

4.4.4 Options

Tag:<options>.

Some people make this separate from the description.

It's intended to hold one<desc> element (5.3). Don't forget to use the elements<code> and<var> here. Example:

<desc><dt><code/-option/ <var/file/    Text describing the option....

4.4.5 Return Value

Tag:<return>.

What the program or function returns if successful.

4.4.6 Errors

Tag:<errors>.

Return codes, either exit status or errno settings.

4.4.7 Examples

give some example uses of the program

4.4.8 Environment

Tag:<env>.

Environment Variables this program might take care about

4.4.9 Files

Tag:<files>.

All files used by the program. Typical usage is a<descagain. (5.3)

4.4.10 Conforming To

Tag:<conform>.

SVID [EXT], AT&T, POSIX, X/OPEN, BSD 4.3

4.4.11 Notes

Tag:<notes>.

Miscellaneous commentary

4.4.12 Diagnostics

Tag:<diag>.

all the possible error messages the program can print out, andwhat they mean.

4.4.13 Restrictions

Tag:<restrict>.

bugs you don't plan to fix :-)

4.4.14 History

Tag:<history>.

Programs derived from other sources sometimes have this.

4.4.15 See Also

Tag:<see>.

Other man pages to check out, like:

<ref t=m id="man(1)"//, <ref t=m id="man(7)"//, <ref t=mid="makewhatis(8)"//.

Refer to (5.5) for a detailed description of<ref>.

4.5 Document type`brief`

The Brief is a DTD according to the German DIN standard.Example:

<!doctype brief public "-//JFW//DTD Brief//DE"><brief fenster=ja><von>&my-adr;<an><adrNAME="Mustermann"VORNAME="Erwin"ORT=MusterhausenSTRASSE="Musterstra&ss;e 7m"PLZ="01000"><datum>1. Januar 1995<betr>Musterschreiben<anrede>Lieber Erwin,<text>Heute reden wir &ue;ber Musterbriefe.Wie gef&ae;llt Dir das?<gruss>Ciao<anlage><pkt>1 Musterschreiben

This gives a letter according to the DIN. The sender address appears asecond time in the window of the envelope. Also fold marks are printedfenster=nein will suppress this.

The tagsanrede, gruss andanlage may be omitted. In thiscase standard text is inserted for the opening and closing. Enclosedis omitted.

The form of the address looks a little complicated. This is becauseit's intended to come from a database. If you set up on, its use willlook like the from address (<von>...).

The entitymy-adr is defined in the file holding the personal datedescribed in section (10).

5 Elements of the document

The different document types share the same elements (Except for``brief'').

5.1 Paragraphs

Paragraphs are seperated with a-Tag.

To reduce markup, an empty line counts as an-tag. Sequences of those tags are reduced to justone.

5.2 Enumerations

There are two kinds of lists, ordered and unordered.

<enum>: opens an enumerated list.
<list>: opens a not numbered list.
<item>: starts a new item.
<o>: same as <item> (short form).

Both<list> and<enum> have to be closed(by</list> or</enum>).

Example:

<list><item> Language to describe the logical structure of text.<o> a tool and library to format SGML text into<enum><o>PostScript<item>HTML</enum><item>One more point</list>

formats to:

Language to describe the logical structure of text.
a tool and library to format SGML text into
1. PostScript
2. HTML
One more point

5.3 Glossare

Glossares are declared by the<desc> Tag.

Again an example:

<desc<dt/<desc>/ opens a description<dt/<dt>/ encloses the described topic</desc>

And the corresponding output.

<desc>: opens a description
<dt>: encloses the described topic

You don't want to put newlines (starting paragraphs) between the<desc> and the<dt>. If you want (as me) omit theclosing > from the<desc> (say<desc).

5.4 Pictures

We don't have a chance to describe pictures in terms of SGML, but wecan tell where the formating application should put them.

To include pictures one can use:

Entity definitions and references.The reference is either a less controlled entity reference like&name; or it's done by using theforeign-tag. The latter should be prefered, because it couldserve more control, but in fact there not much of a difference.
This is the prefered method.
Inlined code, see (5.13).
For some cases, like including GIF-pictures into HTML documents,it's nessesarry to compromise the portability and targetindependance of the document.
Those cases need processing instructions. For the mentioned caseone needs the following:
1. ASDATA entity to get a literal > into the finaloutput. This could be achieved by having this definition in the``header'':
```
<!ENTITY lit-gt SDATA ">" >
```
2. A Processing instruction at the place where the picture (herepicture.gif) goes:
```
 <? <IMG SRC=picture.gif> &lit-gt;
```
 (We need thelit-gt because SGML has no way to have(escape) a >-character within a processing instruction)

In all the cases there are two ways how to include the picture, eitherdirectly between the lines of the text or as a floating part of thedocument. The latter form will allow you to put a caption line on itand an identifier to refer to from any part of the document.

Having pictures between the lines usually doesn't look veryprofessional. For the effort word processors require to handle floats,it became common to do so. You should think twice whether this form isappropriate in your case.

If you use entities and notations for your pictures, you need todeclare both in the header of your document. See (B) for a more detailed description and refer to [1]for full details.

Thelocal installation (C) may provide some predefinednotations. A full installation supports at least eps, fig, lfig, roff, latex, tgif.

To include a picture entity (with entity name ``name'') defined usingnotations between the lines of text just write the entity referencelike this:&name;.

To include the same picture as a floating object use the<figure> tag:

<figure id=refname><foreign file=name><caption/The caption line/

Theid attribute tells the name to be used bycross references (5.5) for this figure.

5.5 Cross References

Cross references are introduced by theref-tag. It has twoattributes:

id

The ID it refers to. The interpretation of the iddepends on the value of thet attribute.

t

The type of reference. There are 3 possible values:

X: (the default) The ID refers to some id in the samedocument.
Note that this mean a document as SGML understands it. Ifyou are inside something which is a full document by itself,but is included as aSUBDOC Entity into some otherdocument, you can refer to any id within this document, butnot any in the outer document!
B: Bibliography reference. The Id notes theTagAttribute value in the database (7).
M: The ID refers to a manpage.
U: A URL. The ID hold the complete URL.

5.6 Emphasize

To emphasize long parts of text there is the<quote>-tag.

Example:

<quote/Important result: long citations, definitions and othermaterial which should be emphasized is enclosed by quote-tags./

And the result:

Important result: long citations, definitions and othermaterial which should be emphasized is enclosed by quote-tags.

The<quote>-tag has astyle attribute accepting thevaluesdefault(wich is the same as if nothing is given)andcenter.These give the recomented style.Default is to narrow the text a little,while center narrows and centers the quoted material.

5.7 Footnotes

Footnotes are enclosed by<footnote>-tags.

<footnote/You can't have figures in footnotes./

Will format as(1).

5.8 Notes

Sometimes you may like to have longer side notes. These are enclosedby<note>. This looks like this:

This text has been enclosed by`<note>` and`</note>`.

5.9 Verbatim copied text

For excerpts of source code there is the<verb>-tag. Theexamples in this text are made mostly by this. Code inside of the verbmarked region is not at all processed for SGML references. Thereforeno references to any entity are possible inside (like<).

There is also a variant of<verb> called<rverb>.This means ``replacable'' verbatim. The contents of<rverbregion is parsed for entity references. Thus allowing references toexternal entities or SGML-end tags inside. The examples which have endtags inside are written using this element.

5.10 Emphasizing words

To emphasize short phrases or words you can choose from the preferedtag and these:

: This will produce a different kind of emphasizing due to the level of use.(slanted, bold..)
: Heavy emphasize.
<bf>: Bold face
<it>: italic
<tt>: tele type kind

5.11 Linguistic Markup

For documentation purpose it is common to distinguish between literals(code), variables and meta characters. Therefore markup exists. Thesetag don't nest, that means each of them ends each other of this group.

For literals use<code>, variables<var> and formeta characters<meta>.

Please don't look for use of them in this manual, they are lateintroduced.

5.12 explicit line and page breaks

In rare cases you might need to insert unconditional newlines. There isthe general entitynl for.

To enforce a page break at a certain position use the<newpage> element.

Don't use it too much. It might look strange in some outputformats. You'll need it most with theslide target, so bestenclose them all the time with a marked section like:

<[ %Slide [ <newpage> ]]>

5.13 Inline Code

Using the<inline> element you can include code using othernotations (see (B)) as available local at your site(see (C)).

The<inline> element takes one argumentn which mustbe assigned to a valid notation. This way you can achieve specialeffects or write tables and equations.

Inside the<inline> element you can't write other SGMLmarkup, but you can refer to predefined entities (using the&name; notation). This raises the question how to include a& followed by a letter. For this purpose you need to write adecimal character reference i.e.,&.

Example:

<inline n=lout>45d @Rotate @ShadowBox 2f @Font {that's a funny cyan @Color "&" }</inline>

will give you

If you use the`<inline>` feature, be careful about thefilenames you use: sdc will eventually (due to the need of thetarget format) create files matching the pattern:
basename-of-output`-`number`.`extension.

For admins: The example above might not work ``out of the box'' atyour site since it uses a notation which needs to be set up. See (C)and the documentation about target formats for thingsnessesary.
If you're going to install sdc. See the file`doc/notations.sgml` of the installation and adapt the itto your needs.

5.14 Tables

In response to various requests for tables one or more the following``syntax'', or to put it better the following ideas will be implemented.I strongly encourage everybody to mail me comments about this.

Currently there is one SGML construct to describe tables withinsdc. If someone comes up with a better solution for (it shouldn'tlook too strange in the source -- sdc is supposed to remain a``don't worry'' application) it will be incorporated.

A table is enclosed with the <table> tag. It can contain one ofthe implemented kinds of table (currently only <tbl>).

5.14.1`tbl`

A tbl-table consists of patterns and rows. Each row must name thepattern it is formed after or otherwise it uses the most recent usedor defined pattern. A row itself is a sequence of cells. Cells begineither with a <c>-tag or simply a |.

A pattern consists of tags describing the alignment. There areleft, right, center, decimal andblock. Between these<sep> are allowed to request vertical lines in the table.

The formatting of the cells within a row is described by theassociated tag in the used pattern.

Probably an example is the better way to explain.

The implementation of tables is not finished. At the moment there areno tables at all for LaTeX.
The here defined syntax has a) not all the features of all thebackends, so you have to drop to plain backend (e.a., inline) codeifyou really need those features b) it has more features than allbackends have in common, therefore depending on the backend somefeatures are silently dropped off.
For HTML HTML-3 Tables are implemented at the moment. As there isvirtually no client to display these, one can change to use fixed fontpreformatted tables instead. (If you have a client please check if thecode works: I can't.) Change`html-tbl-writer-function` inthe file`include/layout.scm` to something else but`'html3-write-tbl` or use`-R HTML2` at the command line.

<figure id=tblexam><table<tbl><pattern<left<center<sep double<decimal align=","><r><bf/Name/ | <bf/Group/ | <bf/V/<sep><r> Fred | M | 3,4<r> Sian | F | 5,78<r> Tiger| F | 100<pattern/<right<center<right>/<sep double><r>A | &lt;=> | B

</tbl<table<caption/Example table/</figure>

Will give you what figure (1) shows.

 ||NameGroup|| VFredM|| 3,4SianF|| 5,78TigerF|| 100   A<=>B

Example table

Questions:

Better have the freedom to mix the patterns with the rows (as it isnow) or to group all patterns before all the rows.
Is the short reference character ``|'' a good choise for the columnseparator?
How much requests will come for row spawning cells? (They eat moresteam in implementation, thus it will take some more time.)

5.14.2 Next

How about a HTML-3 like syntax. This is even harder toimplement on the existing targets (while it's obviously easier onviewers -- but these are internals). Open questions:

I, personaly, dislike the very verbose syntax of HTML-3 tables. Howis yours? Better have<r> and<c> tags (and ``|''as short reference for the latter) to separate row and columns or thelong syntax of HTML-3. The latter will mess up your source code, buthas the advantage to be the same as within HTML. (At the other handwhy should we be compatible with the stuff we compile out of thesource?)

If the above mentioned ways to create tables don't suit your needs youcan use notations see (B). And you might considerutilizing the<inline> element (see (5.13)) towrite the table using other notations as available local at your site(see (C)).

5.15 Equations

There is a limited support for equations at the moment. As they areused they'll be added. To give an example what's supported at themoment:

<quote><bf/Lemma:/ <math/&alpha; &isin; <set/M<sup/200/<sub/4// &cap;<set/N//</quote>

Will yield:

Lemma: [alpha] [isin]M^200_4 [cap]N

Future version will probably support the same features as HTML-3. (Asthis is in fact simply a SGMLish translation of LaTeX's idea ofequations and the latter is commonly treated to be the best way towrite equations.) The question remains: is this the best way or doesanybody have a better idea?

As mentioned the support on equations is limited at the moment andintented for simple formulas. The translation into formatinginstructions is straight forward implemented and not alwaysreliable. If you intent to use LaTeX as backend you want to use<inline latex> (see. (5.13))and native LaTeXcoding instead.

It's not a big task to extend the formating rules to filter thingslike that through the PostScript backend and convert it into say GIFfor HTML. But nobody came around to program it yet.

5.16 Foreign Languages

sdc will insert standard text phrases if appropriated. These dependon the language of the document.

In the simple case you can choose the language for the whole document.This is done by the document declaration. That is:

<!doctype document public "-//JFW//DTD Document//DE" >

will produce a German document and:

<!doctype document public "-//JFW//DTD Document//EN" >

an English one.

Furthermore the tagsreport, document, chapt, sect, sect1, sect2,lang have an attributelang which will temporary change thelanguage in the enclosed part.

Footnote

1): You can't have figures in footnotes.

6 Creating Indexes

sdc creates an index for a document if at least one<index> tag was used in the text.

An index tag can appear at every place where text is allowed exceptfor headlines, which are terminated by.

For the index tag the following attributes are defined:

id: The topic which will appear in the index section.This attribute is required.
sub: An optional subtopic.

Creating useful indexes is an art by itself. Therefore you shouldchoose the attribute values carfully. The most common (recomended)way to use indexes is like that:

<sect id=Indexes>Creating Indexes<index id=Element sub=index<index id=Indexes>

Pleasenote: Because of an unresolved problem it's better to usethe short notation as above. That is:don't close the<index tags (omit the>). Otherwise the PostScriptoutput will create an empty paragraph.

As mentioned, the<index> tag is not restricted to be usedimmediately after the section start.

7 Bibliography database

sdc can use one or more databases for bibliography. If items ofthem are referenced (by a<ref> see (5.5)), a sectionis automatically appended to the document and the referenced items arelisted.

To use a data base you need to include the database file for instancelike this:

<!doctype report public "-//JFW//DTD Report//DE" [<!entity bib system "intro-bib.sgml" subdoc>]>

And youhave to reference the entity later on anywhere in thedocument like this:&bib;. But be sure to do so at somepoint where data is allowed, e.g., where the first paragraphbegins. (Otherwise you'll violate SGML rules.)

The database has the following structure (with repeated BIBL's):

<!doctype bibdata public "-//JFW//DTD Bibliography//EN" []><BIBL     Tag="SGMLGuide"     Author="Martin Bryan"     Title="SGML an authors guide"     Publ="Addison Wesley Publishing Company"     Year="1993"     exc= "not avail"     ISBN="...."     COM="a comment on the book">

Don't write data in this file, only start tags as the one above.

This data base will be extended (some day) to support at least allfeatures abibtex data base supports.

Please note: If you get warnings about items not in the data base,then the numbers within the references may be wrong. Those numbers areonly correct if all references could be resolved.

8 Conditional Inclusion

You can always include text depending on the definition of some parameterentities as you can fromsgmls. That is you write a markedsection like this:

<![ %Name [conditional included Text]]>

Where name is a prior defined parameter entity. Refer to section (14) for predefined entities and conventions withsdc. To exclude the text by default you need to have a entitydefinition like the following in the document type definition:

<!ENTITY % Name "IGNORE">

Now you're able to include the marked section by a command line switch -i Name. This will pretend, that a entity definition

<!ENTITY % Name "INCLUDE">

is seen from the parser, which will overwrite the other one.

Depending on the target format (the-O switch) and the publictext language some parameter entities are predefined. See (14) for details.

9 Slides

Slides are extracted from ordinary documents, better say only fromdocuments of the typesdocument, report andbook.

Everything enclosed by a<slide> tag goes onto oneslide. (Or more than one slide, we can't control what happens if aslide gets overfilled.) Everything outside of those tags is simplydropped.

Slide tags might appear wherever a tag is allowed.

If you have more stuff in the document as you want to have on theslide (as it is usually the case) you can exclude parts by markedsections, see (8).

The title used for the slide is that given to the last division (likechapt, sect sect1) seen. The slides are grouped by the sect's of thedocument. So if all you want is a document full of slides, just writeone and put<slide>'s in instead of paragraphs.

As there is only a ``Slide'' entity defined if you generate your slidesbut the usual case is to exclude stuff from them put three lineslike:
<!ENTITY % Slide "IGNORE"> <[ %Slide [ <!ENTITY % noSlide "IGNORE"> ]]>; <!ENTITY % noSlide ``INCLUDE'' >
This way noSlide is defined if Slide is not. Why? SGML say's the firstdefinition wins. If you define Slide via the target noSlide is definedto ignore otherwise to include.

10 Personal Data

At some pointssdc will insert person dependant text like thedefault for the authors name and institution.

These are defined in a document calledpersonal.data. Usuallyit's to be found in the directoryDOCPATH points to (see (3)).

There is a file calledexample-personal.datain the standardlibrary directory to be copied and adapted.

The following general entitiesmust be defined from this file.(They are used to be the default for the corresponding attributes)

myself: The name of the author.
my-Inst: The institution / organisation.

11 Appendices

Appendices are all the sections which follow the<appendix>-tag.

12 Large documents

Large documents can be spread over individual files. These filestogether form the document. In principle there are two way to declarethem.

As a General Entity (see (A)).
As a SUBDOC General Entity

The first form will insert the whole document instead of the entityreference. This means, that the markup is part of the document, theentity is included in. This restricts what you can put into thedocument. but is has the advantage, that cross references work fromoutside into and from inside out.

Example: within the<doctype ... one can have a entitydeclaration like:

<doctype ... [...<!ENTITY t.desc system "descript.text" >...]>

and later on, at the position of the document where the contents ofthe file is to appear a usual entity reference like<t.desc;. (Note that you can not cross reference to thatpiece of the document as a whole, but to the id's defined within.)

sdc won't do anything interesting with the first form. It'salready handled by the SGML parser.

The second form, subdoc entities, makes the part to be included adocument of it's own. That is, it can (could) be used without beingincluded. Hence you can't put references from inside out. SGML alsodisallows references from outside into it.

Those subdoc entities are restructured to form a division at the placethe references occur, e.g., if you are within a section and referencea subdoc entity you get a subsection holding the text.sdc achieves this bypreprocessing the subdoc entity in a smart way.It re-tags the dcoument as if it wastaged according to the document-DTD.

As explained you can't do any cross referencing between the includeddocument and the outside document. But you can reference to thesubdocument as a whole by using it's entity name.

Example: the entity declaration changes from the above form tosomething like:

<doctype ... [...<!ENTITY bib SYSTEM "intro-bib.sgml" SUBDOC>...]>

Note that the declaration is extended by theSUBDOCkeyword. And remember that documents included this way must start witha<doctype... declaration while those suitable for the firstform must not.

Those subdoc entities are included the same way, that is in theexample by an entity reference tolt;bib;. In this case theentity name (bib) becomes available for cross referencing (you can dolt;ref id=bib// in this example).

13 Literate Programing

sdc supports so called ``literate programing''. That is the sourcecode and it's documentation are mixed. To get the plain code it isfirst to be striped from the documentation.

With sdc it is possible to have the sources which goes into onefile spread over a set of files, i.e., one document and also to haveone document contain the source of a set of files.

To do literate programing the code is surounded by the<literate tag. The literate tag has one attribute calledfile, the name of the file where the code goes. This file isopened at the first occurence and closed at the end of the work ofsdc. If no file is given the last used is implied if none at allhas been specified prior standard output is used.

Within theliterate taged area there may be anything what canappear in normal text. But no formating is applied to the output iftarget literate is choosen. The typical (intented) contents of it areverb and rverb elements and may be plain text paragraphs. Because ofthe formating applied on other target formats plain text paragraphsare only a good idea if the code (or comment) is readable afterreformating.

With the features of replacement and conditional inclusion this formof literate programing might be useful for simple preprocessing.

14 Parameter Entities

SGML knows two kinds of entities: general entities and parameterentities. While general entities can be referenced anywhere, parameterentities are restricted to the meta level where the document structureis defined. This is at most only the document type description. Butalso marked sections can use parameter entities for conditional inclusion(see (8)).

To avoid conflicts between user defined parameter entities and predefinedparameter entities there is one convention: All parameter entities definedby sdc begin with a upper case letter. User defined parameterentities should therefore begin with a lower case letter. This is byno means enforced but a recommented convention.

Depending on the target format the following parameter entities arepredefined. Only the one associated with the target format is definedtoINCLUDE all other yieldIGNORE.

<IGNORE the.index id=``Entity'' sub=``predefined''>

LaTeX: for LaTeX processed documents.
Lout: for PostScript output through Lout
HTML: for HTML pages
ASCII: for ASCII output
Info: for output in Info format
Literat: for literate programing
RTF: for RTF output
Slide: together with Lout for extracting slides.

Depending on the public text language parameter entities named as the ISO639 defined language codes will be define toINCLUDE. The otheragain toIGNORE.

Currently only the definition to include is handled. It's up to you todefine them toIGNORE in you doctype diretive. Also onlyEnglish and German are supported at the moment.

Here the codes reserved for this purpose:

EN: English, DE: German, FR: French, GR: Greek, IT: Italian, NL:Dutch, ES: Spanish, PT: Portuguese, AR: Arabic, HE: Hebrew, RU:Russian, CH: Chinese, JA: Japanese, HI: Hindi, UR: Urdu, SA: Sanscrit.

15 Changing the Layout

As it is not intended to have particular formating instructions orsomething like that within the document, it can't be done.

To change the formating use the-R option to load a schemefile. If you do so, be sure what you do! You might want to have a lookat the fileinclude/layout.scm as the definitions thereare most likely to be changed due to taste. Here an example of such anfile showing how to change the initial font for lout.

(doc-preprocess-hook 'add (lambda ()   (if (equal? doc-output "lout")   (eval (set! lout-initial-font "Times Base 12p")))))

Having this within a file, sayTB12.scm call sdc like

sdc -R TB12.scm -Ops -o output.ps text.sgml

A General Entities

Currently there are general entities for setting mathematical symbolspredefined. Their names are those defined by Addison Wesley.

Also the set of general entities for Greek characters are defined asby Addison Wesley.

To see these set of entities and their repesentation, format adocument like the following into the target format you need.

<!doctype document public "-//JFW//DTD Document//EN" [<!ENTITY f.AWm SYSTEM "AWmaths.text" ><!ENTITY f.AWg SYSTEM "AWgreek.text" >]><document face="2c 2s nidx">General Entities<sect>AWmaths&f.AWm<sect>AWgreek&f.AWg

Prior version had a problem with the`<` Entity. There wheretwo flavors,`<` and`<`. The problem is solved, seethe changes section.

B Notations

SGML provides so called notations to define entities which use``foreign'' descriptions of their content. That is, the entity might bea file containg figure drawn with say tgif. The interpretion of systemnotation is implementation dependant.

You might define notations. And some entities using them. sdcprovides an interpretion for notations which are declared to beSYSTEM.

The system identifier is used to start external commands. Prior thecommand execution the following macros are expanded:

%s: the system identifier of the entity
%f: the complete filename of the entity
%%: a single%

If the notation is to be applied to inlined code, a temporary file isused.

Example:

<!DOCTYPE document public "-//JFW//DTD Document//DE"[<!NOTATION cat SYSTEM "cat %f" ><!ENTITY f1 SYSTEM "file1" NDATA cat><!ENTITY f2 SYSTEM "something" NDATA cat>]><documentface="2s 2c">Notation DemoSome text goes here. &f1;And now call &f2;.

For an extensive example see`doc/nottest.sgml` in thesdc distibution. This shows the ``real world'' notations provided bydefault by sdc.

The above example declares a notations named ``cat'' which starts theexternal programcat with one argument, the full filename ofthe entity the notation is apllied to.

Then two entities (f1 and f2) are declared. These use both thenotationcat. Their system identifiers are ``file1'' and``something'' (i.e., their content is stored in these files). And theircontent is declared to be in the notationcat. (That is done bytheNDATA like notation data keyword followed by the name ofthe notation i.e.,cat for the example).

When the entities are referenced (&f1; and&f2;)the command line associated with the notation (i.e.,cat %f) isexecuted (after the expansion of macros as explained above). Theoutput of this command is feed into the backend/output of sdc.

Make sure, that the output generated by the external programm issuitable for the target format. If needed, use conditional definitionsfor the notations depending on the target format. (See section (8) for target dependant defined parameter entities.)Needless to say, the gain in comfort and features is paid by a loss indocument portability.

A possible application could be to have a picture stored in someformat (like that of xfig). If there is a programfig2dev whichtranslates this format into encapsulated PostScript you may wrap it bya shell script like this:

#!/bin/shSRC=`basename $1`TARGET=`basename $1 .fig`.epsfig2dev -L ps $SRC > $TARGETecho "@IncludeGraphic{" \"$TARGET\" "}"

If you use this notation with the Lout backend to produce PostScript,this will insert the picture on every reference to the correspondingentity.

C Local Features

The following notations are defined by default:

ignore: which will just print a note, that it has been used. It's intended toget rid of the functionality implied by others.
eps: This is Encapsulated PostScript.
fig: This is to be used with figures drawn byxfig. It willautomatically retrieve the referenced file and transform it into theproper format for the target.
lfig: Figures drawn using the @Fig package of Lout. See the Lout user manual[2] for a description.
roff: Preprocessestbl and other roff code. Be sure not to exceed onepage with the image you get from this notation.
latex: Pieces (especially supposed to be formulas) ofLaTeX code.
tgif: This is for figures drawn bytgif.

To use them you need to extend the head of the document like:

<!doctype document public "-//JFW//DTD Document//DE"[<!ENTITY fig1 SYSTEM "figure1.fig" NDATA fig ><!ENTITY tbl1 SYSTEM "table1.eps" NDATA eps >]>

Then you can use it as described in section (5.4).

D Installation

See the file INSTALL.

E Changes

(26th June 96) arguments to rc files
(19th June 96) Version 0.7: Introduced reparsing of thedocument. Implementation based on streams and an recursive descendantparser.
Because the old code is left in, some files are hardly clean and easyto read.
SUBDOC entities supported. You can include completedocuments within others and get them restructured as divisions of theouter document.
DTD's changed to use mixed content model. Not explicit tags nessesary anymore.
<newpage supported within list items.
Tgif supported as notation.
Most (the computational intensive) notation handling based onMakefiles.
<quote> element has got anstyle attribute.
-D option added. It adds a directory at front of the pathgiven by the environment variableDOCPATH.
The&ltc; is not needed anymore. (But still allowed)
(25th. Oct. 95)SGML_PATH is no longer used (butoverwritten!). The new name for the path pointing to the documenttree isDOCPATH. It is a simple colon separated path, pointingto the directories to be searched for document entities.
(25 Oct 95) Proposed tables work a little with Lout (they work aslong as there are no row/column spawning cells). Unforunatly theimplementation revealed limits of Lout. As there is obviously no wayto implement the full semantic without changes to Lout, and the latterare not within reach.
rc files added
library is now a path

F Problems

If you get any problems don't hesitate to mail me:

<joerg.wittenberger@inf.tu-dresden.de>

Known Problems:

Ghostscript version has a problem 3.3 writing ppm files. If sdcrefuses to generate gif files try ghostscript version 2.6.1.
LaTeX output dischards leading whithespaces in <verb> and<rverb> elements. I don't know how to coerce LaTeX to keepthem. There is a macro \psedoverb in the converter file for LaTeXif you have an idea how to correct this macro please drop me a note.
Quote characters are internal translated into markup. They can'tspawn paragraph boundaries.

H Bibliography

1
: Martin BryanSGML an authors guide; 1993Addison Wesley Publishing Companyn
2
: Jeffrey H. KinstonA User's Guide to the Lout Document Formatting System (Version 3); 1994Basser Department of Computer Sciencen

I Index

Appendix(o)
Bibliography(o)
Cross Reference(o),(o)
- footnote(o)
Custumization(o)
DTD(o),(o)
- book(o)
- brief(o)
- document(o)
- manpage(o)
- report(o)
Element(o)
- appendix(o)
- code, var, meta(o)
- desc(o)
- enum(o)
- figure(o)
- footnote(o)
- index(o)
- inline(o)
- item(o)
- lang(o)
- list(o)
- note(o)
- o(o)
- p(o)
- paragraph(o)
Emphasize
- documenting code(o)
- long Text(o)
- words and phrases(o)
Entities
- genereal(o)
- parameter(o)
- predefined(o),(o)
- referencing(o)
Environment(o)
- DOCPATH(o)
Equation(o)
Files(o)
- personal.data(o)
Footnotes(o)
Greek characters(o)
Indexes(o)
Invocation(o)
- options(o)
Language(o)
Notation(o)
- application(o),(o)
- application of(o)
- handling(o)
- inlined(o)
- local predefined(o)
Pictures(o)
Re-Taging(o)
SUBDOC(o)
Slides(o)
Tables(o)
conditional inclusion(o)
element
- rverb(o)
- verb(o)
foreign data(o),(o),(o)
layout(o)
local(o)
marked section(o)
newline(o)
newpage(o)
target formats(o),(o)

Jörg Wittenberger

[8]ページ先頭

Movatterモバイル変換

Typeset, a short introduction

1 Overview

3.1 Environment

4.1 Document typedocument

4.4.2 Config

4.4.5 Return Value

4.4.8 Environment

5.1 Paragraphs

5.3 Glossare

Having pictures between the lines usually doesn't look veryprofessional. For the effort word processors require to handle floats,it became common to do so. You should think twice whether this form isappropriate in your case.

5.6 Emphasize

This text has been enclosed by<note> and</note>.

5.9 Verbatim copied text

5.11 Linguistic Markup

5.13 Inline Code

If you use the<inline> feature, be careful about thefilenames you use: sdc will eventually (due to the need of thetarget format) create files matching the pattern:basename-of-output-number.extension.

5.14 Tables

5.14.2 Next

It's not a big task to extend the formating rules to filter thingslike that through the PostScript backend and convert it into say GIFfor HTML. But nobody came around to program it yet.

5.16 Foreign Languages

6 Creating Indexes

10 Personal Data

11 Appendices

12 Large documents

A General Entities

Prior version had a problem with the< Entity. There wheretwo flavors,< and<. The problem is solved, seethe changes section.

B Notations

For an extensive example seedoc/nottest.sgml in thesdc distibution. This shows the ``real world'' notations provided bydefault by sdc.

E Changes

F Problems

H Bibliography

I Index

4.1 Document type`document`

This text has been enclosed by`<note>` and`</note>`.

If you use the`<inline>` feature, be careful about thefilenames you use: sdc will eventually (due to the need of thetarget format) create files matching the pattern:
basename-of-output`-`number`.`extension.

Prior version had a problem with the`<` Entity. There wheretwo flavors,`<` and`<`. The problem is solved, seethe changes section.

For an extensive example see`doc/nottest.sgml` in thesdc distibution. This shows the ``real world'' notations provided bydefault by sdc.