TEI to TeX Conversion

The traditional method for presenting text-critical comments to critical text editions is by means of a footnote apparatus that is connected to line numbering in the main text. We have developed a method to prepare our XML documents in such a way that this traditional presentation can be generated automatically from the XML source by means of a Perl conversion. The output of our script can be processed immediately by the TeX typesetting system, a versatile, programmable system that is mostly used in scientific typesetting. In our XML documents we embed processing instructions that do not belong to the structural markup of the documents. These instructions are not relevant for the on-line presenation but they are processed by the TEI-to-TeX conversion script.

Contents

1. Processing Instructions

2. Supplying Context

3. Grouping of Tags

4. Nested Processing Instructions

5. Ellipsis

6. Hiding

7. Commentary Notes and Anchors

8. Reference Chart

1. Processing Instructions [Back to Contents]

Processing instructions are always used in pairs <?B0> ... <?E0>. These processing instructions indicate that the enclosed fragment must be referred to in the transcription apparatus. After processing the XML documents, the main text is printed in normalized fashion, i.e. deletions are not printed and additions are printed as regular text. The notes, on the other hand, contain all information about the additions, deletions and other markup dealing with the transcription. As a result, if the editor does not place any processing instructions the text is converted to a clean normalized edition without any notes in the transcription apparatus.

Here is a sample of this procedure. The following text passage contains a supralinear addition, that the editor wants to mention in the apparatus.

For the Logick which tooke place <add>though it</add> might
doe well enough in civil affairs ...

If the text is left as it is, the TEI-to-TeX script will produce only main text ‘For the Logick which tooke place though it might doe well enough in civil affairs’. In order to generate a note processing instructions must be placed.

For the Logick which tooke place <?B0?><add>though it</add><?E0?> might
doe well enough in civil affairs ...

Now the phrase <add>though it</add> is not only present in the main text, but it is also transferred to the critical apparatus, where it will appear in a form more or less similar to ‘10] `though it´’ in which 10 is a reference to the line number of the phrase in de the main text while the quote-like symbols indicate an addition. Note that the graphical presentation for additions, deletions, etc. is not built into the conversion script rigidly; it is laid down in TeX definitions that can be adjusted easily, depending on typographical requirements the for the edition.

2. Supplying Context [Back to Contents]

Often a note in the critical apparatus is only clear to the reader if some amount of context is supplied. This is the case for all deletions, as they do not appear in the normalized text. Here is an example.
voluntary action upon some precedent knowledg <?B0?>or
<del>reas<corr>on</corr></del> appearance<?E0?> of knowledg

Now the main text will read ‘voluntary action upon some precedent knowledg or appearance of knowledg’ and the note ‘10] or [reas<on>] appearance’. Without the first and last word of context (‘or’ and ‘appearance’) the reader would not be able to see where the deletion in line 10 was situated.

3. Grouping of Tags [Back to Contents]

Sometimes the editor might want to group several transcription tags into one single critical note. This occurs for example when the tags are related, such as as a deletion that is followed by a addition. Grouping of tags can simply be achieved by choosing the appropriate context, as can be seen in the following example.

as to an agent yet the truth <?B0?>is the <del>minde</del>
<del><unclear>or</unclear></del> <add>man which is the
agent</add> determins <del>it self</del> him self<?E0?> to this
or that voluntary action

Here the main text will read ‘as to an agent yet the truth is the man which is the agent determins him self to this or that voluntary action’. To this text the following note is attached: ‘10] is the [minde] [or] `man which is the agent´ determins [it self] him self’, which shows three deletions, one addition and one unclear phrase in one critical note.

4. Nested Processing Instructions [Back to Contents]

Sometimes the editor may want to create several critical notes dealing with the same passage. For this reason the level of processing instructions can be modified: <?B0> ... <?E0>, <?B1> ... <?E1>, <?B2> ... <?E2>, etc. The rule for determining the level of the processing instruction is straightforward: the level of a new pair of processing instructions is the lowest available number that is currently not open. Here are some examples of possible sequences.

<?B0> ... <?B1> ... <?E1> ... <?E0>
<?B0> ... <?B1> ... <?E0> ... <?E1>
<?B0> ... <?B1> ... <?E0> ... <?B0> ... <?E0> ... <?E1>
<?B0> ... <?B1> ... <?E0> ... <?B0> ... <?B2> ... <?E0> ... <?E2> ... <?E1>

Here is a real-world example of overlapping processing instructions taken from our encoding of Locke’s Of the Conduct of the Understanding. Note that for the sake of readability indenting has been applied (see the Line breaking and indentation conventions section, which also explains the use of the &sp; entity). The processing instructions <?Bhide1?> ... <?Ehide1?> will be introduced later.

rational creatures and Christians&sp;
<?B0?>
  (for <del>I can not thinke them</del> <add>they can hardly be&sp;
  <?B1?>
    thought<?Bhide1?></add><?Ehide1?>
  <?E0?>&sp;
  <del>them</del> realy
<?E1?>&sp;
to be soe&sp;

The resulting main text of this passage will be ‘rational creatures and Christians (for they can hardly be thought to be soe’. To this passage two critical notes will be attached: ‘10] (for [I can not thinke them] `they can hardly be thought´’ and ‘10-11] thought [them] realy’. Of course, here the line numbers ‘10]’ and ‘10-11]’ are only provided as dummy data; these numbers will be calculated by TeX when processing the TeX documents.

Note that most often overlapping notes occur when the editor attaches a note to a long passage (using <?Bellips0?> ... <?Eellips0?>; see next paragraph) that contains tags that also contains notes on smaller phrases. In this case, the outer note is <?B0> ... <?E0>, containing several <?B1> ... <?E1> notes.

5. Ellipsis [Back to Contents]

Sometimes it is not necessary to repeat the entire text enclosed between <?B0> ... <?E0> in the critical apparatus, as the phrase is already present in the main text. This is particularly true for long additions. The content of a note can be abbreviated by means of the processing instructions <?Bellips0?> ... <?Eellips0?>, which indicates that the enclosed phrase is substituted by an ellipsis (...) in the note text. The ‘0’ of <?Bellips0?> corresponds to the ‘0’ of <?B0?>, which indicates that the ellipsis is applied in the note produced by <?B0> ... <?E0>.

All <?B0?>these <del>admird and unimitable by the unpractised</del>
<add place="p67">admired <?Bellips0?>motions beyond the reach and almost the
conception of unpractised<?Eellips0?> spectators</add><?E0?> are noething
but the mere effects

Now the main text will be ‘All these admired motions beyond the reach and almost the conception of unpractised spectators are noething but the mere effects’ with attached note ‘10-12] these [admird and unimitable by the unpractised] `admired ... spectators´ (add. p. 67)’. This example also shows that the contents of the place="p67" attribute of the <add> tag are translated into a remark between parentheses.

6. Hiding [Back to Contents]

Instead of replacing a phrase with an ellipsis by menas of <?Bellips0?> ... <?Eellips0?>, it is also possible to suppress a phrase entirely by means of the <?Bhide0?> ... <?Ehide0?> processing instructions. Hiding phrases may be necessary on various occasions. One example was presented earlier.

rational creatures and Christians&sp;
<?B0?>
  (for <del>I can not thinke them</del> <add>they can hardly be&sp;
  <?B1?>
    thought<?Bhide1?></add><?Ehide1?>
  <?E0?>&sp;
  <del>them</del> realy
<?E1?>&sp;
to be soe&sp;

Here the closing tag </add> was hidden for the <?B1?> ... <?E1?> note. This was necessary because of a very important rule that applies to the contents of processing instructions which is: the tags enclosed within a pair of <?B0?> ... <?E0?> processing instructions (of whatever level) must be balanced. As there is no opening <add> present in the note above, the closing tag must be suppressed.

Hiding can also be desirable to avoid reduplication of information in the notes. For example, a marginal note (<note place="margin">) that is already shown in the main text does not need to be repeated in a note. Here is another example.

parts (for these&sp;
<?B0?>
  few <del><gap extent="2"/></del>&sp;
  <?Bhide0?>
  <?B1?>
    <pb n="78" rend="nocatch"/>
  <?E1?>
  <?Ehide0?>&sp;
  whose
<?E0?>&sp;
case that is are

The resulting main text reads ‘parts (for these few | whose case that is are’, where the folio break is accompanied by ‘78’ in the margin. Two notes are attached: ‘10] few [..] whose’ and ‘10] no catchword’. If no hiding is used the first note would read something like ‘10] few [..] | (no catchword) whose’. However, here the editor prefers to present information about the missing catchword in a seperate note because he has presented catchword information in separate notes elsewhere.

7. Commentary Notes and Anchors [Back to Contents]

Sometimes the editor may want to make a annotation on the transcription that cannot be captured by standard TEI tags. The tag <note resp="ed" n="trans"> allows the editor to place notes with any desired content.

severall abilitys in the same kinde.
<?B0?>
<note resp="ed" n="trans">
  <hi rend="italic">End of paragraph marked by vertical line.</hi>
</note>
<?E0?>

In this example the resulting note in the critical apparatus will read ‘10] End of paragraph marked by vertical line.’; the line reference is connected to the end of the phrase ‘severall abilitys in the same kinde.’ in the main text. Note that the processing instructions <?B0> ... <?E0> are still required; without them the contents of the <note resp="ed" n="trans"> would not show up in the apparatus. This tag can also be combined with other tags, as is shown in the next example.

and parts doe not arise soe much from&sp;
<?B0?>
  their <del>nall</del> <add place="p69">naturall</add>
  <note resp="ed" n="trans">
    <hi rend="italic">abbreviation expanded for copyist</hi>
  </note>
<?E0?>&sp;
faculties as acquired habits.

Now the main text will read ‘and parts doe not arise soe much from their naturall faculties as acquired habits.’ with attached note ‘their [nall] `naturall´ (add. p. 69; abbreviation expanded for copyist)’ Note that the place="p69" attribute of the <add> tag is grouped together with the contents of the <note resp="ed" n="trans"> into one set of parentheses by the conversion script.

The two last examples showed notes that were attached to a single point in the main text. However, in TEI the editor can also add notes to a phrase instead of a point by using the <note resp="ed" n="trans" targetEnd="number"> tag accompanied by the <anchor id="number"> tag with a corresponsing number. This type of note can be processed by the Tei-to-TeX conversion script if the processing instructions <?B0?> ... <?E0?> can still placed around the <note> ... </note>; the important difference is that the position of the <anchor> tag will be used when calculating the line references. To enhance readability &sp; entities have been removed in the following example.

case they have been convinced
<?B0?>
  <note resp="ed" n="trans" targetEnd="E1">
    there is none <hi rend="italic">at first ended this paragraph, followed
    by a new paragraph starting with</hi> <del>Men must have something to
    rely on</del> <hi rend="italic">however, these words were deleted and
    replaced by the following add. that did not start a new paragraph, but
    continued the original paragraph</hi> <add>but men would be intolerable
    to them</add>
  </note>
<?E0?>
there is none&supdot;<del>Men must have something to rely on</del>
<add>but men would be intolerable to them</add>
<anchor id="E1"/>

Now the main text will read ‘case they have been convinced there is none. but men would be intolerable to them’ and the note text ‘10-13] there is none at first ended this paragraph, followed by a new paragraph starting with [Men must have something to rely on] however, these words were deleted and replaced by the following add. that did not start a new paragraph, but continued the original paragraph `but men would be intolerable to them´’. Note that within the text of a <note>, tags like <add> and <del> can still be used. Note further that the phrase between <?E0?> and <anchor id="E1"> does not show up in the apparatus criticus, because it is not enclosed between <?B0?> ... <?E0?> processing instructions. While irrelevant to the TEI-to_TeX script, the tagging here is still relevant for the on-line presentation.

8. Reference Chart [Back to Contents]

In the following chart the XML source, the web presentation and the typeset presentation of the main tags and combinations of tags is layed down. It is assumed that all XML samples in the first column are surrounded by <?B0> ... <?E0> processing instructions, unless the processing instruction are indicated explicitly. The line references in the last column have been left out. In addition, due to the limitations of the character set available to web browsers, some symbols in the typeset examples are only an approximation of the symbols that will actually be typeset. Comments not pertaining to the samples are typeset in blue.