Two pages from a David Livingstone notebook. One page has linguistic notes; the other Swahili words with English definitions.

One More Voice


Lost Voices from the British Empire's Archives

Coding Guidelines

The One More Voice project has developed a set of lightweight TEI coding guidelines in keeping with the minimal computing approach of the project as a whole. The guidelines are based on a simplified version of the coding guidelines previously created for Livingstone Online. The latter were developed over a ten-year period through a series of Livingstone Online initiatives. For the One More Voice guidelines, the project prioritized key structural and content TEI tags, attributes, and values with two primary goals in mind: 1) Providing a basic TEI scaffolding to support further encoding work by the project contributors or by others; and 2) Enabling edited and encoded TEI items to be transformed (via XSLT) into facsimile HTML versions for online publication.

The list below includes all tags and attributes selected for these purposes, but does not include any tags and attributes exclusive to the TEI header since those do not vary document to document and remain unchanged from Livingstone Online practices. Each tag is followed by a brief gloss and the number of the corresponding sections of the Livingstone Online coding guidelines, which provide further information about practices related to the tag and any relevant attributes and values. As relevant, the entries below also conclude with discussion of significant deviations from the Livingstone Online guidelines.

<ab> - text with no clear relationship to the surrounding text - 11.12; @rend - 11.12

<add> - additions to the original text - 11.13; @place - 11.13; @rend - 11.13

<bloc> - multinational entities, usually continents - 15

<cb> - column breaks (placed wherever relevant in the text) - 11.11

<choice> - segments of text for which the editors have provided alternatives (also see <sic> and <corr>, and <orig> and <reg>) - 12.2, 12.7

<corr> - suggested editorial corrections of the original text (also see <sic>) - 12.2

<country> - countries - 15

<date> - dates - 9, @from - 9; @to - 9; @when - 9

<del> - deletions from the original text - 11.13; @rend - 11.13

<div> - divisions of the original text - 4; @rend - 4

<figDesc> - editorial descriptions of figures, drawing, and calculations - 21

<figure> - figures, drawing, and calculations - 21; @rend - 21

<foreign> - foreign words and phrases - 19

<gap> - missing portions of text or text that is completely illegible (also see <unclear>) - 12.5; @extent - 12.5; @unit - 12. 5

<geogName> - formally named geographical entities - 15

<hi> - formatting in the original text such as underlining - 11.3; @rend - 11.3

<lb> - line breaks (placed at the beginning of lines in coded text) - 7; @break - 11.5

<metamark> - textual marks used to indicate original authorial or editorial amendations - 11.3; @place - 11.3; @rend - 11.3

<milestone> - lines in the original text used to separate different sections - 11.9; @rend - 11.9; @unit - 11.9

<note> - authorial notes within the original text - 11.17

<orgName> - formally-named groups and organizations not covered by <term type="collective"> - 17-17.1

<orig> - nineteenth-century variants of words for which the editors have provided contemporary standardized versions (also see <reg>) - 12.7

<p> - paragraphs - 8; @rend - 8

<pb> - page breaks - 6; @facs - 6; @n - 6

<persName> - formally-named individuals - 14

<placeName> - formally-named places - 15 - now used only as a fallback option in cases where <bloc>, <country>, <region>, <settlement>, and <geogName> are not appropriate

<reg> - contemporary standardized versions of words that appear in nineteenth-century variants in the original text (also see <orig>) - 12.7

<region> - formally-named regions; used when <country> or <bloc> are not appropriate - 15

<seg> - segments of text somehow unusual or noteworthy - 11.4 and elsewhere; @rend - 11.4 and elsewhere; @type - 11.4 and elsewhere; also @type="metamark" now used to tag metamarks (e.g., <seg type="metamark">^</seg>) and @type="hyphen" now used to tag hyphens in words that break over two lines (e.g., <w>Livings<seg type="hyphen">-</seg><lb break="no"/>tone</w>)

<settlement> - settlements such as states, cities, towns, and villages - 15

<sic> - original text deemed as somehow erroneous by the editors (also see <corr>) - 12.2

<space> - unusual spaces in the original text - 11.6-11.8; @dim - 11.6-11.8; @extent - 11.6-11.8 - use value of "char" for one to four spaces; use "word" for five spaces; @unit - 11.6-11.8

<supplied> - text supplied by the editors - 12.6

<term> - significant individual and plural entities - 16, all subsections; @type - 16, all subsections - only use values of "collective" (which replaces "nationality," "ethnic group," "tribe," "people," "person," "religion," and "faith"), "animal", "insect", "foodstuff", "plant"

<unclear> - text that is unclear but partly legible (also see <gap>) - 12.3

<w> - words that break over two lines of text or are otherwise somehow broken up - 11.5