This document provides a formal definition of I5, a derivation of the IDS/XCES vocabulary
as a customization of TEI P5. It contains the various specGrp elements needed to
specify a customization of TEI, together with accompanying prose explaining the logic of the
customization.
The primary goal is to provide a definition of the IDS/XCES vocabulary on the basis of TEI
P5 , and not (via XCES and CES) on the basis of TEI P3. TEI
P3 customization involved the preparation of DTD files in tightly prescribed forms
containing declarations which overrode the default declarations for the entities, elements,
and attributes concerned. TEI P5 customization involves the preparation of an ODD (for
one document does it all) document which describes changes to the base TEI
vocabulary using a specialized vocabulary defined in chapter 22 of TEI P5.
A secondary goal is to document the structure of the customization, specifying what is
included without change from TEI P5, what is excluded, and what is changed. Some differences
between TEI P3 and IDS/XCES originated with CES or XCES and others were introduced when
IDS/XCES was adapted from XCES; since those have different significance for further
development and maintenance of the vocabulary, those two sets of differences are
distinguished here. Another secondary goal is to provide at least rudimentary documentation
for all elements in the vocabulary.
The brief descriptions of elements in the TEI and CES/XCES vocabularies are taken from the
documentation for those encoding schemes; thanks are due to the authors and publishers of
that documentation. Descriptions are included in the appendix for elements suppressed from
modules which are otherwise included, in order to simplify review of the design and
consideration of possible changes.
Required TEI Modules
This section of this ODD file includes a number of TEI P5 modules; eventually it will
also describe differences between the P5 version of the elements involved and the older
IDS/XCES versions.
A note on notation: this document is not a description of the ODD file which generates
the I5 version of the TEI P5 vocabulary; it is the ODD document. Blocks
labeled Spec fragment
, like the one just shown, are used to specify selections from
and modifications to the TEI P5 vocabulary. As may be seen, such spec fragments may
include cross references to other spec fragments elsewhere in the document, which are
included by reference in the set of modifications; ODD documents are thus a specialized
form of literate programming as defined by the computer scientist
Donald Knuth and used in the publication of his TeX and MetaFont programs. The
literate-programming structure allows the formal specification of changes to be embedded
in prose documentation intended to explain what is happening.
The tei module
The tei
module is required for any TEI profile. The following
specification fragment includes the tei module and makes appropriate modifications to
it.
Redefine one macro.
redefine macro.limitedContent.
The core module
The tei
and core
modules are required for any TEI profile.
The following specification fragment includes the core module and makes appropriate
modifications to it.
Delete unneeded elements.
Rename some elements.
Redefine some elements.
Numerous elements in the TEI core
module are suppressed; for short
descriptions of these elements see [the appendix](#core_suppressed).
Suppression of unused elements in core module.
Some elements in the core modules are renamed in obvious ways. The teiCorpus
element is renamed idsCorpus, and its content model is adjusted:
idsCorpus contains a sequence of idsDoc elements, not
idsText (~ TEI) elements, so the default content model is not
appropriate.
Renaming elements in core module:
idsCorpus
teiCorpus.2
The remainder of the elements in the core module are included in the IDS/XCES
vocabulary; that remainder includes the elements listed below. (For the most part, these
elements are also included in CES and XCES, but editor, gloss,
lb, orig, and pb are not in XCES but are added back into
the vocabulary by IDS/XCES.)
Some elements are included without change from TEI, at least in the sense that the
same parameter entity or pattern names are used in the declarations. (The extension of
some element classes in IDS/XCES does of course mean the effective content model is not
actually the same. But we do not need to supply a different model in this ODD file.)
For other elements, IDS/XCES declares a content model which is a restriction of the
content model in TEI P5. One simple way to move I5 closer to TEI P5 would be to drop
these restrictions and use the TEI P5 declarations for these elements unchanged.
Finally, for some elements IDS/XCES declares a content model which extends or modifies
the declarations in TEI P5. Sometimes the change consists merely in the addition of one
or more attributes, or adding the element as a member of this or that class. In other
cases the content model is rewritten.
The following specification fragment indicates which elements are changed from the TEI
P5 declarations.
Redefinition of elements in core module.
Where possible, the list below notes the nature of the changes made.
-
abbr (abbreviation): contains an abbreviation of any sort.
gives an expansion of the abbreviation.
DVP]]>
USPD]]>
DDP]]>
ABF]]>
DAK]]>
DAW]]>
DPZI]]>
DSV]]>
-
address: contains a postal address, for example of a publisher, an
organization, or an individual. (Not present in samples.)
-
analytic (analytic level): contains bibliographic elements describing an item
(e.g. an article or poem) published within a monograph or journal and not as an
independent publication. IDS/XCES modifies the content model to fit the three-level
structure of IDS corpora.
CES restricts the TEI content model and renames some elements; IDS/XCES
extends the CES definition. The model given here is the same as for monogr.
-
author: in a bibliographic reference, contains the name(s) of the author(s),
personal or corporate, of a work; for example in the same form as that provided by a
recognized bibliographic name authority.
-
bibl (bibliographic citation): contains a loosely-structured bibliographic
citation of which the sub-components may or may not be explicitly tagged.
-
biblScope (scope of citation): defines the scope of a bibliographic
reference, for example as a list of page numbers, or a named subdivision of a larger
work.
-
biblStruct (structured bibliographic citation): contains a structured
bibliographic citation, in which only bibliographic sub-elements appear and in a
specified
order.
-
corr (correction): contains the correct form of a passage apparently
erroneous in the copy text. IDS-XCES adds the attribute @sic (which gives the original
form) as e.g. occurring in hi1bb.xces.
from CES Dokumentation: "gives the original form"
ToDo
-
date: contains a date in any format. CES declares this element as a member of
the
token
class.
-
item: adding att.typed for wiki talk HLU 2020-01-24 : Note that in the
current TEI, desc has already @type
-
distinct: identifies any word or phrase which is regarded as linguistically
distinct, for example as archaic, technical, dialectal, non-preferred, etc., or as
forming part of a sublanguage.
-
editor: secondary statement of responsibility for a bibliographic item, for
example the name of an individual, institution or organization, (or of several such)
acting as editor, compiler, translator, etc.
-
item: adding att.typed for wiki talk HLU 2020-01-24
-
foreign: identifies a word or phrase as belonging to some language other than
that of the surrounding text. IDS-XCES allows it to contain q, e.g. in
loz-div-pub.xces
-
gap: indicates a point where material has been omitted in a transcription,
whether for editorial reasons described in the TEI header, as part of sampling
practice, or because the material is illegible, invisible, or inaudible. (In TEI P5,
the description of the gap has moved from a desc attribute to a
desc child; we revert this change for compatibility with existing data.)
gives a description of the omitted material.
die anspruchsvolle Alternative
für Leichtraucher.
Astor mild im Rauch
nikotinarm.
zum Anbieten und Verschenken
Astor mild-Kassette 48 Cigaretten DM 6,-
20 Astor mild DM 2,50.
]]>
Weit treffender haben aber jene
(die Griechen) die unteilbare Subsistenz
einer vernünftigen Natur mit dem Wort
'o'benannt«
(Boethius, zitiert nach
Brasser 1999: 52).]]>
-
gloss: identifies a phrase or word used to provide a gloss or definition for
some other word or phrase.
the target of the pointer
The original TEI target attribute of the class att.pointing comes out as
of type CDATA from the ODD2DTD. (According to the TEI guidelines the type
is 'datapointer' which stands for a single URI, which btw. correctly
caused it to come out as xs:anyURI in the generated i5.xsd). But in the
ids-xces.dtd it was IDREF for target at gloss. Hence I include it here
with an explicit specification of the IDREF type.
-
head (heading): contains any type of heading, for example the title of a
section, or the heading of a list, glossary, manuscript description, etc.
Changed 2020-01-21 such that it can contain sigend, for
wiki talk.
-
hi (highlighted): marks a word or phrase as graphically distinct from the
surrounding text, for reasons concerning which no claim is made.
-
imprint: groups information relating to the publication or distribution of a
bibliographic item. CES redefines this to use the pubDate element instead of
date.
-
item: contains one component of a list. . Redefined to
contain signed as well for wiki talk HLU 2020-01-02.
-
l (verse line): contains a single, possibly incomplete, line of verse. CES
redefined the meaning and values of the part attribute.
indicates whether the verse line is metrically complete.
the line is metrically complete.
the line is metrically incomplete.
metricality is not known or inapplicable.
Given the attribute name part, the value y
might
seem intuitively to mean Yes, this is a partial line, not a full
line,
but the CES documentation glosses y
and
n
as shown.
The attribute appears not to be actively used in any IDS samples, in any
case; all values given are the default u
.
-
label: contains the label associated with an item in a list; in glossaries,
marks the term being defined.
-
lb (line break): marks the start of a new (typographic) line in some edition
or version of a text. IDS-XCES adds @TEIform to the attribute list as used in fsp.xces
and gr1.xces.
TEIform
pb
-
lg (line group): contains a group of verse lines functioning as a formal
unit, e.g. a stanza, refrain, verse paragraph, etc.
indicates whether the verse line group is metrically complete.
the line is metrically complete.
the line is metrically incomplete.
metricality is not known or inapplicable.
The attribute appears not to be actively used in any IDS samples; all
values given are the default u
.
-
list: contains any sequence of items organized as a list. IDS/XCES allows
milestones and xptr elements among the children.
-
measure: contains a word or phrase referring to some quantity of an object or
commodity, usually comprising a number, a unit, and a commodity name.
-
mentioned: marks words or phrases mentioned, not used. (Not present in
samples.)
-
monogr (monographic level): contains bibliographic elements describing an
item (e.g. a book or journal) published as an independent item (i.e. as a separate
physical object). CES restricts the TEI content model and renames some elements;
IDS/XCES extends the CES definition. The model given here is the same as for
analytic.
-
name (name, proper noun): contains a proper noun or noun phrase.
-
note: contains a note or annotation.
-
num (number): contains a number, written in any form.
-
orig (original form): contains a reading which is marked as following the
original, rather than being normalized or corrected. The reg attribute was
dropped in TEI P5 and must be restored. CES also adds a regalt attribute
which must be defined.
gives a regularized (normalized) form of the text.
gives an alternate form of the regularized (normalized) text.
der warnt -
wider die eiserne Regel des
Wahlk(r)ampfes,
einen Gegner durch Nichtnennung zu strafen -
davor, PDS zu wählen.]]>
-
p (paragraph): marks paragraphs in prose.
(paragraph) marks paragraphs in prose. In the case of CMC documents, notably
Wiki talk pages, it is necessary that signed may also appear inside
paragraphs. In a Wiki talk page, users insert their signature as part of the
paragraph. The only change to the original content model of p is that
signed is additionally allowed inside p.
Usenet news message
Wer die ruhrtour Mailingliste noch nicht kennt, der schaut bitte weiter
unten nach!
-
pb (page break): marks the boundary between one page of a text and the next
in a standard reference system.
pb
-
ptr (pointer): defines a pointer to another location. IDS/XCES adds this
element to the
ids.milestones
class.
the target of the pointer
The original TEI target attribute of the class att.pointing comes out as
of type CDATA from the ODD2DTD. (According to the TEI guidelines the type
is 'datapointer' which stands for a single URI, which btw. correctly
caused it to come out as xs:anyURI in the generated i5.xsd). But in the
ids-xces.dtd it was IDREFS. Hence I include it here with an explicit
specification of the IDREFS type.
-
pubPlace (publication place): contains the name of the place where a
bibliographic item was published.
-
publisher: provides the name of the organization responsible for the
publication or distribution of a bibliographic item.
-
q (separated from the surrounding text with quotation marks): contains
material which is marked as (ostensibly) being somehow different than the surrounding
text, for any one of a variety of reasons including, but not limited to: direct speech
or thought, technical terms or jargon, authorial distance, quotations from elsewhere,
and passages that are mentioned but not used. CES adds several attributes.
points to the next element of a virtual aggregate of which the current
element is part. Specifically, for q elements, gives the ID of a
subsequent q element which contains a continuation of the same
quotation.
In TEI P5, this attribute is URI-valued; in P3 (and IDS/XCES), it is
IDREF-valued.
points to the previous element of a virtual aggregate of which the
current element is part. Specifically, for q elements, gives the ID
of a preceding q element which contains the immediately preceding
portion of the same quotation.
In TEI P5, this attribute is URI-valued; in P3 (and IDS/XCES), it is
IDREF-valued.
may be used to indicate whether the quoted matter is regarded as direct
or indirect speech.
speech or thought is represented directly.
speech or thought is represented indirectly, e.g. by use of a
marked verbal aspect.
no claim is made.
indicates whether this quotation or piece of dialog is broken between
two or more q elements (linked using the next and
prev attributes).
quotation is broken across two or more elements.
quotation is not broken across multiple elements.
no claim is made.
-
quote (quotation): contains a phrase or passage attributed by the narrator or
author to some agency external to the text.
-
ref (reference): defines a reference to another location, possibly modified
by additional text or comment. IDS-XCES adds @orig.
-
reg (regularization): contains a reading which has been regularized or
normalized in some sense.
for the original
ToDo
-
sp (speech): An individual speech in a performance text, or a passage
presented as such in a prose or verse text. CES restricts this content model severely;
IDS/XCES brings it back closer to the TEI form, and adds the class of IDS milestones
to the legal content.
Role in plenary debate such as presidency or ordinary mp
Name of speaker
Parliamentary group (German "Fraktion") of speaker
Partyof speaker
Sie irren sich,
erwiderte Korczak,
nicht jeder ist ein Schuft,
und er schlug die Waggontür
hinter sich zu.
]]>
-
speaker: A specialized form of heading or label, giving the name of one or
more speakers in a dramatic text or fragment.
-
stage (stage direction): contains any kind of stage direction within a
dramatic text or fragment.
-
term: contains a single-word, multi-word, or symbolic designation which is
regarded as a technical term.
-
time: contains a phrase defining a time of day in any format.
-
title: contains a title for any kind of work.
The
header
module
The header
module is also essential. We include it here:
The teiHeader element is renamed to idsHeader, and we add some
attributes (the status attribute was in TEI P3 but seems to have disappeared
from P5):
Renaming teiHeader as idsHeader ...
idsHeader
new
teiHeader
Some elements in the TEI header
module are suppressed; for short
descriptions see [the appendix](#header_suppressed).
Deleting unused elements in header module ...
The IDS/XCES vocabulary includes the elements listed below from the TEI
header
module, sometimes with content models which extend or otherwise
modify the content models of TEI P5 in such a way that instances of the revised element
type are not valid against the unmodified TEI P5 schema, and sometimes without change.
In several cases, CES changes a content model from requiring a sequence of paragraphs to
allowing just character data. In other cases, specialized child elements are added to
the content model.
Modifying some elements and classes in header module ...
In the attribute class declarable
, CES renames one attribute from
default to Default, and changes its values from yes
and no
to y
and n
.
n
The elements included from the header
module are these.
-
availability: supplies information about the availability of a text any
restrictions on its use or distribution, its copyright status, etc.
world
An example:
]]>
The IDS/XCES vocabulary includes the following elements from the TEI
header
module with content models which restrict the content models of
TEI P5.
-
biblFull (fully-structured bibliographic citation): contains a
fully-structured bibliographic citation, in which all components of the TEI file
description are present.
-
catDesc (category description): describes some category within a taxonomy
or text typology, either in the form of a brief prose description or in terms of the
situational parameters used by the TEI formal textDesc.
-
catRef (category reference): specifies one or more defined categories
within some taxonomy or text typology.
the target of the pointer
The original TEI target attribute of the class att.pointing comes out
as of type CDATA from the ODD2DTD. (According to the TEI guidelines the
type is 'datapointer' which stands for a single URI, which btw.
correctly caused it to come out as xs:anyURI in the generated i5.xsd).
But in the ids-xces.dtd it was IDREFS. Hence I include it here with an
explicit specification of the IDREFS type.
-
category: contains an individual descriptive category, possibly nested
within a superordinate category, within a user-defined taxonomy.
-
change: summarizes a particular change or correction made to a particular
version of an electronic text which is shared between several researchers.
-
classCode
-
classDecl (classification declarations): contains one or more taxonomies
defining any classificatory codes used elsewhere in the text.
-
correction (correction principles): states how and under what circumstances
corrections have been made in the text.
-
creation: contains information about the creation of a text.
-
distributor: supplies the name of a person or other agency responsible for
the distribution of a text.
-
edition: describes the particularities of one edition of a text.
-
editionStmt (edition statement): groups information relating to one edition
of a text.
-
editorialDecl (editorial practice declaration): provides details of
editorial principles and practices applied during the encoding of a text. CES
changes the set of children for this element from P3 (suppressing
interpretation and stdVals and adding transduction and
conformance); IDS/XCES adds pagination to the children.
-
encodingDesc (encoding description): documents the relationship between an
electronic text and the source or sources from which it was derived. IDS-XCES
additionally allows an empty encodingDesc as used in dck.xces .
-
extent: describes the approximate size of a text as stored on some carrier
medium, whether digital or non-digital, specified in any convenient units.
-
fileDesc (file description): contains a full bibliographic description of
an electronic file.
-
hyphenation: summarizes the way in which hyphenation in a source text has
been treated in an encoded version of it.
-
idno (identifier): supplies any form of identifier used to identify some
object, such as a bibliographic item, a person, a title, an organization, etc. in a
standardized way.
-
keywords: contains a list of keywords or phrases identifying the topic or
nature of a text.
-
langUsage (language usage): describes the languages, sublanguages,
registers, dialects, etc. represented within a text.
-
language: characterizes a single language or sublanguage used within a
text. TEI P5 uses an ident attribute, not id, to give the
language code; IDS/XCES follows P3 in this.
Supplies a language code constructed as defined in BCP 47 which is
used to identify the language documented by this element, and which is
referenced by the global attributes lang and
xml:lang.
<language id="de" usage="100">Deutsch</language>
Note that for technical reasons it is not possible to assign the type
ID
to both the id and xml:id
attributes. This version of I5 assigns the ID
type to
attribute id.
-
normalization: indicates the extent of normalization or regularization of
the original source carried out in converting it to electronic form.
-
profileDesc (text-profile description): provides a detailed description of
non-bibliographic aspects of a text, specifically the languages and sublanguages
used, the situation in which it was produced, the participants and their setting.
-
projectDesc (project description): describes in detail the aim or purpose
for which an electronic file was encoded, together with any other relevant
information concerning the process by which it was assembled or collected.
-
publicationStmt (publication statement): groups information concerning the
publication or distribution of an electronic or other text.
An example (actually, all of the publicationStmt elements in the
available samples look like this, or else have no contents in any of their
children):
Institut für Deutsche Sprache
Postfach 10 16 21, D-68016 Mannheim
+49 (0)621 1581 0
]]>
-
quotation: specifies editorial practice adopted with respect to quotation
marks in the original.
-
refsDecl (references declaration): specifies how canonical references are
constructed for this text.
-
revisionDesc (revision description): summarizes the revision history for a
file.
-
samplingDecl (sampling declaration): contains a prose description of the
rationale and methods used in sampling texts in the creation of a corpus or
collection.
-
segmentation: describes the principles according to which the text has been
segmented, for example into sentences, tone-units, graphemic strata, etc.
-
sourceDesc (source description): describes the source from which an
electronic text was derived or generated, typically a bibliographic description in
the case of a digitized text, or a phrase such as "born digital" for a text which
has no previous existence.
-
tagUsage: supplies information about the usage of a specific element within
a text.
-
tagsDecl (tagging declaration): provides detailed information about the
tagging applied to a document. TEI P5 requires that the tagUsage elements
in the tagsDecl element be wrapped in a namespace element;
IDS/XCES follows P3.
-
taxonomy: defines a typology used to classify texts either implicitly, by
means of a bibliographic citation, or explicitly by a structured taxonomy.
-
textClass (text classification): groups information which describes the
nature or topic of a text in terms of a standard classification scheme, thesaurus,
etc.
-
titleStmt (title statement): groups information about the title of a work
and those responsible for its intellectual content. IDS/XCES adjusts the content
model here to use the specialized title elements it defines for the different corpus
levels. An I5 corpus level-specific {c|d|t}.title element is obligatory, in
addition, original TEI title elements with their suitable attributes may be
specified, e.g. for specifying subtitles
The
textstructure
module
The TEI textstructure
module is also included:
The TEI element is renamed idsText:
-
TEI (TEI document): contains a single TEI-conformant document, comprising a
TEI header and a text, either in isolation or as part of a teiCorpus element. This corresponds in
essential ways to the IDS idsText element.
Renaming ...
idsText
Several elements in this module are suppressed. Descriptions of these elements are
given in [the appendix](#textstructure_suppressed).
Suppressing unused elements ...
The IDS/XCES vocabulary includes the elements listed below from the TEI
textstructure
module. A few of these are included in CES and XCES, with
declarations which extend or otherwise modify those in the TEI. Others are omitted from
CES and XCES and have been added back into the vocabulary by IDS. In some cases, the
IDS/XCES declaration extends that of TEI.
-
back (back matter): contains any appendixes, etc. following the main part
of a text.
-
body (text body): contains the whole body of a single unitary text,
excluding any front or back matter.
-
byline: contains the primary statement of responsibility given for a work
on its title page or at the head or end of the work.
-
closer: groups together salutations, datelines, and similar phrases
appearing as a final group at the end of a division, especially of a letter.
-
dateline: contains a brief description of the place, date, time, etc. of
production of a letter, newspaper story, or other work, prefixed or suffixed to it
as a kind of heading or trailer.
-
div (text division): contains a subdivision of the front, body, or back of
a text. IDS/XCES eliminates the internal structure of the declarations in TEI and
CES and allows a mixture of children in any order. We additionally alow the element
posting from the DeRik TEI-proposal for CMC.
indicates whether the section is complete or a sample.
y
The text section is complete.
The text section is incomplete (typically because it's a
sample.
the type of the text section.
The most frequent values include: section
, Zeitung
, book
, Enzyklopädie-Artikel
, Agenturmeldungen
, figures
, marginnotes
, Rede
, Zeitschrift
, footnotes
, Roman
, content
, preface
.
Other values include: abstract
, Anmerkung
,
Ansprache
, appendix
, Aufruf
, Aufsatz
, Ausgabenvermerk
, Beschluss
, bibliography
, Brief
, captions
, dedication
, endnotes
, Erklärung
, Erzählung
, Erzählungen
,
Fabeln
, Forderung
, Geschichte
,
glossary
, Handzettel
, Information
, Interview
, Kolumnen
, Kriminalroman
, Kurzgeschichten
, Merkblatt
, Nachwort
, Novelle
, postface
, Predigt
, Protokoll
, Referat
, Sachbuch
, Sachbuch, Ratgeber
,
Schilderung
, Sprechchöre und Transparente
,
Vorlesung
, Vortrag
, Wissenschaftszeitung
, and Zeitungsartikel
.
Subject of section in plenary debate according to GermaParlTEI
Descriptionof section in plenary debate according to GermaParlTEI
-
docAuthor (document author): contains the name of the author of the
document, as given on the title page (often but not always contained in a byline).
-
docEdition (document edition): contains an edition statement as presented
on a title page of a document.
-
docImprint (document imprint): contains the imprint statement (place and
date of publication, publisher name), as given (usually) at the foot of a title
page.
-
docTitle (document title): contains the title of a document, including all
its constituents, as given on a title page.
-
epigraph: contains a quotation, anonymous or attributed, appearing at the
start of a section or chapter, or on a title page.
-
front (front matter): contains any prefatory matter (headers, title page,
prefaces, dedications, etc.) found at the start of a document, before the main body.
-
opener: groups together dateline, byline, salutation, and similar phrases
appearing as a preliminary group at the start of a division, especially of a letter.
IDS-XCES adds gap to the content model.
indicates the type of opener.
unspecified
-
salute (salutation): contains a salutation or greeting prefixed to a
foreword, dedicatory epistle, or other division of a text, or the salutation in the
closing of a letter, preface, etc.
-
signed (signature): contains the closing salutation, etc., appended to a
foreword, dedicatory epistle, or other division of a text.
(signature) contains the closing salutation, etc., appended to a
foreword, dedicatory epistle, or other division of a text, or appearing freely
within paragraphs, sentences, quotations or the post as a whole especially of
an email, or of a user contribution on a Wikipedia talk page.
indicates that the corresponding posting was explicitly signed by
a registered user using a user signature mark up (e.g. ~~~~).
indicates that the corresponding posting was marked by either a
registered or unregistered user using the Unsigned or Help
template.
"user_contribution" indicates that the corresponding posting was
marked using a [[Special:Contributions/IP]] link (e.g by an
unregistered user)
added 2019-06-14
This is actually the same as "user_contribution"
"special_contribution" indicates that the corresponding posting
was marked using a [[Special:Contributions/IP]] link (e.g by an
unregistered user)
-
text: contains a single text of any kind, whether unitary or composite, for
example a poem or drama, a collection of essays, a novel, a dictionary, or a corpus
sample.
-
titlePage (title page): contains the title page of a text, appearing within
the front or back matter.
-
titlePart: contains a subsection or division of the title of a work, as
indicated on a title page.
Optional TEI modules
This section lists the optional TEI modules incorporated in whole or part into the
IDS/XCES vocabular.
The
analysis
module
The IDS/XCES vocabulary includes the following elements from the TEI
analysis
module:
-
s (s-unit): contains a sentence-like division of a text. CES and the
existing IDS/XCES DTD define this using the parameter entity
phrase.seq
, but this relies on a different meaning for the
phrase
class than is present in TEI P5.
indicates whether this sentence is broken between two or more
s elements (linked using the next and
prev attributes).
sentence is represented by multiple s elements.
sentence is represented by a single s element.
no claim is made.
This attribute appears not to be in use in the IDS samples.
-
w (word): represents a grammatical (not necessarily orthographic) word.
contains a POS value
(original) gives the original string or is the empty string when the
element does not appear in the source text.
When present, it provides information on whether the token in question
is adjacent to another, and if so, on which side. The definition of this
attribute is adapted from ISO MAF (Morpho-syntactic Annotation Framework),
ISO 24611:2012.
The token is not adjacent to another
There is no whitespace on the left side of the
token
There is no whitespace on the left side of the
token
There is no whitespace on either side of the
token
The token overlaps with another; other devices
(specifying the extent and the area of overlap) are needed to more
precisely locate this token in the character stream.
The remaining elements in this module are suppressed:
See [the appendix for brief
descriptions.](#optional_suppressed)
The
corpus
module
The IDS/XCES vocabulary includes the following elements from the TEI
corpus
module:
-
particDesc (participation description): describes the identifiable
speakers, voices, or other participants in any kind of text.
textDesc (text description): provides a description of a text in terms
of its situational parameters.
The remaining elements in the module are suppressed.
The
figures
module
The IDS/XCES vocabulary includes the following elements from the TEI
figures
module:
-
cell: contains one cell of a table.
-
figDesc (description of figure): contains a brief prose description of the
appearance or content of a graphic figure, for use when documenting an image without
displaying it.
-
row: contains one row of a table.
-
table: contains text displayed in tabular form, in rows and columns.
The following elements in the TEI figures
module are redefined:
-
figure: groups elements representing or containing graphic information such
as an illustration or figure. IDS-XCES adds ptr to the content model.
ToDo
Figure is redefined.
The
namesdates
module
As a conservative extension, the I5 vocabulary now includes 11 elements from the
namesdates module. The rest of the module might as well be included, but are suppressed
because they do not seem to be needed for now.
The following specification fragment includes the namesdates module and makes
appropriate modifications to it.
The following elements are deleted from the module namesdates:
The
linking
module
In DeReKo, the elements ref, ptr, and xptr are used for
linking. ref is already included in I5 through the core
module.
The elements xref and xptr were declared in the linking module in TEI
P3 and P4, but they are no longer part of TEI P5. From the TEI P5 linking module, only
the elements timeline and seg are taken. They are needed for the
encoding of CMC documents
The following specification fragment includes the linking module and makes and delete
all elemente except seg, timeline, and when.
The following elements are deleted from the module linking:
Choosing the linking module automatically includes the linking attributes
corresp, synch, sameAs, copyOf,
next, prev, exclude, and select. All
linking attributes are also att.global, thus can appear almost anywhere.
TEI modules not included
The following optional TEI modules are not included in this customization:
The certainty
module (for recording points of uncertainty and
dispute).
The dictionaries
module (for print or electronic
dictionaries).
The drama
module. (N.B. the caption element of TEI P5
which is included in this module has nothing to do with the caption
element introduced by CES as an extension of TEI, and retained by
IDS/XCES.)
The gaiji
module (for extending the Unicode / ISO 10646
universal character set).
The fs
module (for the representation of feature structures for
linguistic or other analysis).
The msdescription
module for description of manuscript
materials.
The nets
module (for representation of graphs, networks, and
trees).
The spoken
module (for transcription of spoken
materials).
The tagdocs
module (for documentation of XML
vocabularies).
The textcrit
module (for the representation of text-critical
apparatus as used in scholarly editions).
The transcr
module (for markup of transcriptions of original
source material).
The verse
module (for markup of metrical phenomena in
verse).
Elements added by IDS
The elements described in this section are not direct equivalents of any individual TEI
element. They fall into several categories, which are different primarily for purposes of
vocabulary maintenance:
- elements taken over without change from CES and XCES
- elements taken over from TEI P3 which are no longer present in P5
- elements added by IDS
- elements defined by DERIK
- elements defined by TEI Correspondence SIG
The following specification fragment includes each of these groups in turn:
Elements taken over without change from CES and XCES
A number of elements in IDS/XCES are taken over from the CES and XCES vocabularies.
- annotation: provides information about one external annotation document
associated with the text. (Not present in samples.)
provides information about one external annotation document associated
with the text.
indicates the type of annotation.
annotation file contains segmentation into words and
sentences.
annotation file contains morpho-syntactic category information for
the words in the text.
annotation file contains alignment links to a parallel
translation.
provides information (path/file name, URL, etc.) about the location of
the annotation file.
for annotation file containing alignment information, provides
information (path/file name, URL, etc.) about the location of the file
containing the aligned text.
- annotations (in file
ids.xheader.elt
): child of
profileDesc. Groups annotation elements. (Not present in
samples.)
groups information about annotation documents associated with the
text.
- biblNote: a descriptive note supplying additional information of any
kind relating to a bibliographic item described within a corpus or text header.
child of analytic and monogr. #PCDATA, but otherwise
roughly equivalent to TEI note.
Die Datengrundlage
der Tagebücher selbst (17. Juni bis 31. Dezember 1945)
bildet: Klemperer, Victor: So sitze ich denn zwischen
allen Stühlen, Bd. 1, Tagebücher 1945-1949, Hrsg.:
Nowojski, Walter; unter Mitarbeit von Christian Löser. -
Berlin: Aufbau-Verlag]]>
ID:5FDC73,
2007.01.01 10:11]]>
- byteCount: contains the count of bytes in the file containing the text
together with its markup. (Not present in samples.)
child of extent; #PCDATA.
kb
- changeDate: gives the date of a change (as child of change).
(Not present in samples.)
child of change; #PCDATA; context-dependent specialization of
TEI date.
- conformance: provides the CES level of conformance for the text or
corpus. (Not present in samples.)
child of editorialDecl; #PCDATA plus level
attribute.
0
- eAddress: gives an electronic address of the person or institution who
distributes the text or corpus. Note that more than one occurrence of this tag can
appear, so that multiple addresses (possibly of different types) can be included.
(Not present in samples.)
child of publicationStmt, provides electronic address of
distributor; #PCDATA.
email
- extNote: a descriptive note supplying additional information of any
kind relating to an extent information provided within a corpus or text header. (Not
present in samples.)
child of extent; provides additional information about extent of
document. #PCDATA.
- fax: gives the fax number of the person or institution who distributes
the text or corpus, in format conformant to ITU-T/CCITT Recommendation E.123. (Not
present in samples.)
child of publicationStmt, provides fax number of distributor in
CCITT E.123 form; #PCDATA.
- h.author in a bibliographic reference, contains the name of an author
(personal or corporate) of a work; a context-specific renaming of TEI
author element. CES specifies that names should be given in a canonical
form, with surnames preceding forenames, but IDS practice is not consistent in this
regard.
child of analytic and monogr; context-specific renaming
of TEI author element; #PCDATA.
Matthias Kunert]]>
Fronk, Eleonore;
Andreas, Werner]]>
- h.bibl: character data only, suitable for very simple citations.
child of taxonomy; #PCDATA (sic).
Thementaxonomie (siehe
http://www.ids-mannheim.de/kl/projekte/methoden/te.html)
Fiktion
Fiktion:Vermischtes
...
]]>
- h.item: (as child of change element) specifies the nature of
the change(s). One or more occurrences of this element may appear within each
change element. Context-dependent renaming of standard TEI item.
(Not present in samples.)
child of change; context-dependent renaming of standard TEI
item.
- h.keywords: contains a list of keywords or phrases identifying the
topic or nature of a text, each of which is tagged as a term. (Renaming of TEI
keywords, plus modified content model.)
(in file
ids.xheader.elt
): child of textClass.
Contains a list of keywords or phrases identifying the topic or nature of a
text, each of which is tagged as a term. (Renaming of TEI keywords,
plus modified content model.)
Bau/Leiharbeit
]]>
- h.title: the title of the electronic file, including alternative titles
or subtitles. Context-specific renaming of TEI title.
child of analytic and monogr; context-specific renaming
of TEI title; #PCDATA.
main
IKB verschiebt
Halbjahresbericht erneut]]>
- keyTerm (in file
ids.xheader.elt
): child of
h.keywords, encloses one keyword term describing the text.
Context-specific renaming of standard TEI element term.
child of h.keywords, encloses one keyword term describing the
text. Context-specific renaming of standard TEI element term.
indicates the type of keyTerm (person, country)
indicates the subtype of keyTerm
- pubAddress (in file
ids.xheader.elt
): child of
publicationStmt, provides address of distributor. Context-specific
specialization of TEI address element; #PCDATA.
child of publicationStmt, provides address of distributor.
Context-specific specialization of TEI address element;
#PCDATA.
- pubDate (in file
ids.xheader.elt
): child of
publicationStmt, provides date of publication. Context-specific
specialization of TEI date element; #PCDATA.
child of publicationStmt, provides date of publication.
Context-specific specialization of TEI date element; #PCDATA.
- respName (in file
ids.xheader.elt
): child of
respStmt (where it is a context-dependent renaming of name) and
change. Contains #PCDATA only.
child of respStmt (where it is a context-dependent renaming of
name) and change. Contains #PCDATA only.
- respType (in file
ids.xheader.elt
): child of
respStmt; context-specific renaming of standard TEI resp
child of respStmt; context-specific renaming of standard TEI
resp
- telephone (in file
ids.xheader.elt
): child of
publicationStmt, provides telephone number of distributor in CCITT E.123
form; #PCDATA.
child of publicationStmt, provides telephone number of
distributor in CCITT E.123 form; #PCDATA.
- transduction: (as child of editorialDecl) describes the
principles according to which the text has been transduced, either in transcribing
it from audio tape to written form, or in converting from an electronic original.
child of editorialDecl; #PCDATA plus
level
attribute.
- translation (in file
ids.xheader.elt
): child of
translations. Gives information about one translation of the text.
child of translations. Gives information about one translation
of the text.
- translations (in file
ids.xheader.elt
): child of
profileDesc; groups translation elements.
child of profileDesc; groups translation
elements.
- translator (in file
ids.xheader.elt
): identifies the
translator responsible for one translation.
identifies the translator responsible for one translation.
- wordCount: (as child of extent) contains the count of words in
the text. (Not present in samples.)
child of extent; #PCDATA.
- writingSystem (in file
ids.xheader.elt
): child of
wsdUsage; describes one character set used in the document; can point to
an external writing system declaration. (This element appears to be a survival from
the SGML version of CES; in XML, character set issues are typically handled at a
different level.) (Not present in samples.)
child of wsdUsage; describes one character set used in the
document; can point to an external writing system declaration.
- wsdUsage: groups information describing the character set(s) used
within a text. (Not present in samples.)
child of profileDesc; groups writingSystem
elements.
CES also defined some new classes for attribute inheritance and content models.
identies a group of low-level tokens and similar elements
CES moves the rend attribute from the list of global attributes to the
text
class; for now, we follow TEI here.
The following specification fragment incorporates all the descriptions for the
elements just mentioned:
Elements defined by (X)CES and modified by IDS/XCES
-
caption: (1) a heading, title etc. attached to a picture or diagram (2) a
"pull quote" or other text about or extracted from a text and superimposed upon it to
draw attention to it. This element was added by CES; it conflicts (possibly
unintentionally) with a different element also named caption defined by TEI
P3 for text displayed in a film (or text in a screenplay intended for such display).
IDS/XCES modifies the CES version by allowing IDS milestones in the content.
(1) a heading, title etc. attached to a picture or diagram; (2) a
pull quote or other text about or extracted from a text
and superimposed upon it to draw attention to it. IDS-XCES adds ptr to the
content model.
categorizes the caption
unspec
caption containing authorship of an article
extra-textual caption (displayed box, etc.)
caption describing a figure, photograph, etc.
not specified or unknown
"die Käufer werden dem Champagner
treu bleiben, auch wenn er wieder
teurer wird"
Claude Taittinger
Generaldirektor von Champagne
Taittinger
]]>
- poem (in file
ids.xesdoc.dtd
): contains a poem, or an
extract from a poem, appearing within or between paragraphs; an
inter-level element. IDS changes CES's definition to allow milestone
elements among the children.
contains a poem appearing within or between paragraphs; an
inter-level element.
...
Denn auch für diesen hilflosen,
aber unbestechlichen Chronisten der
dunkelsten deutschen Jahre erweisen
die Verse Brechts ihre Gültigkeit:
[...]
Ihr, die ihr auftauchen werdet
aus der Flut
In der wir untergegangen sind
...
...
]]>
IDS/XCES also redefines the base.seq
parameter entity in such a way that
it becomes not a sequence but an element class. To try to avoid confusion, it is here
renamed basic
.
identies a group of basic phrase-level elements allowed in some cases where
other phrase-level elements are not allowed.
identifies a set of elements used by IDS/XCES as milestones, distinct from the
built-in milestone classes of TEI P5.
Renamings of TEI elements
Several IDS/XCES elements are esssentially renamings (or context-dependent renamings)
of TEI elements.
- idsCorpus (in file
ids.xesdoc.dtd
): renaming of
teiCorpus, with slight difference in content model
- idsHeader (in file
ids.xheader.elt
): renaming of
teiHeader
- idsDoc (in file
ids.xesdoc.dtd
): intermediate level between
idsText and idsCorpus; conceptually similar to the TEI
element, and structurally similar to teiCorpus but (just for that reason)
cannot be declared as a renaming of either. HLU: For the time being, idsDoc is not put
in the namespace http://www.ids-mannheim.de/i5 so that in an i5 document, it will work
like idsCorpus, idsText and idsHeader. The latter are technical renamings of original
TEI elements (using altIdent) and therefore cannot be put in a namespace other than
the TEI namespace)
contains a single document within an IDS corpus; may contain one or several
texts.
text
TEI.2
- idsText (in file
ids.xesdoc.dtd
): renaming of TEI
element
TEI P3 elements no longer in P5
Some elements and attributes used by IDS/XCES were taken over from TEI P3, but are no
longer present in TEI P5: xptr, xref, dateRange, and
timeRange.
Of these, only xptr appears in the samples, so for now that's the only one we
define.
external pointer
defines a pointer to a location outside the current document.
ptr
<xptr targType = "pb" targOrder = "u" doc = "korpref.bio" from =
"TK1.00018-5-PB5" to = "DITTO" TEIform = "xptr"/>
TEI P5 has dropped the id attribute from the global
class, and
the targOrder attribute from the att.pointing
class; these must
be restored.
id
where more than one identifier is supplied as the value of the
target attribute, this attribute specifies whether the order in
which they are supplied is significant.
Yes: the order in which IDREF
values are specified as
the value of a target attribute should be followed when
combining the targeted elements.
No: the order in which IDREF
values are specified as the
value of a target attribute has no significance when combining
the targeted elements.
Unspecified: the order in which IDREF
values are
specified as the value of a target attribute may or may not be
significant.
specifies the kinds of elements to which this pointer may point.
If this attribute is supplied, every element specified as a target must be
of one or other of the types specified. An application may choose whether or
not to report failures to satisfy this constraint as errors, but may not
access an element of the right identifier but the wrong type.
defines a set of attributes used by all those elements which use the TEI P3
extended pointer mechanism to point at locations which have no XML ID.
specifies the document within which the desired location is to be
found.
In principle, the value of this attribute is supposed by TEI P3 to be the
name of an external entity declared in the DTD (often in the internal DTD
subset); in practice, in IDS documents it appears to be a relative reference
to a file, in the form of a filename.
specifies the start of the destination of the pointer.
In principle, the value of this attribute is supposed by TEI P3 to be a TEI
extended pointer; in practice, in IDS documents it appears to be an ID in
the document indicated by the doc attribute.
specifies the end of the destination of the pointer.
In principle, the value of this attribute is supposed by TEI P3 to be a TEI
extended pointer; in practice, in IDS documents it appears always to be the
default value, DITTO
.
Elements and attributes added by IDS
The following elements are not present in the TEI or in CES/XCES, but have been added
by IDS. All but one are intended to appear in the header.
- appearance: physical appearance of the source
(BOT+e)
A child of edition.
- c.title: corpus title; a context-specific specialization of TEI
title.
child of titleStmt. #PCDATA only, otherwise a context-specific
specialization of TEI title.
- column: original label of newspaper column> section as in the source
(BOT+ress)
child of textDesc. #PCDATA only.
Bericht
TB-KLN2 (Abk.)
]]>
- creatDate: time of creation.
child of creation (in profileDesc). #PCDATA
only.
2001.01.13
]]>
- creatRef: reference to (creation of text and) first edition.
child of creation (in profileDesc). #PCDATA
only.
1998
(Erstveröffentlichung:
Oberhausen, 1998)
(Erstv. 1998)
]]>
- creatRefShort: short version of reference to (creation of text and)
first edition.
child of creation (in profileDesc). #PCDATA
only.
1959
(Erstveröffentlichung:
Frankfurt a.M., 1959)
(Erstv. 1959)
]]>
- d.title: document title; a context-specific specialization of TEI
title.
child of titleStmt. #PCDATA only, otherwise a context-specific
specialization of TEI title.
- dokumentSigle: document ID (formerly BOTD).
child of titleStmt. #PCDATA only.
A01/AUG
St. Galler Tagblatt, August 2001
]]>
- further: further edition of the same source with year (BOT+gg)
child of edition. #PCDATA only.
5. Auflage 1998 (1. Auflage 1997)
]]>
- kind: kind of edition of the source (BOT+g)
child of edition. #PCDATA only.
Im Gegenteil
Kolumnen 1986-1990
Bichsel: Im Gegenteil
Bichsel, Peter
suhrkamp taschenbuch
...
]]>
- korpusSigle: corpus ID (formerly BOTC).
child of titleStmt. #PCDATA only.
A01
St. Galler Tagblatt 2001
]]>
- numRange (in file
ids.xesdoc.dtd
): member of the
token class, modeled on timeRange and dateRange. (Not
present in samples.)
a range of numbers.
- pagination: whether page numbering is present or not (processing info;
formerly BOTP).
a range of numbers.
]]>
- reference: bibliographic reference string.
a child of sourceDesc.
<reference type="complete" assemblage = "regular">A01/JAN.02562 St.
Galler Tagblatt, [Tageszeitung], 13.01.2001, Jg. 57. - Originalressort:
TB-KLN2 (Abk.), [Bericht]</reference>
- t.title: text title. A context-specific specialization of TEI
title.
child of titleStmt. #PCDATA only, otherwise a context-specific
specialization of TEI title.
- textDomain: subject area of the text (BOT+r)
child of textDesc. #PCDATA only.
Wissenschaft
Wissenschaft
x]]>
- textSigle: text ID (formerly BOTT).
child of titleStmt. #PCDATA only.
A01/JAN.02562
A01/JAN.02562 St. Galler Tagblatt,
13.01.2001,
Ressort: TB-KLN2 (Abk.)
]]>
- textType: type type according to type inventory (BOT+x)
child of textDesc. #PCDATA only.
Zeitung: Tageszeitung
Ausgabenvermerk
Anmerkung]]>
- textTypeArt: text type of a specific article (BOT+xa).
child of textDesc. #PCDATA only.
Bericht
TB-KLN2 (Abk.)
]]>
- textTypeRef: text type as it should appear in bibliographic string
(BOT+X).
child of textDesc. #PCDATA only.
Zeitschrift: Wochenzeitschrift
Wochenzeitschrift
Interview
Gesellschaft
]]>
Zeitung: Tageszeitung
Tageszeitung
]]>
- x.title (in file
ids.xheader.elt
): child of
titleStmt; title of some object which is not a corpus (which would use
c.title), not a document in the IDS-specific sense (which would use
d.title), and not a text in the IDS sense (which would use
t.title). Contains #PCDATA.
child of titleStmt; title of some object which is not a corpus
(which would use c.title), not a document in the IDS-specific sense
(which would use d.title), and not a text in the IDS sense (which
would use t.title). Contains #PCDATA.
Module, classes and elements from the TEI CMC SIG proposals
["
Beißwenger/Ermakova/Geyken/Lemnitzer/Storrer (2013): An XML Schema for the
Representation of CMC Genres in TEI"](http://www.empirikom.net/bin/view/Themen/CmcTEI)
The following elements are not present in the TEI or in CES/XCES, but have been
proposed as part of a TEI extension for Computer-mediated communication by BBAW and
Dortmund Technical University. The part adopted here is the one that declares the
posting structure
- Posting:
describes a stretch of text that an individual user has produced in
private and then passed on to the server through performing a "posting" action
(usually by hitting the [ENTER] key on the keyboard or by clicking on a [SEND]
or [SUBMIT] button on the screen). Postings are the largest structural units
in CMC documents that can be assigned to one author and one point in time.
Their function is to make a (written) contribution to the ongoing
dialogue.
marks the (relative) level of indentation of the respective posting
(as defined by its author and in relation to the standard level of
indentation which is described as „0“).
- autoSignature:
is an empty element used for representing the position of the user
signature position in a posting.
indicates that the corresponding posting was explicitly signed by
a registered user using a user signature mark up (e.g. ~~~~).
indicates that the corresponding posting was marked by either a
registered or unregistered user using the Unsigned or Help
template.
"user_contribution" indicates that the corresponding posting was
marked using a [[Special:Contributions/IP]] link (e.g by an
unregistered user)
added 2019-06-14
This is actually the same as "user_contribution"
"special_contribution" indicates that the corresponding posting
was marked using a [[Special:Contributions/IP]] link (e.g by an
unregistered user)
-
signatureContent:
is used to describe an individual user's signature the header of the user
profile. [Comment for I5.odd by HLU 2013-09-05: this element will not be
available as long as there is no element using the model.persStateLike, like
listPerson.]
-
emoticon:
describes an interaction sign which is an iconic unit that has been
created with the keyboard and which typically serves as an emotion or irony
marker or as a responsive.
describes the native region of an emoticon.
describes the general, context-independent function of the
emoticon
describes the function of the respective instance of the emoticon in
its given context.
the position of the emoticon relative to the text to which it
belongs.
-
interactionWord: +
describes an interaction sign which is a symbolic linguistic unit whose
morphologic construction is based on a word or a phrase and describes
expressions, gestures, bodily actions, or virtual events―cf. the units sing, g
(< grins, “grin”), fg (< fat grin), s (< smile), wildsei (“being
wild”).
is used to describe morphological properties of the interaction
word.
describes the general, context-independent function of the interaction
word.
describes the function of the respective instance of the interaction
word in its given context.
is used to describe the semiotic mode that forms the basis for an
interaction word.
the position of the interaction word relative to the text to which it
belongs.
-
interactionTerm:
describes instances of one or several interaction signs (i.e., of
emoticons, interaction words, interaction templates, and/or addressing
terms).
- timestamp:
is an empty element used for representing the timestamp in a posting,
which was automatically inserted when the user pressed a button. This element
is an addition by IDS, i.e. not from the Derik ODD.
Elements from the TEI Correspondence SIG proposal
Copied from the [github site of the TEI Correspondence SIG, specifically from the file proposal.xml as
of 2015-01-08](https://github.com/TEI-Correspondence-SIG/correspDesc)
LICENSE for the correspondence Elements
Copyright (c) 2013, TEI-Correspondence-SIG
All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are
permitted provided that the following conditions are met:
- * Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
- * Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation and/or
other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT
SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY
WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
DAMAGE.
The following was copied from the file
proposal.xml
:
Module for correspondence, including
letters, telegrams, postcards, e-mail etc.
groups together metadata elements for
describing correspondence
groups elements which may appear as
part of the correspContext element
groups elements which define the parts
(usually names, dates and places) of one action related to the
correspondence.
correspondence description
a wrapper element for metadata
pertaining to correspondence
Adelbert von Chamisso
Vertus
29 January 1807
Louis de La Foye
Caen
[Previous letter of Adelbert von Chamisso to Louis de La Foye: 16 January
1807](http://tei.ibi.hu-berlin.de/berliner-intellektuelle/manuscript?Brief023ChamissoandeLaFoye#1)
[Next letter of Adelbert von Chamisso to Louis de La Foye: 07 May
1810](http://tei.ibi.hu-berlin.de/berliner-intellektuelle/manuscript?Brief025ChamissoandeLaFoye#1)
contains a structured description of
the place, the name of a person/organization and the date related to the
sending/receiving of a message or any other action related to the
correspondence
identifies a/the sending action
of the message
identifies a/the receiving
action of the message
identifies a/the transmitting
action of the message
identifies a/the redirecting
action of the message
identifies a/the forwarding
action of the message
Adelbert von Chamisso
Vertus
29 January 1807
correspondence context
Korrespondenzstelle
provides references to preceding or
following correspondence related to this piece of correspondence
[Previous letter of Carl
Maria von Weber to Caroline Brandt: December 30, 1816](http://weber-gesamtausgabe.de/A040962)
[Next letter of Carl Maria
von Weber to Caroline Brandt: January 5, 1817](http://weber-gesamtausgabe.de/A041003)
References
Association for Computers and the Humanities,
Association for Computational Linguistics, and Association for Literary and Linguistic
Computing. 1994. Guidelines for Electronic Text Encoding and
Interchange (TEI P3). Ed. C. M. Sperberg-McQueen and Lou Burnard. Chicago,
Oxford: Text Encoding Initiative, 1994.
Ide, Nancy. 1998. Corpus Encoding
Standard: SGML Guidelines for Encoding Linguistic Corpora.
Proceedings of the First International Language Resources and
Evaluation Conference, 463–470. Granada, Spain.
Ide, Nancy, Patrice Bonhomme, and
Laurent Romary. 2000. XCES: An XML-based Standard for Linguistic
Corpora.
Proceedings of the Second Language Resources and Evaluation Conference
(LREC), 825–830. Athens, Greece.
Institut für deutsche Sprache. [IDS/XCES DTD]. Mannheim: IDS, 2006. On the WEb at [http://corpora.ids-mannheim.de/idsxces1/DTD/](http://corpora.ids-mannheim.de/idsxces1/DTD/)
[Kupietz, Marc.] IDS-Textmodell: Unterschiede gegenüber XCES. Mannheim: IDS, n.d. On the WEb
at [http://www.ids-mannheim.de/kl/projekte/korpora/idsxces.html](http://www.ids-mannheim.de/kl/projekte/korpora/idsxces.html)
TEI Consortium [together with] The Association for
Computers and the Humanities, The Association for Computational Linguistics, and The
Association for Literary and Linguistic Computing. 2001. TEI P4:
Guidelines for Electronic Text Encoding and Interchange. Ed. C. M.
Sperberg-McQueen and Lou Burnard. XML conversion by Syd Bauman, Lou Burnard, Steven
DeRose, and Sebastian Rahtz. Oxford, Providence, Charlottesville, Bergen: The TEI
Consortium, 2001, rpt. 2002.
TEI Consortium. 2007. TEI P5: Guidelines
for Electronic Text Encoding and Interchange. Ed. Lou Burnard and Syd Bauman.
Oxford, Providence, Charlottesville, Nancy: The TEI Consortium, 2007, rev. 2010.
TEI elements suppressed
This appendix lists elements present in TEI P5 (and present modules included by this ODD
file) which are suppressed.
Elements suppressed from the Core module
The following elements in the TEI core
module are suppressed. Almost all
of these were explicitly suppressed by CES and XCES, but some (binaryObject,
choice, desc, graphic, measureGrp, and
said) were not present in TEI P3 or TEI P4; they were added to TEI in TEI P5.
-
add (addition): contains letters, words, or phrases inserted in the text by
an author, scribe, annotator, or corrector.
-
binaryObject: provides encoded binary data representing an inline graphic
or other object.
-
cb (column break): marks the boundary between one column of a text and the
next in a standard reference system.
-
choice: groups a number of alternative encodings for the same point in a
text.
-
del (deletion): contains a letter, word, or passage deleted, marked as
deleted, or otherwise indicated as superfluous or spurious in the copy text by an
author, scribe, annotator, or corrector.
-
desc (description): contains a brief description of the object documented
by its parent element, including its intended usage, purpose, or application where
this is appropriate.
-
divGen (automatically generated text division): indicates the location at
which a textual division generated automatically by a text-processing application is
to appear.
-
expan (expansion): contains the expansion of an abbreviation.
-
headItem (heading for list items): contains the heading for the item or
gloss column in a glossary list or similar structured list.
-
headLabel (heading for list labels): contains the heading for the label or
term column in a glossary list or similar structured list.
-
index (index entry): marks a location to be indexed for whatever purpose.
-
listBibl (citation list): contains a list of bibliographic citations of any
kind.
-
measureGrp (measure group): contains a group of dimensional specifications
which relate to the same object, for example the height and width of a manuscript
page.
-
meeting: contains the formalized descriptive title for a meeting or
conference, for use in a bibliographic description for an item derived from such a
meeting, or as a heading or preamble to publications emanating from it.
-
milestone: marks a boundary point separating any kind of section of a text,
typically but not necessarily indicating a point at which some part of a standard
reference system changes, where the change is not represented by a structural
element.
-
postBox (postal box or post office box): contains a number or other
identifier for some postal delivery point other than a street address.
-
postCode (postal code): contains a numerical or alphanumeric code used as
part of a postal address to simplify sorting or delivery of mail.
-
resp (responsibility): contains a phrase describing the nature of a
person's intellectual responsibility.
-
rs (referencing string): contains a general purpose name or referring
string.
-
said (speech or thought): indicates passages thought or spoken aloud,
whether explicitly indicated in the source or not, whether directly or indirectly
reported, whether by real people or fictional characters.
-
series (series information): contains information about the series in which
a book or other bibliographic item has appeared.
-
sic (latin forthusorso): contains text reproduced although apparently
incorrect or inaccurate.
-
soCalled: contains a word or phrase for which the author or narrator
indicates a disclaiming of responsibility, for example by the use of scare quotes or
italics.
-
street: a full street address including any name or number identifying a
building as well as the name of the street or route on which it is located.
-
teiCorpus: contains the whole of a TEI encoded corpus, comprising a single
corpus header and one or more TEI elements, each containing a single text header and
a text.
-
unclear: contains a word, phrase, or passage which cannot be transcribed
with certainty because it is illegible or inaudible in the source.
Elements suppressed from the TEI header module
The following elements in the TEI header
module are suppressed:
-
appInfo (application information): records information about an application
which has edited the TEI file.
-
application: provides information about an application which has acted upon
the document.
-
authority (release authority): supplies the name of a person or other
agency responsible for making an electronic file available, other than a publisher
or distributor.
-
cRefPattern (canonical reference pattern): specifies an expression and
replacement pattern for transforming a canonical reference into a URI.
-
geoDecl (geographic coordinates declaration): documents the notation and
the datum used for geographic coordinates expressed as content of thegeoelement elsewhere within the document.
-
handNote (note on hand): describes a particular style or hand distinguished
within a manuscript.
-
interpretation: describes the scope of any analytic or interpretive
information added to the text in addition to the transcription.
-
namespace: supplies the formal name of the namespace to which the elements
documented by its children belong.
-
notesStmt (notes statement): collects together any notes providing
information about a text additional to that recorded in other parts of the
bibliographic description.
-
principal (principal researcher): supplies the name of the principal
researcher responsible for the creation of an electronic text.
-
refState (reference state): specifies one component of a canonical
reference defined by the milestone method.
-
rendition: supplies information about the rendition or appearance of one or
more elements in the source text.
-
scriptNote: describes a particular script distinguished within the
description of a manuscript or similar resource.
-
seriesStmt (series statement): groups information about the series, if any,
to which a publication belongs.
-
sponsor: specifies the name of a sponsoring organization or institution.
-
stdVals (standard values): specifies the format used when standardized date
or number values are supplied.
-
teiHeader (TEI Header): supplies the descriptive and declarative
information making up an electronic title page prefixed to every TEI-conformant
text.
-
typeNote: describes a particular font or other significant typographic
feature distinguished within the description of a printed resource.
Elements suppressed from TEI text structure module
The following elements from the TEI text structure module are suppressed:
-
argument: A formal list or prose description of the topics addressed by a
subdivision of a text.
-
div1 (level-1 text division): contains a first-level subdivision of the
front, body, or back of a text.
-
div2 (level-2 text division): contains a second-level subdivision of the
front, body, or back of a text.
-
div3 (level-3 text division): contains a third-level subdivision of the
front, body, or back of a text.
-
div4 (level-4 text division): contains a fourth-level subdivision of the
front, body, or back of a text.
-
div5 (level-5 text division): contains a fifth-level subdivision of the
front, body, or back of a text.
-
div6 (level-6 text division): contains a sixth-level subdivision of the
front, body, or back of a text.
-
div7 (level-7 text division): contains the smallest possible subdivision of
the front, body or back of a text, larger than a paragraph.
-
docDate (document date): contains the date of a document, as given
(usually) on a title page.
-
floatingText: contains a single text of any kind, whether unitary or
composite, which interrupts the text containing it at any point and after which the
surrounding text resumes.
-
group: contains the body of a composite text, grouping together a sequence
of distinct texts (or groups of such texts) which are regarded as a unit for some
purpose, for example the collected works of an author, a sequence of prose essays,
etc.
-
imprimatur: contains a formal statement authorizing the publication of a
work, sometimes required to appear on a title page or its verso.
Elements suppressed from optional modules
The following elements in the TEI analysis
module are suppressed:
- c (character): represents a character.
- cl (clause): represents a grammatical clause.
-
interp (interpretation): summarizes a specific interpretative annotation
which can be linked to a span of text.
-
interpGrp (interpretation group): collects together a set of related
interpretations which share responsibility or type.
- m (morpheme): represents a grammatical morpheme.
-
pc (punctuation character): a character or string of characters regarded as
constituting a single punctuation mark.
- phr (phrase): represents a grammatical phrase.
-
span: associates an interpretative annotation directly with a span of text.
-
spanGrp (span group): collects together span tags.
The following elements in the TEI corpus
module are suppressed:
-
activity: contains a brief informal description of what a participant in a
language interaction is doing other than speaking, if anything.
-
channel (primary channel): describes the medium or channel by which a text
is delivered or experienced. For a written text, this might be print, manuscript,
e-mail, etc.; for a spoken one, radio, telephone, face-to-face, etc.
-
constitution: describes the internal composition of a text or text sample,
for example as fragmentary, complete, etc.
-
derivation: describes the nature and extent of originality of this text.
-
domain (domain of use): describes the most important social context in
which the text was realized or for which it is intended, for example private vs.
public, education, religion, etc.
-
factuality: describes the extent to which the text may be regarded as
imaginative or non-imaginative, that is, as describing a fictional or a
non-fictional world.
-
interaction: describes the extent, cardinality and nature of any
interaction among those producing and experiencing the text, for example in the form
of response or interjection, commentary, etc.
-
locale: contains a brief informal description of the kind of place
concerned, for example: a room, a restaurant, a park bench, etc.
-
preparedness: describes the extent to which a text may be regarded as
prepared or spontaneous.
-
purpose: characterizes a single purpose or communicative function of the
text.
-
setting: describes one particular setting in which a language interaction
takes place.
-
settingDesc (setting description): describes the setting or settings within
which a language interaction takes place, either as a prose description or as a
series of setting elements.