Programmbereich Korpuslinguistik - Projekt Korpusausbau
I5: The IDS text model
Derivation from TEI P5 via ODD file
Contents
This document provides a formal definition of I5, a derivation of the IDS/XCES vocabulary
as a customization of TEI P5. It contains the various specGrp
elements needed to
specify a customization of TEI, together with accompanying prose explaining the logic
of the
customization.
IDS/XCES [IDS 2006] is a DTD for corpus materials developed at the Institut für deutsche Sprache in Mannheim. It is based on XCES, an XML version of the Corpus Encoding Standard (CES) [Ide 1998], [Ide/Bonhomme/Romary 2000], which in turn was based on version TEI P3 of the Text Encoding Initiative Guidelines [ACH/ACL/ALLC 1994].
The primary goal is to provide a definition of the IDS/XCES vocabulary on the basis of TEI P5 [TEI 2007], and not (via XCES and CES) on the basis of TEI P3. TEI P3 customization involved the preparation of DTD files in tightly prescribed forms containing declarations which overrode the default declarations for the entities, elements, and attributes concerned. TEI P5 customization involves the preparation of an ODD (for ‘one document does it all’) document which describes changes to the base TEI vocabulary using a specialized vocabulary defined in chapter 22 of TEI P5.
A secondary goal is to document the structure of the customization, specifying what is included without change from TEI P5, what is excluded, and what is changed. Some differences between TEI P3 and IDS/XCES originated with CES or XCES and others were introduced when IDS/XCES was adapted from XCES; since those have different significance for further development and maintenance of the vocabulary, those two sets of differences are distinguished here. Another secondary goal is to provide at least rudimentary documentation for all elements in the vocabulary.
The first section below describes the vocabulary's use of the required TEI modules; the next section describes use of optional modules. There follows a section describing elements added by IDS/XCES, and a driver section which gathers together all the ODD fragments included earlier in the document. A final section describes some conformance and design issues which may need attention.
The brief descriptions of elements in the TEI and CES/XCES vocabularies are taken from the documentation for those encoding schemes; thanks are due to the authors and publishers of that documentation. Descriptions are included in the appendix for elements suppressed from modules which are otherwise included, in order to simplify review of the design and consideration of possible changes.
1 Required TEI Modules
This section of this ODD file includes a number of TEI P5 modules; eventually it will also describe differences between the P5 version of the elements involved and the older IDS/XCES versions.
This spec fragment is referred to by top-level schema fragment ids_v2a.
A note on notation: this document is not a description of the ODD file which generates the I5 version of the TEI P5 vocabulary; it is the ODD document. Blocks labeled “Spec fragment”, like the one just shown, are used to specify selections from and modifications to the TEI P5 vocabulary. As may be seen, such spec fragments may include cross references to other spec fragments elsewhere in the document, which are included by reference in the set of modifications; ODD documents are thus a specialized form of ‘literate programming’ as defined by the computer scientist Donald Knuth and used in the publication of his TeX and MetaFont programs. The literate-programming structure allows the formal specification of changes to be embedded in prose documentation intended to explain what is happening.
1.1 The tei module
tei
module is required for any TEI profile. The following
specification fragment includes the tei module and makes appropriate modifications
to
it.
Redefine one macro.
redefine macro.limitedContent
.
This spec fragment is referred to by specgroup-tei.
This spec fragment is referred to by required-modules.
1.2 The core module
tei
and core
modules are required for any TEI profile.
The following specification fragment includes the core module and makes appropriate
modifications to it.
Delete unneeded elements.
Rename some elements.
Redefine some elements.
This spec fragment is referred to by required-modules.
core
module are suppressed; for short
descriptions of these elements see the appendix.
Suppression of unused elements in core module.
add
.
binaryObject
.
cb
.
choice
.
del
.
divGen
.
expan
.
headItem
.
headLabel
.
index
.
listBibl
.
measureGrp
.
meeting
.
milestone
.
postBox
.
postCode
.
rs
.
said
.
series
.
sic
.
soCalled
.
street
.
unclear
.
This spec fragment is referred to by specgroup-core.
teiCorpus
element is renamed idsCorpus
, and its content model is adjusted:
idsCorpus
contains a sequence of idsDoc
elements, not
idsText
(~ TEI
) elements, so the default content model is not
appropriate.
Renaming elements in core module:
teiCorpus
:
rename as “idsCorpus
”.
string
value
string
value
string
value
This spec fragment is referred to by specgroup-core.
The remainder of the elements in the core module are included in the IDS/XCES
vocabulary; that remainder includes the elements listed below. (For the most part,
these
elements are also included in CES and XCES, but editor
, gloss
,
lb
, orig
, and pb
are not in XCES but are added back into
the vocabulary by IDS/XCES.)
Some elements are included without change from TEI, at least in the sense that the same parameter entity or pattern names are used in the declarations. (The extension of some element classes in IDS/XCES does of course mean the effective content model is not actually the same. But we do not need to supply a different model in this ODD file.)
For other elements, IDS/XCES declares a content model which is a restriction of the content model in TEI P5. One simple way to move I5 closer to TEI P5 would be to drop these restrictions and use the TEI P5 declarations for these elements unchanged.
Finally, for some elements IDS/XCES declares a content model which extends or modifies the declarations in TEI P5. Sometimes the change consists merely in the addition of one or more attributes, or adding the element as a member of this or that class. In other cases the content model is rewritten.
Redefinition of elements in core module.
This spec fragment is referred to by specgroup-core.
-
abbr
(abbreviation): contains an abbreviation of any sort.Spec fragment add_abbr_to_token_classChange elementabbr
:Classes(add) model.token(add) model.basicAttributesexpan(add this attribute)gives an expansion of the abbreviation.Example:<abbr expan = "Deutsche Volkspartei (1918-1933)">DVP</abbr>
Example:<abbr expan = "Unabhängige Sozialdemokratische Partei">USPD</abbr>
Example:<abbr expan = "Deutsche Demokratische Partei (1918-1930)">DDP</abbr>
Example:<abbr expan = "Arbeiter-und-Bauern-Fakultät">ABF</abbr>
Example:<abbr expan = "Deutsche Akademie der Künste (Berlin/Ost)">DAK</abbr>
Example:<abbr expan = "Deutsche Akademie der Wissenschaften (Berlin/Ost)">DAW</abbr>
Example:<abbr expan = "Deutsches Pädagogisches Zentralinstitut">DPZI</abbr>
Example:<abbr expan = "Deutscher Schriftstellerverband">DSV</abbr>
This spec fragment is referred to by specgroup-core-redefinitions.
-
address
: contains a postal address, for example of a publisher, an organization, or an individual. (Not present in samples.) -
analytic
(analytic level): contains bibliographic elements describing an item (e.g. an article or poem) published within a monograph or journal and not as an independent publication. IDS/XCES modifies the content model to fit the three-level structure of IDS corpora.Spec fragment redefine_analyticChange elementanalytic
:Change contents to(h.title+, (h.author | editor)*, (biblScope | biblNote)*, (edition, respStmt?)*, imprint+, idno*, (biblNote | biblScope)*)This spec fragment is referred to by specgroup-core-redefinitions.
monogr
. -
author
: in a bibliographic reference, contains the name(s) of the author(s), personal or corporate, of a work; for example in the same form as that provided by a recognized bibliographic name authority. -
bibl
(bibliographic citation): contains a loosely-structured bibliographic citation of which the sub-components may or may not be explicitly tagged. -
biblScope
(scope of citation): defines the scope of a bibliographic reference, for example as a list of page numbers, or a named subdivision of a larger work. -
biblStruct
(structured bibliographic citation): contains a structured bibliographic citation, in which only bibliographic sub-elements appear and in a specified order. -
corr
(correction): contains the correct form of a passage apparently erroneous in the copy text. IDS-XCES adds the attribute @sic (which gives the original form) as e.g. occurring in hi1bb.xces.Spec fragment redefine_corrChange elementcorr
:Attributessic(add this attribute)from CES Dokumentation: "gives the original form"Example:ToDo
This spec fragment is referred to by specgroup-core-redefinitions.
-
date
: contains a date in any format. CES declares this element as a member of thetoken
class.Spec fragment add_date_to_token_classChange elementdate
:Classes(add) model.tokenThis spec fragment is referred to by specgroup-core-redefinitions.
-
item
: adding att.typed for wiki talk HLU 2020-01-24 : Note that in the current TEI, desc has already @typeSpec fragment redefine_descChange elementdesc
:Classes(add) att.typed(add) model.descLikeThis spec fragment is referred to by specgroup-core-redefinitions.
-
distinct
: identifies any word or phrase which is regarded as linguistically distinct, for example as archaic, technical, dialectal, non-preferred, etc., or as forming part of a sublanguage. -
editor
: secondary statement of responsibility for a bibliographic item, for example the name of an individual, institution or organization, (or of several such) acting as editor, compiler, translator, etc. -
item
: adding att.typed for wiki talk HLU 2020-01-24Spec fragment redefine_emphChange elementemph
:Classes(add) att.typedThis spec fragment is referred to by specgroup-core-redefinitions.
-
foreign
: identifies a word or phrase as belonging to some language other than that of the surrounding text. IDS-XCES allows it to contain q, e.g. in loz-div-pub.xcesSpec fragment redefine_foreignChange elementforeign
:Change contents to(#PCDATA | %model.phrase; | %model.global; | q)*This spec fragment is referred to by specgroup-core-redefinitions.
-
gap
: indicates a point where material has been omitted in a transcription, whether for editorial reasons described in the TEI header, as part of sampling practice, or because the material is illegible, invisible, or inaudible. (In TEI P5, the description of the gap has moved from adesc
attribute to adesc
child; we revert this change for compatibility with existing data.)Spec fragment redefine_gapChange elementgap
:Change contents toEmpty.Attributesdesc(add this attribute)gives a description of the omitted material.Example:<p> <s type="manual">die anspruchsvolle Alternative für Leichtraucher.</s> <s type="manual">Astor mild im Rauch nikotinarm<gap desc="Zigarettenschachtel" reason="omitted"/>.</s> <s type="manual">zum Anbieten und Verschenken Astor mild-Kassette 48 Cigaretten DM 6,- 20 Astor mild DM 2,50.</s> <!--* ... *--> </p>
Example:<s>Weit treffender haben aber jene (die Griechen) die unteilbare Subsistenz einer vernünftigen Natur mit dem Wort 'o'<gap desc="GREEKSMALLLETTERSTIGMA" reason="omitted"/>benannt« (Boethius, zitiert nach Brasser 1999: 52).</s>
This spec fragment is referred to by specgroup-core-redefinitions.
-
gloss
: identifies a phrase or word used to provide a gloss or definition for some other word or phrase.Spec fragment redefine_glossChange elementgloss
:Attributestarget(change this attribute)the target of the pointerTypean XSDIDREF
valueThe original TEI target attribute of the class att.pointing comes out as of type CDATA from the ODD2DTD. (According to the TEI guidelines the type is 'datapointer' which stands for a single URI, which btw. correctly caused it to come out as xs:anyURI in the generated i5.xsd). But in the ids-xces.dtd it was IDREF for target at gloss. Hence I include it here with an explicit specification of the IDREF type.
This spec fragment is referred to by specgroup-core-redefinitions.
-
head
(heading): contains any type of heading, for example the title of a section, or the heading of a list, glossary, manuscript description, etc. Changed 2020-01-21 such that it can containsigend
, for wiki talk.Spec fragment redefine_headChange elementhead
:Change contents to(#PCDATA | %model.gLike; | %model.phrase; | %model.inter; | %model.global; | lg | %model.lLike; | %model.floatP.cmc;)*This spec fragment is referred to by specgroup-core-redefinitions.
-
hi
(highlighted): marks a word or phrase as graphically distinct from the surrounding text, for reasons concerning which no claim is made.Spec fragment add_hi_to_basic_classChange elementhi
:Classes(add) model.basicChange contents to(#PCDATA | %model.gLike; | %model.phrase; | %model.inter; | %model.global; | lg | %model.lLike; | %model.floatP.cmc;)*This spec fragment is referred to by specgroup-core-redefinitions.
-
imprint
: groups information relating to the publication or distribution of a bibliographic item. CES redefines this to use thepubDate
element instead ofdate
.Spec fragment redefine_imprintChange elementimprint
:Change contents to(pubPlace | publisher | pubDate)*This spec fragment is referred to by specgroup-core-redefinitions.
-
item
: contains one component of a list. . Redefined to containsigned
as well for wiki talk HLU 2020-01-02.Spec fragment redefine_itemChange elementitem
:Change contents to(#PCDATA | %model.gLike; | %model.phrase; | %model.inter; | %model.divPart; | %model.global; | %model.floatP.cmc;)*This spec fragment is referred to by specgroup-core-redefinitions.
-
l
(verse line): contains a single, possibly incomplete, line of verse. CES redefined the meaning and values of thepart
attribute.Spec fragment redefine_l_part_attributeChange elementl
:Attributespart(redefine this attribute)indicates whether the verse line is metrically complete.Valuesy
the line is metrically complete.n
the line is metrically incomplete.u
metricality is not known or inapplicable.Given the attribute name
part
, the valuey
might seem intuitively to mean “Yes, this is a partial line, not a full line,” but the CES documentation glossesy
andn
as shown.The attribute appears not to be actively used in any IDS samples, in any case; all values given are the default
u
.This spec fragment is referred to by specgroup-core-redefinitions.
-
label
: contains the label associated with an item in a list; in glossaries, marks the term being defined. -
lb
(line break): marks the start of a new (typographic) line in some edition or version of a text. IDS-XCES adds @TEIform to the attribute list as used in fsp.xces and gr1.xces.Spec fragment add_lb_to_ids.milestonesChange elementlb
:Classes(add) model.ids.milestonesAttributesTEIform(add this attribute)TEIformDefault valuepbThis spec fragment is referred to by specgroup-core-redefinitions.
-
lg
(line group): contains a group of verse lines functioning as a formal unit, e.g. a stanza, refrain, verse paragraph, etc.Spec fragment redefine_lg_part_attributeChange elementlg
:Attributespart(redefine this attribute)indicates whether the verse line group is metrically complete.Valuesy
the line is metrically complete.n
the line is metrically incomplete.u
metricality is not known or inapplicable.The attribute appears not to be actively used in any IDS samples; all values given are the default
u
.This spec fragment is referred to by specgroup-core-redefinitions.
-
list
: contains any sequence of items organized as a list. IDS/XCES allows milestones andxptr
elements among the children.Spec fragment redefine_listChange elementlist
:Change contents to(head?, (item | (label, %model.ids.milestones;*, item) | %model.ids.milestones;)*)This spec fragment is referred to by specgroup-core-redefinitions.
-
measure
: contains a word or phrase referring to some quantity of an object or commodity, usually comprising a number, a unit, and a commodity name. -
mentioned
: marks words or phrases mentioned, not used. (Not present in samples.) -
monogr
(monographic level): contains bibliographic elements describing an item (e.g. a book or journal) published as an independent item (i.e. as a separate physical object). CES restricts the TEI content model and renames some elements; IDS/XCES extends the CES definition. The model given here is the same as foranalytic
.Spec fragment redefine_monogrChange elementmonogr
:Change contents to(h.title+, (h.author | editor)*, (biblScope | biblNote)*, (edition, respStmt?)*, imprint+, idno*, (biblNote | biblScope)*)This spec fragment is referred to by specgroup-core-redefinitions.
-
name
(name, proper noun): contains a proper noun or noun phrase.Spec fragment add_name_to_token_classChange elementname
:Classes(add) model.tokenThis spec fragment is referred to by specgroup-core-redefinitions.
-
note
: contains a note or annotation. -
num
(number): contains a number, written in any form.Spec fragment add_num_to_token_classChange elementnum
:Classes(add) model.token(add) model.basicThis spec fragment is referred to by specgroup-core-redefinitions.
-
orig
(original form): contains a reading which is marked as following the original, rather than being normalized or corrected. Thereg
attribute was dropped in TEI P5 and must be restored. CES also adds aregalt
attribute which must be defined.Spec fragment add_attributes_to_origChange elementorig
:Attributesreg(add this attribute)gives a regularized (normalized) form of the text.regalt(add this attribute)gives an alternate form of the regularized (normalized) text.Example:<s type="manual">der warnt - wider die eiserne Regel des <orig reg="Wahlkampfes" regalt="Wahlkrampfes">Wahlk(r)ampfes</orig>, einen Gegner durch Nichtnennung zu strafen - davor, PDS zu wählen.</s>
This spec fragment is referred to by specgroup-core-redefinitions.
-
p
(paragraph): marks paragraphs in prose.Spec fragment redefine_pChange elementp
:(paragraph) marks paragraphs in prose. In the case of CMC documents, notably Wiki talk pages, it is necessary thatsigned
may also appear inside paragraphs. In a Wiki talk page, users insert their signature as part of the paragraph. The only change to the original content model ofp
is thatsigned
is additionally allowed insidep
.Change contents to(#PCDATA | %model.gLike; | %model.phrase; | %model.inter; | %model.global; | lg | %model.lLike; | %model.floatP.cmc;)*Example:Usenet news message
<<egXML>><<p>>Wer die ruhrtour Mailingliste noch nicht kennt, der schaut bitte weiter unten nach!<</p>><</egXML>>This spec fragment is referred to by specgroup-core-redefinitions.
-
pb
(page break): marks the boundary between one page of a text and the next in a standard reference system.Spec fragment add_TEIform_to_pbChange elementpb
:Classes(add) model.ids.milestonesAttributesTEIform(add this attribute)Default valuepbThis spec fragment is referred to by specgroup-core-redefinitions.
-
ptr
(pointer): defines a pointer to another location. IDS/XCES adds this element to theids.milestones
class.Spec fragment ptr_as_milestoneChange elementptr
:Classes(add) att.text(drop) att.global(add) model.ids.milestonesAttributestarget(change this attribute)the target of the pointerTypean XSDIDREFS
valueThe original TEI target attribute of the class att.pointing comes out as of type CDATA from the ODD2DTD. (According to the TEI guidelines the type is 'datapointer' which stands for a single URI, which btw. correctly caused it to come out as xs:anyURI in the generated i5.xsd). But in the ids-xces.dtd it was IDREFS. Hence I include it here with an explicit specification of the IDREFS type.
This spec fragment is referred to by specgroup-core-redefinitions.
-
pubPlace
(publication place): contains the name of the place where a bibliographic item was published. -
publisher
: provides the name of the organization responsible for the publication or distribution of a bibliographic item. -
q
(separated from the surrounding text with quotation marks): contains material which is marked as (ostensibly) being somehow different than the surrounding text, for any one of a variety of reasons including, but not limited to: direct speech or thought, technical terms or jargon, authorial distance, quotations from elsewhere, and passages that are mentioned but not used. CES adds several attributes.Spec fragment add_attributes_to_qChange elementq
:Attributesnext(change this attribute)points to the next element of a virtual aggregate of which the current element is part. Specifically, forq
elements, gives the ID of a subsequentq
element which contains a continuation of the same quotation.Typean XSDIDREF
valueIn TEI P5, this attribute is URI-valued; in P3 (and IDS/XCES), it is IDREF-valued.
prev(change this attribute)points to the previous element of a virtual aggregate of which the current element is part. Specifically, forq
elements, gives the ID of a precedingq
element which contains the immediately preceding portion of the same quotation.Typean XSDIDREF
valueIn TEI P5, this attribute is URI-valued; in P3 (and IDS/XCES), it is IDREF-valued.
directmay be used to indicate whether the quoted matter is regarded as direct or indirect speech.Valuesy
speech or thought is represented directly.n
speech or thought is represented indirectly, e.g. by use of a marked verbal aspect.unspecified
no claim is made.brokenindicates whether this quotation or piece of dialog is broken between two or moreq
elements (linked using thenext
andprev
attributes).Valuesy
quotation is broken across two or more elements.n
quotation is not broken across multiple elements.unspecified
no claim is made.This spec fragment is referred to by specgroup-core-redefinitions.
-
quote
(quotation): contains a phrase or passage attributed by the narrator or author to some agency external to the text. -
ref
(reference): defines a reference to another location, possibly modified by additional text or comment. IDS-XCES adds @orig. -
reg
(regularization): contains a reading which has been regularized or normalized in some sense.Spec fragment redefine_regChange elementreg
:Attributesorig(add this attribute)for the originalExample:ToDo
This spec fragment is referred to by specgroup-core-redefinitions.
-
respStmt
(statement of responsibility): supplies a statement of responsibility for the intellectual content of a text, edition, recording, or series, where the specialized elements for authors, editors, etc. do not suffice or do not apply.Spec fragment redefine_respStmtChange elementrespStmt
:Change contents to(resp*, %model.nameLike.agent;+)This spec fragment is referred to by specgroup-core-redefinitions.
-
sp
(speech): An individual speech in a performance text, or a passage presented as such in a prose or verse text. CES restricts this content model severely; IDS/XCES brings it back closer to the TEI form, and adds the class of IDS milestones to the legal content.Spec fragment redefine_spChange elementsp
:Change contents to(speaker | p | quote | poem | stage | %model.ids.milestones;)*Attributesrole(add this attribute)Role in plenary debate such as presidency or ordinary mpType#PCDATAname(add this attribute)Name of speakerType#PCDATAparliamentary_group(add this attribute)Parliamentary group (German "Fraktion") of speakerType#PCDATAparty(add this attribute)Partyof speakerType#PCDATAExample:<sp who="Korzcak"> <speaker/> <p> <s>Sie irren sich, <stage>erwiderte Korczak,</stage> nicht jeder ist ein Schuft<stage>, und er schlug die Waggontür hinter sich zu</stage>.</s> </p> </sp>
This spec fragment is referred to by specgroup-core-redefinitions.
-
speaker
: A specialized form of heading or label, giving the name of one or more speakers in a dramatic text or fragment. -
stage
(stage direction): contains any kind of stage direction within a dramatic text or fragment. -
term
: contains a single-word, multi-word, or symbolic designation which is regarded as a technical term.Spec fragment add_term_to_token_classChange elementterm
:Classes(add) model.tokenThis spec fragment is referred to by specgroup-core-redefinitions.
-
time
: contains a phrase defining a time of day in any format.Spec fragment add_time_to_token_classChange elementtime
:Classes(add) model.tokenThis spec fragment is referred to by specgroup-core-redefinitions.
-
title
: contains a title for any kind of work.
1.3 The header
module
header
module is also essential. We include it here:
This spec fragment is referred to by required-modules.
teiHeader
element is renamed to idsHeader
, and we add some
attributes (the status
attribute was in TEI P3 but seems to have disappeared
from P5):
Renaming teiHeader as idsHeader ...
teiHeader
:
rename as “idsHeader
”.
new
update
This spec fragment is referred to by specgroup-header.
header
module are suppressed; for short
descriptions see the appendix.
Deleting unused elements in header module ...
appInfo
.
application
.
authority
.
cRefPattern
.
geoDecl
.
handNote
.
interpretation
.
namespace
.
notesStmt
.
principal
.
refState
.
rendition
.
scriptNote
.
seriesStmt
.
sponsor
.
stdVals
.
typeNote
.
This spec fragment is referred to by specgroup-header.
header
module, sometimes with content models which extend or otherwise
modify the content models of TEI P5 in such a way that instances of the revised element
type are not valid against the unmodified TEI P5 schema, and sometimes without change.
In several cases, CES changes a content model from requiring a sequence of paragraphs
to
allowing just character data. In other cases, specialized child elements are added
to
the content model.
Modifying some elements and classes in header module ...
This spec fragment is referred to by specgroup-header.
declarable
, CES renames one attribute from
default
to Default
, and changes its values from yes
and no
to y
and n
.
att.declarable
.
y
n
This spec fragment is referred to by specgroup-header-redefinitions.
header
module are these.
-
availability
: supplies information about the availability of a text any restrictions on its use or distribution, its copyright status, etc.Spec fragment redefine_availabilityChange elementavailability
:Change contents to((#PCDATA | %model.availabilityPart; | %model.pLike;)*)Attributesregion(add this attribute)Default valueworldlabel(add this attribute)Values(closed list)QAO-NC
QAO-NC-LOC:ids
QAO-NC-LOC:ids-NU:1
ACA-NC
ACA-NC-LC
CC-BY-SA
Example:An example:
<availability region = "world" status = "unknown"/>
This spec fragment is referred to by specgroup-header-redefinitions.
header
module with content models which restrict the content models of
TEI P5.
-
biblFull
(fully-structured bibliographic citation): contains a fully-structured bibliographic citation, in which all components of the TEI file description are present. -
catDesc
(category description): describes some category within a taxonomy or text typology, either in the form of a brief prose description or in terms of the situational parameters used by the TEI formal textDesc. -
catRef
(category reference): specifies one or more defined categories within some taxonomy or text typology.Spec fragment redefine_catRefChange elementcatRef
:Attributestarget(change this attribute)the target of the pointerTypean XSDIDREFS
valueThe original TEI target attribute of the class att.pointing comes out as of type CDATA from the ODD2DTD. (According to the TEI guidelines the type is 'datapointer' which stands for a single URI, which btw. correctly caused it to come out as xs:anyURI in the generated i5.xsd). But in the ids-xces.dtd it was IDREFS. Hence I include it here with an explicit specification of the IDREFS type.
This spec fragment is referred to by specgroup-header-redefinitions.
-
category
: contains an individual descriptive category, possibly nested within a superordinate category, within a user-defined taxonomy. -
change
: summarizes a particular change or correction made to a particular version of an electronic text which is shared between several researchers. -
classCode
-
classDecl
(classification declarations): contains one or more taxonomies defining any classificatory codes used elsewhere in the text. -
correction
(correction principles): states how and under what circumstances corrections have been made in the text.Spec fragment redefine_correction_as_PCDATAChange elementcorrection
:Change contents tocharacter dataThis spec fragment is referred to by specgroup-header-redefinitions.
-
creation
: contains information about the creation of a text.Spec fragment redefine_creationChange elementcreation
:Change contents to(creatDate, creatRef?, creatRefShort?)This spec fragment is referred to by specgroup-header-redefinitions.
-
distributor
: supplies the name of a person or other agency responsible for the distribution of a text. -
edition
: describes the particularities of one edition of a text.Spec fragment redefine_editionChange elementedition
:Change contents to(further, kind, appearance)This spec fragment is referred to by specgroup-header-redefinitions.
-
editionStmt
(edition statement): groups information relating to one edition of a text.Spec fragment redefine_editionStmt_as_PCDATAChange elementeditionStmt
:Change contents tocharacter dataAttributesversionThis spec fragment is referred to by specgroup-header-redefinitions.
-
editorialDecl
(editorial practice declaration): provides details of editorial principles and practices applied during the encoding of a text. CES changes the set of children for this element from P3 (suppressinginterpretation
andstdVals
and addingtransduction
andconformance
); IDS/XCES addspagination
to the children.Spec fragment redefine_editorialDeclChange elementeditorialDecl
:Change contents to(pagination | correction | quotation | hyphenation | segmentation | transduction | normalization | conformance)+AttributesversionThis spec fragment is referred to by specgroup-header-redefinitions.
-
encodingDesc
(encoding description): documents the relationship between an electronic text and the source or sources from which it was derived. IDS-XCES additionally allows an empty encodingDesc as used in dck.xces .Spec fragment redefine_encodingDescChange elementencodingDesc
:Change contents to(%model.encodingDescPart; | %model.pLike;)*This spec fragment is referred to by specgroup-header-redefinitions.
-
extent
: describes the approximate size of a text as stored on some carrier medium, whether digital or non-digital, specified in any convenient units. -
fileDesc
(file description): contains a full bibliographic description of an electronic file. -
hyphenation
: summarizes the way in which hyphenation in a source text has been treated in an encoded version of it. -
idno
(identifier): supplies any form of identifier used to identify some object, such as a bibliographic item, a person, a title, an organization, etc. in a standardized way. -
keywords
: contains a list of keywords or phrases identifying the topic or nature of a text. -
langUsage
(language usage): describes the languages, sublanguages, registers, dialects, etc. represented within a text. -
language
: characterizes a single language or sublanguage used within a text. TEI P5 uses anident
attribute, notid
, to give the language code; IDS/XCES follows P3 in this.Spec fragment redefine_languageChange elementlanguage
:Attributesident(delete this attribute)id(change this attribute)Supplies a language code constructed as defined in BCP 47 which is used to identify the language documented by this element, and which is referenced by the global attributeslang
andxml:lang
.Typean XSDID
valueExample:<language id="de" usage="100">Deutsch</language>
Note that for technical reasons it is not possible to assign the type
ID
to both theid
andxml:id
attributes. This version of I5 assigns theID
type to attributeid
.This spec fragment is referred to by specgroup-header-redefinitions.
-
normalization
: indicates the extent of normalization or regularization of the original source carried out in converting it to electronic form.Spec fragment redefine_normalization_as_PCDATAChange elementnormalization
:Change contents tocharacter dataThis spec fragment is referred to by specgroup-header-redefinitions.
-
profileDesc
(text-profile description): provides a detailed description of non-bibliographic aspects of a text, specifically the languages and sublanguages used, the situation in which it was produced, the participants and their setting. -
projectDesc
(project description): describes in detail the aim or purpose for which an electronic file was encoded, together with any other relevant information concerning the process by which it was assembled or collected.Spec fragment redefine_projectDesc_as_PCDATA_or_pChange elementprojectDesc
:Change contents to(#PCDATA | %model.pLike;)*This spec fragment is referred to by specgroup-header-redefinitions.
-
publicationStmt
(publication statement): groups information concerning the publication or distribution of an electronic or other text.Spec fragment redefine_publicationStmtChange elementpublicationStmt
:Change contents to((distributor, pubAddress, telephone*, fax*, eAddress*, idno*, availability, pubDate, pubPlace*) | p+)Example:An example (actually, all of the
publicationStmt
elements in the available samples look like this, or else have no contents in any of their children):<publicationStmt> <distributor >Institut für Deutsche Sprache</distributor> <pubAddress >Postfach 10 16 21, D-68016 Mannheim</pubAddress> <telephone >+49 (0)621 1581 0</telephone> <availability region="world" status="unknown"/> <pubDate/> </publicationStmt>
This spec fragment is referred to by specgroup-header-redefinitions.
-
quotation
: specifies editorial practice adopted with respect to quotation marks in the original.Spec fragment redefine_quotation_as_PCDATAChange elementquotation
:Change contents tocharacter dataThis spec fragment is referred to by specgroup-header-redefinitions.
-
refsDecl
(references declaration): specifies how canonical references are constructed for this text. -
revisionDesc
(revision description): summarizes the revision history for a file. -
samplingDecl
(sampling declaration): contains a prose description of the rationale and methods used in sampling texts in the creation of a corpus or collection.Spec fragment redefine_samplingDecl_as_PCDATAChange elementsamplingDecl
:Change contents to(#PCDATA | p)*This spec fragment is referred to by specgroup-header-redefinitions.
-
segmentation
: describes the principles according to which the text has been segmented, for example into sentences, tone-units, graphemic strata, etc.Spec fragment redefine_segmentation_as_PCDATAChange elementsegmentation
:Change contents tocharacter dataThis spec fragment is referred to by specgroup-header-redefinitions.
-
sourceDesc
(source description): describes the source from which an electronic text was derived or generated, typically a bibliographic description in the case of a digitized text, or a phrase such as "born digital" for a text which has no previous existence.Spec fragment redefine_sourceDescChange elementsourceDesc
:Change contents to(%model.pLike;*, (biblFull | biblStruct)+, reference*)This spec fragment is referred to by specgroup-header-redefinitions.
-
tagUsage
: supplies information about the usage of a specific element within a text. -
tagsDecl
(tagging declaration): provides detailed information about the tagging applied to a document. TEI P5 requires that thetagUsage
elements in thetagsDecl
element be wrapped in anamespace
element; IDS/XCES follows P3. -
taxonomy
: defines a typology used to classify texts either implicitly, by means of a bibliographic citation, or explicitly by a structured taxonomy.Spec fragment redefine_taxonomyChange elementtaxonomy
:Change contents to(category+ | ((h.bibl | biblStruct), category*))This spec fragment is referred to by specgroup-header-redefinitions.
-
textClass
(text classification): groups information which describes the nature or topic of a text in terms of a standard classification scheme, thesaurus, etc.Spec fragment redefine_textClassChange elementtextClass
:Change contents to(catRef | classCode | h.keywords)*This spec fragment is referred to by specgroup-header-redefinitions.
-
titleStmt
(title statement): groups information about the title of a work and those responsible for its intellectual content. IDS/XCES adjusts the content model here to use the specialized title elements it defines for the different corpus levels. An I5 corpus level-specific {c|d|t}.title element is obligatory, in addition, original TEItitle
elements with their suitable attributes may be specified, e.g. for specifying subtitlesSpec fragment redefine_titleStmtChange elementtitleStmt
:Change contents to(((korpusSigle, c.title) | (dokumentSigle, d.title) | (textSigle, t.title) | (x.title)), %model.respLike;*)This spec fragment is referred to by specgroup-header-redefinitions.
-
xenoData
Spec fragment define_xenoData<xenoData> element (new)Classes(add) model.teiHeaderPartContainsmeta+This spec fragment is referred to by specgroup-header-redefinitions.
-
meta
Spec fragment define_meta<meta> element (new)Containscharacter dataAttributesname(add this attribute)name of featureType#PCDATAproject(add this attribute)name of project or collectionType#PCDATAtype(add this attribute)string or text or keyword or integerValues(closed list)string
stringtext
textkeyword
keywordinteger
integeruri
uriattachment
only retrievabledate
datedesc(add this attribute)descriptionType#PCDATAThis spec fragment is referred to by specgroup-header-redefinitions.
1.4 The textstructure
module
textstructure
module is also included:
This spec fragment is referred to by required-modules.
TEI
element is renamed idsText
:
-
TEI
(TEI document): contains a single TEI-conformant document, comprising a TEI header and a text, either in isolation or as part of ateiCorpus
element. This corresponds in essential ways to the IDSidsText
element.
Renaming ...
TEI
:
rename as “idsText
”.
This spec fragment is referred to by specgroup-textstructure.
Suppressing unused elements ...
argument
.
div1
.
div2
.
div3
.
div4
.
div5
.
div6
.
div7
.
docDate
.
floatingText
.
group
.
imprimatur
.
This spec fragment is referred to by specgroup-textstructure.
textstructure
module. A few of these are included in CES and XCES, with
declarations which extend or otherwise modify those in the TEI. Others are omitted
from
CES and XCES and have been added back into the vocabulary by IDS. In some cases, the
IDS/XCES declaration extends that of TEI.
This spec fragment is referred to by specgroup-textstructure.
-
back
(back matter): contains any appendixes, etc. following the main part of a text. -
body
(text body): contains the whole body of a single unitary text, excluding any front or back matter. -
byline
: contains the primary statement of responsibility given for a work on its title page or at the head or end of the work. -
closer
: groups together salutations, datelines, and similar phrases appearing as a final group at the end of a division, especially of a letter. -
dateline
: contains a brief description of the place, date, time, etc. of production of a letter, newspaper story, or other work, prefixed or suffixed to it as a kind of heading or trailer. -
div
(text division): contains a subdivision of the front, body, or back of a text. IDS/XCES eliminates the internal structure of the declarations in TEI and CES and allows a mixture of children in any order. We additionally alow the element posting from the DeRik TEI-proposal for CMC.Spec fragment modify_attributes_for_divChange elementdiv
:Change contents to(opener | head | byline | p | sp | u | %model.inter; | caption | figure | note | %model.divLike; | closer | %model.ids.milestones; | dateline)*Attributescomplete(add this attribute)indicates whether the section is complete or a sample.Typean XSDNMTOKEN
valueDefault valueyValues(closed list)y
The text section is complete.n
The text section is incomplete (typically because it's a sample.type(redefine this attribute)the type of the text section.Type#PCDATAThe most frequent values include: “section”, “Zeitung”, “book”, “Enzyklopädie-Artikel”, “Agenturmeldungen”, “figures”, “marginnotes”, “Rede”, “Zeitschrift”, “footnotes”, “Roman”, “content”, “preface”.
Other values include: “abstract”, “Anmerkung”, “Ansprache”, “appendix”, “Aufruf”, “Aufsatz”, “Ausgabenvermerk”, “Beschluss”, “bibliography”, “Brief”, “captions”, “dedication”, “endnotes”, “Erklärung”, “Erzählung”, “Erzählungen”, “Fabeln”, “Forderung”, “Geschichte”, “glossary”, “Handzettel”, “Information”, “Interview”, “Kolumnen”, “Kriminalroman”, “Kurzgeschichten”, “Merkblatt”, “Nachwort”, “Novelle”, “postface”, “Predigt”, “Protokoll”, “Referat”, “Sachbuch”, “Sachbuch, Ratgeber”, “Schilderung”, “Sprechchöre und Transparente”, “Vorlesung”, “Vortrag”, “Wissenschaftszeitung”, and “Zeitungsartikel”.
what(add this attribute)Subject of section in plenary debate according to GermaParlTEIType#PCDATAdesc(add this attribute)Descriptionof section in plenary debate according to GermaParlTEIType#PCDATAThis spec fragment is referred to by specgroup-text-structure-changes.
-
docAuthor
(document author): contains the name of the author of the document, as given on the title page (often but not always contained in a byline). -
docEdition
(document edition): contains an edition statement as presented on a title page of a document. -
docImprint
(document imprint): contains the imprint statement (place and date of publication, publisher name), as given (usually) at the foot of a title page. -
docTitle
(document title): contains the title of a document, including all its constituents, as given on a title page. -
epigraph
: contains a quotation, anonymous or attributed, appearing at the start of a section or chapter, or on a title page. -
front
(front matter): contains any prefatory matter (headers, title page, prefaces, dedications, etc.) found at the start of a document, before the main body. -
opener
: groups together dateline, byline, salutation, and similar phrases appearing as a preliminary group at the start of a division, especially of a letter. IDS-XCES adds gap to the content model.Spec fragment redefine_openerChange elementopener
:Change contents to(#PCDATA | %model.phrase; | dateline | keywords | salute | list | %model.global.edit; | pb | lb)*Attributestype(add this attribute)indicates the type of opener.Typean XSDNMTOKEN
valueDefault valueunspecifiedValues(closed list)lead
unspecified
This spec fragment is referred to by specgroup-text-structure-changes.
-
salute
(salutation): contains a salutation or greeting prefixed to a foreword, dedicatory epistle, or other division of a text, or the salutation in the closing of a letter, preface, etc. -
signed
(signature): contains the closing salutation, etc., appended to a foreword, dedicatory epistle, or other division of a text.Spec fragment redefine_signedChange elementsigned
:(signature) contains the closing salutation, etc., appended to a foreword, dedicatory epistle, or other division of a text, or appearing freely within paragraphs, sentences, quotations or the post as a whole especially of an email, or of a user contribution on a Wikipedia talk page.Classes(add) model.floatP.cmcAttributestype(add this attribute)Values(closed list)signed
indicates that the corresponding posting was explicitly signed by a registered user using a user signature mark up (e.g. ~~~~).unsigned
indicates that the corresponding posting was marked by either a registered or unregistered user using the Unsigned or Help template.user_contribution
"user_contribution" indicates that the corresponding posting was marked using a [[Special:Contributions/IP]] link (e.g by an unregistered user)special_contribution
added 2019-06-14 This is actually the same as "user_contribution" "special_contribution" indicates that the corresponding posting was marked using a [[Special:Contributions/IP]] link (e.g by an unregistered user)This spec fragment is referred to by specgroup-text-structure-changes.
-
text
: contains a single text of any kind, whether unitary or composite, for example a poem or drama, a collection of essays, a novel, a dictionary, or a corpus sample. -
titlePage
(title page): contains the title page of a text, appearing within the front or back matter. -
titlePart
: contains a subsection or division of the title of a work, as indicated on a title page.
2 Optional TEI modules
This section lists the optional TEI modules incorporated in whole or part into the IDS/XCES vocabular.
This spec fragment is referred to by top-level schema fragment ids_v2a.
2.1 The analysis
module
analysis
module:
-
s
(s-unit): contains a sentence-like division of a text. CES and the existing IDS/XCES DTD define this using the parameter entityphrase.seq
, but this relies on a different meaning for thephrase
class than is present in TEI P5.Spec fragment redefine_sChange elements
:Change contents to(#PCDATA | %model.phrase; | %model.global; | q | list | stage | %model.floatP.cmc; | quote | poem)*Attributesbrokenindicates whether this sentence is broken between two or mores
elements (linked using thenext
andprev
attributes).Valuesy
sentence is represented by multiples
elements.n
sentence is represented by a singles
element.unspecified
no claim is made.This attribute appears not to be in use in the IDS samples.
This spec fragment is referred to by specgroup-analysis.
-
w
(word): represents a grammatical (not necessarily orthographic) word.Spec fragment add_w_to_token_classChange elementw
:Classes(add) model.tokenAttributesposcontains a POS valueType#PCDATAorig(original) gives the original string or is the empty string when the element does not appear in the source text.Type#PCDATAhead(add this attribute)head indicator as in conll-uTypeteidata.countdeprel(add this attribute)Type#PCDATAmsd(add this attribute)Type#PCDATAjoinWhen present, it provides information on whether the token in question is adjacent to another, and if so, on which side. The definition of this attribute is adapted from ISO MAF (Morpho-syntactic Annotation Framework), ISO 24611:2012.Values(closed list)no
The token is not adjacent to anotherleft
There is no whitespace on the left side of the tokenright
There is no whitespace on the left side of the tokenboth
There is no whitespace on either side of the tokenoverlap
The token overlaps with another; other devices (specifying the extent and the area of overlap) are needed to more precisely locate this token in the character stream.This spec fragment is referred to by specgroup-analysis.
c
.
cl
.
interp
.
interpGrp
.
m
.
pc
.
phr
.
span
.
spanGrp
.
This spec fragment is referred to by optional-modules.
2.2 The corpus
module
corpus
module:
-
particDesc
(participation description): describes the identifiable speakers, voices, or other participants in any kind of text.textDesc
(text description): provides a description of a text in terms of its situational parameters.Spec fragment redefine_textDescChange elementtextDesc
:Change contents to(textType?, textTypeRef?, textTypeArt*, textDomain?, column?)This spec fragment is referred to by specgroup-corpus.
channel
.
constitution
.
derivation
.
domain
.
factuality
.
interaction
.
locale
.
preparedness
.
purpose
.
This spec fragment is referred to by optional-modules.
2.3 The figures
module
figures
module:
-
cell
: contains one cell of a table. -
figDesc
(description of figure): contains a brief prose description of the appearance or content of a graphic figure, for use when documenting an image without displaying it. -
row
: contains one row of a table. -
table
: contains text displayed in tabular form, in rows and columns.
figures
module are redefined:
-
figure
: groups elements representing or containing graphic information such as an illustration or figure. IDS-XCES adds ptr to the content model.Spec fragment redefine_figureChange elementfigure
:Change contents to(%model.headLike; | ptr | %model.common; | figDesc | %model.graphicLike; | %model.global; | %model.divBottomPart; | %model.divWrapper; | %model.descLike;)*Example:ToDo
This spec fragment is referred to by specgroup-figures.
Figure
is redefined.
This spec fragment is referred to by optional-modules.
2.4 The spoken
module
As a conservative extension, the I5 vocabulary now includes the element u
from the
spoken module, for an inclusion of transcripts of spoken language in DeReKo.
The following elements are deleted from the module spoken:
annotationBlock
.
broadcast
.
equipment
.
incident
.
kinesic
.
pause
.
recording
.
recordingStmt
.
scriptStmt
.
shift
.
transcriptionDesc
.
vocal
.
writing
.
This spec fragment is referred to by optional-modules.
2.5 The namesdates
module
As a conservative extension, the I5 vocabulary now includes 11 elements from the namesdates module. The rest of the module might as well be included, but are suppressed because they do not seem to be needed for now.
The following elements are deleted from the module namesdates:
addName
.
affiliation
.
age
.
birth
.
bloc
.
climate
.
death
.
district
.
education
.
event
.
faith
.
floruit
.
genName
.
geo
.
geogFeat
.
langKnowledge
.
langKnown
.
listEvent
.
listNym
.
listPlace
.
listRelation
.
location
.
nameLink
.
nationality
.
nym
.
occupation
.
offset
.
org
.
personGrp
.
place
.
population
.
relation
.
relationGrp
.
residence
.
roleName
.
settlement
.
sex
.
socecStatus
.
state
.
terrain
.
trait
.
This spec fragment is referred to by optional-modules.
2.6 The linking
module
In DeReKo, the elements ref
, ptr
, and xptr
are used for
linking. ref
is already included in I5 through the core
module.
The elements xref
and xptr
were declared in the linking module in TEI
P3 and P4, but they are no longer part of TEI P5. From the TEI P5 linking module,
only
the elements timeline
and seg
are taken. They are needed for the
encoding of CMC documents
The following specification fragment includes the linking module and makes and delete
all elemente except seg
, timeline
, and when
.
The following elements are deleted from the module linking:
ab
.
anchor
.
alt
.
altGrp
.
join
.
joinGrp
.
link
.
linkGrp
.
Choosing the linking module automatically includes the linking attributes
corresp
, synch
, sameAs
, copyOf
,
next
, prev
, exclude
, and select
. All
linking attributes are also att.global, thus can appear almost anywhere.
This spec fragment is referred to by optional-modules.
2.7 TEI modules not included
-
The
certainty
module (for recording points of uncertainty and dispute). -
The
dictionaries
module (for print or electronic dictionaries). -
The
drama
module. (N.B. thecaption
element of TEI P5 which is included in this module has nothing to do with thecaption
element introduced by CES as an extension of TEI, and retained by IDS/XCES.) -
The
gaiji
module (for extending the Unicode / ISO 10646 universal character set). -
The
fs
module (for the representation of feature structures for linguistic or other analysis). -
The
msdescription
module for description of manuscript materials. -
The
nets
module (for representation of graphs, networks, and trees). -
The
tagdocs
module (for documentation of XML vocabularies). -
The
textcrit
module (for the representation of text-critical apparatus as used in scholarly editions). -
The
transcr
module (for markup of transcriptions of original source material). -
The
verse
module (for markup of metrical phenomena in verse).
3 Models from cmc-core
The module cmc, the classes model.floatP.cmc and att.lobal.cmc
model.floatP.cmc
.
signed
(see at the definition of
signed
) to allow it to occur freely and multiply within the elements that
have model.floatP.cmc as part of their content model. In the original TEI,
signed
is p-like, i.e. restricted to occur between paragraphs only,
reflecting the more rigid structure of written letters. This extension is needed to
mark user signatures in Wikipedia talk which occur within p
,
s
,head
,hi
, and item
att.global.cmc
.
creation
.
This class provides the additional global attribute creation
for
associating information about how the element content was created in a CMC
environment.
human
template
system
bot
unspecified
automatic system message in chat: user moves on to another chatroom
automatic system message in chat: user enters a chatroom
automatic system message in chat: user changes his font color
An automatic signature of user including an automatic timestamp (Wikipedia
discussion, anonymized). The specification of creation
at the inner
element signed
is meant to override the specification at the outer
element post
. This is generally possible when the outer
creation
value is "human".
Usenet news message: a client-generated line that introduces a quotation from a previous message (similar to email):
Wikipedia talk page, user signature
This spec fragment is referred to by top-level schema fragment ids_v2a.
4 Elements added by IDS
- elements taken over without change from CES and XCES
- elements taken over from TEI P3 which are no longer present in P5
- elements added by IDS
- elements defined by DERIK
- elements defined by TEI Correspondence SIG
This spec fragment is referred to by top-level schema fragment ids_v2a.
4.1 Elements taken over without change from CES and XCES
annotation
: provides information about one external annotation document associated with the text. (Not present in samples.)Spec fragment ids-header-annotation<annotation> element (new)provides information about one external annotation document associated with the text.Classesatt.globalContainscharacter dataAttributestypeindicates the type of annotation.Type#PCDATAValues(open list)segment
annotation file contains segmentation into words and sentences.gram
annotation file contains morpho-syntactic category information for the words in the text.align
annotation file contains alignment links to a parallel translation.ann.locprovides information (path/file name, URL, etc.) about the location of the annotation file.Type#PCDATAtrans.locfor annotation file containing alignment information, provides information (path/file name, URL, etc.) about the location of the file containing the aligned text.Type#PCDATAThis spec fragment is referred to by specgroup-XCES-unchanged.
annotations
(in fileids.xheader.elt
): child ofprofileDesc
. Groupsannotation
elements. (Not present in samples.)Spec fragment ids-header-annotations<annotations> element (new)groups information about annotation documents associated with the text.Classesatt.globalContainsannotation+This spec fragment is referred to by specgroup-XCES-unchanged.
biblNote
: a descriptive note supplying additional information of any kind relating to a bibliographic item described within a corpus or text header.Spec fragment ids-header-biblNote<biblNote> element (new)child ofanalytic
andmonogr
. #PCDATA, but otherwise roughly equivalent to TEInote
.Classesatt.globalContainscharacter dataExample:<biblNote n="1">Die Datengrundlage der Tagebücher selbst (17. Juni bis 31. Dezember 1945) bildet: Klemperer, Victor: So sitze ich denn zwischen allen Stühlen, Bd. 1, Tagebücher 1945-1949, Hrsg.: Nowojski, Walter; unter Mitarbeit von Christian Löser. - Berlin: Aufbau-Verlag</biblNote>
Example:<biblNote n="1">ID:5FDC73, 2007.01.01 10:11</biblNote>
This spec fragment is referred to by specgroup-XCES-unchanged.
byteCount
: contains the count of bytes in the file containing the text together with its markup. (Not present in samples.)Spec fragment ids-header-byteCount<byteCount> element (new)child ofextent
; #PCDATA.Classesatt.globalContainscharacter dataAttributesunitsDefault valuekbValuesbytes
kb
mb
gb
This spec fragment is referred to by specgroup-XCES-unchanged.
changeDate
: gives the date of a change (as child ofchange
). (Not present in samples.)Spec fragment ids-header-changeDate<changeDate> element (new)child ofchange
; #PCDATA; context-dependent specialization of TEIdate
.Classesatt.globalContainscharacter dataAttributesvalueTypean XSDdate
valueThis spec fragment is referred to by specgroup-XCES-unchanged.
conformance
: provides the CES level of conformance for the text or corpus. (Not present in samples.)Spec fragment ids-header-conformance<conformance> element (new)child ofeditorialDecl
; #PCDATA pluslevel
attribute.Classesatt.globalContainscharacter dataAttributeslevelDefault value0Values0
1
2
3
This spec fragment is referred to by specgroup-XCES-unchanged.
eAddress
: gives an electronic address of the person or institution who distributes the text or corpus. Note that more than one occurrence of this tag can appear, so that multiple addresses (possibly of different types) can be included. (Not present in samples.)Spec fragment ids-header-eAddress<eAddress> element (new)child ofpublicationStmt
, provides electronic address of distributor; #PCDATA.Classesatt.globalContainscharacter dataAttributestypeTypean XSDstring
valueDefault valueemailThis spec fragment is referred to by specgroup-XCES-unchanged.
extNote
: a descriptive note supplying additional information of any kind relating to an extent information provided within a corpus or text header. (Not present in samples.)Spec fragment ids-header-extNote<extNote> element (new)child ofextent
; provides additional information about extent of document. #PCDATA.Classesatt.globalContainscharacter dataThis spec fragment is referred to by specgroup-XCES-unchanged.
fax
: gives the fax number of the person or institution who distributes the text or corpus, in format conformant to ITU-T/CCITT Recommendation E.123. (Not present in samples.)Spec fragment ids-header-fax<fax> element (new)child ofpublicationStmt
, provides fax number of distributor in CCITT E.123 form; #PCDATA.Classesatt.globalContainscharacter dataThis spec fragment is referred to by specgroup-XCES-unchanged.
h.author
in a bibliographic reference, contains the name of an author (personal or corporate) of a work; a context-specific renaming of TEIauthor
element. CES specifies that names should be given in a canonical form, with surnames preceding forenames, but IDS practice is not consistent in this regard.h.bibl
: character data only, suitable for very simple citations.Spec fragment ids-header-h.bibl<h.bibl> element (new)child oftaxonomy
; #PCDATA (sic).Classesatt.globalContainscharacter dataExample:<taxonomy id="topic"> <h.bibl>Thementaxonomie (siehe http://www.ids-mannheim.de/kl/projekte/methoden/te.html) </h.bibl> <category id="topic.fiktion"> <catDesc>Fiktion</catDesc> <category id="topic.fiktion.vermischtes"> <catDesc>Fiktion:Vermischtes</catDesc> </category> </category> ... </taxonomy>
This spec fragment is referred to by specgroup-XCES-unchanged.
h.item
: (as child ofchange
element) specifies the nature of the change(s). One or more occurrences of this element may appear within eachchange
element. Context-dependent renaming of standard TEIitem
. (Not present in samples.)Spec fragment ids-header-h.item<h.item> element (new)child ofchange
; context-dependent renaming of standard TEIitem
.Classesatt.globalContainscharacter dataThis spec fragment is referred to by specgroup-XCES-unchanged.
h.keywords
: contains a list of keywords or phrases identifying the topic or nature of a text, each of which is tagged as a term. (Renaming of TEIkeywords
, plus modified content model.)Spec fragment ids-header-h.keywords<h.keywords> element (new)(in fileids.xheader.elt
): child oftextClass
. Contains a list of keywords or phrases identifying the topic or nature of a text, each of which is tagged as a term. (Renaming of TEIkeywords
, plus modified content model.)Classesatt.globalContainskeyTerm+Example:<h.keywords> <keyTerm>Bau/Leiharbeit</keyTerm> </h.keywords>
This spec fragment is referred to by specgroup-XCES-unchanged.
h.title
: the title of the electronic file, including alternative titles or subtitles. Context-specific renaming of TEItitle
.Spec fragment ids-header-h.title<h.title> element (new)child ofanalytic
andmonogr
; context-specific renaming of TEItitle
; #PCDATA.Classesatt.globalContainscharacter dataAttributestypeTypean XSDNMTOKEN
valueDefault valuemainValuesmain
sub
abbr
levelTypean XSDNMTOKEN
valueValuesm
a
Example:<h.title type="main">IKB verschiebt Halbjahresbericht erneut</h.title>
This spec fragment is referred to by specgroup-XCES-unchanged.
keyTerm
(in fileids.xheader.elt
): child ofh.keywords
, encloses one keyword term describing the text. Context-specific renaming of standard TEI elementterm
.Spec fragment ids-header-keyTerm<keyTerm> element (new)child ofh.keywords
, encloses one keyword term describing the text. Context-specific renaming of standard TEI elementterm
.Classesatt.globalContainscharacter dataAttributestypeindicates the type of keyTerm (person, country)Type#PCDATAsubtypeindicates the subtype of keyTermType#PCDATAThis spec fragment is referred to by specgroup-XCES-unchanged.
pubAddress
(in fileids.xheader.elt
): child ofpublicationStmt
, provides address of distributor. Context-specific specialization of TEIaddress
element; #PCDATA.Spec fragment ids-header-pubAddress<pubAddress> element (new)child ofpublicationStmt
, provides address of distributor. Context-specific specialization of TEIaddress
element; #PCDATA.Classesatt.globalContainscharacter dataThis spec fragment is referred to by specgroup-XCES-unchanged.
pubDate
(in fileids.xheader.elt
): child ofpublicationStmt
, provides date of publication. Context-specific specialization of TEIdate
element; #PCDATA.Spec fragment ids-header-pubDate<pubDate> element (new)child ofpublicationStmt
, provides date of publication. Context-specific specialization of TEIdate
element; #PCDATA.Classesatt.globalatt.datable.isoContainscharacter dataAttributestypeTypean XSDNMTOKEN
valueValuesyear
month
day
time
This spec fragment is referred to by specgroup-XCES-unchanged.
respName
(in fileids.xheader.elt
): child ofrespStmt
(where it is a context-dependent renaming ofname
) andchange
. Contains #PCDATA only.Spec fragment ids-header-respName<respName> element (new)child ofrespStmt
(where it is a context-dependent renaming ofname
) andchange
. Contains #PCDATA only.Classesatt.globalContainscharacter dataThis spec fragment is referred to by specgroup-XCES-unchanged.
respType
(in fileids.xheader.elt
): child ofrespStmt
; context-specific renaming of standard TEIresp
Spec fragment ids-header-respType<respType> element (new)child ofrespStmt
; context-specific renaming of standard TEIresp
Classesatt.globalContainscharacter dataThis spec fragment is referred to by specgroup-XCES-unchanged.
telephone
(in fileids.xheader.elt
): child ofpublicationStmt
, provides telephone number of distributor in CCITT E.123 form; #PCDATA.Spec fragment ids-header-telephone<telephone> element (new)child ofpublicationStmt
, provides telephone number of distributor in CCITT E.123 form; #PCDATA.Classesatt.globalContainscharacter dataThis spec fragment is referred to by specgroup-XCES-unchanged.
transduction
: (as child ofeditorialDecl
) describes the principles according to which the text has been transduced, either in transcribing it from audio tape to written form, or in converting from an electronic original.Spec fragment ids-header-transduction<transduction> element (new)child ofeditorialDecl
; #PCDATA pluslevel
attribute.Classesatt.headeratt.declarableContainscharacter dataThis spec fragment is referred to by specgroup-XCES-unchanged.
translation
(in fileids.xheader.elt
): child oftranslations
. Gives information about one translation of the text.Spec fragment ids-header-translation<translation> element (new)child oftranslations
. Gives information about one translation of the text.Classesatt.globalContainscharacter dataAttributestrans.locTypean XSDstring
valueThis spec fragment is referred to by specgroup-XCES-unchanged.
translations
(in fileids.xheader.elt
): child ofprofileDesc
; groupstranslation
elements.Spec fragment ids-header-translations<translations> element (new)child ofprofileDesc
; groupstranslation
elements.Classesatt.globalContains(translation, translator?)+This spec fragment is referred to by specgroup-XCES-unchanged.
translator
(in fileids.xheader.elt
): identifies the translator responsible for one translation.Spec fragment ids-header-translator<translator> element (new)identifies the translator responsible for one translation.Classesatt.globalContainscharacter dataThis spec fragment is referred to by specgroup-XCES-unchanged.
wordCount
: (as child ofextent
) contains the count of words in the text. (Not present in samples.)Spec fragment ids-header-wordCount<wordCount> element (new)child ofextent
; #PCDATA.Classesatt.globalContainscharacter dataThis spec fragment is referred to by specgroup-XCES-unchanged.
writingSystem
(in fileids.xheader.elt
): child ofwsdUsage
; describes one character set used in the document; can point to an external writing system declaration. (This element appears to be a survival from the SGML version of CES; in XML, character set issues are typically handled at a different level.) (Not present in samples.)Spec fragment ids-header-writingSystem<writingSystem> element (new)child ofwsdUsage
; describes one character set used in the document; can point to an external writing system declaration.Classesatt.globalContainscharacter dataThis spec fragment is referred to by specgroup-XCES-unchanged.
wsdUsage
: groups information describing the character set(s) used within a text. (Not present in samples.)Spec fragment ids-header-wsdUsage<wsdUsage> element (new)child ofprofileDesc
; groupswritingSystem
elements.Classesatt.globalContains(translation, translator?)+This spec fragment is referred to by specgroup-XCES-unchanged.
att.header
.
att.text
.
string
value
model.token
.
This spec fragment is referred to by specgroup-XCES-unchanged.
CES moves the rend
attribute from the list of global attributes to the
text
class; for now, we follow TEI here.
This spec fragment is referred to by IDS-additions.
4.2 Elements defined by (X)CES and modified by IDS/XCES
-
caption
: (1) a heading, title etc. attached to a picture or diagram (2) a "pull quote" or other text about or extracted from a text and superimposed upon it to draw attention to it. This element was added by CES; it conflicts (possibly unintentionally) with a different element also namedcaption
defined by TEI P3 for text displayed in a film (or text in a screenplay intended for such display). IDS/XCES modifies the CES version by allowing IDS milestones in the content. poem
(in fileids.xesdoc.dtd
): contains a poem, or an extract from a poem, appearing within or between paragraphs; an inter-level element. IDS changes CES's definition to allow milestone elements among the children.Spec fragment ids-doc-poem<poem> element (new)contains a poem appearing within or between paragraphs; an inter-level element.Classesatt.textmodel.interContains(head?, (lg | l | %model.ids.milestones;)+)Example:<p> ... <s>Denn auch für diesen hilflosen, aber unbestechlichen Chronisten der dunkelsten deutschen Jahre erweisen die Verse Brechts ihre Gültigkeit:</s> </p> <poem> <lg part="u"> <l part="u"> <s>[...]</s> </l> </lg> <lg part="u"> <l part="u"> <s>Ihr, die ihr auftauchen werdet aus der Flut</s> </l> <l part="u"> <s>In der wir untergegangen sind</s> </l> ... </lg> ... </poem>
This spec fragment is referred to by specgroup-XCES-changed.
base.seq
parameter entity in such a way that
it becomes not a sequence but an element class. To try to avoid confusion, it is here
renamed basic
.
model.basic
.
model.ids.milestones
.
This spec fragment is referred to by specgroup-XCES-changed.
This spec fragment is referred to by IDS-additions.
4.3 Renamings of TEI elements
Several IDS/XCES elements are esssentially renamings (or context-dependent renamings) of TEI elements.
idsCorpus
(in fileids.xesdoc.dtd
): renaming ofteiCorpus
, with slight difference in content modelidsHeader
(in fileids.xheader.elt
): renaming ofteiHeader
idsDoc
(in fileids.xesdoc.dtd
): intermediate level betweenidsText
andidsCorpus
; conceptually similar to theTEI
element, and structurally similar toteiCorpus
but (just for that reason) cannot be declared as a renaming of either. HLU: For the time being, idsDoc is not put in the namespace http://www.ids-mannheim.de/i5 so that in an i5 document, it will work like idsCorpus, idsText and idsHeader. The latter are technical renamings of original TEI elements (using altIdent) and therefore cannot be put in a namespace other than the TEI namespace)Spec fragment ids-doc-idsDoc<idsDoc> element (new)contains a single document within an IDS corpus; may contain one or several texts.Classesatt.globalContains(idsHeader, idsText+)AttributestypeTypean XSDstring
valueDefault valuetextversionType#PCDATATEIformTypean XSDNMTOKEN
valueDefault valueTEI.2This spec fragment is referred to by specgroup-renamings.
idsText
(in fileids.xesdoc.dtd
): renaming ofTEI
element
This spec fragment is referred to by IDS-additions.
4.4 TEI P3 elements no longer in P5
Some elements and attributes used by IDS/XCES were taken over from TEI P3, but are
no
longer present in TEI P5: xptr
, xref
, dateRange
, and
timeRange
.
xptr
appears in the samples, so for now that's the only one we
define.
<xptr targType = "pb" targOrder = "u" doc = "korpref.bio" from = "TK1.00018-5-PB5" to = "DITTO" TEIform = "xptr"/>
This spec fragment is referred to by specgroup-TEI-P3.
id
attribute from the global
class, and
the targOrder
attribute from the att.pointing
class; these must
be restored.
att.global
.
att.pointing
.
target
attribute, this attribute specifies whether the order in
which they are supplied is significant.
y
IDREF
values are specified as
the value of a target
attribute should be followed when
combining the targeted elements.
n
IDREF
values are specified as the
value of a target
attribute has no significance when combining
the targeted elements.
u
IDREF
values are
specified as the value of a target
attribute may or may not be
significant.
values
If this attribute is supplied, every element specified as a target must be of one or other of the types specified. An application may choose whether or not to report failures to satisfy this constraint as errors, but may not access an element of the right identifier but the wrong type.
att.xPointer
.
ENTITY
value
In principle, the value of this attribute is supposed by TEI P3 to be the name of an external entity declared in the DTD (often in the internal DTD subset); in practice, in IDS documents it appears to be a relative reference to a file, in the form of a filename.
string
value
In principle, the value of this attribute is supposed by TEI P3 to be a TEI
extended pointer; in practice, in IDS documents it appears to be an ID in
the document indicated by the doc
attribute.
string
value
In principle, the value of this attribute is supposed by TEI P3 to be a TEI
extended pointer; in practice, in IDS documents it appears always to be the
default value, DITTO
.
This spec fragment is referred to by IDS-additions.
4.5 Elements and attributes added by IDS
appearance
: ‘physical appearance’ of the source (BOT+e)Spec fragment ids-header-appearance<appearance> element (new)A child ofedition
.Classesatt.globalContainscharacter dataThis spec fragment is referred to by specgroup-IDS-specific.
c.title
: corpus title; a context-specific specialization of TEItitle
.Spec fragment ids-header-c.title<c.title> element (new)child oftitleStmt
. #PCDATA only, otherwise a context-specific specialization of TEItitle
.Classesatt.globalContainscharacter dataThis spec fragment is referred to by specgroup-IDS-specific.
column
: original label of newspaper column> section as in the source (BOT+ress)Spec fragment ids-header-column<column> element (new)child oftextDesc
. #PCDATA only.Classesatt.globalContainscharacter dataExample:<textDesc> <textTypeArt>Bericht</textTypeArt> <textDomain/> <column>TB-KLN2 (Abk.)</column> </textDesc>
This spec fragment is referred to by specgroup-IDS-specific.
creatDate
: time of creation.Spec fragment ids-header-creatDate<creatDate> element (new)child ofcreation
(inprofileDesc
). #PCDATA only.Classesatt.globalatt.responsibilityContainscharacter dataExample:<creation> <creatDate>2001.01.13</creatDate> </creation>
This spec fragment is referred to by specgroup-IDS-specific.
creatRef
: reference to (creation of text and) first edition.Spec fragment ids-header-creatRef<creatRef> element (new)child ofcreation
(inprofileDesc
). #PCDATA only.Classesatt.globalatt.responsibilityContainscharacter dataExample:<creation> <creatDate>1998</creatDate> <creatRef>(Erstveröffentlichung: Oberhausen, 1998)</creatRef> <creatRefShort>(Erstv. 1998)</creatRefShort> </creation>
This spec fragment is referred to by specgroup-IDS-specific.
creatRefShort
: short version of reference to (creation of text and) first edition.Spec fragment ids-header-creatRefShort<creatRefShort> element (new)child ofcreation
(inprofileDesc
). #PCDATA only.Classesatt.globalatt.responsibilityContainscharacter dataExample:<creation> <creatDate>1959</creatDate> <creatRef>(Erstveröffentlichung: Frankfurt a.M., 1959)</creatRef> <creatRefShort>(Erstv. 1959)</creatRefShort> </creation>
This spec fragment is referred to by specgroup-IDS-specific.
d.title
: document title; a context-specific specialization of TEItitle
.Spec fragment ids-header-d.title<d.title> element (new)child oftitleStmt
. #PCDATA only, otherwise a context-specific specialization of TEItitle
.Classesatt.globalatt.responsibilityContainscharacter dataThis spec fragment is referred to by specgroup-IDS-specific.
dokumentSigle
: document ID (formerly BOTD).Spec fragment ids-header-dokumentSigle<dokumentSigle> element (new)child oftitleStmt
. #PCDATA only.Classesatt.globalContainscharacter dataExample:<titleStmt> <dokumentSigle>A01/AUG</dokumentSigle> <d.title>St. Galler Tagblatt, August 2001</d.title> </titleStmt>
This spec fragment is referred to by specgroup-IDS-specific.
further
: further edition of the same source with year (BOT+gg)Spec fragment ids-header-further<further> element (new)child ofedition
. #PCDATA only.Classesatt.globalatt.responsibilityContainscharacter dataExample:<edition> <further>5. Auflage 1998 (1. Auflage 1997)</further> <kind/> <appearance/> </edition>
This spec fragment is referred to by specgroup-IDS-specific.
kind
: kind of edition of the source (BOT+g)Spec fragment ids-header-kind<kind> element (new)child ofedition
. #PCDATA only.Classesatt.globalContainscharacter dataExample:<monogr> <h.title type="main">Im Gegenteil</h.title> <h.title type="sub">Kolumnen 1986-1990</h.title> <h.title type="abbr" level="m">Bichsel: Im Gegenteil</h.title> <h.author>Bichsel, Peter</h.author> <editor/> <edition> <further/> <kind>suhrkamp taschenbuch</kind> <appearance/> </edition> ... </monogr>
This spec fragment is referred to by specgroup-IDS-specific.
korpusSigle
: corpus ID (formerly BOTC).Spec fragment ids-header-korpusSigle<korpusSigle> element (new)child oftitleStmt
. #PCDATA only.Classesatt.globalContainscharacter dataExample:<titleStmt> <korpusSigle>A01</korpusSigle> <c.title>St. Galler Tagblatt 2001</c.title> </titleStmt>
This spec fragment is referred to by specgroup-IDS-specific.
numRange
(in fileids.xesdoc.dtd
): member of thetoken
class, modeled ontimeRange
anddateRange
. (Not present in samples.)Spec fragment ids-doc-numRange<numRange> element (new)a range of numbers.Classesatt.textmodel.tokenmodel.basicContains%model.basic;*AttributesfromTypean XSDstring
valueValuesyes
no
toTypean XSDstring
valueValuesyes
no
typeTypean XSDstring
valueThis spec fragment is referred to by specgroup-IDS-specific.
pagination
: whether page numbering is present or not (processing info; formerly BOTP).Spec fragment ids-header-pagination<pagination> element (new)a range of numbers.Classesatt.globalContainscharacter dataAttributestypeTypean XSDNMTOKEN
valueValuesyes
no
Example:<editorialDecl Default="n"> <pagination type="yes"/> </editorialDecl>
This spec fragment is referred to by specgroup-IDS-specific.
reference
: bibliographic reference string.Spec fragment ids-header-reference<reference> element (new)a child ofsourceDesc
.Classesatt.headeratt.responsibilityContainscharacter dataAttributestypeTypean XSDNMTOKEN
valueValuescomplete
super
short
former
assemblageTypean XSDNMTOKEN
valueValuesexternal
regular
non-automatic
existenceTypean XSDNMTOKEN
valueValuesno
yes
originTypean XSDNMTOKEN
valueValuesBOTfile
notBOTfile
Example:<reference type="complete" assemblage = "regular">A01/JAN.02562 St. Galler Tagblatt, [Tageszeitung], 13.01.2001, Jg. 57. - Originalressort: TB-KLN2 (Abk.), [Bericht]</reference>
This spec fragment is referred to by specgroup-IDS-specific.
t.title
: text title. A context-specific specialization of TEItitle
.Spec fragment ids-header-t.title<t.title> element (new)child oftitleStmt
. #PCDATA only, otherwise a context-specific specialization of TEItitle
.Classesatt.globalatt.responsibilityContainscharacter dataAttributesassemblageTypean XSDNMTOKEN
valueValuesexternal
regular
non-automatic
This spec fragment is referred to by specgroup-IDS-specific.
textDomain
: subject area of the text (BOT+r)Spec fragment ids-header-textDomain<textDomain> element (new)child oftextDesc
. #PCDATA only.Classesatt.globalContainscharacter dataExample:<textDesc> <textTypeArt/> <textDomain>Wissenschaft</textDomain> <column>Wissenschaft</column> </textDesc>x
This spec fragment is referred to by specgroup-IDS-specific.
textSigle
: text ID (formerly BOTT).Spec fragment ids-header-textSigle<textSigle> element (new)child oftitleStmt
. #PCDATA only.Classesatt.globalContainscharacter dataExample:<titleStmt> <textSigle>A01/JAN.02562</textSigle> <t.title assemblage="regular" >A01/JAN.02562 St. Galler Tagblatt, 13.01.2001, Ressort: TB-KLN2 (Abk.)</t.title> </titleStmt>
This spec fragment is referred to by specgroup-IDS-specific.
textType
: type type according to type inventory (BOT+x)Spec fragment ids-header-textType<textType> element (new)child oftextDesc
. #PCDATA only.Classesatt.headerContainscharacter dataExample:<textType>Zeitung: Tageszeitung</textType> <textType>Ausgabenvermerk</textType> <textType>Anmerkung</textType>
This spec fragment is referred to by specgroup-IDS-specific.
textTypeArt
: text type of a specific article (BOT+xa).Spec fragment ids-header-textTypeArt<textTypeArt> element (new)child oftextDesc
. #PCDATA only.Classesatt.globalContainscharacter dataExample:<textDesc> <textTypeArt>Bericht</textTypeArt> <textDomain/> <column>TB-KLN2 (Abk.)</column> </textDesc>
This spec fragment is referred to by specgroup-IDS-specific.
textTypeRef
: text type as it should appear in bibliographic string (BOT+X).Spec fragment ids-header-textTypeRef<textTypeRef> element (new)child oftextDesc
. #PCDATA only.Classesatt.globalContainscharacter dataExample:<textDesc> <textType>Zeitschrift: Wochenzeitschrift</textType> <textTypeRef>Wochenzeitschrift</textTypeRef> <textTypeArt>Interview</textTypeArt> <textDomain>Gesellschaft</textDomain> <column/> </textDesc>
Example:<textDesc> <textType>Zeitung: Tageszeitung</textType> <textTypeRef>Tageszeitung</textTypeRef> </textDesc>
This spec fragment is referred to by specgroup-IDS-specific.
x.title
(in fileids.xheader.elt
): child oftitleStmt
; title of some object which is not a corpus (which would usec.title
), not a document in the IDS-specific sense (which would used.title
), and not a text in the IDS sense (which would uset.title
). Contains #PCDATA.Spec fragment ids-header-x.title<x.title> element (new)child oftitleStmt
; title of some object which is not a corpus (which would usec.title
), not a document in the IDS-specific sense (which would used.title
), and not a text in the IDS sense (which would uset.title
). Contains #PCDATA.Classesatt.globalContainscharacter dataThis spec fragment is referred to by specgroup-IDS-specific.
This spec fragment is referred to by IDS-additions.
4.6 Module, classes and elements from the TEI CMC SIG proposals " Beißwenger/Ermakova/Geyken/Lemnitzer/Storrer (2013): An XML Schema for the Representation of CMC Genres in TEI"
Posting
:Spec fragment posting<posting> element (new)describes a stretch of text that an individual user has produced in private and then passed on to the server through performing a "posting" action (usually by hitting the [ENTER] key on the keyboard or by clicking on a [SEND] or [SUBMIT] button on the screen). Postings are the largest structural units in CMC documents that can be assigned to one author and one point in time. Their function is to make a (written) contribution to the ongoing dialogue.Classesmodel.divLikeatt.datableatt.globalatt.typedatt.ascribedContains(#PCDATA | %model.headLike; | opener | %model.pLike; | %model.gLike; | %model.phrase; | %model.inter; | %model.global; | lg | %model.lLike; | %model.divBottom;)*AttributesindentLevel(add this attribute)marks the (relative) level of indentation of the respective posting (as defined by its author and in relation to the standard level of indentation which is described as „0“).Typedata.countThis spec fragment is referred to by specgroup-DERIK.
autoSignature
:Spec fragment autoSignature<autoSignature> element (new)is an empty element used for representing the position of the user signature position in a posting.Classesmodel.pPart.editatt.pointingatt.globalContains(#PCDATA | s | timestamp)*AttributestypeValues(closed list)signed
indicates that the corresponding posting was explicitly signed by a registered user using a user signature mark up (e.g. ~~~~).unsigned
indicates that the corresponding posting was marked by either a registered or unregistered user using the Unsigned or Help template.user_contribution
"user_contribution" indicates that the corresponding posting was marked using a [[Special:Contributions/IP]] link (e.g by an unregistered user)special_contribution
added 2019-06-14 This is actually the same as "user_contribution" "special_contribution" indicates that the corresponding posting was marked using a [[Special:Contributions/IP]] link (e.g by an unregistered user)This spec fragment is referred to by specgroup-DERIK.
-
signatureContent
:Spec fragment signatureContent<signatureContent> element (new)is used to describe an individual user's signature the header of the user profile. [Comment for I5.odd by HLU 2013-09-05: this element will not be available as long as there is no element using the model.persStateLike, like listPerson.]Classesmodel.persStateLikeatt.globalContains(#PCDATA | ref | %model.hiLike; | %model.milestoneLike; | figure)*This spec fragment is referred to by specgroup-DERIK.
-
emoticon
:Spec fragment emoticon<emoticon> element (new)describes an interaction sign which is an iconic unit that has been created with the keyboard and which typically serves as an emotion or irony marker or as a responsive.Classesmodel.gLikeatt.globalContainscharacter dataAttributesstyledescribes the native region of an emoticon.Values(closed list)Western
Japanese
Korean
other
systemicFunctiondescribes the general, context-independent function of the emoticonValues(closed list)emotionMarker:positive
emotionMarker:negative
emotionMarker:neutral
emotionMarker:unspec
virtualEvent
illocutionMarker
ironyMarker
responsive
contextFunctiondescribes the function of the respective instance of the emoticon in its given context.Values(closed list)emotionMarker:positive
emotionMarker:negative
emotionMarker:neutral
emotionMarker:unspec
virtualEvent
illocutionMarker
ironyMarker
responsive
topologythe position of the emoticon relative to the text to which it belongs.Values(closed list)front_position
back_position
intermediate_position
standalone
This spec fragment is referred to by specgroup-DERIK.
-
interactionWord
: +Spec fragment interactionWord<interactionWord> element (new)describes an interaction sign which is a symbolic linguistic unit whose morphologic construction is based on a word or a phrase and describes expressions, gestures, bodily actions, or virtual events―cf. the units sing, g (< grins, “grin”), fg (< fat grin), s (< smile), wildsei (“being wild”).Classesmodel.global.spokenatt.globalContainscharacter dataAttributesformTypeis used to describe morphological properties of the interaction word.Values(closed list)simple
complex
abbreviated
systemicFunctiondescribes the general, context-independent function of the interaction word.Values(closed list)emotionMarker:positive
emotionMarker:negative
emotionMarker:neutral
emotionMarker:unspec
virtualEvent
illocutionMarker
ironyMarker
responsive
contextFunctiondescribes the function of the respective instance of the interaction word in its given context.Values(closed list)emotionMarker:positive
emotionMarker:negative
emotionMarker:neutral
emotionMarker:unspec
virtualEvent
illocutionMarker
ironyMarker
responsive
semioticSourceis used to describe the semiotic mode that forms the basis for an interaction word.Valuesmimic
gesture
bodyReaction
sound
action
process
emotion
sentiment
topologythe position of the interaction word relative to the text to which it belongs.Values(closed list)front_position
back_position
intermediate_position
standalone
openingTag
closingTag
This spec fragment is referred to by specgroup-DERIK.
-
interactionTerm
:Spec fragment interactionTerm<interactionTerm> element (new)describes instances of one or several interaction signs (i.e., of emoticons, interaction words, interaction templates, and/or addressing terms).Classesmodel.phraseatt.globalContains(emoticon | interactionWord)*This spec fragment is referred to by specgroup-DERIK.
timestamp
:Spec fragment timestamp<timestamp> element (new)is an empty element used for representing the timestamp in a posting, which was automatically inserted when the user pressed a button. This element is an addition by IDS, i.e. not from the Derik ODD.Classesmodel.pPart.editatt.pointingatt.globalContainscharacter dataThis spec fragment is referred to by specgroup-DERIK.
This spec fragment is referred to by IDS-additions.
4.7 Elements from the TEI Correspondence SIG proposal
Copied from the github site of the TEI Correspondence SIG, specifically from the file proposal.xml as of 2015-01-08
4.7.1 LICENSE for the correspondence Elements
Copyright (c) 2013, TEI-Correspondence-SIG
All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
- * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
- * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
4.7.2 The following was copied from the file proposal.xml
:
model.correspDescPart
.
model.correspContextPart
.
model.correspActionPart
.
sending
receiving
transmitting
redirecting
forwarding
model.nameLike
.
model.dateLike
.
model.ptrLike
.
model.pLike
.
model.addressLike
.
note
:
This spec fragment is referred to by IDS-additions.
5 Top-level driver
6 Conformance and design issues
This section of the document records some conformance and design issues which may need attention.
-
The current IDS/XCES DTD, and the DeReKo documents which conform to it, use no namespaces. Conforming customizations of TEI P5, however, are required to use the TEI namespace
http://www.tei-c.org/ns/1.0
for TEI elements, and to put extensions to the vocabulary in a different namespace.That is, if the goal of the I5 project is to create a customization of TEI P5 which (1) accepts existing DeReKo documents as valid and (2) conforms to TEI P5, then the two goals are not simultaneously satisfiable.
-
One of the goals for this ODD file is to produce a schema which more or less matches the existing IDS/XCES DTD in the effective document grammar. Another is to ensure that the DeReKo documents provided as samples are valid against the document grammar defined here.
These goals prove to be incompatible: some documents in the samples are not valid against the DTD. Specifically:- Document
DeReKo-Sample-08/mld.sample.xces
contains a number ofidsDoc
elements with numeric IDs (for example<idsDoc id="951" type="text" version="1.0" TEIform="TEI.2">
); these numeric strings are not type-valid against theID
type.
id
attribute as having typeID
, which means that the documents just mentioned remain invalid. (The alternative of declaring it as having ytpeNCName
would render other documents invalid.) - Document
-
One of the goals for defining I5 is to align IDS's XML vocabulary better with TEI P5. In cases where the existing IDS/XCES vocabulary deviates from TEI P5, this goal raises the question: should I5 follow TEI P5 or the existing IDS/XCES DTD?
This ODD document follows the existing DTD in all cases where the change made in the DTD is necessary to make DeReKo data (as represented by the samples available to the author) valid. In other cases, however, this ODD document follows TEI P5. In particular, if all instances of an element in the available samples are valid against the TEI P5 declaration of that element type, no change was made to the declaration. This means that as defined here, I5 follows TEI P5 and not the existing IDS/XCES DTD in all cases where IDS/XCES DTD restricts elements to a subset of what is allowed by the TEI's definition.
In some cases, this may make the schema defined here looser than desired.
-
In the same spirit, element types taken over by the IDS/XCES DTD from TEI P3 that are not present in TEI P5 have been declared here if and only if instances of the element types are present in the samples. So
xptr
has been declared here, butxref
,dateRange
, andtimeRange
have not been declared.It is easy to add declarations for these element types if they proved to be needed.
A References
B TEI elements suppressed
This appendix lists elements present in TEI P5 (and present modules included by this ODD file) which are suppressed.
B.1 Elements suppressed from the Core module
core
module are suppressed. Almost all
of these were explicitly suppressed by CES and XCES, but some (binaryObject
,
choice
, desc
, graphic
, measureGrp
, and
said
) were not present in TEI P3 or TEI P4; they were added to TEI in TEI P5.
-
add
(addition): contains letters, words, or phrases inserted in the text by an author, scribe, annotator, or corrector. -
binaryObject
: provides encoded binary data representing an inline graphic or other object. -
cb
(column break): marks the boundary between one column of a text and the next in a standard reference system. -
choice
: groups a number of alternative encodings for the same point in a text. -
del
(deletion): contains a letter, word, or passage deleted, marked as deleted, or otherwise indicated as superfluous or spurious in the copy text by an author, scribe, annotator, or corrector. -
desc
(description): contains a brief description of the object documented by its parent element, including its intended usage, purpose, or application where this is appropriate. -
divGen
(automatically generated text division): indicates the location at which a textual division generated automatically by a text-processing application is to appear. -
expan
(expansion): contains the expansion of an abbreviation. -
headItem
(heading for list items): contains the heading for the item or gloss column in a glossary list or similar structured list. -
headLabel
(heading for list labels): contains the heading for the label or term column in a glossary list or similar structured list. -
index
(index entry): marks a location to be indexed for whatever purpose. -
listBibl
(citation list): contains a list of bibliographic citations of any kind. -
measureGrp
(measure group): contains a group of dimensional specifications which relate to the same object, for example the height and width of a manuscript page. -
meeting
: contains the formalized descriptive title for a meeting or conference, for use in a bibliographic description for an item derived from such a meeting, or as a heading or preamble to publications emanating from it. -
milestone
: marks a boundary point separating any kind of section of a text, typically but not necessarily indicating a point at which some part of a standard reference system changes, where the change is not represented by a structural element. -
postBox
(postal box or post office box): contains a number or other identifier for some postal delivery point other than a street address. -
postCode
(postal code): contains a numerical or alphanumeric code used as part of a postal address to simplify sorting or delivery of mail. -
resp
(responsibility): contains a phrase describing the nature of a person's intellectual responsibility. -
rs
(referencing string): contains a general purpose name or referring string. -
said
(speech or thought): indicates passages thought or spoken aloud, whether explicitly indicated in the source or not, whether directly or indirectly reported, whether by real people or fictional characters. -
series
(series information): contains information about the series in which a book or other bibliographic item has appeared. -
sic
(latin forthusorso): contains text reproduced although apparently incorrect or inaccurate. -
soCalled
: contains a word or phrase for which the author or narrator indicates a disclaiming of responsibility, for example by the use of scare quotes or italics. -
street
: a full street address including any name or number identifying a building as well as the name of the street or route on which it is located. -
teiCorpus
: contains the whole of a TEI encoded corpus, comprising a single corpus header and one or more TEI elements, each containing a single text header and a text. -
unclear
: contains a word, phrase, or passage which cannot be transcribed with certainty because it is illegible or inaudible in the source.
B.2 Elements suppressed from the TEI header module
header
module are suppressed:
-
appInfo
(application information): records information about an application which has edited the TEI file. -
application
: provides information about an application which has acted upon the document. -
authority
(release authority): supplies the name of a person or other agency responsible for making an electronic file available, other than a publisher or distributor. -
cRefPattern
(canonical reference pattern): specifies an expression and replacement pattern for transforming a canonical reference into a URI. -
geoDecl
(geographic coordinates declaration): documents the notation and the datum used for geographic coordinates expressed as content of thegeo
element elsewhere within the document. -
handNote
(note on hand): describes a particular style or hand distinguished within a manuscript. -
interpretation
: describes the scope of any analytic or interpretive information added to the text in addition to the transcription. -
namespace
: supplies the formal name of the namespace to which the elements documented by its children belong. -
notesStmt
(notes statement): collects together any notes providing information about a text additional to that recorded in other parts of the bibliographic description. -
principal
(principal researcher): supplies the name of the principal researcher responsible for the creation of an electronic text. -
refState
(reference state): specifies one component of a canonical reference defined by the milestone method. -
rendition
: supplies information about the rendition or appearance of one or more elements in the source text. -
scriptNote
: describes a particular script distinguished within the description of a manuscript or similar resource. -
seriesStmt
(series statement): groups information about the series, if any, to which a publication belongs. -
sponsor
: specifies the name of a sponsoring organization or institution. -
stdVals
(standard values): specifies the format used when standardized date or number values are supplied. -
teiHeader
(TEI Header): supplies the descriptive and declarative information making up an electronic title page prefixed to every TEI-conformant text. -
typeNote
: describes a particular font or other significant typographic feature distinguished within the description of a printed resource.
B.3 Elements suppressed from TEI text structure module
-
argument
: A formal list or prose description of the topics addressed by a subdivision of a text. -
div1
(level-1 text division): contains a first-level subdivision of the front, body, or back of a text. -
div2
(level-2 text division): contains a second-level subdivision of the front, body, or back of a text. -
div3
(level-3 text division): contains a third-level subdivision of the front, body, or back of a text. -
div4
(level-4 text division): contains a fourth-level subdivision of the front, body, or back of a text. -
div5
(level-5 text division): contains a fifth-level subdivision of the front, body, or back of a text. -
div6
(level-6 text division): contains a sixth-level subdivision of the front, body, or back of a text. -
div7
(level-7 text division): contains the smallest possible subdivision of the front, body or back of a text, larger than a paragraph. -
docDate
(document date): contains the date of a document, as given (usually) on a title page. -
floatingText
: contains a single text of any kind, whether unitary or composite, which interrupts the text containing it at any point and after which the surrounding text resumes. -
group
: contains the body of a composite text, grouping together a sequence of distinct texts (or groups of such texts) which are regarded as a unit for some purpose, for example the collected works of an author, a sequence of prose essays, etc. -
imprimatur
: contains a formal statement authorizing the publication of a work, sometimes required to appear on a title page or its verso.
B.4 Elements suppressed from optional modules
analysis
module are suppressed:
c
(character): represents a character.cl
(clause): represents a grammatical clause.-
interp
(interpretation): summarizes a specific interpretative annotation which can be linked to a span of text. -
interpGrp
(interpretation group): collects together a set of related interpretations which share responsibility or type. m
(morpheme): represents a grammatical morpheme.-
pc
(punctuation character): a character or string of characters regarded as constituting a single punctuation mark. phr
(phrase): represents a grammatical phrase.-
span
: associates an interpretative annotation directly with a span of text. -
spanGrp
(span group): collects together span tags.
corpus
module are suppressed:
-
activity
: contains a brief informal description of what a participant in a language interaction is doing other than speaking, if anything. -
channel
(primary channel): describes the medium or channel by which a text is delivered or experienced. For a written text, this might be print, manuscript, e-mail, etc.; for a spoken one, radio, telephone, face-to-face, etc. -
constitution
: describes the internal composition of a text or text sample, for example as fragmentary, complete, etc. -
derivation
: describes the nature and extent of originality of this text. -
domain
(domain of use): describes the most important social context in which the text was realized or for which it is intended, for example private vs. public, education, religion, etc. -
factuality
: describes the extent to which the text may be regarded as imaginative or non-imaginative, that is, as describing a fictional or a non-fictional world. -
interaction
: describes the extent, cardinality and nature of any interaction among those producing and experiencing the text, for example in the form of response or interjection, commentary, etc. -
locale
: contains a brief informal description of the kind of place concerned, for example: a room, a restaurant, a park bench, etc. -
preparedness
: describes the extent to which a text may be regarded as prepared or spontaneous. -
purpose
: characterizes a single purpose or communicative function of the text. -
setting
: describes one particular setting in which a language interaction takes place. -
settingDesc
(setting description): describes the setting or settings within which a language interaction takes place, either as a prose description or as a series of setting elements.