KueNumeroManchaIin SientuLosNúmerosdelUnoalCienGisela BeckmannMaría M. NievesConsultantsGisela BeckmannMaría M. NievesTEI EncodingJack T. Bowers
This is an XML-TEI markup of the original publications by SIL Mexico for the purposes of:
archival and for the creation of a body of Mixtepec-Mixtec language resources (LR) (ISO: mix).
Encoded documents used with the permission of original authors.
Kue numero mancha iin sientuBeckmann, Gisela(consultant); Nieves, María M.(consultant); 2008. (2nd ed.).Tlalpan, D.F., México:Instituto Lingüístico de Verano, A.C.
Obtained from:@mode:'w' means that the primary means of utilising this document is by reading it's written form; this document is labeled "original" because the XML-TEI markup of it's contents have not made any alterations to the original content;not applicable
Because the content in this referential document covers cardinal numbers which are inherantly an ordered list, the contents are being divided up, and presented in list format.
The digital forms of the number vocabulary is being represented with the TEI 'num' element. The attribute: @value is being used to store the numeric value in digital form.
The MIX language orthographic form of that number is being represented with the 'seg' element and these corresponding forms of the same semantic/ontological entity are to be linked by the
attribute: @corresp with the value being the xml:id of the seg element containing the written form.
In this document (and in all others in the SIL collection) the 'seg' element is used to encapsulate:
i) the entire sentences from original document (first level);ii) the individual lexical items contained within the sentences (second level);Within the sentence level "seg" (segment) elements all punctuation characters (".", ";", ":", "¿", "?", "!", etc.) are encapsulated in the "c" (character) element;
In future stages of the project, each of these punctuation characters will be formally linked with the linguistic categories with which it is associated: a sentence that begins with: "¿"
(and ends with: "?") can be immediately recognized as sentence type "question/interrogative";Where the page divisions in the original documents are, this encoding uses "div" (division) elements; this is to separate the linguistic, and knowledge data from formatting
This portion of the lexicography project deals with the only known body of literature in Mixtepec-Mixtec language which is comprised of series of 34* orthographic children's texts published by SIL Mexico (Summer Institute of Linguistics). The target audience of these documents is primarily young MIX speakers and are designed for use as classroom handouts and/or lessons for primary/elementary school-level. The topical content of these publications contain both culturally specific, and non-culturally specific subject matter supplemented by illustrations. Topics covered in these documents include: vocabulary, mathematics, telling time, geography, seasons, weather, local agricultural practices, fables, local Mixtec legends amongst others. These publications make up the second most important collection in this project and they remain the only source of reference for examples of how the working orthography of the Mixtec language is applied to the Mixtepec variety.
The encoding of this portion of the MIX Language resources can be divided into the following components:
i) TEI-XML markup of documents that supports reuse and extension within this project and/or for other interested parties;ii) Annotation and glossing of lexical information and structures in documents; tasks include;
glossing of orthographical lexical items;assignment of lexical properties to sentences, phrases and lexical items (morpho-/syntactic; pragmatic; semantic etc.);linking corresponding bilingual items (Spanish-Mixtec) within each document;entry of analyzed, glossed and categorized lexical information into TEI-Dictionary;link orthographical forms of lexical items with phonetic transcriptions of spoken forms;iii) Classification of Document Types and Ontology Linking;
replace current SIL document classification model with established and expansible document system which can be used to classify any new document or literature produced in the language in the future;as per the goal of the overall project, the use of language-specific glosses (including 'Interlingua') will be replaced by grounding the semantic profile of each lexical item by referencing ontological entities (from existing ontologies such as DBpedia, SKOS, etc.);