Cross-linguistic Reference Grammar project

The Munich Cross-linguistic Reference Grammar (CRG) is a framework for the unified description of natural languages of any kind, spoken and written as well as signed languages. The leading idea behind this database application is to support the descriptive linguist as well as the theoretical linguist, the former by providing grammar authoring tools, the latter by making available a wide variety of queries not only on individual languages, but also and especially cross-linguistically. It is based on a project called AVG 2.0 'Allgemein-Vergleichende Grammatik 2.0' ('General Comparative Grammar 2.0') supported by grants Za 111/7-5 and Za 111/7-6 from the DFG (German Research Association) to Dietmar Zaefferer, which are hereby gratefully acknowledged.

This page gives a short overview. It is structured as indicated on the left.

 


Project Aim

The aim of the project is to provide a general format for reference grammars that
(a) guarantees an adequate and comprehensive description of the language under consideration, and
(b) ensures that the description is organized along the same lines for every language, allowing thereby cross-linguistic comparison in a systematic way.


Background

The basic idea goes back to the questionnaire published in Lingua (Comrie and Smith 1977), on which the volumes of the Descriptive Grammars series (now published by Routledge) are based.

 

It was further elaborated during the work on the hypertext system "A Framework for Descriptive Grammars" (FDG), developed in a project funded by the National Science Foundation from 1990 - 1993 with William Croft and Bernard Comrie as principal investigators, and in the course of the German project AVG 'Allgemein-Vergleichende Grammatik' ('General Comparative Grammar'), supported from 1992 to 1995 by grant Le 358/8-1,2 from the DFG (German Research Association) with Christian Lehmann and Dietmar Zaefferer as principal investigators.

 

Former project members include

 

Timothy Clausner

Eva Schultze-Bernd

Elisabeth Verhoeven

Elena Lenk

Jürgen Bohnemeyer

Vladimir Tourovski

Roman Pichler

Ellen Brandner

Milly Brunello

Christian Strömsdörfer

Tsuyoshi Takizawa

Gerhard Hingerl

Felix Weigel

Eleni Kriempardis

Matthias Nickles

Irfan Bilgili

John Peterson

Almudena Bada MorellÔø‡Ôø‡n

MÔø‡Ôø‡nica Valero Valientes

 

The main differences for AVG 2.0 and CRG from these former approaches, besides its implementation as a platform-independent database, are the following:


Structure of the database

The database structure has the form of a tree with three kinds of edges:

Ôø‡Ôø‡Ôø‡ BE-edges for the taxonomic relations (conceptual subordination: every negative clause is a clause), Ôø‡Ôø‡Ôø‡ HAVE-edges for the meronomic relations (conceptual part-of-relations: a negative clause must have a negation marker, the latter is by definition part of the former) and

Ôø‡Ôø‡Ôø‡ optional edges for the rest (e.g. other part-of-relations: a negative clause may have a secondary negation marker).

 

The CRG database application provides two front ends. Firstly, the front end for the reader of grammars. To start a query the user selects a subtree, which corresponds to an incomplete statement. The system then provides all the completions that are stored in the database, i.e. all the complete statements that include the incomplete statement that defines the query. This makes it easy to discover correlations and to test hyptheses. At a later stage it will be possible to look at visual representations of these correlations.

The second front end is grammar authoring system. It facilitates language documentation and description, e.g. by providing an interlinear representation format for language data of any kind (spoken, written and signed; see below for further details). The urgent need for such a system should be obvious in view of the fact that many languages are threatened by extinction in the near future.


Contents

The complete system will include the following components:

Grammar

The database includes descriptive categories for all relevant aspects of human languages, including phonetics and phonology, morphology, syntax, and semanto-pragmatic aspects. In order to guarantee cross-linguistic comparability, it is necessary that corresponding phenomena found in different languages are described with the same terms. Therefore, an important prerequisite for the usefulness of the database is the coherent and systematic definition of its descriptive terms. Preference was given to terms which are in use across theoretical frameworks, but there are new term as well and it is also possible for a describer to define new terms if she regards this as unavoidable and if the database adminstration agrees. The glossary is still under construction. Its further development will proceed in agreement with the GOLD project (Farrar and Langendoen 2003), which develops a General Ontology for Linguistic Description as part of an upper ontology called SUMO.

Lexicon

The lexicon format allows to store detailed lexical information. Lexical entries contain not only the meaning of the word (or its translation into English) but all the relevant phonetic-phonological grammatical, and semantic-pragmatic information.

Tools for the Field Linguist

These are still to be developed or adapted from other tools. Among them are templates which allow to enter data in a quick and easy way. In addition, it is planned to offer tests and diagnostics to make sure that the descriptive terms are in fact consistently used. 

Online Glossary

An easily accessible glossary (online) will inform the grammar authors and the database users about the way descriptive terms are used in the system.


Cooperation

The cooperation with the following people and organizations is hereby gratefully acknowledged:


Current development

Currently, thanks to the support of the central administration of the Ludwig-Maximilians-UniversitÔø‡Ôø‡t (president Bernd Huber) and the Faculty for Languages and Literatures (dean Georg JÔø‡Ôø‡ger), the database maintenance and development is taken care of by Sarah Bluhme (sarah.bluhme@itg.uni-muenchen.de) of the IT group humanities (Christian Riepl).


References

Ameka, Felix, Alan Dench, Nicholas Evans (eds.)(forthcoming): Catching Language. Issues in Grammar Writing. Berlin: Mouton de Gruyter.

Bluhme, Sarah, Matthias Nickles and Dietmar Zaefferer (2003): Cross-linguistic reference grammar: An XML-based internet database for general comparative linguistics. Presented at DGfS-CL-2003, 27 February 2003, Munich. dgfs-poster-session.ppt

Comrie, Bernard, and Smith, Norval (1977): Lingua Descriptive Studies: questionnaire. - In: Lingua 42, 1-72.

Comrie, Bernard, William Croft, Christian Lehmann, and Dietmar Zaefferer (1993): "A Framework for Descriptive Grammars". - In: André Crochetière, Jean-Claude Boulanger et Conrad Ouellon (eds.), Proceedings of the XVth International Congress of Linguists, vol. I, Quebec City, 159-170.

Comrie, Bernard, (1998): Ein Strukturrahmen für deskriptive Grammatiken: Allgemeine Bemerkungen. - In: Dietmar Zaefferer (ed.), Deskriptive Grammatik und allgemeiner Sprachvergleich, 7-16.

Farrar, Scott, and D. Terence Langendoen (this volume). Markup and the GOLD ontology. [http://emeld.org/workshop/2003/papers03.html]

Nickles, Matthias (2001): Systematics - Ein XML-basiertes Internet-Datenbanksystem für klassifikationsÔø‡Ôø‡gestütze Sprachbeschreibungen. UniversitÔø‡Ôø‡t München: Centrum für Informations- und Sprachverarbeitung CIS-Bericht-01-129.

[http://www.cis.uni-muenchen.de/CISPublikationen.html]

Peterson, John (2002): AVG 2.0. Cross-linguistic Reference Grammar. Final Report. UniversitÔø‡Ôø‡t München: Centrum für Informations- und Sprachverarbeitung CIS-Bericht-02-130. [http://www.cis.uni-muenchen.de/CISPublikationen.html]

Zaefferer, Dietmar (1997): "Neue Technologien in der Sprachbeschreibung. Der ParadigmenÔø‡Ôø‡wechsel von linearen P-Grammatiken zu vernetzten E-Grammatiken". - In: Zeitschrift für LiteÔø‡Ôø‡raturÔø‡Ôø‡wissenschaft und Linguistik 106, 76-82.

Zaefferer, Dietmar ed. (1998): Deskriptive Grammatik und allgemeiner Sprachvergleich, Tübingen: Niemeyer.

Zaefferer, Dietmar (2001): Modale Kategorien. - In: Martin Haspelmath, Ekkehart König, Wulf Oesterreicher, Wolfgang Raible (Hgg.), Sprachtypologie und sprachliche Universalien (HSK 20.1). Berlin: Mouton de Gruyter, 784-816.

Zaefferer, Dietmar (2003): A unified representation format for spoken and sign language texts. [Downloadable from http://emeld.org/workshop/2003/papers03.html]

Zaefferer, Dietmar (forthcoming): Realizing HumboldtÔø‡Ôø‡Ôø‡s dream: Cross-linguistic grammatography as database creation. - In: Ameka et al. (eds.)

 


 

Institut für Theoretische Linguistik, LMU

14.08.2003 sarah.bluhme at itg.uni-muenchen.de