By Bengt Altenberg, Karin Aijmer

A publication on language

Show description

Read or Download Advances in Corpus Linguistics: Papers from the 23rd International Conference on English Language Research on Computerized Corpora (ICAME 23) Göteborg 22-26 May 2002 (Language and Computers 49) PDF

Similar language & grammar books

How Texts Work (Routledge a Level English Guides)

How Texts Work:  explores the ways that we categorize texts finds the restrictions of a few of the polarisations we use to categorize texts analyzes a large choice of texts from a variety of genres and classes, from Ibsen's A Doll's residence to an 18-30s brochure, web chatrooms and George Bush's September eleven speech deals a step by step advisor to imminent texts and structuring a reaction can be utilized as either a direction stimulus and a revision device.

History of Linguistics 2005: Selected Papers from the Tenth International Conference on the History of the Language Sciences (ICHOLS X), 1-5 September ... in the History of the Language Sciences)

As each one interval within the background of the language sciences has selected to target diverse key questions, the learn of that historical past provides to open our eyes to the range of fascinating questions that may be requested, and spoke back – starting up the blinders of up to date preoccupations. September 1–5, 2005, linguists from twenty-five nations amassed on the collage of Illinois at Urbana-Champaign to percentage their ardour for the heritage in their self-discipline.

Control as Movement

The stream idea of keep watch over (MTC) makes one significant declare: that regulate family members in sentences like 'John desires to go away' are grammatically mediated by means of circulation. This is going opposed to the normal view that such sentences contain no longer flow, yet binding, and analogizes keep watch over to elevating, albeit with one vital contrast: while the objective of move up to speed buildings is a theta place, in elevating it's a non-theta place; but the grammatical tactics underlying the 2 buildings are a similar.

Vowels and consonants

This renowned and obtainable creation to phonetics has been absolutely up-to-date for its 3rd version, and now contains an accompanying site with sound documents, and improved assurance of subject matters resembling speech know-how. Describes how languages use numerous diversified sounds, lots of them really not like any that take place in well–known languages.

Additional resources for Advances in Corpus Linguistics: Papers from the 23rd International Conference on English Language Research on Computerized Corpora (ICAME 23) Göteborg 22-26 May 2002 (Language and Computers 49)

Sample text

Annotation as an exploitation of the mark-up facility is typical of the kind of tool that emerged in the early days of computing – simple, extremely flexible and useful. The other side of the coin is that it can be uncontrolled, invasive and overwhelming; I believe that most of the research projects in corpus linguistics that are in progress at the present time are not examining their languages at all, but are examining the tags. The particular choices of word combinations that corpora uniquely offer us are impossible to retrieve using tags.

4. 5. The information captured in mark-up is valuable and worth preserving. Mark-up is not the only way of preserving this information. Mark-up is now obsolete as a way of storing text. Marked-up text can be prepared by merging plain text and tags. Corpus material should always be kept in plain text format. With this in mind, we can now turn to annotation. Annotation uses the same conventions as mark-up, but is not restricted to features of the original text or recording. The classic annotation is “POS-tagging”, which means inserting after each word in a corpus a code denoting its part of speech, but there are now many others, some quite unusual and informal, and many corpora are very heavily annotated.

Mark-up is not the only way of preserving this information. Mark-up is now obsolete as a way of storing text. Marked-up text can be prepared by merging plain text and tags. Corpus material should always be kept in plain text format. With this in mind, we can now turn to annotation. Annotation uses the same conventions as mark-up, but is not restricted to features of the original text or recording. The classic annotation is “POS-tagging”, which means inserting after each word in a corpus a code denoting its part of speech, but there are now many others, some quite unusual and informal, and many corpora are very heavily annotated.

Download PDF sample

Rated 4.67 of 5 – based on 12 votes