End of this page section.

Begin of page section: Contents:

English-Corpora.org

Monday, 04 October 2021

Corpora for English and American Studies

 

English-Corpora.org is one of the most widely used collections of texts in the English language. As an indispensible tool for linguists and researchers from other areas, it receives 120,000 visitors every month. Its 19 corpora offer a solid basis of data for empirical investigations into linguistic questions.
Using the corpora, characteristics of a language, (lexical, syntactic, phontetic, etc), can be made visible and quantifiable.

A special feature of English-Corpora.org is the qualitative compilation of the data. This makes it possible, in contrast to most other corpora, to carry out detailed searches to analyse language change or regional, historical, sociological or geographical variants.

The Guided Tour of the corpora is a comprehensive tutorial on efficient searching, with graphics and screenshots. In addition to the tour, there is context-sensitive help available and compact information with lots of practical examples for each of the corpora.

To illustrate some of the many functions of the corpora, here are a few examples from the Corpus of Contemporary American English (COCA):
COCA contains more than a billion words of written and spoken recorded speech from blogs, websites, subtitles from films and TV, radio programmes, fictional writing, magazines, newspapers and journals.

The "genres" function can be used to examine the varying frequency of use of a word or phrase in formal or informal speech, or semantic and syntactic developments over the course of time.

Searching for an individual word, brings back information about the meaning, part of speech, pronounciation, use and related terms. Texts and phrases can also be analysed down to the level of individual words. 

Each of the "top" 60,000 words has a comprehensive entry that not only shows the frequency of use by medium, but also includes audio examples of correct pronounciation, translations, definitions, synonyms, closely and more loosely related terms, etymological information, collocations and "clusters" (phrases of 2 to 4 words).

English-Corpora.org is licenced by the University of Graz and can be accessed from the campus via unikat or DBIS. You will need to register the first time you use the website. Please use your institutional eMail address and a password of your choice. Members of the university can also access the licenced content from outside the campus via VPN.

Please make use of our online tutorials on unikat, literature research and reference management, as well as the courses that take place during the semester. For more information, please see the library course catalogue.

If you have any questions, please contact ub.zeitschriften(at)uni-graz.at.

End of this page section.

Begin of page section: Additional information:

End of this page section.