Databases \ ICAME Collection of English Language Corpora


A CD-ROM collection of 20 English language corpora for linguistic research.
Eight of the corpora are also available online - see below under Access

The CD-ROM package includes WordSmith, WordCruncher and other linguistic software programs, which can be used both on the corpora supplied and on other text files.

List of corpora, retrieval programs and other software on the CD-ROM with instructions for installation and online manuals.

ICAME homepage with links to bibliographies and the ICAME journal


20 corpora, comprising 7.9 million words of written modern text, 2.5 million of transcribed speech and 6.6 million historical text. 3.5 million words have been tagged for part of speech.


Online Corpora
Online search in ICAME corpora
The following corpora are also available online:
FLOB (Untagged), FROWN (Untagged), Australian (Untagged), London Lund (Untagged), LOB (Tagged), Brown (Tagged), COLT (Untagged), and Wellington Corpus of written NZ English (Untagged). Searching various combinations is possible.
Manuals for all the ICAME corpora are available on this website.

The Brown Corpus is also available online at the Linguistic Data Consortium website.


ICAME Collection of English Language Corpora is related to 2 subjects. Explore these subjects to find other useful databases.