CoLIWeb
This action may take several minutes for large corpora, please wait.

Word list options

Corpus:
Subcorpus: create new
Search attribute:
. Value of n: from to
Filter options:
Filter word list by:Regular expression:
Minimum frequency:
Maximum frequency: (0 = no maximum frequency)
Whitelist:
Blacklist: format
Word list whitelists and blacklists must be plain text (.txt), encoded in UTF-8, with one item per line. The items must correspond to the selected attribute, so, eg, if 'lemma' is selected from the attribute menu, then the list should be a list of lemmas. We use exact matching, not regular-expression matching, for file input.
Output options:
Frequency figures:
Output type:
Reference (sub)corpus
Prefer: rare words
common words

You can select one or more output attributes. Please note that this option can be time-consuming.