escort(1WN) is a window-based browser used to search the semantic concordances for instances of semantically tagged word forms. It can be used to find semantic tags to one or more senses of a word and optional co-occurring senses.
conc | Contents | What's Tagged |
---|---|---|
brown1 | 103 Brown Corpus files | All open class words |
brown2 | 83 Brown Corpus files | All open class words |
brownv | 166 Brown Corpus files | Verbs |
Each directory contains the files cntlist , taglist and statistics , and a tagfiles directory. See cntlist(5WN) and taglist(5WN) for information about these files. See STATISTICS for information about the contents of the statistics file.
The tagfiles directory contains the semantically tagged files. Each file is named using the following convention:
br-article_code
where article_code is a letter followed by a two digit number that denotes the section and article number that the text was derived from. No file is in more than one semantic concordance.
A semantic tag associated with a word form indicates one or more senses in the WordNet database that are appropriate for that word form in the textual context. Semantic tags are represented as SGML attribute/value pairs, and are described in detail in cxtfile(5WN) .
Only nouns, verbs, adjectives, and adverbs (open class words) can be semantically tagged, as these are the only classes of words represented in WordNet. Proper nouns are generally not in WordNet, but are labeled in the semantically tagged files with one of four categories and assigned semantic tags to predetermined WordNet senses for these categories.
Attribute/Value Pair | WordNet Sense | Sense Key |
---|---|---|
pn=person | 1 | person%1:03:00:: |
pn=location | 1 | location%1:03:00:: |
pn=group | 1 | group%1:03:00:: |
pn=other | n/a | n/a |
Strings of several words that form a collocation or phrase found in WordNet are joined into one word form in a semantically tagged file and tagged as a single unit. In the case of discontinuous constituents (a collocation in which the words are not adjacent, such as look up in the phrase look the number up ) the first word of the collocation is "redefined" as the entire collocation and is tagged to the appropriate WordNet sense. The remaining words are marked with a special attribute/value pair that indicates which word form contains the semantic tag.
Category | Semantic Concordance | Total | ||
---|---|---|---|---|
brown1 | brown2 | brownv | ||
Miscellaneous | ||||
total word forms (<wf> ) | 198796 | 160936 | 316814 | 676546 |
word forms with cmd=done including ot= | 122724 | 98235 | 53421 | 274380 |
word forms with cmd=done excluding ot=notag | 107118 | 86255 | 41607 | 234980 |
word forms with semantic pointers (wnsn= ) | 106639 | 86000 | 41497 | 234136 |
word forms tagged to multiple senses | 115 | 551 | 37 | 703 |
total semantic pointers (including multiple senses) | 106725 | 86414 | 41525 | 234664 |
untagged word forms (cmd=ignore + ot= ) | 92154 | 74936 | 135684 | 302774 |
Number of Semantic Pointers | ||||
semantic pointers to nouns | 48835 | 39477 | 0 | 88312 |
semantic pointers to verbs | 26686 | 21804 | 41525 | 90015 |
semantic pointers to adjectives | 9886 | 7539 | 0 | 17425 |
semantic pointers to adverbs | 11347 | 9245 | 0 | 20592 |
semantic pointers to adjective satellites | 9970 | 8347 | 0 | 18317 |
Total semantic pointers | 106724 | 86412 | 41525 | 234661 |
Pointers to Proper Nouns | ||||
pointers to pn=person | 3815 | 2783 | 0 | 6598 |
pointers to pn=location | 600 | 363 | 0 | 963 |
pointers to pn=group | 740 | 440 | 0 | 1180 |
pointers to pn=other | 447 | 489 | 7 | 943 |
Total pointers to proper nouns | 5602 | 4075 | 7 | 9684 |
Unique WordNet Senses/TR> | ||||
senses pointed to by nouns | 11399 | 9546 | 0 | 20945 |
senses pointed to by verbs | 5334 | 4790 | 6520 | 16644 |
senses pointed to by adjectives | 1754 | 1463 | 0 | 3217 |
senses pointed to by adverbs | 1455 | 1377 | 0 | 2832 |
senses pointed to by adjective satellites | 3451 | 3051 | 0 | 6502 |
Total senses | 23393 | 20227 | 6520 | 50140 |
The previous table was compiled from the data in the statistics file in each concordance directory.
Note that there are 7 attribute/value pairs that assign proper nouns to the category "other" in the concordance brownv . These proper nouns were incorrectly identified as verbs by the syntactic tagger. See cxtfile(5WN) for a detailed discussion of the attribute/value pairs.
For a description of the Brown Corpus:
Francis, W. N., and Kucera, H. (1982). "Frequency Analysis of English Usage: Lexicon and Grammar" , Houghton Mifflin Company, Boston.
For more information on semantic concordances:
Miller, G.A., Leacock, C., Tengi, R., and Bunker R. T. (1993). A Semantic Concordance, "Proceedings of the ARPA WorkShop on Human Language Technology" . San Francisco, Morgan Kaufman.
Landes, S., Leacock, C., Tengi, R. (1998). Building Semantic Concordances. "WordNet: An Electronic Lexical Database" , MIT Press, Cambridge, MA.