12 KiB
POS (Part of Speech) Reference Guide
This document provides comprehensive descriptions for all Part of Speech (POS) tags found in the Wiktionary dataset.
Common POS Tags
abbrev
Full Name: Abbreviation
Description: A shortened form of a word or phrase, such as "Dr." for "Doctor" or "etc." for "et cetera". Abbreviations are used to represent longer terms in a condensed form.
adj
Full Name: Adjective
Description: A word that describes or modifies a noun or pronoun. Adjectives provide additional information about qualities, states, or characteristics, such as "beautiful", "large", "red", or "happy".
adj_noun
Full Name: Adjective-Noun Compound
Description: A compound word that functions as both an adjective and a noun, or a word that can serve either role depending on context.
adj_phrase
Full Name: Adjectival Phrase
Description: A group of words that functions as an adjective, modifying a noun or noun phrase. Examples include "very tall", "extremely happy", or "made of wood".
adnominal
Full Name: Adnominal
Description: A word or phrase that modifies a noun, typically preceding it. Similar to an adjective but can include other elements that function noun-modifying roles.
adv
Full Name: Adverb
Description: A word that modifies a verb, an adjective, another adverb, a clause, or a whole sentence. Adverbs often indicate manner, place, time, degree, or frequency, such as "quickly", "very", "here", or "often".
adv_phrase
Full Name: Adverbial Phrase
Description: A group of words that functions as an adverb, modifying verbs, adjectives, or other adverbs. Examples include "very quickly", "in the morning", or "with great care".
affix
Full Name: Affix
Description: A morpheme that is attached to a word stem to form a new word or word form. This includes prefixes, suffixes, infixes, and circumfixes.
ambiposition
Full Name: Ambiposition
Description: A word that can function as both a preposition and a postposition depending on its position relative to the noun phrase it modifies.
article
Full Name: Article
Description: A determiner that precedes a noun and indicates whether the noun is specific or general. In English, this includes "a", "an", and "the".
character
Full Name: Character
Description: A single letter, number, or symbol used in writing. In linguistic contexts, this often refers to individual graphemes, logograms, or writing system characters, particularly in non-alphabetic scripts.
circumfix
Full Name: Circumfix
Description: An affix that has two parts, one placed at the beginning of a word and the other at the end. Common in languages like German (e.g., "ge-...-t" for past participles).
circumpos
Full Name: Circumposition
Description: A word or set of words that surrounds a noun phrase, functioning similarly to a preposition or postposition but with elements on both sides.
classifier
Full Name: Classifier
Description: A word or morpheme used in some languages to categorize the noun it accompanies, often based on semantic properties like shape, animacy, or function. Common in East Asian languages.
clause
Full Name: Clause
Description: A grammatical unit that contains a subject and a predicate. Can be independent (main clause) or dependent (subordinate clause).
combining_form
Full Name: Combining Form
Description: A linguistic element that appears only in combination with other elements to form words, often derived from Greek or Latin roots (e.g., "bio-", "photo-", "-graphy").
component
Full Name: Component
Description: A linguistic element that forms part of a larger word or construction, typically without independent meaning.
conj
Full Name: Conjunction
Description: A word that connects words, phrases, clauses, or sentences. Coordinating conjunctions (and, but, or) join equal elements, while subordinating conjunctions (because, although, if) create dependent relationships.
contraction
Full Name: Contraction
Description: A shortened form of a word or group of words, often with an apostrophe replacing omitted letters. Examples include "don't" (do not), "can't" (cannot), or "I'm" (I am).
converb
Full Name: Converb
Description: A non-finite verb form that functions as an adverbial, expressing temporal, causal, conditional, or other relationships between clauses. Found in many Turkic and other languages.
counter
Full Name: Counter
Description: A word used in some languages to count specific types of nouns, similar to classifiers but often with numerical functions. Common in Japanese and other East Asian languages.
det
Full Name: Determiner
Description: A word or affix that precedes a noun or noun phrase and expresses its reference or quantity. Includes articles, demonstratives, possessives, and quantifiers.
gerund
Full Name: Gerund
Description: A verb form that ends in "-ing" (in English) and functions as a noun. Examples include "swimming is fun" or "I enjoy reading".
hard-redirect
Full Name: Hard Redirect
Description: A Wiktionary entry that automatically redirects to another entry, typically for spelling variations or alternative forms.
infix
Full Name: Infix
Description: An affix inserted into the middle of a word, rather than at the beginning or end. Common in Austronesian and other language families.
interfix
Full Name: Interfix
Description: A connecting element, often without independent meaning, used to join two morphemes or words in compounds. Examples include "-s-" in "statesman" or "-o-" in "speedometer".
interj
Full Name: Interjection
Description: A word or phrase that expresses emotion, exclamation, or sudden feeling. Examples include "Oh!", "Wow!", "Ouch!", or "Alas!".
intj
Full Name: Interjection (Alternative spelling)
Description: Same as interj - a word or phrase expressing emotion or exclamation.
name
Full Name: Name/Proper Noun
Description: A proper noun that refers to a specific person, place, organization, or other unique entity. Examples include "John", "London", "Microsoft", or "Mount Everest".
noun
Full Name: Noun
Description: A word that represents a person, place, thing, idea, or concept. Nouns function as subjects, objects, or complements in sentences.
num
Full Name: Numeral/Number
Description: A word or symbol that represents a numerical quantity or position. Includes cardinal numbers (one, two, three) and ordinal numbers (first, second, third).
onomatopoeia
Full Name: Onomatopoeia
Description: A word that phonetically imitates the sound it describes. Examples include "buzz", "meow", "bang", "splash", or "tick-tock".
onomatopeia
Full Name: Onomatopoeia (Alternative spelling)
Description: Same as onomatopoeia - a word that imitates the sound it represents.
participle
Full Name: Participle
Description: A non-finite verb form that can function as an adjective or be used in compound tenses. In English, includes present participles (-ing) and past participles (-ed, -en).
particle
Full Name: Particle
Description: A word that does not fit into the major word classes but has grammatical function. Includes discourse markers, focus particles, and other function words.
phrase
Full Name: Phrase
Description: A group of words that functions as a single unit in a sentence but does not contain both a subject and a finite verb. Can be noun phrases, verb phrases, prepositional phrases, etc.
postp
Full Name: Postposition
Description: A function word that follows its object, similar to a preposition but placed after the noun phrase. Common in languages like Japanese, Korean, and Finnish.
prefix
Full Name: Prefix
Description: An affix added to the beginning of a word to modify its meaning or create a new word. Examples include "un-", "re-", "pre-", "mis-".
prep
Full Name: Preposition
Description: A function word that typically precedes a noun phrase and shows the relationship between its object and another element in the sentence. Examples include "in", "on", "at", "by", "for".
prep_phrase
Full Name: Prepositional Phrase
Description: A phrase that begins with a preposition and ends with a noun or pronoun (the object of the preposition). Functions as an adjective or adverb in sentences.
preverb
Full Name: Preverb
Description: A prefix or separate word that modifies the meaning of a verb, often indicating direction, aspect, or other semantic features. Common in Native American and other languages.
pron
Full Name: Pronoun
Description: A word that replaces a noun or noun phrase. Includes personal pronouns (I, you, he), demonstrative pronouns (this, that), and relative pronouns (who, which).
proverb
Full Name: Proverb
Description: A short, traditional saying that expresses a perceived truth, piece of advice, or common observation. Examples include "A stitch in time saves nine" or "Actions speak louder than words".
punct
Full Name: Punctuation
Description: Symbols used in writing to separate sentences, clauses, and elements within sentences. Includes periods, commas, semicolons, question marks, etc.
quantifier
Full Name: Quantifier
Description: A word or phrase that indicates quantity or amount. Examples include "some", "many", "few", "all", "several", "much".
romanization
Full Name: Romanization
Description: The representation of text from a non-Latin writing system in Latin script. Used for transliteration of languages like Chinese, Japanese, Arabic, etc.
root
Full Name: Root
Description: The core morpheme of a word that carries the primary meaning, to which affixes can be attached.
soft-redirect
Full Name: Soft Redirect
Description: A Wiktionary entry that provides a link to another entry but may include additional information or context before the redirect.
stem
Full Name: Stem
Description: The part of a word to which inflectional affixes are attached. The stem may include the root plus derivational affixes.
suffix
Full Name: Suffix
Description: An affix added to the end of a word to modify its meaning or create a new word. Examples include "-ing", "-ed", "-ly", "-tion".
syllable
Full Name: Syllable
Description: A unit of pronunciation having one vowel sound, with or without surrounding consonants, forming the whole or a part of a word.
symbol
Full Name: Symbol
Description: A character or mark that represents something else, such as mathematical symbols (+, -, ×), currency symbols ($, €, £), or other special characters.
typographic variant
Full Name: Typographic Variant
Description: An alternative form of a word or character that differs in typography but represents the same linguistic item, such as "œ" vs "oe" or different ligatures.
unknown
Full Name: Unknown
Description: A part of speech that could not be determined or classified during the extraction process.
verb
Full Name: Verb
Description: A word that expresses an action, state, or occurrence. Verbs function as the main element of predicates and can be conjugated for tense, mood, aspect, and voice.
Summary
This dataset contains 57 different POS tags, ranging from common categories like noun, verb, and adjective to specialized linguistic terms like circumfix, converb, and classifier. The diversity reflects the comprehensive nature of Wiktionary data across multiple languages and writing systems.