Linguistic Category and Subcategory Learning
The organization of words into linguistic categories, and the generalization from seen word combinations to novel ones, account for important aspects of the expansion of linguistic knowledge in the early stages of language acquisition. But this process is far from trivial: the available cues to grammatical categories are often abstract and sparse. Learners never see the entire input corpus during natural language acquisition, so they must figure out the proper contexts for new words, keeping in mind that sometimes there are lexically-specific restrictions on words (such as give vs. donate: despite similar meanings, Joe can give David a book, but Joe cannot *donate David a book). Yet in spite of the complexity of grammatical categories, there is little evidence that children miscategorize words based on non-syntactic information (like semantics; Maratsos & Chalkley, 1980). A crucial question to ask, then, is what type of information can learners use in order to extract the linguistically relevant categories from the input and achieve an adult-like level of language ability. Though there are multiple hypotheses about how humans acquire grammatical categories, computational analyses of natural language corpora have demonstrated the utility of distributional information for cuing syntactic categories (e.g., Mintz, Newport & Bever, 2002). In collaboration with Elissa Newport and Dick Aslin, I have studied the circumstances under which learners utilize distributional information in the acquisition of grammatical categories.
Across a series of artificial grammar learning experiments, we have identified a number of distributional variables to category structure that can shift learners from maintaining lexical specificity, to collapsing words into a single category. Our results demonstrate that learners can and do exploit the distributional information of their input in a principled way in order to distinguish accidental omissions in the input (i.e., things you haven't heard yet), from systematic gaps that arise from rules or lexical idiosyncrasies (like the difference between give and donate). Interestingly, this process occurs even without negative evidence or correlated cues to category structure: our results show evidence of robust category (and subcategory) learning based solely on the distributional patterning of words and their surrounding contexts. The full extent of this work is currently being written up, but one paper is under review, and early versions of these studies have been presented at annual conferences of the Cognitive Science Society [CogSci 2009; CogSci 2010].