On the Boundaries of Phonology and Phonetics

səhifə	22/41
tarix	25.06.2016
ölçüsü	3.17 Mb.

1 ... 18 19 20 21 22 23 24 25 ... 41

4.3.Syllabic structure

Phonotactic constraints might hint at how the stream of phonemes is organized in the language processing system. The popular phoneme, syllable and word entities may not be the only units that we use for lexical access and production. There are suggestions that in addition, some sub-syllabic elements are involved in those processes, that is, the syllables might have not linear structure, but more complex representations (Kessler & Treiman, 1997). For that purpose, we will analyze how the phoneme prediction error at a threshold of 0.016 - where the network resulted in best word recognition - is located within words with respect to the following sub-syllabic elements - onset, nucleus and coda. The particular hypothesis that will be tested is whether Dutch monosyllables follow the structure below that was found in English as well (Kessler & Treiman, 1997).
( Onset - Rhyme (Nucleus - Coda) )
The distribution of phoneme error within words (Table 4a) shows that the network makes more mistakes at the beginning than at the end of words, where SRN becomes more confident in its decision. This could be explained with increasing contextual information that more severely restricts possible phonemic combinations. A more precise analysis of the error position in the onset, the nucleus and the coda further reveals other interesting phenomena (Table 4b).

Table 4. Distribution of phoneme prediction error at a threshold of 0.016 by (a) phoneme position within words and (b) phoneme position within sub-syllables. Word and Onset positions start from 2, because the prediction starts after the first phoneme.

Word Position	2	3	4	5	6	7	8
Error (%)	4.3	1.7	1.4	0.6	0.3	0.3	0.00

Sub-syllabes

Onset

Nucleus

Coda

Relative Position

2.6

0.0

4.5

1.0

1.5

2.0

2.6

Error (%)

First, error within the coda increases at the coda's end. We attribute this to error accumulated toward the end of the words, as was predicted earlier. The mean entropy in the coda (1.32; σ = 0.87) is smaller than the mean entropy in the onset (1.53; σ = 0.78), where we do not observe such effects. So looser constraints are not the reason for the relatively greater error in the coda. Next, the error at the transition onset-nucleus is much higher than the error at the surrounding positions, which means that the break between onset and rhyme (the conjunction nucleus-coda) is significant. This distribution is also consistent with the statistical finding that the entropy is larger in the body (the transition point onset-nucleus) (3.45; σ = 0.39), than in the rhyme (1.94; σ = 1.21). All this data support the hypothesis that onset and rhyme play significant roles in lexical access and that the syllabic structure confirmed for English by Kessler & Treiman (1997) is valid for Dutch, too.

5.Conclusions

Phonotactic constraints restrict the way phonemes combine in order to form words. These constraints are empirical and can be abstracted from the lexicon - either by extracting rules directly, or via models of that lexicon. Existing language models are usually based on abstract symbolic methods, which provide good tools for studying such knowledge. But linguistic research from a connectionist perspective can provide a fresh perspective about language because the brain and artificial neural networks share principles of computations and data representations.

Connectionist language modeling, however, is a challenging task. Neural networks use distributed processing and continuous computations, while languages have a discrete, symbolic nature. This means that some special tools are necessary if one is to model linguistic problems with connectionist models. The research reported in this paper attempted to provide answers to two basic questions: first, whether phonotactic learning is possible at all in connectionist systems, which had been doubted earlier (Tjong Kim Sang, 1995; Tjong Kim Sang, 1998). In the case of a positive answer, the second question is how NN performance compares to human ability. In order to draw this comparison, we needed to extract the phonotactic knowledge from a network which has learned the sequential structure. We proposed several ways of doing this.

Section 3 studied the first question. Even if there are theoretical results demonstrating that NNs have the needed finite-state capacity for phonotactic processing, there are practical limitations, so that we needed experimental support to demonstrate the practical capability of SRNs to learn phonotactics. A key to solving the problems of earlier investigators was to focus on finding a threshold that optimally discriminated the continuous neuron activations with respect to phoneme acceptance and rejection simultaneously. The threshold range at which the network achieves good discrimination is very small (see Figure 2), which illustrates how critical the exact setting of the threshold is. We also suggested that this threshold might be computed interactively, after processing each symbol, which is cognitively plausible, but we postpone a demonstration of this to another paper.

The network performance on word recognition - word acceptance rate of 95% and random string rejection rate of 95% at a threshold of 0.016 - competes with the scores of symbolic techniques such as Inductive Logic Programming and Hidden Markov Models (Tjong Kim Sang, 1998), both of which reflect low-level human processing architecture with less fidelity.

Section 4 addressed the second question of how other linguistic knowledge encoded into the networks can be extracted. Two approaches were used. Section 4.1 clustered the weights of the network, revealing that the network has independently become sensitive to established phonetic categories.

We went on to analyze how various factors which have been shown to play a role in human performance find their counterparts in the network's performance. Psycholinguistics has shown, for example, the ease and the time with which spoken words are recognized are monotonically related to the frequency of words in language experience (Luce et al., 1990). The model likewise reflected the importance of neighborhood density in facilitating word recognition, which we speculated stems from the supportive evidence which more similar patterns lend to the words in their neighborhood. Whenever network and human subjects exhibit a similar sensitivity to well-established parameters, we see a confirmation of the plausibility of the architecture chosen.

Finally, the distribution of the errors within the words showed another linguistically interesting result. In particular, the network tended to err more often at the transition onset-nucleus - which is also typical for transitions between adjacent words in the speech stream and used for speech segmentation. Analogically, we can conclude from this that the nucleus-coda unit - the rhyme - is a significant linguistic unit for the Dutch language, a result suggested earlier for English (Kessler & Treiman, 1997).

We wind up this conclusion with one disclaimer and a repetition of the central claim. We have not claimed that SRNs are the only (connectionist) model capable of dynamic processing, nor that they are biologically the most plausible neural network. Our central claim is to have demonstrated that relatively simple connectionist mechanisms have the capacity to model and learn phonotactic structure.

1 ... 18 19 20 21 22 23 24 25 ... 41