Ana səhifə

On the Boundaries of Phonology and Phonetics

Yüklə 3.17 Mb.
ölçüsü3.17 Mb.
1   ...   33   34   35   36   37   38   39   40   41

6.Discussion and Conclusion

In section 4, we presented our phonological account of the restructuring within the framework of OT. Our main conclusion is that phonetic compression cannot be the sole explanation of the different rhythm patterns. Although the results cannot really confirm our hypothesis that there are different grammars, i.e. constraint rankings for different rates of speaking, there seems to be something that relates to speech rate. The fast speakers display different grammars, i.e. constraint rankings, for different rates of speaking. In their andante tempo, correspondence constraints prevail, whereas in allegro tempo markedness constraints dominate the correspondence ones. These preferences resemble the preferences of andante and allegro music. In both disciplines clashes are avoided in allegro tempo by means of enlarging the distances between beats.

In section 5, we attempted to confirm our phonological account with a phonetic analysis. Unfortunately, the phonetic correlates of stress - duration, pitch, intensity and spectral balance - do not show the expected and perceived differences in rhythm patterns in all pairs. Sluijter (1995) found out that duration is the main correlate of primary stress with spectral balance as an important second characteristic. In our analysis, however, neither differences in duration nor differences in spectral balance could identify secondary stress. Therefore, we have to conclude that our analysis supports earlier work by Shattuck Hufnagel et al (1994), Cooper and Eady (1986), Huss (1978) and Grabe and Warren (1995), who all claim that acoustic evidence for secondary stress cannot be found unambiguously. Although we did find some differences in duration, spectral balance or pitch, these differences were not systematically found in all pairs in which we perceived rhythmic variability. Finally, we discussed rhythmic timing as a cue for variable patterns. However, the hypothesis that the duration between prominent syllables is approximately equal in both andante and allegro speech was not confirmed by the auditive analysis of the data. It seems that rhythmic restructuring is more a matter of perception than of production. At this point, the question remains: are we fooled by our brains and is there no phonetic correlate of the perceived phonological stress shifts in the acoustic signal or do we have to conclude that the real phonetic correlate of secondary stress has yet to be found?



Boersma, Paul, and David Weenink (1992-2002). PRAAT, phonetics by computer. Available at, University of Amsterdam.

Burzio, Luigi (1998). Multiple Correspondence. Lingua, 104: 79-109.

Cooper, W., and J. Eady (1986). Metrical phonology in speech production. Journal of Memory and Language, 25: 369-384.

Couper-Kuhlen, Elizabeth (1993). English speech rhythm: form and function in everyday verbal interaction. Benjamins, Amsterdam.

Cummins, Fred, and Robert Port (1998). Rhythmic constraints on stress timing in English. Journal of Phonetics, 26(2): 145-171.

Eefting, Wieke, and Toni Rietveld (1989). Just noticeable differences of articulation rate at sentence level. Speech Communication, 8: 355-351.

Gilbers, Dicky, and Wouter Jansen (1996). Klemtoon en ritme in Optimality Theory, deel 1: hoofd-, neven-, samenstellings- en woordgroepsklemtoon in het Nederlands [Stress and rhythm in Optimality Theory, part 1: primary stress, secondary stress, compound stress and phrasal stress in Dutch]. TABU, 26(2): 53-101.

Gilbers, Dicky, and Maartje Schreuder (to appear). Language and Music in Optimality Theory. Proceedings of the 7th International Congress on Musical Signification 2001, Imatra, Finland. Extended manuscript available as ROA-571.

Grabe, Esther, and Paul Warren (1995). Stress shift: do speakers do it or do listeners hear it? In: Connell, Bruce and Amalia Arvaniti (eds.). Phonology and phonetic evidence. Papers in Laboratory Phonology IV.

Hart, Johan, René Collier, and Antonie Cohen (1990). A perceptual study of intonation. An experimental-phonetic approach to speech melody. Cambridge University Press, Cambridge.

Huss, V. (1978). English word stress in the postnuclear position. Phonetica, 35: 86-105.

Kager, René (1994). Ternary rhythm in alignment theory. ROA-35.

Legendre, Geraldine, Yoshiro Miyata, and Paul Smolensky (1990). Harmonic Grammar - A formal multi-level connectionist theory of linguistic well- formedness: An application. In: Proceedings of the Twelfth Annual Meeting of the Cognitive Science Society, 884-891.

Lerdahl, Fred, and Ray Jackendoff (1983). A Generative Theory of Tonal Music. The MIT Press, Cambridge, Massachusetts, London, England.

Liberman, Mark (1975). The Intonational System of English. Garland, New York and London.

McCarthy, John J. (1986). OCP Effects: Gemination and antigemination. Linguistic Inquiry, 17: 207-263.

Neijt, Anneke, and Wim Zonneveld (1982). Metrische fonologie - De representatie van klemtoon in Nederlandse monomorfematische woorden. [Metrical phonology – The representation of stress in Dutch monomorphemic words] De nieuwe Taalgids, 75: 527-547.

Prince, Alan, and Paul Smolensky (1993). Optimality Theory: constraint interaction in generative grammar. Ms., ROA-537.

Quené, Hugo, and Robert F. Port (2002). Rhythmical factors in stress shift. Paper presented at the 38th Meeting of the Chicago Linguistic Society, Chicago.

Rietveld, Toni, and Vincent van Heuven (1997). Algemene Fonetiek. [General Phonetics]. Dick Coutinho, Bussum.

Schreuder, Maartje, and Dicky Gilbers (submitted). Restructuring the melodic content of feet. In: Proceedings of the 9th International Phonology Meeting 2002, Vienna, Austria.

Shattuck Hufnagel, Stephanie, Mari Ostendorf, and Ken Ross (1994). Stress shift and early pitch accent placement in lexical items in American English. Journal of Phonetics, 22: 357-388.

Sluijter, Agaath (1995). Phonetic Correlates of Stress and Accent. HIL dissertations 15, Leiden University.

Sluijter, Agaath, and Vincent van Heuven (1996). Spectral balance as an acoustic correlate of linguistic stress. Journal of the Acoustical Society of America, 100(4): 2471-2485.

List of Addresses

Drs. Markus Bergmann

University of Groningen, Faculty of Arts, Department of Linguistics

Oude Kijk in 't Jatstraat 26, 9712 EK Groningen, The Netherlands

+31 50 3635982,
Drs. Tamás Bíró

University of Groningen, Faculty of Arts, Department of Computational Linguistics

Oude Kijk in 't Jatstraat 26, 9712 EK Groningen, The Netherlands

+31 50 3636852,

Dr. Dicky Gilbers

University of Groningen, Faculty of Arts, Department of Linguistics

Oude Kijk in 't Jatstraat 26, 9712 EK Groningen, The Netherlands

+31 50 3635983,

Dr. Charlotte Gooskens

University of Groningen, Faculty of Arts, Department of Scandinavian Languages and Cultures

Oude Kijk in 't Jatstraat 26, 9712 EK Groningen, The Netherlands

+31 50 3635827,

Dr. Dr. Tjeerd de Graaf and Drs. Nynke de Graaf

University of Groningen, Faculty of Arts, Department of Linguistics

Oude Kijk in 't Jatstraat 26, 9712 EK Groningen, The Netherlands

+31 50 3635982,

Drs. Angela Grimm

University of Groningen, Faculty of Arts, Department of Linguistics

Oude Kijk in 't Jatstraat 26, 9712 EK Groningen, The Netherlands

+31 50 3635920,

Dr. Ing. Wilbert Heeringa

University of Groningen, Faculty of Arts, Department of Computational Linguistics

Oude Kijk in 't Jatstraat 26, 9712 EK Groningen, The Netherlands

+31 50 3635970,

Prof. Dr. Vincent J. van Heuven

University of Leiden, Faculty of Arts, Department of Linguistics

Van Wijkplaats 4, 2311 BX Leiden, The Netherlands

+31 71 5272105,

Nienke Knevel

p/a University of Groningen, Faculty of Arts, Department of Linguistics

Oude Kijk in 't Jatstraat 26, 9712 EK Groningen, The Netherlands

+31 50 3635983,

Dr. Jurjen van der Kooi

University of Groningen, Faculty of Arts, Department of Frisian

Oude Kijk in 't Jatstraat 26, 9712 EK Groningen, The Netherlands

+31 50 3635966,

Prof. Dr. Ir. John Nerbonne

University of Groningen, Faculty of Arts, Department of Computational Linguistics

Oude Kijk in 't Jatstraat 26, 9712 EK Groningen, The Netherlands

+31 50 3635815,

Drs. Maartje Schreuder

University of Groningen, Faculty of Arts, Department of Linguistics

Oude Kijk in 't Jatstraat 26, 9712 EK Groningen, The Netherlands

+31 50 3635920,

Drs. Hidetoshi Shiraishi

University of Groningen, Faculty of Arts, Department of Linguistics

Oude Kijk in 't Jatstraat 26, 9712 EK Groningen, The Netherlands

+31 50 3635982,

Dr. Ivilin Stoianov

University of Padova, Department of General Psychology

Via Venezia 8, 35100 AS Padova, Italy

+39 049 8276676,

1 The experiments reported in this chapter were run by Susanne Strik and Josien Klink in partial fulfillment of the course requirements for the Experimental Phonetics Seminar taught by the Linguistics Programme at University of Leiden.

2 This commutation procedure is best viewed as a mental experiment; when the exchange is implemented through actual digital tape splicing, the result is more often than not an un­interpretable stream of sound.

3 The nature of the distinction between intonational categories is problematic for a further reason: inter-listener agreement on the identity of intonational events is low (Pitrelli et al., 1994), particularly in comparison with the self-evident consensus on segmental distinctions. This lack of consistency has lead Taylor (1998) to reject a basic principle of (intonational) phonology, namely its categorical nature. With respect to methodology, researchers tend to act as expert listeners, linking contours that sound distinct to pragmatic meaning in an intuitive fashion. Accordingly, inter-researcher agreement may be low, too (e.g. Caspers, 1998).

4 Nevertheless, large between-listener variability has been reported, for instance, in the cuing of the voiced/voiceless contrast by the duration of the pre-burst silent interval: the boundary was at 70 ms for subject #1 and over 100 ms for subject #7 (Slis & Cohen, 1969). These results are commented on by Nooteboom & Cohen (1976: 84) as follows: ‘Although the cross-over from /d/ to /t/ proceeds rather gradually when averaged over all listeners, the boundary is quite sharply defined for individual listeners’ (my translation, VH).

5 The ‘%’ sign following the tone letter (as in ‘L%’, ‘H%’) denotes a domain-final boundary; domain-initial boundaries are coded by the ‘%’ sign preceding a tone letter (as in ‘%L’, ‘%H’). A ‘%’ sign unaccompanied by a tone letter may only occur in domain-final positions, where it is phonetically coded by a physical pause and/or pre-boundary lengthening only.

6 It has been argued by structuralists at least as far back as Merckens (1960) that V1 (‘verb first’) is directly opposed to V2 ('verb second') in signaling, for example, ‘non-assertion’ rather than ‘assertion’, since neither a command nor a question nor a condition expresses an ongoing state of affairs.

7 A sequence like Neemt u de trein naar Wageningen might in addition be interpretable as a topic-drop-sentence (e.g. [Dan/Daar] neemt u de trein naar Wageningen ‘[Then/There] you take the train to Wageningen’, analogous to Doen we! ‘We'll do [it]’ or Weet ik! ‘[That] I know’. Although this added interpretation (with a ‘deleted’ element) is theoretically possible, we believe that it was highly unlikely under the controlled conditions of the experiment. Furthermore, none of the experimental subjects volun­teered the information that we had forgotten such an extra interpretation.

8 This position does not exclude the possibility that statement and imperative are subtly different in their paralinguistic use of prosody. For instance, the overall pitch of the imperative may be lower, and it may be said with greater loudness and larger/higher pitch excursions on the accented syllables. This does not invalidate our claim that both statements and imperatives are coded by the ‘L%’ terminal boundary.

9 The ERB scale (Equivalent Rectangular Bandwidth) is currently held to be the most satisfactory psychophysical conversion for pitch intervals in human speech (Hermes & van Gestel, 1991; Ladd & Terken, 1995). The conversion from Hertz (f) to ERB (E) is achieved by a simple formula: E = 16.6 * log (1 + f / 165.4).

10 Dr. Tjeerd de Graaf, the central figure in this volume, was born in Leeuwarden, the capital of Friesland. Leeuwarden is one of the places where Town Frisian is spoken. Tjeerd de Graaf is a native speaker of this dialect, but later on he also learned (standard) Frisian. The Leeuwarden speaker in the present investigation was Tjeerd de Graaf (see Section 3.1).

11 Most of this section is based on König and Van der Auwera (1994).

12 The Lillehammer recording can be found at together with 52 recordings of other Norwegian dialects.

13 Since our material included two toneme languages, Swedish and Norwegian, also the two tonemes I and II were transcribed. For the other varieties primary stress was noted. Stress and tonemes were, however, not included for calculation of linguistic distances.

14 The program PRAAT is a free public-domain program developed by Paul Boersma and David Weenink at the Institute of Phonetic Sciences of the University of Amsterdam and available at

15 The data is taken from the Linguistic Atlas of the Middle and South Atlantic States (LAMSAS) and available via:

16 The example should not be interpreted as a historical reconstruction of the way in which one pronunciation changed into another. From that point of view it may be more obvious to show how [] changed into [tnn]. We just show that the distance between two arbitrary pronunciations is found on the basis of the least costly set of operations mapping one pronunciation into another.

17 Tjeerd de Graaf has never taken such an extreme position. Possibly speakers of Town Frisian have a more moderate opinion towards this issue since Town Frisian is more closely related to standard Dutch, as appeared in Figure 5 and Table 3.

18 The authors are particularly pleased to offer this piece to a Festschrift honoring Dr. Dr. h.c. Tjeerd de Graaf, who graciously agreed to cooperate in the supervision of Stoianov's Ph.D. project 1997-2001 at the University of Groningen. Even if Tjeerd is best known for his more recent work on descriptive linguistics, minority languages and language documentation, his early training in physics and earlier research on acoustic phonetics made him one of the best-suited supervisors for projects such as the one reported on here involving advanced learning algorithms. Tjeerd's sympathy with Eastern European languages and cultures is visceral and might have led him to agree in any case, but we particularly appreciated his phonetic acumen.

19 The distance is related to Euclidean, but more exactly the distance between the two n-dimensional vectors is


 The cluster analysis in Figure 3 was produced by programs written by Peter Kleiweg, available at

21 According to, the mass of the electron neutrino (e) is less than 2.2 eV, the mass of the muon neutrino () does not exceed 170 keV, while the mass of the tau neutrino () is reported to be bellow 15.5 MeV. For the sake of comparison, the mass of an electron is 511 keV, while the mass of a proton is almost 940 MeV.

22 Physical phenomena are thought to be reducible to four fundamental forces. These are gravity, electromagnetism, weak interaction and strong interaction. The last two play a role in sub-atomic physics.

23 The photons (particles of the light) are the exchange particles for the electromagnetic interaction; the hypothetical gravitons should transmit gravi­tation; in the case of the weak interaction, the W +, W - and Z vector bosons play that role; whereas the strong interaction is mediated by pions.

24 Targumim (plural of targum) are the Jewish Aramaic versions of the Hebrew Bible from the late antiquity, including also many commentaries beside the pure translation. The same way as late antiquity Jews created the commented translation of the Holy Scriptures to their native tongue and using their way of thinking, Moses Mendelssohn expected his version of the Bible to fit the modern way of thinking and the “correct language” of its future readers. Obviously, the Biur should first have to fulfil its previous task, namely to teach the modern way of thinking and the “correct tongue” to the first generation of its readers. Interestingly enough, script was not such a major issue for Mendels­sohn as “language purity”, thus he wrote Hochdeutsch in Hebrew characters; in order to better disseminate his work among the Jewish population.

25 I assume that the formative phase of modern Dutch society and culture in the 17th and 18th century is comparable to that of 19th century Hungary; even more is so the role of Jewry in both countries, as a group which was simultaneously integrating into the new society and also forming it. In both cases, the presence of the continuous spectrum from the pre-Haskala Yid to the self-modernizing Israelite led to a gradual, though determined giving up of the Yiddish language. This socio-historical parallelism could partially explain why phenomena of Yiddish influence on Dutch are often similar to that on Hungarian.

Concerning Dutch-Jewish linguistic interactions, readers interested in Jewish aspects of Papiamentu, a creole language spoken in the Netherlands Antilles, are referred to Richard E. Wood’s article in Jewish Language Review 3 (1983):15-18.

26 The etymology of the Yiddish word itself is also interesting. The origin is the late Latin or Old French root [] ‘to read’ (cf. to Latin lego, legere, modern French je lis, lire), which was borrowed by the Jews living in early medieval Western Europe. The latter would then change their language to Old High German, the ancestor of Yiddish. At some point, the meaning of the Old French word was restricted to the public reading of the Torah-scroll in the synagogue.

27 Compare to ‘ski’ > síel ‘to ski’, tűz ‘fire’ > tüzel ‘to fire’; also: printel ‘to print with a computer printer’. It is extremely surprising that the word lejnol does not follow vowel harmony, one would expect *lejnel. Even though the [] sound can be transparent for vowel harmony, this fact is not enough to explain the word lejnol. Probably the dialectal Yiddish laynen was originally borrowed, and this form served as the base for word formation, before the official Yiddish form leynen influenced the Hungarian word. Some people still say lájnol.

28 When being called to the Torah during the public reading, one recites a blessing, the text of which says: “He Who blessed our forefathers Abraham, Isaac and Jacob, may He bless [the name of the person] because he has come up to the Torah / who has promised to contribute to charity on behalf of… etc.” The part of the text ‘who has promised’ sounds in the Ashkenazi pronunciation []. This is most probably the source of the word snóder, after vowel in the unstressed last syllable has become a schwa, a process that is crucial for understanding the Yiddishization of Hebrew words. The exciting part of the story is that the proclitic [] (‘that’) was kept together with the following finite verbal form ([] ‘he promised’), and they were reanalysed as one word.

29 When I asked people about the meaning of unberufn on the mailing list, somebody reported that her non-Jewish grandmother also used to say unberufn with a similar meaning.

30 Other Hungarian words of Hebrew origin do not come from Yiddish, as shown by their non-Askenazi pronunciation: Tóra ([] ‘Torah’, as opposed to its Yiddish counterpart Toyre) or rabbi (and not rov or rebe). Words like behemót (‘big hulking fellow’), originally from Biblical Hebrew behema (‘cattle’, plural: behemot; appearing also as a proper name both in Jewish and in Christian mythology) should be rather traced back to Christian Biblical tradition.

31 Note, that the word has kept its original word initial [], without transforming it into [], which would have been predicted by Hebrew phonology. Although this is a remarkable fact for Netzer, it turns out that almost no word borrowed by Modern Hebrew would change its initial [] to []. Even not verbs that have had to undergo morpho-phonological processes (e.g. fibrek from English to fabricate). The only exception I have found in dictionaries is the colloquial form pilosofiya for filosofiya ‘philosophy’, as well as the verb formed from it, pilsef ‘to philosophise’. Furthermore, it can be argued that pilosofiya is not even a modern borrowing. The only reason why one would still expect firgen ]. On the other hand, one may claim that // and // should be considered as distinct phonemes in Modern Hebrew, even if no proposed minimal pair that I know of is really convincing.

32 “…identity effects will come into play only to the extent that the immediate constituents composing the complex structure constitute independently occurring outputs…(Kenstowicz 1996: 373)”, “The base of an OO-correspondence relation is a licit output word, which is both morphologically and phonologically well-formed (Benua 1997a: 29)”, “The bound form of a stem is segmentally identical with its corresponding free form (Ito and Mester 1997: 431)”.

33 See for more information.

34 The rhotic r of Nivkh is classified here and elsewhere in the literature (e.g. Trubetzkoj 1939) as a voiced fricative since it patterns as such in the CA system. Its voiceless counterpart r is an apical trill containing portions without vocal cord vibration (Ladefoged and Maddieson 1996: 236).

35 Regarding this nature of CA, one may postulate a single laryngeal feature (rather than two) for both plosives and fricatives, e.g. [+spread glottis] for both aspirated plosives and voiceless fricatives. Such an analysis is proposed by Jakobson (1957) and Blevins (1993). See also section 3 below.

36 Segments that underwent CA are put in square brackets. Abbreviations are: ALL= allative, asp= aspiration, I=Intonational phrase, INF=infinitive, NP = noun phrase, PL= plural, VP = verb phrase, XP = maximal projection.

37 The alternation (r >) t > d is due to post-nasal voicing.

38 CA exhibits aspects of prosodic phonology (I am using this term to contrast with lexical phonology); it is sensitive to pause insertions and to speech rate. I would classify it as a P-structure rule in the terminology of Selkirk (1986). P-structure rules exhibit phonological properties of prosodic phonology, yet they are sensitive to syntactic bracketing (Selkirk 1986).

39 This line of analysis has antecedents. Amongst them are: Kenstowicz and Kisserberth (1979), Rushchakov (1981), Kaisse (1985), and Blevins (1993). Interestingly, Lev Shternberg, the pioneer of Nivkh study, assumed plosive-initial forms to be the input to transitive structures, as well (Shternberg 1908).

40 Spirantization and hardening are not ordered relative to each other in the tableau below.

41 Post-nasal context requires different markedness constraint but I omit it from the discussion below. See Shiraishi (2000) for details.

42 Following a velar or a uvular plosive, the initial velar of a suffix appears as [x], spirantizing the former at the same time: tx-xu <tk+PL ‘fathers’.

43 On the other hand, OO-constraints are known to be able to make reference to non-contrastive features. See Benua (1997b) and Steriade (2000) for such cases.

44 This paper is an extension of our paper "Restructuring the melodic content of feet", which is submitted to the proceedings of the 9th International Phonology Meeting: Structure and melody, Vienna 2002. We wish to thank Grzegorz Dogil, Hidetoshi Shiraishi plus the participants of the 9th International Phonology Meeting, Vienna 2002 and the participants of the 11th Manchester Phonology Meeting, Manchester 2003 for their useful comments. We are also grateful to Sible Andringa, Nynke van den Bergh, Gerlof Bouma, John Hoeks, Jack Hoeksema, Wander Lowie, Dirk-Bart den Ouden, Joanneke Prenger, Ingeborg Prinsen, Femke Wester for participating in our experiment. We especially thank Wilbert Heeringa and Hugo Quené for supplying us with the PRAAT scripts that we could use for our spectral balance and rhythmic timing analyses.

45 For reasons of clarity, we abstract from constraints such as FootBinarity (FtBin) and Weight-to-stress principle in Table 2. Although these constraints play an important role in the Dutch stress system (cf. Gilbers & Jansen, 1996), the conflict between output-output correspondence and foot repulsion is essential for our present analysis.

46 With respect to the phonological analysis of the data, we suggest a random ranking of weighed correspondence and markedness constraints. By means of weighing constraints we adopt an OT variant that more or less resembles the analyses in OT’s predecessor Harmonic Grammar (cf. Legendre, G., Y. Miyata & P. Smolensky, 1990). Note that we do not opt for a co-phonology for allegro-style speech in our analysis. In a co-phonology, the output of the andante-style ranking is input or base for the allegro-style ranking. We opt for a random ranking with different preferences for allegro and andante speech, because our data show variable rhythmic structures at both rates. Both rankings evaluate the same input form.

47 The perceived loudness depends on the frequency of the tone. The Phon entity is defined using the 1kHz tone and the decibel scale. A pure sinus tone at any frequency with 100 Phon is as loud as a pure tone with 100 dB at 1kHz (Rietveld and Van Heuven, 1997: 199). We are most sensitive to frequencies around 3kHz. The hearing threshold rapidly rises around the lower and upper frequency limits, which are respectively about 20Hz and 16kHz.

1   ...   33   34   35   36   37   38   39   40   41

Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur © 2016
rəhbərliyinə müraciət