1. Introduction
This paper examines speaker tolerance for word-initial phonological consonant alternations in Iwaidja (Pym & Larrimore, 1979; ISO 639–3 code ibd) and Mawng (Capell & Hinch, 1970; ISO 639–3 code mph), two Indigenous Australian languages of the Iwaidjan family (Mailhammer & Harvey, 2018). Both Iwaidja and Mawng have been described as having historical, now lexicalised, grammatically conditioned initial mutations in verb and noun roots (Evans, 2000, p. 102), as exemplified in (1.1) in bold. However, Mawng is also described as having synchronic external sandhi processes affecting initial segments at word boundaries (Capell & Hinch, 1970, p. 44), as exemplified in (1.2), also in bold. This situation provides an excellent opportunity to investigate, as we do here, the effect of initial segmental phonetic and phonological faithfulness on word recognition in Mawng and Iwaidja, and the potential differences between phonological mutation vis à vis phonological or phonetic allophony.
- (1.1)
- a.
- a-mawur
- 3pl-arm
- ‘their arms’
- b.
- bawur
- 3.arm
- ‘his arm’
- (1.2)
- a.
- ɡe
- it.goes
- [ɡ ~ ɣ]apala
- boat
- ‘the boat goes’
- b.
- ɡaɾɡbin
- big
- [ɡ]apala /*ɣabala
- boat
- ‘big boat’
- c.
- mada
- veg.the
- [ɡ ~ w]ubuɲ
- canoe
- ‘the canoe’
- d.
- mariɡ
- NEG
- [ɡ]ubuɲ/*wubuɲ
- canoe
- ‘not a canoe’
In what follows (Section 1.1), we discuss the reliance of current models of continuous parsing on input which is faithful to the underlying form of any given lexeme, as well as the importance of the word-initial segment in lexical activation. We review some cases of initial mutation or alternation (lenition; fortition) that challenge these models. Following from that (Section 1.2), we describe the patterns of mutation and lenition in Mawng and Iwaidja before we present a Two-Alternate Forced Choice (2AFC) study of speaker tolerance for word-initial segmental variation (stops-to-continuants and continuants-to-stops). The results (Section 3) indicate that language-specific phonological, phonetic, and distributional characteristics influence participant tolerance for word-initial alternations and suggest that models of continuous parsing and word recognition must account for the effects of phonology and segmental phonetics as well as phonotactics/distributional information to adequately model processes of word-recognition and parsing.
1.1 Unreliable beginnings are problematic
Despite differences in what is assumed to be the nature of the input (e.g., acoustic-phonetic material, articulatory gestures or constellations, abstract phonemes, or distributional information from sequences of phones), most models of word recognition, like the Cohort Model (Marslen-Wilson & Welsh, 1978), TRACE (McClelland & Elman, 1986), and Shortlist B (Norris & McQueen, 2008) assume that word recognition begins with the word-initial segment. Under most models, this first segment and the following segments feed two concurrent processes: On one hand, they activate possible lexical competitors in the listener’s vocabulary—a top-down process—and on the other, they progressively eliminate competitors that do not match the presented material—a bottom-up input-driven process.
Models like the Cohort Model, TRACE, and Shortlist B invoke connectionist, associative and/or probabilistic mechanisms and assume that the process of word recognition is linear and progressive. For instance, in Shortlist B (Norris & McQueen, 2008; see Figure 1 below), processes of word recognition begin immediately—and the first segment or two may even provide a match to a word in the lexicon—and then continue to use incoming segmental information to activate or eliminate lexical candidates. Eventually, the competition will be won by the highly activated sequence of lexical candidates that successfully accounts for the entire string without any leftover elements (Cutler, 2008; 1996; Norris & McQueen, 2008; Norris, McQueen, & Cutler, 2000).
Models like Shortlist B (Norris & McQueen, 2008) attempt to also account for observations from the predictive parsing literature that many well-established factors play a role in word recognition, including sub-phonemic transitional information (Marslen-Wilson & Warren, 1994), phoneme and lexeme frequency, lexical neighbourhood density (Marslen-Wilson & Welsh, 1978; Metsala, 1997), and adherence to or violation of semantic and syntactic rules (Spoehr, 1980; Miller & Isard, 1963), such that a particular construction may favour an activity verb, a countable noun, or human referent (Smirnova et al., 2019). Research also shows that intonation, as well as word stress and tone, likewise provide cues about upcoming syntactic and lexical structures (Kjelgaard & Speer, 1999; Roll et al., 2010; 2011; Hirose & Mazuka, 2015; Söderström et al., 2017). And it is well-established that not only are whole words activated or eliminated in the lexicon, though Figure 1 might appear to suggest that this is the case, but words that are near-matches to a presented word are also activated (Davis & Taft, 2005; Taft & Forster, 1976). Additionally, word-internal morphological complexity—and speaker morphological knowledge—is also important to word recognition/processing (Bundgaard-Nielsen & Baker, 2020) and morphological competitors have been shown to play a role in the identification of complex words (see e.g., Balling & Baayen, 2012) as listeners may make decisions about the identity of some morphemes before the end of a complex word is reached.
Importantly, though often implicitly, models of continuous parsing and research into human word recognition mechanisms also assume that the input, whatever it is assumed to be in each model, is faithful to the lexical specifications of the entries in the lexicon. To quote Cutler (2008, p. 1602): “Psycholinguists largely take the ‘front end’—the initial processing applied to raw acoustic input—for granted, assuming that it will deliver a representation of the input that is in a form suitable for accessing stored lexical entities” [emphasis added]. And often, this might seem a safe and reasonable assumption to make. Indeed, in the Shortlist B example in Figure 1 (Norris & McQueen, 2008)—the British English phrase [ðəkætəlɒɡɪnəlaɪbrɪ] ‘the catalogue in a library’—few would argue with the fact that [khæt] ‘cat’ is stored in the minds of English-speakers as /kæt/ and will not necessarily be well-recognised if produced as [ɡæt] (a change in the voicing specifications of the initial segment that crosses the phonetic VOT boundary), or [pæt] (a change in place of articulation), or [ɣæt] or [ɰæt] (a change in manner of articulation from a stop to a fricative [ɣ] or approximant [ɰ] at the velar place of articulation) by speakers. Changes of this sort potentially lead to activation of the wrong set of lexical competitors, where the phonetic realisation results in a phonological substitution (/kæt/ ‘cat’ to /pæt/ ‘pat’ or /ɡæt/ ‘gat’, or no activation at all). These ‘wrong paths’ would require reinterpretation or repair of the input at some later point, in a scenario that is not possible under Shortlist B assumptions of linear progression from the root node (R) to a terminal node (T).
This example is extremely important because it demonstrates that psycholinguistic models of word recognition—even ones that Shortlist B, for example, rely upon include on a combination of both path probabilities (i.e., predictions about ‘what comes next’) and information about phonetic confusability between phonemes—tend to be poorly equipped to deal with variability in the input. We expect this to be particularly problematic in cases of word-initial mutations as found in Welsh, and in terms of the complex (even multidirectional) phone-to-phoneme correspondences explored in Mawng and Iwaidja in the present study.
The (insular) Celtic language family, to which Welsh belongs, is well-known for its richness in word-initial consonant mutations. For instance, the word /pont/ ‘bridge’ may appear also as /bont/ or /font/ in Welsh (Ball & Müller, 1992; Fife & King, 1998) depending on grammatical environment. Initial mutations of the kind found in Welsh are often considered morphological rather than phonological (Ussishkin et al., 2017) and raise questions about what is stored in the lexicon (is it one form, or many? And is the lexicon multi-tiered?). Welsh-style alternations in surface forms also raise questions of whether selection of correct forms is the result of rule-application (syntactic, morphological, and/or phonological), as is assumed under generativist/Chomskyan models of language processing (e.g., Chomsky & Halle, 1968; Gaskell & Marslen-Wilson, 1996) or associative patterns that can be accounted for by connectionist, relational or associative morphological processes/schemas and extensive listings in the lexicon (e.g., Rumelhart & McClelland, 1986; Walsh et al., 2010; Jackendoff & Audring, 2020), as well as in TRACE (McClelland & Elman, 1986) and Shortlist B (Norris & McQueen, 2008).
Changes to the surface form of words, and even to the initial segments of words, have also been described for many other languages, including Spanish. In Spanish, voiced stop consonants /b d ɡ/ have two regular and predictable allophones both word-initially and within words, such that stops /b ɡ/ are realised as approximants [β̞ ɣ̞] everywhere but after nasals (where they are realised as stops), and /d/ is realised as approximant [ð̞] everywhere but after nasals and laterals. The Spanish allophonic realisations, however, are unlikely to induce competition in lexical activation, as the lenited forms are not canonical realisations of other Spanish phonemes, and thus do not cause perceptual confusion or neutralisation, or ‘pernicious homophony’ (Campbell, 1996; Blevins & Wedel, 2009). That is, the changes involved are allophonic and ‘non-structure-preserving’, in Kiparsky’s (1982) sense, because the lenited allophones are not segments in the inventory, unlike the case of Welsh and other languages with initial mutation, which do involve substitutions between phonemes, i.e., neutralisation.
Within phonological theory, it has been observed that lenition and neutralisation are far less common in initial position than in later positions in the word (Houlihan, 1975; Beckman, 1998). An explanation for this fact is offered by Smith (2004, p. 1456), who notes that “Nooteboom (1981) and Taft (1984) have argued that speech processing is more efficient when positions that are especially important in (early-stage) word recognition are given as large a number as possible of phonological contrasts to draw from,” and thus word-initial and root-initial syllables are particular prone to maximise phonological contrasts and resist neutralisation (see also Wedel et al., 2019).
Australian Indigenous languages buck these trends, however. As first pointed out by Dixon (1972), the apical contrast (between apico-alveolar and apico-retroflex) is typically neutralised or may be absent altogether in word-initial position. Australian languages also display patterns of initial phonetic/phonological alternations (Blevins, 2001; Baker, 2014). Initial alternation phenomena among Australian languages are typically phonologically conditioned and involve alternations between phonemes in the inventory, rather than allophones (Baker, 2014). That is, the changes are structure-preserving, as in Welsh, but phonologically-conditioned, as in Spanish, and present a case which is distinct from both. For example, in Wubuy (Heath, 1984), every morpheme which begins in either a continuant consonant or a vowel also has a stop-initial form found after non-continuants (nasals and stops) within words. In (1.3) for example, the verb for ‘race’ begins with a /w/ underlyingly, and appears in that form in the word in (1.3), because the preceding segment is a vowel. But in (1.4), the form of the verb begins in a stop /p/ because the preceding segment is a nasal. This alternation is called ‘hardening’ in (Heath, 1984). ‘Hardening’ is completely productive and exceptionless in Wubuy.1
- (1.3)
- ni-wajamaŋi
- 3masc-race.pcon
- (Heath, 1980, Text 27.4)
- ‘he raced’
- (1.4)
- wuru-manpa-man-pajamaŋi
- 3pl-rdp-group-race.pcon
- (Heath, 1980, Text 27.3)
- ‘they raced as a pack’
Few experimental studies of initial alternations in Australian languages have been undertaken, but there is one processing study of phonological hardening in Wubuy (Bundgaard-Nielsen, Baker & Wang, 2023). Using a cross-splicing preference task like the one employed in the present study, this work suggests that phonological hardening of the kind found in Wubuy may be better accounted for in terms of application of a phonological rule modifying an underlying lexeme than by selection between competing lexemes. This contrasts with what has been reported for Scots Gaelic (Ussishkin et al., 2017) and Welsh (Boyce et al., 1987), where the phenomenon is accounted for as a morphological process.
1.2 Word-initial mutations and alternations in Iwaidja and Mawng
Mawng and Iwaidja are non-Pama-Nyungan languages spoken on Croker Island, an island in the Arafura Sea off the coast of the Northern Territory, Australia, approximately 250 km northeast of Darwin.2 Croker Island was the site of the Croker Island Mission (1940–1968). Currently, the only large settlement is the Aboriginal community of Minjilang. According to the Australian Bureau of Statistics 2021 census, 234 people live in Minjilang, of which approximately 90% are Indigenous Australians. Of these, 36% identify as speakers of Iwaidja, 30% as Kunwinjku, and 10% as Mawng (Mawng is primarily spoken in the community of Warruwi on Goulburn Island). Just over 4% of Minjilang’s population identify as speakers of Kuninjku (as with Kunwinjku, a variety of Bininj Gun-wok: Evans, 2003), and another 4% as Burarra speakers. Multilingualism is the norm in the community, and all participants in the present study report speaking at least three languages (Iwaidja, Mawng, and English). Language input domains and usage patterns, however, vary between individuals on Croker Island, depending on family histories and ties. Many speak one or two Indigenous languages as their first language, but there are also individuals who use English very frequently or exclusively, whilst others use English only in domains like school or social or health services. English may also be used as a lingua franca in the absence of a common Indigenous language. Indigenous languages occupy almost all community domains except school and government-related communication. For official community announcements, the default language is usually Iwaidja, and there is no translation into other languages provided (Mailhammer, 2021). This is the only sizable population of speakers of Iwaidja anywhere. Shaw et al. (2020, p. 579) provide a “generous estimate” of around 50 speakers of Iwaidja. Mawng is a somewhat larger language: Singer et al. (2021) estimate around 400 speakers.
Both Iwaidja and Mawng have been described as having historical and grammatically-conditioned initial mutations in verb and noun roots (Evans, 1998; 2000) that differ from Wubuy in being no longer phonologically-conditioned. Examples of the grammatically-conditioned initial mutations in Iwaidja are illustrated in (1.1) and in (1.5)–(1.6), adapted from Mailhammer and Harvey (2018, p. 334). In these examples, there is an alternation between initial /w/ and initial /b/ in the verb stem, but the alternation cannot be predicted on the basis of the phonological environment: In (1.5), /w/ appears after a vowel-final prefix, while we get /b/ in this environment in (1.6). In contrast, /b/ occurs word-initially in (1.5), but following a vowel in (1.6). Therefore, the alternation can only be described in terms of the morphological environment and, in particular, the person/number/gender features of the arguments. Evans (1998) proposes that the alternations have a basis in the historical phonological environment but evidently, they are opaque synchronically and could potentially be accounted for by an approach like that taken for Celtic languages (see, for instance, Ussishkin et al., 2017).
- (1.5)
- a.
- ŋa-wani-ŋan
- 1sg-sit-pst
- ‘I was sitting’
- b.
- Ø-bani-ŋan
- 3sg-sit-pst
- ‘He/She/It was sitting’.
- (1.6)
- a.
- a-bu-ŋ
- 1sg/3sg-hit-pst
- ‘I hit him/her/it’
- b.
- ɻi-wu-ŋ
- 3sg.ma/3sg-hit-pst
- ‘He hit him/her/it’
The focus here, however, is that Mawng additionally has been described as having additional synchronic, phonologically-conditioned external sandhi processes affecting initial /ɡ/ at word boundaries (Capell & Hinch, 1970), but the additional presence of widespread mutations in the grammar of both languages makes the question of the importance of the identity of the word-initial segment a potentially more interesting one for these languages. Indeed, speakers of Mawng and Iwaidja have been reported to behave linguistically in a way that makes it difficult to align their phonological systems and speech recognition processes with what is typically assumed of speakers of other languages. This makes the linguistic behaviour—here, word recognition with variable input—extremely important from a theoretical point of view, as well as from a descriptive point of view.
The additional synchronic external sandhi processes described for Mawng affect initial segments at word boundaries. Both the grammars of Mawng (Capell & Hinch, 1970) and Iwaidja (Pym & Larrimore, 1979) include a velar approximant or fricative in the inventory, which can only occur phonemically in word-medial positions in either language. However, word-initial velar stop /ɡ/ in Mawng—but not Iwaidja—is reported to lenite to [w], [ɣ] or [ɰ] following vowels, liquids and glides in the preceding word (Capell & Hinch, 1970, p. 44), as in examples (1.2a) and (1.2c), whereas only the stop is found following nasals and stops as in examples (1.2b) and (1.2d).3 There has been no empirical instrumental study of sandhi in Mawng to date. We note, however, that a recent acoustic and ultrasound study with speakers of Iwaidja (Shaw et al., 2020) reports considerable inter- and intra-speaker variation in the realisation of Iwaidjan words involving /ɡ/ and /ɰ/. They conclude that there is no evidence from production for a phonemic contrast between /ɡ/ and /ɰ/ despite evidence for a bimodal distribution of [ɰ] and [ɡ]: The data suggests that some participants use one form, and others typically another (Shaw et al., 2020, p. 608). The study considered only a subset of the environments in which the velar approximant has been described as phonemic, which further qualifies the conclusions. The fact that Mawng displays synchronic lenition and Iwaidja does not is the critical difference explored in the present studies.
The present studies test the effect of word-initial segmental variability, presumably arising from phonetically-conditioned synchronic lenition processes exemplified in examples in (1.2a) and (1.2c) on word acceptability in Mawng and Iwaidja. We do so by testing speakers’ tolerance of initial segmental alternations of two kinds—onset lenition (stops to approximants) and onset fortition (unattested change of approximants to stops)—in a 2AFC experiment with bilingual Mawng and Iwaidja speakers from Croker Island in the Northern Territory of Australia. The 2AFC paradigm implemented here allows us to examine the way in which surface phonetic alternations influence word activation in speakers. It does so because participant word preference decisions require successful word recognition, and the only thing that differs between the competing selections is the phonetic characteristics of the first segment (if the phonetic difference between forms presented does not matter in terms of word acceptability, neither option will be favoured). The results thus provide novel insights into the importance of first-segment phonetic-phonological fidelity in word recognition (no preference without recognition). The results additionally provide new data on some of the factors that potentially constrain listener tolerance for phonetic variability in word initial phonological segments.
2. Method
We discuss the stimulus recording and selection in Section 2.1 and the experimental design of the word preference study in Section 2.2. Information about the participants and participant selection is in Section 2.3. Finally, predictions are presented in Section 2.4.
2.1 Stimuli
We recruited one female speaker of Iwaidja and one female speaker of Mawng to produce the stimulus material for the two 2AFC experiments; both speakers were bilingual in Mawng and Iwaidja. The Iwaidjan speaker was aged 50 at the time of the recording and reported speaking Iwaidja, Mawng, English, Kunwinjku, and Kunbarlang. She had some literacy in English, Iwaidja, Mawng and Kunwinjku, and was very familiar with a range of language research activities, including recording of narratives and elicitation of material for acoustic analysis, sociolinguistic interviews, and ultrasound for language research purposes.
The Mawng speaker was 64 years old at the time of the recording. She reported speaking Mawng, Iwaidja, and English, and had some literacy in all three languages. Like the speaker of Iwaidja, she was familiar with a wide range of language research activities, including recording of narratives and elicitation of material for acoustic analysis, sociolinguistic interviews, and ultrasound for language research purposes. A researcher trained in phonetics recorded the speakers in a quiet home in Minjilang on Croker Island. The recording devices used were a Panasonic HC-VX-1 HD camera with Countryman omnidirectional lapel microphones with Sennheiser ew100G3 wireless transmission. All recordings had a 16-bit sampling depth with a sampling rate of 44.1 kHz. Each speaker was compensated for her time with a payment of $50.
The speakers each produced five iterations of two carrier frames (see Table 1). One frame was structured so that the target words followed a vowel (/a/); in the other, the target word followed a stop (/d/ in Iwaidja and /b/ in Mawng). The carrier frames had the same semantic content in both target languages. The speakers also produced five iterations of each depictable noun target in its canonical form in each frame, as well as five lenited iterations of each depictable stop-initial noun in each frame, and five hardened iterations of each continuant-initial noun target: Lenition targets beginning in /b/ were elicited with [w]-onsets; lenition targets beginning in /ɡ/ were elicited with [ɣ~ɰ] onsets (see discussion of elicitation and transcription procedure below and Figures 2a and 2b); and lenition targets beginning in /ɟ/ were elicited with onset [j]. Fortited targets beginning with /w/ were elicited with onset [b], and fortited targets beginning with /j/ were elicited with [ɟ] onsets. The resulting word pairs (a canonical form plus either a lenited or fortited form) are presented in Appendix 1.
Iwaidja | Mawng | Eng. Gloss | |
Stop final | /aɖajan waɻad…/ |
/ŋejan jaɻaɡab…/ |
I can see one… |
Vowel final | /aɖajan ɻuɡa…/ | /ŋejan nuɡa…/4 | I can see this… |
The mutated forms of the Iwaidja and Mawng target words were elicited by explaining to the speakers that the different ways of pronouncing the same word were required so that the correct pronunciation of the Iwaidja/Mawng word could be better understood. To support the speakers, a phonetically trained researcher modelled the lenited and hardened forms of the initial segments only (not the target words), and the speakers were then asked to reproduce each target word with the modified onset. No corrections or adjustments were made by the researcher to the pronunciation variants produced by the speakers in either language, and the variants produced were all within the described range of variants for each language respectively.
We assume that the lenited realisations typically transcribed as [ɣ] and [ɰ] are phonetic gradients (more versus less constriction: see also Shaw et al., 2020), and that transcription discrepancies in the published literature on these languages reflect the practical need to select an IPA symbol despite realisations falling on a continuum of greater to lesser constriction. For convenience, we use the approximant symbol [ɰ] throughout, without making assumptions about the degree of constriction of the lenited form. We note, however, that existing research such as Shaw et al. (2020) and Ennever et al. (2017) on the Gurindji language has found very few, if any, fricative productions of lenited stops. In Figure 2, the four spectrograms (2a, b, c, and d) provide example spectrograms of recordings of the Iwaidja and Mawng cognate target word /ɡabal/ ‘flood plain’ in the stop-initial canonical/underlying form and in the lenited word form [ɰabal], elicited from stop-final and vowel-final frames, respectively.
From the recorded nouns, a subset of 14 target nouns for each language were selected for inclusion in the four experiments, as were two sets of minimal pairs as Control Trials (totalling 16 unique trials for each language). The Control Trials were both /b/-/w/ minimal pairs to avoid issues associated with variability in the realization of /ɡ/ in Mawng and Iwaidja. This resulted in a total of 16 word-pairings (see Appendix 1). To generate the stimuli for the experiments, we selected one token of each of the two frames in both Iwaidja and Mawng from iterations 2, 3, or 4 (avoiding list-initial and list-final intonation) as well as one iteration of each target noun (canonical or modified) also from iteration 2, 3, or 4, based on similarity in intonation structure, intensity, and speaking rate. These frames and individual words were then normalised for intensity and the experimental stimuli were generated by cross splicing each target noun (canonical or modified) into each carrier frame in both Iwaidja and Mawng.
2.2. Experimental design
The recorded materials were used to design stimuli for two parallel 2AFC experiments, one in Mawng and one in Iwaidja, testing participant preference for canonical versus mutated noun forms in the two carrier frames (vowel-final versus stop-final) in two prosodic juncture conditions (no pause versus pause inserted). The 2AFC paradigm used for the present study constitutes a preference task—participants are required to pick a preferred stimulus out of two items differing only in the onset characteristics of the target noun: Critical trials differ only in presenting the noun with the underlying/canonical onset versus either lenited or hardened onset, and control trials also differ only in having one continuant and one stop initial target word (same place of articulation: POA), constituting a minimal pair.
Systematic preference for either stop or continuant-initial target words relies on two things: ability to discriminate between stops and continuants with the same POA, and knowledge about the phonological specification of each target word (what is the canonical pronunciation, for instance, of the Iwaidja and Mawng /ɡabal/ ‘flood plain’?). Inability to discriminate between stop and continuant-initial targets, or uncertainty about the phonological specifications of the target words, would lead participants to perform at chance level. Chance level preference patterns, in turn, would be expected from cognate target words (i.e., the control trials), if they indeed activate separate lexical entries, because selection of either alternative constitutes a valid choice.
Each experiment (Mawng versus Iwaidja) was presented in two forms, differing only in whether a silent pause had been inserted before the target noun, presented in counterbalanced order. Each experiment (natural versus pause-insertion) consisted of 16 target words in two carrier frames (see Table 1) in two presentation orders, resulting in 64 individual trials. The order of presentation of the 64 trials was pseudo-randomised, ensuring that consecutive trials did not contain the same target nouns. The pause-insertion condition was created by insertion of 500 ms of silence before the onset of the target nouns. This pause insertion temporally distances the potential phonological conditioning environment from the target word onsets (canonical, lenited or hardened), because external sandhi phenomena are known to be sensitive to prosodic juncture (Kilbourn-Ceron et al., 2020).
Each trial belonged to one of three types (See Appendix 2 for all pairings):
Lenition Trials in which the target word was presented in its canonical form and in a lenited continuant-initial form e.g., [baŋɡa] ‘forked stick’ in both Iwaidja and Mawng versus *[waŋɡa], where /waŋɡa/ is not a canonical word in either language;
Fortition Trials in which the target word was presented in its canonical form and in an unattested stop-initial (hardened) form, e.g., [wamba] ‘shark’ in both Iwaidja and Mawng versus *[bamba]; and,
Control Trials which consisted of minimal pairs of words that differed only in the initial segment: one had an approximant onset, the other a stop onset at the same POA, in a manner similar to the Lenition and Fortition Trials, e.g., [baɡaj] ‘spike/harpoon’ versus [waɡaj] ‘sugar glider’ in Iwaidja and Mawng. In these trials, each realisation arguably activates different entries in the lexicon.
Some of the included target nouns were cognate between Iwaidja and Mawng, for optimal comparison, while others were not. By ‘cognate’ we mean that the forms are identical in each language (whether this is for either reasons of genetic inheritance or for borrowing). One Fortition Trial was cognate (/wamba/ ‘shark’), three Lenition Trials were cognate (/baŋɡa/ ‘forked stick’; /ɟaɾaŋ/ ‘horse’; and /ɡabal/ ‘flood plain’), and one Control Trial was a cognate (/baɡaj/ ‘spike/harpoon’ versus /waɡaj/ ‘sugar glider’), and one a partial cognate (/baɳɖi/ ‘armband’ versus /waɳɖi/ ‘one who is hanging’ [in Iwaidja]5; see discussion in Results).
The distribution of Lenition and Fortition Trials also differed slightly between the two languages (see Table 3): The Iwaidjan participants were presented with 12 Fortition Trials (six continuant-initial words plus their non-attested fortited forms); 16 Lenition Trials (eight stop-initial words plus their lenited continuant-initial forms), and 4 Control Trials (two minimal pairs). The Mawng participants were presented with 10 Fortition Trials (five continuant-initial words plus their non-attested hardened forms), 18 Lenition Trials (nine stop-initial words plus their lenited continuant-initial forms), and 4 Control Trials (two minimal pairs; see Table 3).
Each study (Mawng versus Iwaidja, each in the unmodified and the pause-inserted conditions) was delivered to the participants in the form of a PowerPoint presentation. During each study, the participants heard each pair of utterances through headphones from a laptop and at the same time were presented with two line-drawings of faces on the computer screen. For each pair of utterances, the listeners were encouraged to think of the experiment as two Iwaidja- (or respectively, Mawng-) speaking people trying to tell them something, and they were instructed to indicate which of the utterances they preferred. Previous research has demonstrated that framing experimental psycholinguistic research in a socially meaningful narrative improves participant experience and ensures the widest possible range of participants (Bundgaard-Nielsen et al., 2015; Bundgaard-Nielsen, Baker, & Wang, 2023; Bundgaard-Nielsen & Baker, 2020). Limited literacy and computer skills, as well as the need for the stimuli to be socially meaningful, limit the suitability of more traditional experimental approaches, including the use of gating tasks. Participants were allowed to listen to each pair as many times as they liked, and to take as long as they liked before making their decision. Participants were allowed breaks at any point in time.
Our experimental protocol diverges from standard laboratory speech science paradigms in word recognition studies (such as gating tasks, which typically rely on participants having strong independent test-taking and literacy skills: see e.g., Grosjean, 1980) but was implemented to accommodate a wide range of speakers (i.e., different ages, educational backgrounds, familiarity with electronics) as well as to accommodate a testing situation rich in interruptions and social interactions, and to provide a culturally safe environment (Smye et al., 2010). The approach taken here, along with other low-tech and low formality approaches, have been successfully used with both adults and children in other Indigenous speech communities in Australia for psycholinguistic research (Bundgaard-Nielsen et al., 2015; Bundgaard-Nielsen & Baker, 2020; Bundgaard-Nielsen, Baker, Bell, & Wang, 2023; Bundgaard-Nielsen, Baker, & Wang, 2023). These alternative methods are particularly important to consider in communities with a low number of speakers, to ensure that as many speakers of a given language can participate, even when they may face increased barriers to participation, including limited literacy, limited familiarity with electronic equipment and experimental protocols, as well as Western research practices. It is also more accommodating for participants who wish to adhere to local cultural practices that favour interaction in smaller groups over one-to-one interactions with (unfamiliar) outsider researchers.
2.3. Study participants
We recruited 11 L1 speakers of Iwaidja (8 Male; 3 Female), ranging in age from 26 to 88 years of age (M = 49). We also recruited 11 L1 speakers of Mawng (8 Male; 3 Female), ranging in age from 22 to 88 years of age (M = 44). All participants completed a language background questionnaire. All participants, irrespective of which experiment they participated in (Iwaidja or Mawng), reported speaking both Iwaidja and Mawng fluently, as well as Kunwinjku and English to varying degrees of fluency. One participant additionally reported having some competence in Kunbarlang, and another reported some competence in Amurdak, Marrku, and Garig. All but one of the participants also reported regular exposure to Kunbarlang and/or Yolngu Matha. All but one participant were literate in English to some extent, and most reported having very basic Iwaidja and Mawng literacy as well (one participant reported having excellent literacy in both Iwaidja and Mawng). Four participants participated in both the Iwaidja and the Mawng experiments on separate days.
All participants were recruited by word of mouth on Croker Island, and all testing took place in a quiet home on Croker Island. Many of the participants had previously participated in language research. All were compensated for their time and effort by a payment of $50 which is standard in remote communities where living costs are very high and opportunities for paid employment are very limited. The study was approved by the Western Sydney University Human Research Ethics Committee, approval number H14890.
2.4. Predictions
Fortition Trials. We would expect both Iwaidjan and Mawng participants to reject all unattested hardened target nouns. We would not expect an effect of phonological frame (preceding stop versus preceding vowel), nor an effect of presence or absence of a pause. We include fortition trials to differentiate between tolerance for lenition and inability to perceptually discriminate between stop and approximants at the same POA: Consistent rejection of fortited approximants (even if lenited stops are acceptable) indicates that participants can perceive the difference between [j] and [ɟ], and [w] and [b]. This also gives us a baseline rejection rate.
Lenition Trials. We would expect Iwaidja speakers to be intolerant of segmental alternations of the kind presented in the Lenition Trials in the same manner as those presented in the Fortition Trials, and we would not expect an effect of phonological frame, nor an effect of presence or absence of a pause. Based on reports of synchronic sandhi processes affecting onset /ɡ/ in Mawng, we would expect Mawng speakers to be more tolerant of lenited onset targets than of unattested fortition alternations and more tolerant than the participants in the Iwaidjan experiment. We would also expect an effect of phonological frame, such that lenition would be more acceptable in the vowel-final frame than in the consonant final frame (because this matches the phonological conditioning of lenition described in the literature).
Control trials. We would expect the participants to be at chance (50% preference) for 2AFC trials consisting of minimal pairs, as each target word should activate different, equally acceptable, lexical competitors, even though the phonetic/phonological difference between the two words is of the same kind as the pairs that participants are exposed to in the Fortition and the Lenition trials.
Prosodic Juncture Condition. We would expect Mawng speakers to be more likely to accept lenited wordforms in the unmodified condition than in the pause-insertion condition. If lenition is a bottom-up, phonetically conditioned sandhi phenomenon, we would expect an effect of pause insertion, such that pause insertion might reduce the acceptability of lenited onsets, as the temporal distance to the conditioning environment is extended (Kilbourn-Ceron et al., 2020).
3. Results
The average response rates by Iwaidja and Mawng speakers are summarised in Figure 3. The averages represent the proportion of acceptance of an utterance containing the unmodified, i.e., canonical, form of a target noun. To analyse the preference pattern across the four experiments (Iwaidja versus Mawng, with and without pause insertion), we fitted a series of generalised linear mixed-effects models (GLMMs, binomial link), see Table 2. At the first step, Model 1 included language group (Iwaidja vs. Mawng) and comparison conditions (fortition vs. lenition) as fixed effects, while participant and word item were assigned as random intercepts. Model 2 further included phonological frame (vowel-final vs. consonant-final), pause manipulation (no pause vs. with a pause), and cognate status (cognates vs. non-cognates). Finally, Model 3 further included interaction terms between language group and comparison conditions. For model evaluation, we calculated the Akaike Information Criterion (AIC) score and the model deviance to indicate the goodness-of-fit based on the dataset, and these results suggested that Model 3 achieved the best performance (AIC = 2459, deviance = 2437). We therefore chose Model 3 for data interpretation.
GLMM term | Model 1 | Model 2 | Model 3 |
Fixed effects | |||
Intercept | 0.515 | 0.486 | 0.267 |
Comparison: Fortition | 1.740*** | 1.772*** | 1.618*** |
Comparison: Lenition | 1.918*** | 1.933*** | 2.500*** |
Language: Mawng | –0.797* | –0.798* | –0.373 |
Frame: Vowel-final | – | –0.043 | –0.043 |
Pause manipulation: With pause | – | 0.011 | 0.011 |
Cognate status: Cognate | – | 0.135 | 0.138 |
Interaction: Lenition × Mawng | – | – | –1.041** |
Interaction: Fortition × Mawng | – | – | 0.378 |
Random effects | |||
Participant | 0.475 | 0.475 | 0.482 |
Word item | 0.519 | 0.515 | 0.319 |
Model evaluation | |||
AIC | 2474 | 2468 | 2459 |
Deviance | 2450 | 2450 | 2437 |
-
Note: * = p < .05, ** = p < .01, *** = p < .001.
According to Model 3, both Iwaidja and Mawng speakers showed a preference for canonical target nouns in the fortition condition (β = 1.618, p < .001) and the lenition condition (β = 2.500, p < .001). Neither group displayed a preference pattern for one word form over the other in the minimal pair control condition. This was consistent with our predictions, as both choices were canonical word forms with independent lexical meanings, though we acknowledge that one of the control pairs (/baɳɖi/ vs. /waɳɖi/) was technically only a partial cognate: /waɳɖi/ only means ‘one that is hanging’ in Iwaidja, not in Mawng. However, given that all participants were bilingual in Iwaidja and Mawng, we decided to include this item, and indeed the results indicate that the bilingual participants may have recognised the Iwaidja target also in the Mawng experiment, perhaps as a codeswitched item. The participants in the Mawng experiments, however, differed from the participants in the Iwaidja experiment by having significantly lower preference rates for canonical noun targets (i.e., higher tolerance for lenited forms) in the lenition condition (β = –1.041, p = .007) than the fortition condition (β = 0.378, p = .359), again consistent with our predictions. The effects of phonological frame and pause manipulation (Prediction 4) were not significant (p’s > .05). This is not consistent with predictions based on Mawng lenition as an exclusively bottom-up phonetic context-conditioned sandhi phenomenon. We found no effect of cognate status (p > .05).
Since the lenition condition reached a significant interaction effect, suggesting a language-specific pattern, we further explored the response rates in the lenition condition for each word item; see Table 3. As can be seen, Iwaidja speakers showed very high preference rates for the canonical production in all tested word items, ranging from 86% for /ɡabal/ to 97% for /ɟaɾaŋ/, /ɡaɳɡuɾɡ/ and /ɡaɾuŋ/, while Mawng speakers showed a wide range of preference rates, ranging from 57% for /ɡaɾɡaɲ/ to 94% for /ɟaɾaŋ/. More specifically, the descriptive results suggest that Mawng speakers typically showed lower preference rates (therefore higher tolerance) in stimulus words that start with the velar stop /ɡ/ (from 57% to 73%), while they showed higher preference rates (therefore lower tolerance) in the words with an initial /b/ or /ɟ/ (from 75% to 94%). Therefore, we grouped the word items based on the onset consonant and calculated the mean preference rates for onset consonants with different places of articulation (POAs); see Figure 4.
Word (in IPA) |
Mean (Iwaidja) |
SD (Iwaidja) |
Word (in IPA) |
Mean (Mawng) |
SD (Mawng) |
Fortition Trials | |||||
jabiɾɡ (white egret) | 76 | 26 | jaɭgaɟ (shell fish) | 76 | 18 |
jaɭɾi (scorpion) | 73 | 29 | jaɾi (striped fish) | 88 | 18 |
jaɽa (eye) | 88 | 15 | waɡiɟ (fishing line) | 96 | 10 |
wamba (shark) | 91 | 14 | wamba (shark) | 89 | 22 |
waɽjad (rock) | 76 | 24 | waɾɡa (flower) | 85 | 17 |
waɾɡaɾɡ (goanna) | 91 | 16 | |||
Lenition Trials | |||||
baŋɡa (forked stick) | 89 | 18 | baŋɡa (forked stick) | 80 | 18 |
ɟala (throwing net) | 94 | 10 | banaŋ (headband) | 86 | 18 |
ɟambaŋ (tamarind tree) | 91 | 19 | balaɟi (bag) | 90 | 12 |
ɟaŋaɲ (stingray) | 91 | 16 | ɟalaɟ (dingo) | 75 | 26 |
ɟaɾaŋ (horse) | 97 | 8 | ɟaɾaŋ (horse) | 94 | 9 |
ɡabal (flood plain) | 86 | 7 | ɡabal (flood plain) | 63 | 22 |
ɡaɳɡuɾɡ (sandhill) | 97 | 8 | ɡaɭaɾaɡ (green parrot) | 73 | 18 |
ɡaɾuŋ (sack bag) | 97 | 8 | ɡaɾaɡ (black cockatoo) | 59 | 26 |
ɡaɾɡaɲ (chicken hawk) | 57 | 30 | |||
Control Trials | |||||
baɡaj (harpoon) vs waɡaj (sugar glider) | 55 | 33 | baɡaj (harpoon) vs waɡaj (sugar glider) | 53 | 21 |
baɳɖi (armband) vs waɳɖi (one who is hanging) | 53 | 31 | baɳɖi (armband) vs waɳɖi* | 44 | 20 |
-
*Does not mean anything in Mawng, but see discussion in Results.
To verify the observation of the place of articulation effect in the lenition condition, we fitted a GLMM for a confirmatory analysis using the following formula: Response ~ Language Group * Onset Consonant + (1 | Participant) + (1 | Word Item), see Table 4. The model returned a significant intercept (β = 2.714, p < .001), and a significant interaction effect between Mawng speakers and target words beginning with the velar stop /ɡ/ (β = –1.408, p = .013). These results showed that while the lenition condition has led to above-chance rates (as shown in the significant intercept), Mawng speakers typically showed a higher level of tolerance for lenition of word items with an initial velar stop /ɡ/, indicating that POA might have a language-specific role in the acceptance of phonological alternations. This is interesting also in the light of the fact that /b/ and /w/ as well as /j/ and /ɟ/ alternate in the shared (lexicalised) historical patterns illustrated in (1.3) and (1.4). We return to this issue in the Discussion.
GLMM term | Estimate | SE | z value | p value |
Intercept | 2.714*** | 0.536 | 5.066 | <.001 |
Language: Mawng | –0.831 | 0.561 | –1.483 | .138 |
Onset consonant: Palatal /ɟ/ | 0.249 | 0.547 | 0.455 | .649 |
Onset consonant: Velar /ɡ/ | 0.252 | 0.567 | 0.444 | .657 |
Interaction: Mawng × Palatal /ɟ/ | –0.285 | 0.604 | –0.471 | .638 |
Interaction: Mawng × Velar /ɡ/ | –1.408* | 0.570 | –2.472 | .013 |
-
Note: *, p < .05, ***, p < .001.
4. Discussion
The study presented here investigated the effect of word-initial segmental alternations on word-recognition and preference in two Indigenous Australian languages Iwaidja and Mawng. Both Iwaidja and Mawng have been described as having historical grammatically conditioned initial mutations in verb and noun roots (Evans, 2000), but Mawng additionally has been described with a synchronic external sandhi process affecting initial /ɡ/ at word boundaries (Capell & Hinch, 1970), and together, the two languages provide an excellent opportunity to investigate the effect of initial segmental alternation on word recognition.
The present studies used a 2AFC paradigm to test speaker preferences for lexical items presented with or without initial segmental alternation (lenition and unattested fortition) in one of two frames that either reportedly induce lenition (vowel-final frame) or do not reportedly induce lenition (stop-final frame). The studies also examined the role of prosodic juncture in conditioning lenition in Mawng and Iwaidja by examining speaker preferences for lenited or hardened versus canonical forms in natural frames versus frames where the target is preceded by an inserted pause.
The results are largely consistent with the predictions in Section 2.5. They show that Iwaidja and Mawng participants all reject lexical items that have been subject to phonological alternation involving unattested fortition of continuants /w/ and /j/ to stops [b] and [ɟ], respectively, as well as lexical items subject to alternation involving ‘lenition’ of stops /b/ and /ɟ/, to approximants [w] and [j], respectively. Speakers of Iwaidja are similarly intolerant of lenition of initial velar stop /ɡ/ to approximant [ɰ], but importantly Mawng speakers are willing to accept [ɰ]-initial target words some of the time. All participants perform at chance in the control condition consisting of minimal pairs differing in onsets (/b/ versus /w/), indicating confusion about which lexical target to select when each member of the minimal pair is a lexical item.
The results of the present studies also show that tolerance for lenition in Mawng and Iwaidja is not affected by phonological frame, which is at odds with the description of external sandhi in Capell and Hinch (1970). This suggests that the reported synchronic lenition of /ɡ/-initial target words in Mawng does not reflect coarticulatory factors like potential target undershoot of the underlying phonological stop. Similarly, there is no effect of pause insertion on speaker tolerance for lenition. This latter finding contrasts with the assumptions of many current models of intervocalic stop lenition that lenition derives from speech speed and/or temporal affinity: Faster utterances generally produce greater approximation, or target undershoot, which has been argued to lead to lenition (Ennever et al., 2017; Cohen Priva & Gleason, 2020; Katz, 2021). This is also reminiscent of Morrison’s (2021) paper on Scots Gaelic, which argues that initial mutations show no phonetic evidence of incomplete neutralisation of mutated forms that could be argued to arise from transparent phonetic processes (in the present study: vowel-final conditioning frame and presence of a preceding pause).
Together, these two results invite discussion about what might then underpin synchronic tolerance for word-initial lenition in Mawng, and also, if [ɰ] is regarded as a reasonable version of /ɡ/, then why are [w] and [j] not regarded as reasonable versions of [b] and [ɟ]? Given that context and the presence of a pause prior to the target noun appear to be irrelevant, one plausible explanation may be found in differences in the mappings between phonetic variants and phonological categories in the two languages, as well as differences in the distributions of lenited forms. We can make a number of phonological inferences from our word-acceptability study, as the combination of trial types (lenition; fortition) double as implicit discrimination tasks: rejection of hardened (underlyingly /w/- and /j/-initial) word forms implies ability to discriminate between continuants and stops at the same place of articulation when the lexical item is specified for a continuant, and rejection of lenited (underlyingly /b/- and /ɟ/-initial) forms implies ability to discriminate between stops and continuants at the same place of articulation when the lexical item is specified for a stop. The fact that participants can do this in both directions (stop-to-continuant and continuant-to-stop) indicates that the relationship between the phonetic realisation of each category and the phonological representations or categories is symmetrical and non-overlapping.
Key information about the interplay between the phonetic and phonological systems and word-acceptability is also supplied by the participants’ performance on the crucial lenition trials involving the lenition of velar /ɡ/ to [ɰ]. As outlined in Section 2, all participants in the present studies were bilingual in Mawng and Iwaidja, and the differential performance of the participants in the Mawng versus the Iwaidja study (despite ~40% participant overlap between the two studies) reveals that the phonological system—and the specific phonetic realisations of each phoneme in different positions in the word within each language system—significantly shapes participant behavior. Indeed, the results would suggest, as illustrated in Figure 5, that word-initial [ɰ] in Mawng is perceptually categorised by speakers as a (legitimate) realisation of /ɡ/, activating a shared underlying phoneme, in turn activating the intended underlying /ɡ/-initial lexeme. In Iwaidja, participant rejection of all lenited underlyingly /ɡ/-initial words likely indicates that the lenited onset [ɰ] activates either two competing phonological categories asymmetrically—[ɰ] as a canonical realization of /ɰ/ which does not occur word-initially, as well as [ɰ] as an acoustically and articulatorily deviant realization of /ɡ/—or that the word-initial position activates only the /ɰ/ phoneme which is outright rejected, as lenition of /ɡ/ to [ɰ] is only acceptable in word-medial contexts, not word-initially.
Understanding the potential phone-to-phoneme mappings in Mawng and Iwaidja finally allows us to turn to the question of what the present studies contribute to our understanding of continuous parsing, and what the results indicate with respect to the nature of the input (e.g., whether it be, for instance, acoustic-phonetic material, articulatory gestures or gestural constellations, abstract phonemes, distributional information from sequences of phones, or information about phone confusability). Firstly, the studies indicate that surface (phonetic) variability is not a problem for continuous parsing per se but that models of continuous parsing need to carefully consider how to distinguish between phonological and phonetic variability in the input (i.e., the question about confusability of phones incorporated into Shortlist B, for instance), as well as to consider language specific phonotactic constraints and distributional and transitional properties (i.e., the question of predictability of ‘what comes next’).
Phonological substitutions, including the ones implemented in the studies reported on here, cause problems for word-recognition, as we would expect, given that whatever the input is assumed to be in each model, it must be faithful to the lexical specifications of the entries in the lexicon. This is, of course, also in line with what we would expect given what we know about categorical perception of segments: Discrimination across a phoneme boundary tends to be highly automatic and accurate for L1 speakers (Harnad, 1987; Best, 1994; 1995). The potentially disruptive effect of phonetic variation on word-recognition is more complex to assess because all speech contains degrees of acoustic/phonetic variation, some of which is linguistically relevant and some of which is not (but provides important information about the speaker’s gender, age, social, and regional origin, health, emotional state, and unique identity, including cultural aspects: Benzeghiba et al., 2007; Garvin & Ladefoged, 1963; Nolan, 1983).
In terms of variation relevant to speech perception, the present studies show that the importance placed on phone confusability in Shortlist B (Norris & McQueen, 2008) is well placed: One-to-many mappings of a phone to two or more phoneme categories, as in the case of [ɰ] mapping to both /ɰ/ and /ɡ/ medially, can be problematic from a word recognition perspective. Speaker-listeners, of course, must make decisions about what they are hearing, and as indicated by the results presented here, when in doubt about which phoneme is being produced because the input phone (Iwaidja [ɰ]) could be either a canonical instance of one phoneme (Iwaidja /ɰ/, which does not occur phonemically in word-initial position) or a realisation of a contrasting phoneme (Iwaidja /ɡ/), participants in the present study prefer the interpretation that the phone is a canonical form. They do so even when that means that the sequence in which it occurs fails to correspond to an entry in the speakers’ lexicon; that is, [ɰ] is mapped to /ɰ/ word-initially even though /ɰ/ cannot occur word-initially (in either language). In Mawng, the possible perceptual mappings differ: phonetic [ɰ] occurs as a lenited form of /ɡ/ initially but as a canonical form of /ɰ/ medially, and the (Mawng and Iwaidja bilingual) participants are more willing to accept word-forms with a deviant (lenited) onset, even if they do prefer canonical realisations. A near-parallel may be found in varieties of English with ‘th-fronting’ where, for instance, the words thing and thin are produced as [fiŋ] and [fin], respectively. Even speakers of varieties of English without th-fronting can learn to map [f] to /θ/, but we speculate would likely choose a canonical interpretation of fin over thin if given a word recognition task including a stimulus [fin] (see, for instance, Norris et al., 2003).
These results further suggest that while some researchers have found evidence for a role for position-independent phonemes in spoken word recognition (Dufour & Grainger, 2020), position independence would not appear to apply to the phonetic allophones of /ɡ/. Mawng speakers’ tolerance for lenition of /ɡ/ is, however, not mirrored in tolerance for lenition for /b/ and /ɟ/, where the lenited realisations are canonical realisations of phonemes /w/ and /j/. Indeed, there is a big difference between /w, j/ on the one hand, and /ɰ/, on the other, which is that /w/ and /j/ are structure-preserving and hence neutralising alternations, while the velar approximant is non-structure-preserving and hence non-neutralising in initial position where it cannot be phonemically contrastive. This suggests that information about the distributional properties or likelihoods of encountering a particular phonetic shape must be taken into consideration by models of parsing.
We do not wish to imply that speakers of Mawng are unable to discriminate between [ɡ] and [ɰ], despite their willingness to map both [ɰ] and [ɡ] to the same phonological category /ɡ/, as their preference pattern exceeds chance level and favours [ɡ]-initial target nouns. And we highlight that it is well-established that within-category discrimination can range from ‘chance’ for perceptually highly similar phones to ‘very good’ for perceptually dissimilar phones (see discussion in Best, 1994; 1995), and that speakers of many languages have been demonstrated to successfully discriminate between phones that differ only in their underlying voicing specifications in incomplete neutralisation patterns (see e.g., Port & O’Dell, 1985, for German; Matsui, 2011, for Russian).
Finally, the results reported here seem to indicate that the assumed input to models of continuous parsing must include some level of phonotactic or transitional information, as for instance is done in Shortlist B (Norris & McQueen, 2008). We argue this because—despite evidence (see Shaw et al., 2020) of frequent intervocalic and word-medial lenition of /ɡ/ to [ɰ] in Iwaidja, and thus familiarity with this variant realisation of /ɡ/—speakers of Iwaidja do not accept this phonetic variant in word-initial position, even where it is intervocalic (in the vowel-final frame). In contrast, lenited variants of /ɡ/ could potentially be accepted by listeners as instances of /ɡ/ in initial and medial position in Mawng.
In conclusion, the results of the studies presented here contribute to our understanding of initial segmental alternations on word recognition. The results show that cross-phoneme boundary alternations disrupt word-recognition, though this should not happen with lexemes for which both alternants are in the lexicon, such as i-mawurr/a-bawurr ‘arm of (entity indexed by prefix)’ in (1.1). The results also show that tolerance for word-initial phonetic variation depends not only on the phonological inventory of the language in question, but also on the degrees of overlap in the phonetic realisation of contrasting phonemes and the phonotactic or distributional properties of the language. The results are consistent with the reported synchronic lenition processes described for Mawng by Capell and Hinch (1970) and are consistent with the proposed phonological inventories of Mawng and Iwaidja.
The results also contribute to a better understanding of the phonetics of both languages, but further instrumental phonetic investigations with particular focus on lenition phenomena are needed: There are currently no comprehensive acoustic analyses available of initial stop production in Mawng or Iwaidja, nor do we know how frequently initial lenition might occur in either language at the present time, and we would welcome such work. The results of this study, however, are not consistent with theoretical accounts of lenition that argue that synchronic lenition is a context-dependent phenomenon, nor are they strongly consistent with arguments that lenition is an artifact of speaking-rate (Ennever et al., 2017; Cohen Priva & Gleason, 2020; Katz, 2021). Although we acknowledge that our pause manipulation was not strictly a manipulation of speaking rate, nevertheless, to the extent that strong juncture such as pause is inconsistent with fast speech phenomena, then lenition under these circumstances is unlikely to be conditioned by fast speech.
The studies also add to a growing but still vastly insufficient number of studies on languages outside of Europe and North America (where research has been and remains dominated by studies of English, German, French, and a handful of other large national varieties) and Asia (Mandarin, Cantonese, Korean, and Japanese, in particular). It is our hope that the results show the value of undertaking research with speakers of the many understudied languages of the world, and that a focus on typologically diverse languages is important for theory assessment and theory building. Finally, the studies demonstrate that simple adaptations of traditional laboratory psycholinguistic methods to take into consideration participants’ characteristics, as well as their social and cultural practices, can make participation in psycholinguistic research possible and acceptable for wide sections of non-WEIRD (Western, Educated, Industrialised, Rich, Democratic: Henrich et al., 2010) populations.
Appendix 1
The stop and approximant inventories of Iwaidja and Mawng are presented in Tables A and B. Note that while we have used the voiced symbols throughout the text, there is no voicing distinction in Mawng or Iwaidja, and researchers differ in their preference for using the voiced or voiceless series in transcription, to reflect the choices of the references.
Iwaidja | Bilab. | Alv. | Retrofl. | Postalv. | Velar |
Stops | b | d | ɖ | ɟ | ɡ [ɡ ɣ ɰ w] |
Approximants | w | ɻ | j | ɰ |
Mawng | Bilab. | Alv. | Retrofl. | Postalv. | Velar |
Stops | b | d | ɖ | ɟ | ɡ [ɡ ɣ ɰ w] |
Approximants | w | ɻ | j | ɰ |
Appendix 2
List of each word pair by trial type (fortition; lenition; control) in Iwaidja and Mawng. Items in italics are cognate between the two languages.
Iwaidja target word pairs | Mawng target word pairs |
Fortition Trials | |
[jabiɾɡ] ‘white egret’ vs [ɟabiɾɡ] | [jaɭgaɟ] ‘shell fish’ vs [ɟaɭgaɟ] |
[jaɭɾi] ‘scorpion’ vs [ɟaɭɾi] | [jaɾi] ‘striped fish’ vs [ɟaɾi] |
[jaɽa] ‘eye’ vs [ɟaɽa] | [waɡiɟ] ‘fishing line’ vs [baɡiɟ] |
[wamba] ‘shark’ vs [bamba] | [wamba] ‘shark’ vs [bamba] |
[waɽjad] ‘rock’ vs [baɽjad] | [waɾɡa] ‘flower’ vs [baɾɡa] |
[waɾɡaɾɡ] ‘goanna’ vs [baɾɡaɾɡ] | |
Lenition Trials | |
[baŋɡa] ‘forked stick’ vs [waŋɡa] | [baŋɡa] ‘forked stick’ vs [waŋɡa] |
[ɟala] ‘throwing net’ vs [jala] | [banaŋ] ‘headband’ vs [wanaŋ] |
[ɟambaŋ] ‘tamarind tree’ vs [jambaŋ] | [balaɟi] ‘bag’ vs [walaɟi] |
[ɟaŋaɲ] ‘stingray’ vs [jaŋaɲ] | [ɟalaɟ] ‘dingo’ vs [jalaɟ] |
[ɟaɾaŋ] ‘horse’ vs [jaɾaŋ] | [ɟaɾaŋ] ‘horse’ vs [jaɾaŋ] |
[ɡabal] ‘flood plain’ vs [ɰabal] | [ɡabal] ‘flood plain’ vs [ɰabal] |
[ɡaɳɡuɾɡ] ‘sandhill’ vs [ɰaɳɡuɾɡ] | [ɡaɭaɾaɡ] ‘green parrot’vs [ɰaɭaɾaɡ] |
[ɡaɾuŋ] ‘sack bag’ vs [ɰaɾuŋ] | [ɡaɾaɡ] ‘black cockatoo’ vs [ɰaɾaɡ] |
[ɡaɾɡaɲ] ‘chicken hawk’ vs [ɰaɾɡaɲ] | |
Control Trials | |
[baɡaj] ‘harpoon’ vs waɡaj ‘sugar glider’ | [baɡaj] ‘harpoon’ vs waɡaj ‘sugar glider’ |
[baɳɖi] ‘armband’ vs [waɳɖi] ‘one who is hanging’ | [baɳɖi] (armband) vs [waɳɖi]* |
Acknowledgements
We thank the Mawng and Iwaidja speakers who provided the stimuli for the study that we report on. We also thank the Mawng and Iwaidja participants in the study. We thank the Australian Research Council for funding the research through DP190100646, awarded to the first, second, third, and fifth authors. We thank the audience at ALS 2023 in Sydney for valuable feedback. We also thank Prof. Anne Cutler for inspiring and supporting the initial stages of this project.
Author Statement
Rikke Bundgaard-Nielsen (RBN) conceptualised the studies. RBN, Brett Baker (BB), Robert Mailhammer (RM), and Mark Harvey (MH) co-designed the study. RM recruited the speakers and recorded the stimulus materials. RBN designed the experimental protocols, and Chloe Turner (CT) undertook the acoustic segmentation and manipulations and created the two experiments in collaboration with RBN. RM recruited the Mawng and Iwaidja participants and collected the experimental data. RBN processed the data. RBN and Yizhou Wang (YW) jointly decided on the statistical analyses, and YW conducted the analyses. Interpretation of the statistical results was undertaken by RBN, YW, and BB. RBN drafted the manuscript. RM, BB, YW, MH, and CT provided comments and suggestions.
Competing Interests
The authors have no competing interests to declare.
Notes
- We use IPA representations throughout, rather than orthography. None of the Australian languages mentioned has a voicing contrast in stops, and voiced symbols are used here for Iwaidja and Mawng, although phonetic voicing varies. [^]
- These language names have been represented variously in the literature as ‘Iwaija’, ‘Yiwayja’, ‘Maung’, and ‘Mawung’, among many others. Iwaidja and Mawng are the current community-preferred representations and the ones approved by the Australian Institute of Aboriginal and Torres Strait Islander Studies. [^]
- We note that Capell and Hinch’s (1970, p. 44) sandhi rules appear to suggest that choice of approximant is conditioned by the following vowel: /w/ precedes /u/, while preceding other vowels /ɣ/ appears (nouns) or there is variation between /ɡ/, /w/ and /ɣ/ (verbs). In the present study, we restrict our target nouns to the /a/ environment (see Section 2). [^]
- Mawng has five noun classes, which are marked by agreement on determiners, adjectives, and verbs. The phrase we used shows masculine agreement expressed by the 1sg>3sg.masc verb form of ‘see’ (ŋejan) and the masculine determiner nuga. We took advice from expert Mawng speakers about the use of these forms rather than forms that would change depending on the noun class of the head noun. A small number of participants pointed out mismatches of agreement morphology in our stimuli but did not object to their use. [^]
- The form /waɳɖi/ only means ‘one that is hanging’ in Iwaidja, not in Mawng, but given that all participants were bilingual in Iwaidja and Mawng, we decided to include this item. We discuss this in Section 3. [^]
References
Baker, B. (2014). Word structure in Australian languages. In H. Koch & R. Nordlinger (Eds.), World of linguistics: Australia (pp. 137–211). Mouton.
Ball, M. J., & Müller, N. (1992). Mutation in Welsh (1st ed.). Routledge. http://doi.org/10.4324/9780203192764
Balling, L. W., & Baayen, R. H. (2012). Probability and surprisal in auditory comprehension of morphologically complex words. Cognition, 125(1), 80–106. http://doi.org/10.1016/j.cognition.2012.06.003
Beckman, J. N. (1998). Positional faithfulness. [Doctoral dissertation, University of Massachusetts, Amherst].
Benzeghiba, M., de Mori, R. Deroo, O., Dupon, S., Erbes, T., Jouvet, D., Fisore, L., Laface, P., Mertins, A., Ris, R., Rose, R., Tyagi, V., & Wellekens, C. (2007). Automatic speech recognition and speech variability: A review. Speech Communication, 49(10–11), 763–786. http://doi.org/10.1016/j.specom.2007.02.006
Best, C. T. (1994). The emergence of native-language phonological influences in infants: A perceptual assimilation model. In J. C. Goodman & H. Nusbaum (Eds.), The development of speech perception: The transition from speech sounds to spoken words (pp. 167–224). MIT Press.
Best, C. T. (1995). A direct-realist view of cross-language speech perception. In W. Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 171–204). York Press.
Blevins, J. (2001). Where have all the onsets gone? Initial consonant loss in Australian Aboriginal languages. In J. Simpson, D. Nash, M. Laughren, P. Austin, & B. Alpher (Eds.), Forty years on: Ken Hale and Australian languages (pp. 481–492). Pacific Linguistics.
Blevins, J., & Wedel, A. (2009). Inhibited sound change: An evolutionary approach to lexical competition. Diachronica, 26(2), 143–183.
Boyce, S., Browman, C. P., & and Goldstein, L. (1987). Lexical organization and Welsh consonant mutations. Journal of Memory and Language 26, 419–452.
Bundgaard-Nielsen, R. L., & Baker, B. J. (2020). Pause acceptability indicates word-internal structure in Wubuy. Cognition, 198, 104167. http://doi.org/10.1016/j.cognition.2019.104167
Bundgaard-Nielsen, R. L., Baker, B. J., Bell, E. A., & Wang, Y. (2023). Stop contrast acquisition in child Kriol: Evidence of stable transmission of phonology post Creole formation. Journal of Child Language, 1–37. http://doi.org/10.1017/S0305000923000430
Bundgaard-Nielsen, R. L., Baker, B. J., Kroos, C., Harvey, M., & Best, C. T. (2015). Discrimination of multiple coronal stop contrasts in Wubuy (Australia): A natural referent consonant account. PloS ONE, 10(12), e0142054. http://doi.org/10.1371/journal.pone.0142054
Bundgaard-Nielsen, R. L, Baker, B. J., & Wang, Y. (2023). Words or rules: Phonological mutations in Wubuy. In R. Skarnitzl & J. Volín (Eds.), Proceedings of the 20th International Congress of Phonetic Sciences (pp. 4007–4011). Guarant International.
Campbell, Lyle. (1996). On sound change and challenges to regularity. In M. Durie & M. Ross (Eds.), The Comparative Method reviewed: Regularity and irregularity in language change (pp. 72–89). Oxford University Press.
Capell, A., & Hinch, H. E. (1970). Maung grammar: Texts and vocabulary (Vol. 98). Mouton.
Chomsky, N., & Halle, M. (1968). The sound pattern of English. Harper & Row.
Cohen Priva, U., & Gleason, E. (2020). The causal structure of lenition: A case for the causal precedence of durational shortening. Language, 96(2), 413–448. http://doi.org/10.31234/osf.io/awvzd
Cutler, A. (1996). Prosody and the word boundary problem. In J. L. Morgan & K. Demuth (Eds.), Signal to syntax: Bootstrapping from speech to grammar in early acquisition (pp. 87–99). Erlbaum.
Cutler, A. (2008). The 34th Sir Frederick Bartlett Lecture: The abstract representations in speech processing. Quarterly Journal of Experimental Psychology, 61(11), 1601–1619. http://doi.org/10.1080/13803390802218542
Davis, C. J., & Taft, M. (2005). More words in the neighborhood: Interference in lexical decision due to deletion neighbors. Psychonomic Bulletin & Review, 12, 904–910. http://doi.org/10.3758/BF03196784
Dixon, R. M. W. (1972). The Dyirbal language of North Queensland. Cambridge University Press.
Dufour, S., & Grainger, J. (2020). The influence of word frequency on the transposed-phoneme priming effect. Attention, Perception, & Psychophysics, 82, 2785–2792. http://doi.org/10.3758/s13414-020-02060-9
Ennever, T., Meakins, F., & Round, E., (2017). A replicable acoustic measure of lenition and the nature of variability in Gurindji stops. Laboratory Phonology, 8(1), 20. http://doi.org/10.5334/labphon.18
Evans, N. (1998). Iwaidja mutations and its origins. In A. Siewierska & J. J. Song, eds., Case, typology and grammar: In honor of Barry J. Blake (pp. 115–150). John Benjamins.
Evans, N. (2000). Iwaidjan, a very un-Australian language family. Linguistic Typology, 4(2), 91–142.
Evans, N. (2003). Bininj Gun-Wok: A pan-dialectal grammar of Mayali, Kunwinjku and Kune. Pacific Linguistics.
Fife, J., & King, K. (1998). Celtic (Indo-European). In A. Spencer & A. M. Zwicky (Eds.), The handbook of morphology (pp. 477–99). Blackwell.
Garvin, P. L., & Ladefoged, P. (1963). Speaker identification and message identification in speech recognition. Phonetica, 9, 193–199.
Gaskell, M. G., & Marslen-Wilson, W. D. (1996). Phonological variation and inference in lexical access. Journal of Experimental Psychology: Human Perception and Performance, 22(1), 144–158. http://doi.org/10.1037/0096-1523.22.1.144
Grosjean, F. (1980). Spoken word recognition processes and the gating paradigm. Perception & Psychophysics, 28(4), 267–283. http://doi.org/10.3758/BF03204386
Harnad, S. (1987). Categorical perception: The groundwork of cognition. Cambridge University Press.
Heath, J. (1980). Nunggubuyu myths and ethnographic texts. Australian Institute of Aboriginal Studies.
Heath, J. (1984). Functional grammar of Nunggubuyu. Australian Institute of Aboriginal Studies.
Henrich, J., Heine, S., & Norenzayan, A. (2010). Most people are not WEIRD. Nature, 466, 29. http://doi.org/10.1038/466029a
Hirose, Y., & Mazuka, R. (2015). Predictive processing of novel compounds: Evidence from Japanese. Cognition, 136, 350–358. http://doi.org/10.1016/j.cognition.2014.11.033
Houlihan, K. (1975). The role of word boundary in phonological processes. [Doctoral dissertation: University of Texas, Austin].
Jackendoff, R., & Audring, J. (2020). Relational morphology: A cousin of construction grammar. Frontiers in Psychology, 11. http://doi.org/10.3389/fpsyg.2020.02241
Katz, J. (2021). Intervocalic lenition is not phonological: Evidence from Campidanese Sardinian. Phonology, 38(4), 651–692. http://doi.org/10.1017/S095267572100035X
Kilbourn-Ceron, O., Clayards, M., & Wagner, M. (2020). Predictability modulates pronunciation variants through speech planning effects: A case study on coronal stop realizations. Laboratory Phonology, 11(1), 5: 1–28.
Kiparsky, P. (1982). From cyclic phonology to lexical phonology. In H. van der Hulst & N. Smith (Eds.), The structure of phonological representations (Part I) (pp. 131–176). Foris Publications.
Kjelgaard, M. M., & Speer, S. R. (1999). Prosodic facilitation and interference in the resolution of temporary syntactic closure ambiguity. Journal of Memory and Language, 40(2), 153–194. http://doi.org/10.1006/jmla.1998.2620
Mailhammer, R. (2021). English on Croker Island: The synchronic and diachronic dynamics of contact and variation. De Gruyter Mouton. http://doi.org/10.1515/9783110707854
Mailhammer, R., & Harvey, M. (2018). A reconstruction of the Proto-Iwaidjan phoneme system. Australian Journal of Linguistics, 38(3), 329–359. http://doi.org/10.1080/07268602.2018.1470455
Marslen-Wilson, W., & Warren, P. (1994). Levels of perceptual representation and process in lexical access: Words, phonemes, and features. Psychological Review, 101(4), 653–675. http://doi.org/10.1037/0033-295X.101.4.653
Marslen-Wilson, W. D., & Welsh, A. (1978). Processing interactions and lexical access during word recognition in continuous speech. Cognitive Psychology, 10(1), 29–63. http://doi.org/10.1016/0010-0285(78)90018-X
Matsui, M. (2011). The identifiability and discriminability between incompletely neutralized sounds: Evidence from Russian. In W.-S. Lee & E. Zee (Eds.), Proceedings of the 17th International Congress of Phonetic Sciences (pp. 1342–1345). Hong Kong: City University of Hong Kong.
Metsala, J. L. (1997). An examination of word frequency and neighborhood density in the development of spoken-word recognition. Memory & Cognition, 25(1), 47–56. http://doi.org/10.3758/bf03197284
Miller, G., & Isard, S. (1963). Some perceptual consequences of linguistic rules. Journal of Verbal Learning and Verbal Behavior, 2, 217–228.
Morrison, D. A. (2021). Vowel nasalisation in Scottish Gaelic: No evidence for incomplete neutralisation in initial mutation. Morphology, 31, 121–146. http://doi.org/10.1007/s11525-020-09347-5
Nolan, F. (1983). The Phonetic bases of speaker recognition. Cambridge University Press.
Nooteboom, S. G. (1981). Lexical retrieval from fragments of spoken words: Beginnings vs endings. Journal of Phonetics, 9(4), 407–424.
Norris, D., & McQueen, J. M. (2008). Shortlist B: A Bayesian model of continuous speech recognition. Psychological Review, 115(2), 357–395. http://doi.org/10.1037/0033-295X.115.2.357
Norris, D., McQueen, J. M., & Cutler, A. (2000). Merging information in speech recognition: Feedback is never necessary. Behavioral and Brain Sciences, 23(3), 299–325. http://doi.org/10.1017/S0140525X00003241
Norris, D., McQueen, J. M., & Cutler, A. (2003). Perceptual learning in speech. Cognitive Psychology, 47(2), 204–238. http://doi.org/10.1016/S0010-0285(03)00006-9
Port, R., & O’Dell, M. (1985). Neutralization of syllable-final voicing in German. Journal of Phonetics, 13, 455–471. http://doi.org/10.1016/S0095-4470(19)30797-1
Pym, N., & Larrimore, B. (1979). Papers on Iwaidja phonology and grammar. (Work Papers of SIL-AAB, Series A, Vol. 2.) Darwinː Summer Institute of Linguistics Australian Aborigines Branch.
Roll, M., Horne, M., & Lindgren, M. (2010). Word accents and morphology–ERPs of Swedish word processing. Brain Research, 1330, 114–123. http://doi.org/10.1016/j.brainres.2010.03.020
Roll, M., Horne, M., & Lindgren, M. (2011). Activating without inhibiting: Left-edge boundary tones and syntactic processing. Journal of Cognitive Neuroscience, 23, 1170–1179. http://doi.org/10.1162/jocn.2010.21430
Rumelhart, D. E., & McClelland, J. L. (1986). On learning the past tenses of English verbs. Psycholinguistics: Critical Concepts in Psychology, 4, 216–271.
Shaw, J. A., Carignan, C., Agostini, T. G., Mailhammer, R., Harvey, M., & Derrick, D. (2020). Phonological contrast and phonetic variation: The case of velars in Iwaidja. Language, 96(3), 578–617. http://doi.org/10.1353/lan.2020.0042
Singer, R., Garidjalalug, N., Urabadi, R., Hewett, H., Mirwuma, P., Ambidjambidj, P., & Fabricius, A. (2021). Mawng dictionary. Aboriginal Studies Press.
Smirnova, E., Mailhammer, R., & Flach, S. (2019). The role of atypical constellations in the grammaticalization of German and English passives. Diachronica, 36(3), 384–416. http://doi.org/10.1075/dia.16033.smi
Smith, J. L. (2004). Making constraints positional: Toward a compositional model of CON. Lingua, 114, 1433–64.
Smye, V., Josewski, V., & Kendall, E. (2010). Cultural safety: An overview. First Nations, Inuit and Métis Advisory Committee, 1, 28.
Spoehr, K. T. (1980). Word recognition in speech and reading: Toward a single theory of language processing. In P. Eimas & J. Miller (Eds.), Perspectives on the study of speech. Erlbaum.
Söderström, P., Horne, M., & Roll, M. (2017). Stem tones pre-activate suffixes in the brain. Journal of Psycholinguistic Research, 46, 271–280. http://doi.org/10.1007/s10936-016-9434-2
Taft, L. (1984). Prosodic constraints and lexical parsing strategies. [Doctoral dissertation, University of Massachusetts, Amherst].
Taft, M., & Forster, K. I. (1976). Lexical storage and retrieval of polymorphemic and polysyllabic words. Journal of Verbal Learning and Verbal Behavior, 15(6), 607–620.
Ussishkin, A., Warner, N., Clayton, I., Brenner, D., Carnie, A., Hammond, M., & Fisher, M. (2017). Lexical representation and processing of word-initial morphological alternations: Scottish Gaelic mutation. Laboratory Phonology, 8(1), 8. http://doi.org/10.5334/labphon.22
Walsh, M., Möbius, B., Wade, T. and Schütze, H. (2010). Multilevel Exemplar Theory. Cognitive Science, 34, 537–582. http://doi.org/10.1111/j.1551-6709.2010.01099.x
Wedel, A., Ussishkin, A., & King, A. (2019). Crosslinguistic evidence for a strong statistical universal: Phonological neutralization targets word-ends over beginnings. Language, 95(4), 428–446.