1. Introduction

1.1. American English coda glottalization

American English voiceless stops in syllable codas are sometimes pronounced with audible glottal constriction (Bellavance, 2017; Cohn, 1993; Eddington & Channer, 2010; Eddington & Taylor, 2009; Huffman, 2005; Kahn, 1976; Kaźmierski, 2018, 2020; Kaźmierski, Wojtkowiak, & Baumann, 2016; Kilbourn-Ceron, 2017; Kilbourn-Ceron, Clayards, & Wagner, 2020; Levon, 2006; Pierrehumbert, 1994, 1995; Redi & Shattuck-Hufnagel, 2001; Roberts, 2006). For example, the word ‘bat’ may be pronounced as [bæt], [bæt], [bæʔt], or [bæʔ]. Voiceless stop glottalization is common across languages (Harris, 2001; Kohler, 1994; Michaud, 2004), though in English it is better documented outside North America, where it has more widespread socio-indexical meaning (Ashby & Przedlacka, 2014; Clark & Watson, 2016; Docherty & Foulkes, 1999a; Docherty, Hay, & Walker, 2006; Fabricius, 2002; Gordeeva & Scobbie, 2013; Henton & Bladon, 1988; Higginbottom, 1964; Holmes, 1995; Johnston, 2007; Kerswill, 2007; Mathisen, 1999; Mees & Collins, 1999; Milroy, Milroy, Hartley, & Walshaw, 1994; Newbrook, 1999; Penney, Cox, Miles, & Palethorpe, 2018; Penney, Cox, & Szakay, 2019; Ramisch, 2007; Roach, 1973, 1979; Stuart-Smith, 1999; Tollfree, 1999, 2001; Watt & Milroy, 1999; Williams & Kerswill, 1999).

The accompanying audio files (a–c) exemplify three acoustic types of American English coda glottalization. These are visualized as waveforms and spectrograms in Figure 1.1 The first type (1a) shows aperiodicity in the waveform and visibly-irregular glottal pulses preceding a typical [t] closure. Voicing irregularity is characteristic of glottal constriction, although one does not necessarily entail the other. The second type (1b) shows irregular glottal striations that appear throughout the short syllable coda, and no silent closure. In this type, it is difficult to determine whether or not an alveolar constriction exists based on acoustic or auditory evidence. In the third type (1c), well-defined glottal irregularity precedes a brief silence. Here, the stable formants and unidentifiable stop release (though see Bellavance, 2017) suggest that the silence is a sustained glottal closure rather than an oral one, and the accompanying audio file lacks a perceptible alveolar consonant transition.

Figure 1
Figure 1

Waveform and spectrogram representations of three utterances of the phrase not very produced by the second author. Each panel illustrates a different type of coda glottalization associated with the coda of not, which is indicated by the box in each panel. Spectrograms were calculated with a 10 ms Gaussian window and 0.7 ms step length.

These various types are sometimes distinguished with terms such as preglottalization, glottal reinforcement, glottalization, glottaling, glottal replacement, and full glottalization (e.g., Esling, Fraser, & Harris, 2005; Higginbottom, 1964; Milroy et al., 1994). In the current study, we refer to all three types with the cover term coda glottalization. Glottalized codas all share the primary feature of audible glottal constriction, despite its variable alignment with the nucleus vowel, and with an oral stop closure that may be missing or incomplete. We refer to types (1b) and (1c) as glottal stops, using the broad symbol [ʔ] to transcribe both types. These are defined by glottal constriction in the absence of an audible oral closure. While type (1b) does not have a sustained closure that results in silence, it involves the same basic articulation as (1c), and glottal stop production is inherently noisy and often incomplete (Garellek, 2013). For transcription purposes, we include type (1a) under the broad symbol [t]: Although it exhibits coda glottalization, it also has a definite alveolar closure.

1.2. Phonological distribution of coda glottalization

Coda glottalization in American English is attested primarily for /t/ and sometimes /p/ (Bellavance, 2017; Cohn, 1993; Eddington & Channer, 2010; Eddington & Taylor, 2009; Huffman, 2005; Kahn, 1976; Kaźmierski, 2018, 2020; Kaźmierski et al., 2016; Kilbourn-Ceron, 2017; Kilbourn-Ceron et al., 2020; Levon, 2006; Pierrehumbert, 1994, 1995; Roberts, 2006). In mainstream American English, irregular voicing accompanying an oral closure (as in 1a) is attested for both sounds, but irregular voicing or glottal closure without an identifiable oral closure (as in 1b–c) is attested only for coda /t/. Across other English varieties, glottalization is most often reported with /t/ and less often with /p/ and /k/, but coda glottalization is attested for all three voiceless stops /p, t, k/ in some varieties (Docherty & Foulkes, 1999a; Docherty, Foulkes, Milroy, Milroy, & Walshaw, 1997; Johnston, 2007; Jones, Kalbfeld, Hancock, & Clark, 2019; Milroy et al., 1994; Penney et al., 2019; Roach, 1973, 1979; Stoddart, Upton, & Widdowson, 1999; Stuart-Smith, 1999; Tollfree, 1999, 2001; Trudgill, 1999).

Coda glottalization occurs both word-internally (as in litmus pronounced [liʔ.məs]) and word-finally (as in eight people pronounced [eIʔ.phi.pl]). It is especially common at phrasal junctures (Huffman, 2005). While it is reportedly variable, coda glottalization is more likely when the following onset consonant is a sonorant (Cohn, 1993; Davidson, Orosco, & Wang, under review; Huffman, 2005; Pierrehumbert, 1994, 1995; Roberts, 2006), though Huffman (2005) argues that this is true only phrase-medially.

1.3. The glottis and American English voicing contrasts

It has been proposed that coda glottalization is related to the stop voicing contrast in American English (Huffman, 2005; Keyser & Stevens, 2006; Pierrehumbert, 1994, 1995). The vocal folds normally vibrate when air flows through the glottis, producing voicing. To produce a voiceless pulmonic sound, the glottis can be configured to inhibit vocal fold vibration, such as through either glottal spreading or constriction (Garellek, 2019). If glottal spreading is used, a possible side-effect is breathy voice or aspiration during the gestural transition (Löfqvist & McGowan, 1992; Seyfarth & Garellek, 2018). If glottal constriction is used, the side-effect would instead be audible glottalization or creaky voice (Ashby & Przedlacka, 2014; Garellek, 2012).

1.3.1. Glottal spreading in voiceless fricatives and voiceless onset stops

Glottal spreading is easy to identify in some American English voiceless sounds. For example, voiceless fricatives are often surrounded by audible pre- and post-aspiration, which indicates that the glottis must be spread during the fricative (Clayards & Knowles, 2015; Klatt, Stevens, & Mead, 1968; Löfqvist & McGarr, 1987; Löfqvist & McGowan, 1992; Munhall & Löfqvist, 1992). Although glottal constriction is a possible alternative means of achieving voicelessness, voiceless fricatives require high airflow, which is less compatible with a constricted glottis than a spread one. This makes glottal spreading aerodynamically preferable to constriction for inhibiting voicing in fricatives.

On the other hand, there is no aerodynamic preference between glottal spreading and constriction for American English voiceless stops. Nevertheless, spreading appears to be used for at least voiceless onset stops in foot-initial position, as evidenced by the voiceless post-aspiration that occurs in this position because the glottis is still spread once the oral closure is released (Cooper, 1991; Löfqvist & McGowan, 1992; Löfqvist & Yoshioka, 1984).

1.3.2. Glottal constriction in voiceless coda stops

The perceptible glottalization associated with voiceless stops in syllable coda position (Section 1.2) implies that American English speakers may use glottal constriction rather than spreading in order to inhibit voicing in this position. The acoustic and auditory evidence for glottal constriction in voiceless stops is supported by measurements of transglottal pressure (Westbury & Niimi, 1979) and laryngoscopic imaging (Fujimura & Sawashima, 1971). If this voicelessness gesture is what causes coda glottalization, why is perceptible glottalization systematically variable? In particular, why should glottalization be more likely before sonorants, and why is it better attested for /t/?

1.3.3. Previous accounts for voiceless stop coda glottalization

Pierrehumbert (1994, 1995) argues that glottal constriction in voiceless stops might be preferred to inhibit voicing before nasals and /l/ for perceptual reasons, and that glottal constriction before these sounds has been phonologically generalized to also occur before other sonorants.2 To account for the high rate of /t/ glottalization relative to /p, k/, Keyser and Stevens (2006) hypothesize that constriction may not be used with /p, k/ codas because the tongue can be stiffened during /p, k/ to increase intraoral pressure, which reduces airflow and adequately inhibits voicing. However, the front of the tongue must remain flexible to produce a /t/, and so they propose that /t/ requires a different strategy—glottal constriction—to inhibit voicing. Yet Huffman (2005) argues that the phonological distribution of coda glottalization does not fully conform to Pierrehumbert’s predictions, and American English coda glottalization is attested for /p/ as well as /t/ (Huffman, 2005; Pierrehumbert, 1994, 1995), which is not expected under Keyser and Stevens’s proposal.

To explain the variability of coda glottalization, Huffman (2005) suggests that glottal constriction is generally optional for voiceless stops (Cohn, 1993), and that coda glottalization appears more often before sonorants because of anticipatory coarticulation. If the velum is lowered early before a nasal onset, the acoustic side-branch increases the supraglottal volume and lowers the pressure, which facilitates transglottal airflow (Cohn, 1993). Similarly, anticipation of /l/ might involve early opening of the lateral side-channels, which could have a similar effect (Moll & Daniloff, 1971). When air flows through the constricted glottis in response to the pressure difference, the result is audible constricted voicing. Coda glottalization thus seems to be most common before sonorants because it is most likely to be audible in this environment. This implies that coda glottalization should be more common before /n, m, l/ onsets, but not before the other American English sonorants /ɹ, w, j/, which lack acoustic side-branches.

1.4. The current proposal

Our proposal is based on the assumption that voicelessness for American English coda stops is typically produced through glottal constriction (Fujimura & Sawashima, 1971; Kahn, 1976; Westbury & Niimi, 1979), as opposed to the glottal spreading used for voiceless onsets. If the oral constriction for a coda stop is not audible, not produced, or not aligned with this glottal constriction gesture, the result is perceptible coda glottalization (Selkirk, 1972, p. 194, Browman & Goldstein, 1992; Cohn, 1993; Manuel & Vatikiotis-Bateson, 1988; Parrell & Narayanan, 2018). In the present study, we explore and attempt to explain the variability of perceptible coda glottalization. Our data are drawn from audio recordings, and are thus not suited to evaluate the presence or absence of a physical glottal constriction gesture. Whether this physical gesture is indeed always present (Kahn, 1976), is an optional variant of voiceless codas (Huffman, 2005), or is itself conditioned, is a question for future research (see discussion Section 5.6.1).

We propose that perceptible coda glottalization is variable for two reasons (following Browman & Goldstein, 1990). First, the alignment and completeness of the oral and glottal constriction gestures vary in connected speech. Glottal constriction that begins earlier than the oral constriction may appear as pre-glottalization (similar to 1a), and a complete early glottal closure (as in 1c) can silence an oral one. If the oral constriction is incomplete or not produced (as in 1b–c), then audible glottalization may be the only sign of the expected coda stop.

Oral /p, t, k/ constrictions are reduced at different rates and in different ways, which accounts for the variability in perceptible coda glottalization among these sounds. In mainstream American English, /t/ lacks an oral constriction much more often than /p/ and /k/, which is likely due to the low information content of /t/ in American English (Cohen Priva, 2008; 2012; 2015; 2017). A labial /p/ constriction is usually not reduced in English (Jun, 1996), and an incomplete /k/ more typically results in spirantization (Riebold, 2011), which is both highly audible and likely to conceal the acoustic and visual signs of glottal constriction. When it is not omitted, a coronal /t/ constriction is shorter and has less alveolar contact when it is followed immediately by a non-coronal consonantal constriction, especially at faster speech rates (Barry, 1991; Browman & Goldstein, 1995; Byrd & Tan, 1996; Kühnert & Hoole, 2004; Sung & Kochetov, 2018).3 Given these patterns, glottalization should be attested most often for /t/, especially when followed by a labial or velar onset consonant, and especially at faster speech rates.

The second source of variability in coda glottalization is because some phonological contexts facilitate perceptible irregular voicing more than others (Huffman, 2005). At phrase junctures, irregular voicing is common (Redi & Shattuck-Hufnagel, 2001), where it may be used independently to signal a boundary (Kilbourn-Ceron, 2017; Kreiman, 1982; Slifka, 2007; Umeda, 1978). When a voiceless stop precedes a boundary, irregular voicing at the boundary may reinforce the perceptibility of the stop’s glottal constriction.

In this context, we also test the hypothesis in Huffman (2005) that coda glottalization is more audible before /n, m, l/ due to an anticipatory increase in supraglottal volume, which facilitates transglottal airflow and thus makes glottalization more audible (Cohn, 1993; Moll & Daniloff, 1971, see Section 1.3). To preview our results, we find that coda glottalization before sonorants is near-categorical and not limited to /n, m, l/, which is inconsistent with this final hypothesis. Elsewhere, however, it is consistent with the other physical causes proposed above. For this reason, we ultimately argue that perceptible coda /t/ glottalization is planned before sonorants, whereas elsewhere it is primarily the consequence of unplanned physical and mechanical variation.

2. Data collection and annotation

2.1. Overview

To support this proposal, we examine the distribution of perceptible glottal stops in a conversational corpus of American English speech. The focus of the analyses is the phonetic and phonological context surrounding the variable coda glottalization: the following onset consonant, the position of the coda in a phrase, stress before and after each coda, and the talker’s rate of speech.

The two analyses in this study are exploratory. In the first analysis, we model the rate of hand-annotated coda glottalization in word-final position before each individual onset consonant. This model estimates and adjusts for other factors, such as speech rate, but these other factors are treated as independent predictors. The second analysis uses a model tree to automatically search for significant contrasts and interactions within the data. In both cases, we provide an account for the phonological distribution indicated by the models, and argue that this distribution is most consistent with our proposal that (except before sonorants) variation in perceptible American English coda glottalization derives from phonetic reduction of the oral constriction and reinforcement of the glottal one. Our annotated dataset is available at https://doi.org/10.5281/zenodo.3332888.

2.2. The Buckeye Corpus

The data from this study come from the Buckeye Corpus of Conversational Speech (Pitt et al., 2007). The Buckeye Corpus contains interviews with forty white middle-class native English speakers who had grown up in central Ohio. Twenty speakers were under thirty years old and twenty were over forty; twenty were female and twenty were male. Although central Ohio straddles several dialect regions for white American English speakers, the demographic limitations of the corpus should be noted. The corpus comprises about 300,000 words of speech by the interviewees, and each interview is up to an hour long. Interviewees were asked about their background, and were asked to discuss their opinions about everyday topics.

The audio files in the Buckeye Corpus were recorded at a 16 kHz sampling rate with 16-bit depth. The recordings were transcribed orthographically, then automatically force-aligned based on a phonetic dictionary. The dictionary transcriptions and segment time-stamps were then hand-corrected by trained annotators to create a close phonetic transcription for each word token. Among the other close transcription conventions (described in the corpus manual), the close transcriptions include [ʔ] in place of some dictionary /t/ segments. Longer portions of irregular voicing were noted by annotators in a separate transcription file.

2.3. Data used in analysis

For this study, we identified a subset of the recorded words containing a singleton /t, p/ coda. We annotated both /t/ and /p/ codas in the corpus with the intent of analyzing both segments in this study. However, we found that /p/ codas meeting the criteria were too rare (947 included tokens) and acoustically heterogeneous in the natural speech of the Buckeye Corpus to draw useful conclusions. The data that we used in our present analyses therefore include only /t/ codas. Our complete annotated dataset, including /p/ codas with acoustic measurements, is made available for other researchers (see the accompanying Data Accessibility statement and Seyfarth & Garellek, 2015 for a preliminary analysis).

The tokens included in this dataset are all those in the corpus that met these criteria:

  1. The target segment was in a singleton syllable coda in the dictionary transcription of the word. Syllabification uses the procedure from Gorman (2013, Appendix B), with two modifications. First, /pj/ and /tw/ sequences were considered to be permissible word-medial onsets (as in popular, between). Second, postvocalic /ɹ/ was never included in a syllable nucleus.

  2. If the target segment was /t/ in the dictionary transcription of the word, annotators had given it a close transcription of [t, ʔ, d, ɾ, t͡ʃ], or [s]. If the target segment was dictionary /p/, annotators had also transcribed it as [p]. Over 95% of codas were word-final, which made it straightforward to check the correspondence between the dictionary and close transcriptions. For the remaining word-medial codas, we identified a possible matching segment in the close transcription that was preceded by any vowel and followed by any consonant, and then hand-checked the set of matches.4 This criterion excluded 1,693 /t/ codas and 81 /p/ codas, which were primarily auditorily-deleted codas.

  3. If the target /t, p/ coda was word-final, it was not followed by a vowel in either the dictionary or close transcription of the following word. Although /t/ glottalization is also attested word-finally before vowels (Eddington & Channer, 2010; Eddington & Taylor, 2009; Kaźmierski, 2018; Kaźmierski et al., 2016; Kilbourn-Ceron, 2017; Roberts, 2006; Umeda, 1978), an analysis of coda glottalization in this environment is more complex because it is confounded by resyllabification and glottal stop insertion in onsetless syllables (Garellek, 2013). We discuss this environment further in Section 5.4.

  4. The vowel preceding the target segment was at least 50 milliseconds (ms) long.

  5. The word containing the target segment did not contain a speech error, disfluency, or other interruption; it did not overlap with the interviewer’s speech or non-speech recording noise; and its close transcription matched the segment timestamps provided in the corpus.

2.4. Annotation procedure

2.4.1. Manual annotation

The close phonetic transcriptions of the tokens used in this study were further reviewed by five trained annotators, then verified by the second author (or else were reviewed by the authors directly). Annotators viewed the transcriptions alongside waveform and spectrogram representations of the associated audio in Praat (Boersma & Weenink, 2019) using the default settings and a spectrogram frequency range of 0–8 kHz.

The annotators listened to the surrounding audio and judged whether each token was followed by a phrase boundary. A phrase boundary was marked if the target word was perceptibly lengthened, or if it was followed by a pause, breath, or pitch reset. Separately, the presence or absence of phrasal creak was annotated based on the corpus transcription logs followed by hand-inspection. Phrasal creak was defined as a portion of irregular voicing that lasted for at least twice the duration of the vowel preceding the target segment. Thus, a syllable rime with creaky voice localized only to the target vowel would not be considered to occur in phrasal creak.5

The close transcription for each target syllable coda was checked and corrected if necessary, using a set of categories that was expanded beyond the original corpus transcriptions. The following categories were used:

  • VOICELESS LABIAL OR ALVEOLAR STOP [p, t]: Nucleus vowel followed by near-absence of acoustic energy (corresponding to stop closure) and/or the presence of a transient (corresponding to the stop formation); presence of a transient following the closure (corresponding to the stop release).6 While this category often had some perceptible creaky voice (Figure 1a), as would be expected if voiceless stop codas canonically have glottal constriction (Section 1.3), these were not annotated as glottal stops due to a defined oral closure.

  • VOICED LABIAL STOP OR ALVEOLAR STOP/TAP [b, d, ɾ]: Same as for [p, t], except with voicing (as seen by the voice bar in the spectrogram and/or periodic oscillations in the waveform) lasting at least half the closure duration.

  • AFFRICATED VOICELESS ALVEOLAR STOP [t͡s]: [t] followed by at least 20 ms of high-frequency frication noise.

  • FRICATIVE [ɸ, β, s, z]: Nucleus vowel followed immediately by frication noise (no identifiable stop closure).

  • GLOTTAL STOP [ʔ]: Presence of a strong glottal pulse with complete damping at the vowel’s offset followed by silence (corresponding to a sustained glottal stop) or presence of irregular voicing with no oral stop closure or release burst (Figure 1b–c).

  • DELETED [∅]: Nucleus vowel followed by the following word’s initial segment with no evidence of /t, p/ closure or glottalization.7

Annotators excluded additional tokens that contained or were immediately followed by an abrupt cut-off, restart, or prolongation; tokens in which the vowel preceding the target segment was voiceless; tokens where another talker was speaking simultaneously; and tokens that were mislabeled in the corpus transcriptions.

2.4.2. Automatic annotation

In addition to manual annotation of each coda’s close transcription, phrase position, and phrasal creak, the data were also automatically annotated for the following variables:

  • WORD POSITION: Whether the target coda was word-medial or word-final.

  • VOWEL QUALITY: Dictionary transcription of the nucleus vowel preceding the target coda.

  • STRESS: Whether the syllable containing the target coda was stressed or unstressed. A syllable was considered to be stressed if it had primary or secondary stress in the CMU Pronouncing Dictionary, and unstressed otherwise. Monosyllabic function words (at, that, but, etc.) were always considered to be unstressed. Words that were not in the CMU Pronouncing Dictionary were hand-annotated by the first author.

  • FOLLOWING STRESS: Whether the syllable following the target coda (typically, the first syllable of the following word) was stressed or not, using the same criteria. Pre-pausal codas were marked as being followed by a syllable without stress.

  • SPEECH RATE: The number of syllables per second in the surrounding utterance. An utterance in the Buckeye Corpus was defined as a stretch of speech delimited by pauses of 500 ms or greater.

Finally, we noted the age (under 30 or over 40) and gender (female or male) of each speaker, based on the corpus manual.8

2.5. Data summary

In total, the data include 12,451 singleton coda /t, p/ segments (11,504 /t/ and 947 /p/) including 477 unique words produced by 40 speakers. Nearly all (12,170 tokens; 98%) were word-final codas, and slightly over half (6,931 tokens; 56%) were phrase-medial. The four most common word types (that, it, but, not) comprise 50% of the data, and the fifteen most common comprise 85%, with a long tail. An earlier version of this dataset, including some acoustic measures and incomplete phrase annotations, was described and analyzed in Seyfarth and Garellek (2015).

The codas were followed by one of 21 different consonants or by a pause. Pre-pausal tokens were always utterance-final, and not closely followed by another speech sound. The most common environment was pre-pausal (4,056 tokens; 33%); codas were also common before /ð/ (1,556 tokens; 12.5%) or /w/ (1,037 tokens; 8.3%). The data included between 128–651 tokens before every other segment, except for uncommon /v/ (17 tokens; 0.1%) and /t͡ʃ/ (41 tokens; 0.3%).

The transcription of the consonant following the target codas was based on the dictionary transcription of the following word. We chose to use dictionary transcriptions of the following consonant, rather than close transcriptions, because we expected that the pronunciation of the following consonant might be equally influenced by the target coda (e.g., see Kaźmierski et al., 2016, on /j/ onsets).

2.5.1. Variable realizations of coda /t/

Figure 2 shows the empirical distribution of coda /t/ realizations before each type of following consonant. This figure reflects our corrected transcriptions, except for the deleted codas, which were not manually reviewed and are not included in our subsequent analyses.9 The overall average rate of coda /t/ deletion is lower than reported in previous work (Bybee, 2002; Guy, 1991), which is most likely because our dataset includes only singleton codas produced by white speakers.

Figure 2
Figure 2

Observed distributions of coda /t/ realizations before each type of following consonant. Deleted tokens were excluded from the analysis, but all other tokens were included.

Other than [t, ʔ] realizations, it is relatively common for coda /t/ to be voiced before /h/. Under the assumption that coda /t/ normally involves glottal constriction to inhibit voicing (Section 1.4), this makes sense: The glottal constriction during a voiceless /t/ will be reduced when it is immediately followed by a voiceless /h/, which is defined by a glottal spreading gesture. This leaves the glottis in a neutral position that is conducive to coarticulatory voicing between voiceless coda /t/ and voiceless onset /h/. On the other hand, this result would not make sense if coda /t/ (like onset /t/) normally involved glottal spreading to inhibit voicing: A subsequent /h/ would at most enhance a hypothetical /t/ glottal spreading gesture. This would continue to inhibit voicing, and thus coarticulatory voicing at a /t.h/ juncture should be rare, but this is not what we find.

It is also relatively common for coda /t/ to be voiced before other voiced coronals, affricated before /j/ (see also Kaźmierski et al., 2016), and spirantized before voiceless fricatives, especially /s, ʃ/. All of these realizations are phonetically natural in their respective environments.

2.5.2. Summary of corrections

We corrected a minority of the glottal stop transcriptions in the corpus. Of the 5,651 segments in our data that were originally transcribed as a glottal stop in the corpus, we hand-corrected only 421 (7%) to a [t], 54 (1%) to [d], and four others to [s, t͡s]. Of the 5,399 segments in our data that were originally transcribed as a [t], we hand-corrected 1,177 (22%) to a glottal stop [ʔ], as well as 465 (9%) to [d] and 141 (3%) to [s, t͡s]. Of the 454 segments that were originally transcribed as [d, ɾ, t͡ʃ, s], we hand-corrected 3 (1%) to [t] and 140 (31%) to [ʔ].

In general, then, our criteria for annotating glottal stops—obvious coda glottalization with no evidence for an oral constriction—resulted in annotating relatively more glottal stops than the original Buckeye Corpus transcriptions. Among the corrections to [p], we also identified two tokens originally transcribed as [p] that we hand-corrected to [ʔ], although these were not included in the following analysis of glottal stop rates.

Although the final category in the list above (deleted tokens) was defined in our annotation criteria, the annotators ultimately identified no tokens in this category that seemed to be deleted but had been mislabeled in the corpus transcriptions. This outcome further suggests that our criteria for glottal stops (as defined above and in Section 1.1) were more inclusive than those used by the original annotators, who preferred to label /t/ codas as [t] or deleted.

3. Estimation of glottal stop rates

In the first analysis, we modeled the rate at which the coda /t/ tokens in the dataset were pronounced as perceptible glottal stops in each phonetic environment. Because there were relatively very few word-medial codas (2%), we excluded word-medial codas from the analysis in this section only, rather than pooling them with word-final codas or modeling interactions with word position. The distribution of glottal stops in word-medial codas is discussed in Section 3.6.

3.1. Model procedure

The probability of glottalization for the word-final /t/ codas was modeled with a multilevel logistic regression using the brms package for R (Bürkner, 2018; R Core Team, 2018). The dependent variable was whether each token was annotated as glottal stop ([ʔ]) or not ([t, d, ɾ, t͡s, t͡ʃ, s]). The model included an overall intercept and parameters for the following onset segment (21 possible consonants, or a pause) and all interactions between following segment and phrase position (medial or final). Including interaction parameters between following segment and phrase position allows us to model how the effects of the following segment might change when a phrase juncture intervenes between a /t/ coda and that segment.

Also included were parameters for the preceding nucleus vowel (13 possible vowels), the presence of stress on the target and following syllable, the presence of phrasal creak, the talker’s speech rate in syllables per second, and the speaker’s age (old or young) and gender (female or male), and an interaction between age and gender. All categorical predictors were sum-coded with values of –0.5 or 0.5 for each parameter.

In order to facilitate generalizations across speakers and word types, the model included group-level intercepts for each speaker, each word, and each following word. For each speaker, the model included group-level slopes for the following segment and all interactions with phrase position, and for speech rate. For each word type, the model included group-level slopes for phrase position. There were no group-level slopes for the following word type.

All parameters were estimated with the default priors provided by brms, except the population-level slopes, which were estimated with Gaussian priors with μ = 0 and σ = 2. The model was fit via Markov chain Monte Carlo with Stan using the default sampler (Carpenter et al., 2017; Stan Development Team, 2018) with four chains and 2,000 samples per chain, discarding the first 1,000 samples per chain as warm-up. Convergence and model fit were assessed via the potential scale reduction statistic (all <1.004) and visual inspection of the posterior predictive density (Gabry, Simpson, Vehtari, Betancourt, & Gelman, 2019).

3.2. Effects of the following onset consonant

Figure 3 shows the rate of perceptible glottal stops in place of /t/ before each onset consonant type (or pause), as estimated from the model posterior. The upper panel shows the estimated rates in phrase-medial position, and the lower panel shows the estimated rates in phrase-final position. The onset consonants are ordered from left to right by the rate of glottal stop production, with the highest rates of glottalization on the left, so the x-axis differs between the two panels.

Figure 3
Figure 3

Model estimates for the probability of a coda /t/ being realized as a glottal stop (y-axis) when a coda appears before each type of onset consonant (x-axis), ordered by rate of glottal stop production. Top panel shows glottal stop probabilities in phrase-medial position; bottom panel shows when a phrase juncture intervenes between the coda and the following onset. Note that the x-axis order differs between panels. Bar heights are median posterior estimates, and error bars are one standard error above and below the median. Model estimates for onset consonants with zero observations in the data are not included in the figure. In this and all subsequent plots, estimates are given with other predictors held at their mean value (for continuous predictors) or averaged over all levels of other predictors (for categorical predictors).

By referring to the model estimates rather than the raw (empirical) frequencies in the corpus, we are able to adjust for idiosyncratic effects associated with individual speakers and lexical contexts, as well as other variables that may affect glottalization rates but are not evenly distributed across contexts. The Bayesian model estimates also allow us to take into account imbalances in the frequencies of different contexts: The estimated rates for contexts that are rarely observed are drawn towards the conditional means (e.g., the estimated rate before rare /v/ is drawn toward the average rate before all other segments).

3.2.1. Sonorant effect

In both phrase-medial and phrase-final positions, glottal stops are attested almost categorically in place of coda /t/ before sonorants (88%+ of realizations, other than deleted /t/), while glottal stops are relatively less likely before obstruents. This effect involves all of the sonorants, and is not limited to /n, m, l/. If the sonorant effect were caused by anticipatory coarticulation (as proposed in Huffman, 2005, and Section 1.4), it was predicted that it would be limited to nasals and /l/. Because glottalization occurs so often before sonorants—including /ɹ, w, j/—and because there is no obvious coarticulatory source, we believe that coda /t/ glottalization may be a conditioned allophonic variant of /t/ before sonorants.

3.2.2. Coronal reduction

The second major pattern that can be observed is that phrase-medially (upper panel), glottal stops are annotated more frequently before labial and velar obstruents than before coronal obstruents and /h/. However, this pattern disappears phrase-finally (lower panel).

This is consistent with our claim that coda glottalization is more identifiable when the oral /t/ constriction is reduced (Section 1.4). A /t/ constriction is reduced in magnitude when it is immediately followed by a non-coronal consonant (Barry, 1991; Browman & Goldstein, 1995; Kühnert & Hoole, 2004; Sung & Kochetov, 2018, see also Browman & Goldstein, 1990). Reduction of the oral /t/ constriction allows the simultaneous glottal constriction to be more audible. When a phrase juncture intervenes between the two consonants, however, the first consonant is both lengthened (e.g., Wightman, Shattuck-Hufnagel, Ostendorf, & Price, 1992) and more articulatorily separated (Byrd, Kaun, Narayanan, & Saltzman, 2000) from the second consonant, which should inhibit this pattern. This accounts for the elevated rate of coda /t/ glottalization before labial and velar obstruents, but only phrase-medially.

3.3. Phrase-final position

Besides the effect of the following onset consonant, the rate of perceptible coda glottalization increases overall in phrase-final position compared to phrase-medial position. Phrase-final position is associated with 0.39 greater log-odds of glottalization compared to phrase-medial position (95% posterior density interval: 0.01 to 0.79). This is consistent with our proposal that phrase-final irregular voicing increases the audibility (and perhaps the extent) of the glottal constriction associated with /t/, making it more likely to completely obscure or replace the oral /t/ constriction.

The 95% interval for the overall effect of phrase position is large. This is likely because the effect of phrase-final position is not uniform with respect to the different (phrase-initial) onset consonants. As shown in Figure 3, coda glottalization rates before obstruents are generally higher when a phrase juncture intervenes (lower panel) than when it does not (upper panel). However, an intervening phrase juncture instead very slightly lowers coda glottalization rates before sonorants (see also Seyfarth & Garellek, 2015).

If the sonorant effect is considered to be an allophonic alternation, this decrease is consistent with the hypothesis that phonological alternations that depend on upcoming sounds are sensitive to the availability of those sounds during speech production (Côté, 2013; Kilbourn-Ceron, 2017; Kilbourn-Ceron et al., 2020; Kilbourn-Ceron & Sonderegger, 2018; Kilbourn-Ceron, Wagner, & Clayards, 2016; Tanner, Sonderegger, & Wagner, 2017; Wagner, 2012). Because a sonorant is arguably less accessible during planning when it occurs in a new phrase than when it occurs during the existing one, a phonological (planned) sonorant effect should apply slightly less often across phrase junctures than within the same phrase.

3.4. Other effects of phonetic context

3.4.1. Speech rate

Marginal predictions across increasing speech rate are visualized in Figure 4. A one-syllable-per-second increase in speech rate is associated with a median increase of 0.11 log-odds of glottalization (95% posterior density interval: 0.05 to 0.17). Deletion of oral /t/ is more likely at faster speech rates (Kul, 2015; Tanner et al., 2017). Thus, this result is consistent with our proposal that coda glottalization may be more perceptible when the oral constriction is incomplete or not produced, which is more likely at faster speech rates (Barry, 1991; Byrd & Tan, 1996; Kühnert & Hoole, 2004; Kul, 2015; Parrell & Narayanan, 2018; Sung & Kochetov, 2018).

Figure 4
Figure 4

Model estimates for the effect of speech rate (x-axis) on the probability of coda glottalization (y-axis). The line shows the median posterior estimate for the probability of coda glottalization at different values of speech rate. The line shading shows one standard error above and below the median. The histogram shows the distribution of speech rate in the data.

3.4.2. Stress

If the following syllable is stressed, the log-odds of coda glottalization increase by 0.49 (median estimate; 95% posterior interval: 0.27 to 0.72). Eddington and Channer (2010) also report that prevocalic word-final /t/ glottalization is more likely when the following syllable is stressed. Stress on the syllable containing the /t/ coda itself did not have a reliable effect on whether the coda was glottalized or not (median estimate = 0.10 decrease in log-odds of glottalization; 95% posterior density interval: 0.47 decrease to 0.28 increase). Marginal median predictions from the model are visualized in Figure 5.

Figure 5
Figure 5

Model estimates for the probability of coda glottalization (y-axis), depending on stress in the syllable containing the coda and in the following syllable (x-axis). Bar heights are median posterior estimates, and error bars are one standard error above and below the median.

We are unsure whether this effect can be described by our proposal. It is possible that stress on the following onset may be associated with reduction of the preceding consonantal coda gesture. Alternatively, phrasal stress (pitch and phrase accents) is also associated with laryngeal constriction (Bird & Garellek, 2019; Campbell & Beckman, 1997; Dilley, Shattuck-Hufnagel, & Ostendorf, 1996), perhaps as a means of achieving a prominent pitch peak. If the constriction is anticipated in the previous syllable, this would enhance the glottal constriction associated with the preceding coda and lead to more audible glottalization. However, this explanation would also seem to predict that accented (or stressed) syllables themselves have more coda glottalization, which we did not find to be the case.

On the other hand, the apparent effect of stress in the following syllable may also be related to the unbalanced distribution of words in the corpus. For example, we observed impressionistically that our data contained many two-word constructions in which the second word had initial stress (e.g., right nów, not réally), but few in which the second word was unstressed (under our criteria for stress; Section 2.5). That is, most such unstressed bigrams were sequences like that they, at the, etc., which are not meaningful constructions. When two syllables comprise a more holistic construction (see Bybee, 2001), a coda /t/ within a cluster between the two syllables may be more reduced (Hay, 2003), permitting more audible glottalization. Thus, the apparent effect of stress may actually result from the fact that there are more two-word constructions that have stress on the second syllable.

3.4.3. Phrasal creak

The presence of phrasal creak, which was defined as irregular voicing lasting more than twice the length of the nucleus vowel, did not have a reliable effect on coda glottalization (median estimate = 0.06 increase in log-odds of glottalization; 95% interval: –0.10 to 0.23). This is not fully consistent with our proposal, but the majority of the posterior density is in the predicted direction. If phrase-final creak reinforces coda glottalization (Section 1.4), then phrasal creak should also reinforce coda glottalization. Phrasal and phrase-final creak are distinct from coda glottalization in that they may derive from lower subglottal pressure in addition to or in the absence of glottal constriction (Bird & Garellek, 2019; Slifka, 2006). Both sources, however, should facilitate irregular voicing (both duration and magnitude) and thus make coda glottalization more salient.

3.4.4. Vowel quality

Figure 6 shows the rate of coda glottal stops in syllables with different nucleus vowels, as estimated from the model. There is a general trend for less coda glottalization in syllables with high vowels, though the differences between vowels are very small and unreliable compared to the effects of the following onset consonant (cf. Figure 3).

Figure 6
Figure 6

Model estimates for the probability of coda glottalization (y-axis) when a coda appears after each type of nucleus vowel (x-axis). Bar heights are median posterior estimates, and error bars are one standard error above and below the median.

Laryngeal and epilaryngeal constriction naturally co-occur with tongue lowering and retraction, and glottal constriction is a consequence of laryngeal constriction (Moisik, Czaykowska-Higgins, & Esling, 2019). Coda glottalization may therefore be more audible after low and retracted vowels due to this enhancement of the glottal constriction gesture. Reduction of the oral constriction may also be involved: Lowering the jaw to produce a low vowel increases the distance that oral articulators need to move in order to produce a constriction, and so an oral constriction is more likely to be incomplete after low vowels (see Brown, 2004; Brown & Raymond, 2012; Raymond & Brown, 2012, on the phonetic conditioning of historical Spanish /f/ > /h/). We emphasize, however, that the differences in glottalization rates among most of the vowel types are very small.

3.5. Age and gender of the speaker

Figure 7 shows marginal median predictions for coda /t/ glottalization by age and gender. Younger speakers had median 0.70 increased log-odds of coda glottalization compared to older speakers overall (95% interval: 0.18 to 1.21 log-odds). The overall difference between female and male speakers was smaller and less reliable (0.41 greater log odds for female speakers; 95% interval: 0.09 to 0.90), though the effect of gender may be larger within younger speakers (0.39 further increase in log-odds for female speakers who are younger; 95% interval: –0.59 to 1.39).

Figure 7
Figure 7

Model estimates for the probability of coda glottalization (y-axis), depending on the age and gender of the speaker (x-axis). Bar heights are median posterior estimates, and error bars are one standard error above and below the median.

In several varieties of American English, coda glottalization is reportedly more common for younger speakers (Eddington & Channer, 2010; Eddington & Taylor, 2009; Roberts, 2006), and Kaźmierski (2020) finds the same qualitative pattern for prevocalic word-final /t/ glottalization in the Buckeye Corpus. In some English varieties outside the United States, younger speakers also have higher rates of glottal stop production (Holmes, 1995; Mathisen, 1999 Penney et al., 2019; Smith & Holmes-Elliott, 2018; Stoddart et al., 1999; Watt & Milroy, 1999).

3.6. Word-medial codas

There were 166 tokens of word-medial coda /t/ in our data. The majority (103 tokens; 62%) were transcribed as glottal stops. As in word-final position, glottal stops occurred at the highest rates before approximants (85%) and nasals (70%), and at lower rates before obstruents (38%).

4. Phonological subgroups

While we interpreted the estimates in Figure 3 as evidence that /t/ glottalizes before sonorants at higher rates than before other consonants, it is desirable to evaluate whether this phonological generalization reflects a robust separation in the data, or whether another kind of generalization involving the particular approximants and nasals in English is more appropriate. In the second analysis, we use an automatic procedure to identify important separations within the data based on features of the phonological context. As before, though, this analysis is exploratory, and our interpretations of the results are post-hoc. In Section 5.2.3, we discuss directions for confirmatory research that might be used to falsify our account.

4.1. Model procedure

We fit a multilevel logistic model tree to all of the dictionary word-final and word-medial /t/ codas in our dataset using the glmertree, partykit, and lme4 packages for R (Bates, Mächler, Bolker, & Walker, 2015; Fokkema, Smits, Zeileis, Hothorn, & Kelderman, 2018; Hothorn & Zeileis, 2015; Zeileis, Hothorn, & Hornik, 2008). The procedure uses the model-based recursive partitioning strategy described in Fokkema et al. (2018) to separate the data into subgroups.

In our analysis, the dependent measure is whether a /t/ coda is realized as [ʔ] or not, as defined in Section 2.4. The model-fitting procedure partitions the data into subgroups that have different rates of glottalization. For example, the procedure might choose to partition the data on the basis of phrase position, which might happen if phrase-final and phrase-medial /t/ codas are systematically associated with different rates of coda glottalization. At each step of the procedure, the best two-way partition is selected based on a parameter instability test (Zeileis et al., 2008). Each partition creates two subgroups, and additional partitions within each subgroup are recursively created until there are no more possible divisions that reflect systematic differences in the data, until the subgroups reach a minimum number of observations, or until the tree reaches a maximum depth.

In our model, we included the following phonological features as candidates that could be used to partition the data: the voicing, place, and manner of the following onset consonant (all annotated as ‘none’ for pre-pausal codas); the height and backness of the nucleus vowel; whether the coda was phrase-medial or phrase-final; whether the coda was word-medial or word-final; stress in the target syllable and in the following syllable; as well as articulatory speech rate and phrasal creak. Parameter instability tests for each variable were conducted with Bonferroni-corrected α = 0.05 and post-pruning with the Bayesian Information Criterion. Thus, all partitions represent statistically significant contrasts in the data.

A final logistic regression is then fit to the data within each subgroup, which predicts the rate of glottalization for tokens within that subgroup. For these final regression models, we used only the age and gender of the speaker to predict glottalization rates, since we found impressionistically that these did not interact with the partitioning variables.10 A multilevel (mixed-effects) model tree (Fokkema et al., 2018) also takes into account group-specific effects (i.e., random effects), such as speaker-specific idiosyncrasies, which are estimated across the entire dataset rather than separately within each subgroup. In our model, we included group-level intercepts for the speaker, word type, and the following consonant type.

4.2. Results and discussion

The partitioning procedure found seven partitions, which are shown in Figure 8. In the tree, the root variable (manner of the following onset consonant) is the most important division within the data, while lower-level divisions are less important. The terminal nodes show the estimated probability of coda glottalization within the final subgroups. For example, the estimated probability of coda glottalization is 63% (third subgroup) when the following consonant is a plosive, is bilabial, and is voiced (following the tree branching from the top).11

Figure 8
Figure 8

Partitions and final subgroups for the logistic model tree. Branches show two-way divisions within a predictor variable that involve significantly different rates of coda glottalization. Terminal nodes show the estimated probability of coda glottalization in each of the seven subgroups, including the confidence interval (CI) and number of observations (N) for each subgroup, averaging over the levels of age and gender. Bar plots below each terminal node show the model estimates for the probability of coda glottalization (y-axis), depending on the age and gender of the speaker (x-axis).

Sonority of the following onset consonant. In predicting the rate of perceptible coda /t/ glottalization, the most important two-way division was whether the following consonant was a sonorant (approximant or nasal) or an obstruent (all other manners). Coda /t/ before sonorants is associated with perceptible glottalization at the highest rate, relative to all other contexts.

Within the sonorants, codas that are followed by palatal /j/ have a lower rate of perceptible glottalization than codas that are followed by any of the other sonorants. This is likely because a /t.j/ sequence is often also pronounced [t͡ʃ], even across a word boundary (Kaźmierski et al., 2016, and see Figure 2). For this reason, the absolute proportion of glottalization before /j/ onsets is lower than before other sonorant onsets. Additionally, young and old speakers seem to have larger differences in this environment than in any other environment, based on the model estimates using age and gender of the speaker (shown in the bar plots below each terminal node).

Place of the following onset obstruent. Within the coda /t/s that were followed by an obstruent, glottalization is perceptible less often before coronal obstruents (or glottal /h/) than before the other obstruents. We argue that this is because the oral constriction of /t/ is reduced in magnitude when it is immediately followed by a second consonant, except when the second consonant is also coronal (Browman & Goldstein, 1990; Sung & Kochetov, 2018) or is /h/ (Kühnert & Hoole, 2004). Thus, due to this reduction of the /t/ oral constriction, the simultaneous glottal constriction is more perceptible before non-coronal obstruents. Additionally, if the following consonant is /h/, the glottal constriction gesture itself may be reduced due to blending with the glottal spreading that is necessary to produce the adjacent /h/ (see also Section 2.5.1). This makes glottal constriction less perceptible before /h/.

Phrase position. Within the coda /t/s that were followed by a coronal consonant (or /h/), glottalization was identified more often phrase-finally than phrase-medially. We argue that frequent irregular voicing in phrase-final syllables (Redi & Shattuck-Hufnagel, 2001) as well as in pre-pausal position (Slifka, 2006, 2007) would enhance the irregular voicing associated with glottal constriction, leading to an increase in the observed rate of coda glottalization. In the model tree, the pre-pausal codas (fourth subgroup) are separated from the phrase-final (pre-coronal) codas (fifth subgroup). This is likely an artifact of the partitioning procedure, because the estimated rate of coda glottalization before long pauses (52%) is nearly the same as the pre-coronal phrase-final codas (48%). We claim that the crucial conditioning context for coda glottalization here is simply phrase-final position, regardless of a pause.

If phrase-final position is important, why does phrase position only significantly divide the data when the following consonant is coronal (rightmost three subgroups in Figure 8)? We believe that phrase-final position also conditions glottalization when the following consonant is labial or velar, but the increase is not apparent in this environment because of a second phrase-related effect. As we argued above, a labial or velar consonant should reduce a preceding oral /t/ constriction and make glottalization more perceptible. However, the oral gesture should be reduced only when the labial or velar consonant is in the same phrase. The net result is that coda glottalization should be more perceptible before labials and velars regardless of a phrase boundary, because (i) phrase-finally, prosodically-conditioned irregular voicing enhances coda glottalization, and (ii) phrase-medially, a labial or velar onset consonant reduces the oral constriction of the coda. Thus, it appears that glottalization is not affected by phrase position in this branch of the model tree.

Voicing of the following consonant. Additionally, glottalization is estimated to be slightly more common before voiced labial and velar obstruents (63%) compared to voiceless ones (52%). Voiced obstruents typically involve supraglottal maneuvers to facilitate phonation, such as an increase in supraglottal volume, even when the phonation itself is absent (Ahn, 2018; Netsell, 1969; Westbury, 1983). If these maneuvers begin early, the consequent increase in transglottal airflow would facilitate irregular voicing when the glottis is constricted (Huffman, 2005).

Moreover, English voiceless obstruents in onset position typically involve glottal spreading. As with /h/ onsets, anticipation of glottal spreading in a voiceless onset should reduce the magnitude of glottal constriction in a preceding voiceless coda stop. This would further reduce the rate of coda glottalization before voiceless compared to voiced labial and velar onsets.

Stress after the coda. As in Section 3.2 and Eddington and Channer (2010), our analysis found that coda glottalization is more likely when the following syllable is stressed.

In all other environments, our account still allows for coda glottalization to be present at some low base rate due to occasional miaslignment or omission of the oral /t/ closure in conversational speech.

5. General discussion

5.1. Summary of empirical findings

We explored the distribution of perceptible coda /t/ glottalization in a conversational speech corpus of American English. We found that singleton coda /t/ is pronounced as a glottal stop almost categorically before all sonorant onsets when not deleted, although somewhat less often before /j/, where coda /t/ can alternately be resyllabified into a [t͡ʃ] onset (Kaźmierski et al., 2016). Coda glottalization is common at phrase junctures, though it is even more likely when a sonorant onset follows the juncture. Additionally, glottalization often occurs phrase-medially before labial and velar obstruents, and slightly more often before voiced ones. Glottalization is more likely at higher speech rates across phonological contexts, as well as when the following syllable is stressed.

5.2. Physical and phonological causes of coda glottalization

Our account is based on the assumption that American English voiceless stops in coda position include a glottal constriction gesture that is used to inhibit voicing (Fujimura & Sawashima, 1971; Huffman, 2005; Kahn, 1976; Westbury & Niimi, 1979). We claim that much of the distribution of perceptible coda glottalization can be explained by physical variation in the production of simultaneous oral and glottal constriction gestures. In phonetic environments which favor a reduced oral constriction, or those which favor irregular voicing, the inherent glottal constriction gesture is more likely to cause perceptible coda glottalization. However, we do not believe that coda /t/ glottalization before sonorants is caused by a favorable phonetic environment (contra Huffman, 2005, Section 1.4). In this environment, we claim that reduction or deletion of the alveolar closure to produce a glottal stop is phonologically planned.

5.2.1. Physical variation and coda glottalization

Among the voiceless stops, coda /t/ is by far the most likely to have a reduced or omitted oral closure (Browman & Goldstein, 1995; Cohen Priva, 2008, 2012, 2015; Parrell & Narayanan, 2018), which accounts for why coda glottalization is most readily identifiable for /t/. When the oral closure is reduced or omitted, glottal constriction is the audible remnant of the voiceless coda. This is consistent with our finding that pre-consonant coda glottalization appears more often at higher speech rates, as well as more often before phrase-medial labial and velar obstruents, where a coronal closure is reduced in magnitude (Barry, 1991; Browman & Goldstein, 1995; Byrd & Tan, 1996; Kühnert & Hoole, 2004; Sung & Kochetov, 2018). At phrase junctures, coda glottalization is reinforced by independent irregular voicing (Redi & Shattuck-Hufnagel, 2001; Slifka, 2006, 2007). Before strongly voiced sounds, anticipatory supraglottal expansion (Ahn, 2018; Netsell, 1969; Westbury, 1983) facilitates constricted voicing. These account for our findings that coda /t/ glottalization is more perceptible at phrase junctures and before voiced onsets.

5.2.2. Planned coda /t/ glottalization

Before sonorants, we claim that coda /t/ glottalization is planned rather than the consequence of physical variation. In this environment, glottal stops are pronounced in place of coda /t/ in about 90% of the tokens which have any perceptible trace of /t/. Huffman (2005) proposed a mechanical account for this sonorant effect, namely that it derives from an anticipatory increase in supraglottal volume before nasals and /l/ (Section 1.3). While this account correctly predicts glottalization before /n, m, l/, it does not predict the equally-high rates of glottalization before the other sonorants /ɹ, w, j/, which would not involve side-branches that increase supraglottal volume. Given the near-categorical rate of glottal stop pronunciation before sonorants and the absence of an alternative mechanical explanation, we proposed that pre-sonorant glottal stops are a planned allophonic variant of coda /t/.

5.2.3. Confirmatory research to test our account

Our analysis of coda glottalization is exploratory, and further confirmatory research is needed to investigate our account. At least two possible empirical findings might demonstrate that our account is incorrect. First, if future research finds that American English speakers glottalize coda /p/, not just /t/, at high rates before labial and velar onsets, that would be inconsistent with our proposal. A subsequent labial or velar consonant might condition reduction of a /t/ closure, but should not do so for /p/ (Jun, 1996). On the other hand, phrase-final irregular voicing should favor perceptible glottalization associated with /p/ and /t/ equally. Thus, if future research fails to find increased glottalization of coda /p/ in phrase-final compared to phrase-medial position (contra Huffman, 2005), that would also be inconsistent with our proposal.

Our proposal that pre-sonorant coda /t/ glottalization is an allophonic variant can also be tested with confirmatory research. We follow Wagner (2012) (among others; see Section 3.3) in assuming that phonological alternations which involve multiple words (and perhaps other kinds of alternations) must be planned during speech production. As a consequence, /t/ glottalization should be more likely when an upcoming sonorant onset is more accessible during planning, such as when an upcoming sonorant-initial word is highly predictable, when it occurs in the same prosodic constituent (Section 3.3), and when the speaker has additional time to plan (Kilbourn-Ceron et al., 2020, and ongoing work). If pre-sonorant coda /t/ glottalization is instead found to be less likely under these conditions, that would be inconsistent with our proposal.

5.3. Dialectal variation in coda glottalization

Coda glottalization varies across English dialects in both acoustics and phonological distribution. While the proposal here focuses on coarticulatory mechanics to account for coda glottalization in mainstream American English, a mechanics-based account does not entail that the distribution of coda glottalization should be identical across dialects and speakers. In particular, glottalization of some stops in some phonological environments is commonly associated with social meaning in other dialects (see references in Section 1.1), which largely has not been reported in white mainstream American English (but see Roberts, 2006 on rural Vermont English; Levon, 2006 on Reform American Jewish English; and Farrington, 2018; Fasold, 1981 on African American English). Inasmuch as particular varieties of coda glottalization carry socio-indexical meaning, we expect their production to be increasingly less predictable from mechanical factors. For example, Tyneside English has much higher rates of voiceless stop glottalization than American English (especially for /p/) which almost certainly results from the complex relationship between glottalization and social groups in that variety (Milroy et al., 1994).

Moreover, coarticulation itself is variable, and it depends on other articulatory and phonological patterns in a particular language (e.g., Cohn, 1993). Other varieties of English may reduce oral stop closures at different rates due to different usage patterns (Cohen Priva, 2017; Hay & Foulkes, 2016; Sóskuthy & Hay, 2017), or may specify a different alignment of the oral and glottal constriction gestures (Docherty & Foulkes, 1999a). Both of these will lead to different patterns of perceptible coda glottalization. Some speakers or dialects may also use alternate strategies to inhibit stop voicing. For instance, stop voicelessness can be achieved through glottal spreading rather than constriction (Section 1.3). If a language variety uses glottal spreading with coda /t/, the account discussed here would predict that a misaligned, incomplete, or omitted oral closure could produce audible coda aspiration and pre-aspiration, rather than glottalization (Parrell & Narayanan, 2018). Such coda aspiration patterns are attested in Liverpool English (Clark & Watson, 2016; Watson, 2002) and in Scottish and Welsh English (Gordeeva & Scobbie, 2013; Morris & Hejná, 2019, respectively). Other English varieties glottalize voiceless stops (and voiced stops: Farrington, 2018) in other phonological environments, which may interact with or supersede coda glottalization.

5.4. Word-final prevocalic glottalization

American English coda glottalization is also attested in word-final position before vowels, though at much lower rates than before consonants (Eddington & Channer, 2010; Eddington & Taylor, 2009; Kaźmierski, 2018, 2020; Kaźmierski et al., 2016; Kilbourn-Ceron, 2017; Kilbourn-Ceron et al., 2020; Roberts, 2006; Umeda, 1978). While prevocalic position does not necessarily favor /t/ reduction, glottal stops are often produced before vowel-initial words, especially at phrase junctures and in stressed syllables (Dilley et al., 1996; Garellek, 2013; Pierrehumbert & Talkin, 1991; Umeda, 1978). This glottalization pattern could reinforce coda /t/ glottalization, leading to increased rates of coda glottalization in word-final prevocalic position. Prevocalic coda glottalization should thus be more perceptible at phrase junctures and before stressed syllables.

This prediction is supported by the findings in Kilbourn-Ceron et al. (2020): Prevocalic coda /t/ glottalization is more likely when the word is lengthened (cf. speech rate in Kaźmierski, 2020), when it is followed by a short pause, or when the following word is relatively unpredictable. All of these are strong correlates of phrase junctures (e.g., Turk, 2010, Section 2.4.1), and Eddington and Channer (2010) further find that prevocalic coda /t/ glottalization is more likely before stressed syllables.

Eddington and Channer (2010) and Kaźmierski (2020) also propose that coda glottalization may be phonologically generalized from pre-consonantal environments (e.g., pre-sonorant) to prevocalic ones, as word-final /t/ occurs most often before consonants.

5.5. Other accounts for coda glottalization

5.5.1. Coarticulation and the sonorant effect

If the sonorant effect is caused by coarticulation, Huffman (2005) argues that it should be weaker phrase-finally, because coarticulatory anticipation should be attenuated across phrase boundaries. Our data show that the difference between sonorants and obstruents is indeed smaller in phrase-final position (Figure 3, lower panel). However, this is mostly due to a general increase in coda glottalization before obstruents at phrase boundaries, and glottalization occurs before sonorants much more often than before obstruents in both phrase positions. This increase before obstruents is consistent with the data reported in Huffman (2005, compare Figures 4 and 8 of that paper). Because the differences between obstruents and sonorants are smaller phrase-finally, the observed lack of a sonorant effect at a phrase boundary might have been due to the smaller dataset in Huffman (2005).

The finding that glottalization rates are elevated before sonorants even across phrase boundaries indicates that the sonorant effect probably does not derive from coarticulatory mechanics. We suggested (Section 5.2) that the sonorant effect might be a planned allophonic alternation instead (see also Cohn, 1993; Pierrehumbert, 1994). Pierrehumbert (1994) proposes that the sonorant effect is due to a phonological prototype in which /t/ glottalizes before /n/, and sufficiently-similar phonological contexts—such as other nasals or other coronals—may also participate in the same alternation. In our data, we found that coronal obstruents condition coda glottalization at the lowest rates, however, and /n/ does not condition coda glottalization substantially more than the other sonorants. Thus, if a prototype effect were the original source of the phonological alternation, it seems unlikely that it is currently active for our speakers.

5.5.2. Perception-based accounts

One type of account for coda glottalization refers to enhancement of the phonological features of /t/ (Davidson et al., under review; Keyser & Stevens, 2006; Pierrehumbert, 1994; Seyfarth & Garellek, 2015; Stevens & Keyser, 1989, 2010). For example, Pierrehumbert (1994, 1995) proposes that glottal constriction is used to inhibit voicing in coda /t/ before /n, m, l/ because glottal spreading has acoustic consequences that might be misattributed to the onset /n, m, l/, rather than being perceived as intended voicelessness. It has also been proposed that a glottal closure is used to replace an oral /t/ closure when /t/ might otherwise be masked by the following sound (Keyser & Stevens, 2006; Stevens & Keyser, 2010; see also Kohler, 1994; Slifka, 2007). Coda glottalization as an enhancement strategy could be planned or controlled by the speaker on some cognitive level (Buz, Tanenhaus, & Jaeger, 2016; Clayards & Knowles, 2015; Schertz, 2013; Seyfarth, Buz, & Jaeger, 2016), though online control is not necessarily required by these accounts.

These accounts are implicitly based on the perceptual needs of a listener. However, they assume that the listener’s goal is to identify phonological forms (segments, gestures, or features) rather than meaning (social, lexical, propositional, or other). Phonological information such as stop voicing or closure is useful to the listener, but only insofar as those things participate in the communication of meaning (Hall, Hume, Jaeger, & Wedel, 2016, 2018). A word-final /t/ carries very little information about lexical meaning in American English (Cohen Priva, 2008; 2012; 2015). In a communicative framework, then, American English word-final /t/ is a good candidate for reduction rather than enhancement in most contexts. Indeed, such a framework might even predict that the phonetic cues to English final /t/ are likely to be masked by the following onset when auditory enhancement is called for. This would extend the duration of the phonetic cues associated with the following onset, and a non-coronal word onset (especially a stressed one) can be an important cue to word identity (Turnbull, Seyfarth, Hume, & Jaeger, 2018).

Any such predictions, however, crucially depend on the function of particular phonetic cues in a particular language variety, and will differ in other varieties (Cohen Priva, 2017; Docherty & Foulkes, 1999b). Before making any perception-based predictions, it is important to determine how a specific group of listeners use coda glottalization and other laryngeal articulations (Chong & Garellek, 2018; Penney, Cox, Miles, & Palethorpe, 2018; Penney, Cox, & Szakay, 2020; Sanker, 2019) and what expectations talkers have about their listeners’ perception.

5.6. Conclusions and future work

We hand-annotated and analyzed over ten thousand pre-consonantal singleton /t/ codas in a corpus of conversational English speech. Based on the distribution of glottal stops in the corpus, we argued that perceptible coda glottalization in this variety of American English can be understood primarily as a consequence of conditioned variability in the alignment and magnitude of simultaneous oral and glottal constriction gestures. In addition, coda glottalization before sonorant onsets may be a conditioned allophonic variant, as we found no plausible co-articulatory or mechanical motivation for the high rate of glottalization in this environment.

5.6.1. Articulatory research

As we noted in Section 1.4, our empirical findings concern the distribution of perceptible coda glottalization, and audio data are not suited to evaluate the physical configuration of the articulators. One direction for future work is to determine what phonetic and phonological factors condition physical glottal constriction in /t/ and other voiceless codas, and how the coordination between the glottal and oral constriction gestures is associated with audible coda glottalization (e.g., Davidson, Lang, Paterson, Abdullah, & Marantz, 2020). Previous articulatory measurements (Fujimura & Sawashima, 1971; Westbury & Niimi, 1979) indicate that glottal constriction generally occurs with English voiceless stop codas. However, these studies used highly invasive instruments, which might have encouraged unnatural or emphatic speech, and glottal constriction sometimes occurs anyway along with emphatic speaking styles. It will be important to use less invasive instruments such as modern external photoglottography (Bouvet, 2017; Suthau, Birkholz, Mainka, & Simpson, 2016) to measure glottal area during natural spontaneous speech, especially if this can be done simultaneously with instrumental measurement of oral constriction (Kim, Maeda, Honda, & Crevier-Buchman, 2018).

If we are correct in assuming that a glottal constriction gesture is associated with voiceless codas, then we would expect to find decreased glottal opening from photoglottography leading into the coda stop. That gesture would presumably also be modulated by prosodic factors like prominence, such as with a stronger and longer constriction gesture associated with increased prominence, similar to what is found for word-initial glottalization (Dilley et al., 1996; Garellek, 2014; Pierrehumbert & Talkin, 1991, Section 5.4).

5.6.2. Social and historical research

Given that coda glottalization may be planned in at least one environment, the alternation potentially carries socio-indexical meaning. In other dialects of English, coda glottalization (and voiceless stop glottalization more generally) is more clearly associated with particular social groups. The age and gender differences found in the corpus (Eddington & Channer, 2010; Eddington & Taylor, 2009; Kaźmierski, 2020; Roberts, 2006; and Section 3.5), as well as two proposals that voiceless coda glottalization may reflect a mainstream American identity (Levon, 2006; Roberts, 2006), point towards future work on the sociolinguistic use of voiceless coda glottalization in American English.

A common historical sound change involves voiceless oral stops changing to glottal ones (Lass, 1976; Michaud, 2004; O’Brien, 2012). This change can be understood phonetically as the deletion of an oral constriction (Garrett & Johnson, 2013), leaving behind the glottal gesture that was used to inhibit voicing. In our study, we have argued that this pattern can be observed synchronically in environments that favor reduction or misalignment of the oral constriction (as proposed in Browman & Goldstein, 1992; Cohn, 1993; Manuel & Vatikiotis-Bateson, 1988; Parrell & Narayanan, 2018; Selkirk, 1972). Future work might explore the time-course of a glottalization sound change to evaluate whether it begins with such environments.

Data Accessibility Statement

The data are available at https://doi.org/10.5281/zenodo.3332888.

Additional file

The additional file for this article can be found as follows:

Appendix

Types of coda glottalization. Audio files illustrating three types of coda glottalization in the phrase not very. DOI: https://doi.org/10.5334/labphon.213.s1

Notes

  1. See also Figures 1, 2, 3 of Huffman (2005) for similar illustrations, and see Redi and Shattuck-Hufnagel (2001) and Keating, Garellek, and Kreiman (2015) for further discussion of variation in English creaky and glottalized voice qualities. [^]
  2. Specifically, Pierrehumbert (1994, 1995) argues that voiceless stops should have an acoustic cue that can be exclusively attributed to phonological voicelessness. Before nasals and /l/ onsets, speakers should prefer glottal constriction to spreading, because the acoustic correlates of glottal spreading are easily misattributed to those particular sounds (see e.g., Garellek, Ritchart, & Kuang, 2016; Berkson, 2013, Section 8.3). [^]
  3. A coronal /t/ gesture is also more likely to be masked by a subsequent oral constriction than /p, k/ (Browman & Goldstein, 1990, 1992; Byrd, 1992, 1994, 1996; Hardcastle & Roach, 1979; Jun, 2004). In principle, whether or not the oral gesture is masked or assimilated is independent of the audibility of the glottal constriction. Stevens and Keyser (2010) propose that coda /t/ glottalization can also serve to enhance an oral closure that could be masked by the following sound (see also Kohler, 1994, on glottalization in German); see Section 5.5.2 for further discussion. [^]
  4. Cohen Priva (2015) provides a more complete procedure for aligning the dictionary and close transcriptions in the Buckeye Corpus. [^]
  5. A comparison of the acoustics of glottalization and phrasal creak in a subset of these data appears in Garellek and Seyfarth (2016). In that paper, we found that glottal stops are best identified by a rapid increase in the relative amount of spectral noise, regardless of any phrasal creak. [^]
  6. For this category, annotators also marked the presence of a perceptible stop release, which is not discussed in this exploratory study. [^]
  7. Though we defined deleted tokens as a category for our annotators to consider, ultimately their review did not identify any additional tokens that should have been marked as deleted beyond the original transcriptions (see Section 2.5.2). [^]
  8. The corpus manual lists speakers as old or young. Old speakers are those over 40. The age range for younger speakers is listed as under 40 in the manual Section 1 but under 30 in the manual Section 2.1. [^]
  9. We excluded deleted tokens because we predict that many of the same factors which should make glottalization more perceptible, such as reduction of an oral /t/ closure, are also associated with the perception of segmental deletion. Because our analyses involve a two-way categorical distinction (whether or not /t/ is pronounced as a glottal stop), excluding deleted tokens is necessary in order to infer whether these predictors are independently associated with perceptible glottalization. [^]
  10. In this exploratory analysis, we experimented with a range of variables, procedures, and criteria for the model tree. The tree presented here is conservative in that it only involves splits that seemed to be robust across different analyses. Our data are available for further exploration at https://doi.org/10.5281/zenodo.3332888. [^]
  11. Note that the estimates of glottal stop probability differ slightly from Figure 3; this is partially because the regression estimates in Section 3.1 are adjusted for all of the covariates, while the model tree provides unadjusted predictions within a subgroup of the covariate values. [^]

Acknowledgements

We are grateful to Julia Dinwiddie, Hilda Parra, Alex Wang, Meagan Rose Baron, and Yushi Zhang for help annotating the data. We also thank the phonetics and phonology group at UC San Diego, as well as audiences at ICPhS 2015 in Glasgow, ASA 2016 in Salt Lake City, and Interspeech 2016 in San Francisco. An earlier version of this work appears in the Proceedings of ICPhS 2015 (Seyfarth & Garellek, 2015), and an analysis of phrasal creak acoustics using the same dataset appears in the Proceedings of Interspeech 2016 (Garellek & Seyfarth, 2016).

Funding Information

This research was supported by a National Science Foundation Graduate Research Fellowship (Division of Graduate Education) to the first author under grant number DGE-1144086. Any opinion, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Competing Interests

The authors have no competing interests to declare.

References

Ahn, S. (2018). The role of tongue position in laryngeal contrasts: An ultrasound study of English and Brazilian Portuguese. Journal of Phonetics, 71, 451–467. DOI:  http://doi.org/10.1016/j.wocn.2018.10.003

Ashby, M., & Przedlacka, J. (2014). Measuring incompleteness: Acoustic correlates of glottal articulations. Journal of the International Phonetic Association, 44(03), 283–296. DOI:  http://doi.org/10.1017/S002510031400019X

Barry, M. (1991). Temporal modelling of gestures in articulatory assimilation. In Proceedings of the 12th International Congress of Phonetic Sciences, 4, 14–17.

Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. DOI:  http://doi.org/10.18637/jss.v067.i01

Bellavance, S. (2017). Co-occurrence of /t/ Variants in Young Vermont Speakers (Unpublished doctoral dissertation).

Berkson, K. (2013). Phonation types in Marathi: An acoustic investigation (Unpublished doctoral dissertation). University of Kansas.

Bird, E., & Garellek, M. (2019). Dynamics of voice quality over the course of the English utterance. In Proceedings of the 19th International Congress of Phonetic Sciences (pp. 2406–2410).

Boersma, P., & Weenink, D. (2019). Praat: Doing phonetics by computer [Computer program].

Bouvet, A. (2017). Calibration of external lighting and sensing photoglottograph. 10th international workshop Models and Analysis of the Vocal Emissions for Biomedical Applications (MAVEBA).

Browman, C. P., & Goldstein, L. (1990). Tiers in articulatory phonology, with some implications for casual speech. Papers in laboratory phonology I: Between the grammar and physics of speech, 341–376. DOI:  http://doi.org/10.1017/CBO9780511627736.019

Browman, C. P., & Goldstein, L. (1992). Articulatory phonology: An overview. Phonetica, 49(3–4), 155–180. DOI:  http://doi.org/10.1159/000261913

Browman, C. P., & Goldstein, L. (1995). Gestural syllable position effects in American English. In F. Bell-Berti & L. J. Raphael (Eds.), Producing Speech: Contemporary Issues. For Katherine Safford Harris. Woodbury, NY: AIP Press.

Brown, E. L. (2004). The reduction of syllable-initial /s/ in the Spanish of New Mexico and southern Colorado: A usage-based approach (Unpublished doctoral dissertation). Albuquerque, NM: University of New Mexico.

Brown, E. L., & Raymond, W. D. (2012). How discourse context shapes the lexicon: Explaining the distribution of Spanish f-/h words. Diachronica, 29(2), 139–161. DOI:  http://doi.org/10.1075/dia.29.2.02bro

Bürkner, P.-C. (2018). Advanced Bayesian Multilevel Modeling with the R Package brms. The R Journal, 10(1), 395–411. DOI:  http://doi.org/10.32614/RJ-2018-017

Buz, E., Tanenhaus, M. K., & Jaeger, T. F. (2016). Dynamically adapted context-specific hyper-articulation: Feedback from interlocutors affects speakers’ subsequent productions. Journal of Memory and Language. DOI:  http://doi.org/10.1016/j.jml.2015.12.009

Bybee, J. (2001). Phonology and language use. Cambridge: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511612886

Bybee, J. (2002). Word frequency and context of use in the lexical diffusion of phonetically conditioned sound change. Language Variation and Change, 14(3), 261–290. DOI:  http://doi.org/10.1017/S0954394502143018

Byrd, D. (1992). Perception of assimilation in consonant clusters: A gestural model. Phonetica, 49, 1–24. DOI:  http://doi.org/10.1159/000261900

Byrd, D. (1994). Articulatory Timing in English Consonant Sequences. UCLA Working Papers in Phonetics, 86.

Byrd, D. (1996). Influences on articulatory timing in consonant sequences. Journal of Phonetics, 24(2), 209–244. DOI:  http://doi.org/10.1006/jpho.1996.0012

Byrd, D., Kaun, A., Narayanan, S., & Saltzman, E. (2000). Phrasal signatures in articulation. In M. Broe & J. Pierrehumbert (Eds.), Papers in Laboratory Phonology V (pp. 70–87). Cambridge: Cambridge University Press.

Byrd, D., & Tan, C. C. (1996, April). Saying consonant clusters quickly. Journal of Phonetics, 24(2), 263–282. DOI:  http://doi.org/10.1006/jpho.1996.0014

Campbell, N., & Beckman, M. (1997). Stress, prominence, and spectral tilt. In Intonation: Theory, Models, and Applications (pp. 67–70). Athens, Greece.

Carpenter, B., Gelman, A., Hoffman, M., Lee, D., Goodrich, B., Betancourt, M., … Riddell, A. (2017). Stan: A Probabilistic Programming Language. Journal of Statistical Software, Articles, 76(1), 1–32. DOI:  http://doi.org/10.18637/jss.v076.i01

Chong, A., & Garellek, M. (2018). Online perception of glottalized coda stops in American English. Laboratory Phonology. DOI:  http://doi.org/10.5334/labphon.70

Clark, L., & Watson, K. (2016). Phonological leveling, diffusion, and divergence: /t/ lenition in Liverpool and its hinterland. Language Variation and Change, 28(01), 31–62. DOI:  http://doi.org/10.1017/S0954394515000204

Clayards, M., & Knowles, T. (2015). Prominence enhances voicelessness and not place distinction in English voiceless sibilants. Proceedings of ICPhS 2015.

Cohen Priva, U. (2008). Using Information Content to Predict Phone Deletion. In N. Abner & J. Bishop (Eds.), Proceedings of the 27th West Coast Conference on Formal Linguistics (pp. 90–98).

Cohen Priva, U. (2012). Sign and signal: Deriving linguistic generalizations from information utility (Unpublished doctoral dissertation). Stanford University.

Cohen Priva, U. (2015). Informativity affects consonant duration and deletion rates. Laboratory Phonology, 6(2). DOI:  http://doi.org/10.1515/lp-2015-0008

Cohen Priva, U. (2017). Informativity and the actuation of lenition. Language, 93(3), 569–597. DOI:  http://doi.org/10.1353/lan.2017.0037

Cohn, A. C. (1993). Nasalisation in English: Phonology or phonetics. Phonology, 10(01), 43–81. DOI:  http://doi.org/10.1017/S0952675700001731

Cooper, A. M. (1991). An articulatory account of aspiration in English (Unpublished doctoral dissertation). Yale University.

Côté, M.-H. (2013). Understanding cohesion in French liaison. Language Sciences, 39, 156–166. DOI:  http://doi.org/10.1016/j.langsci.2013.02.013

Davidson, L., Lang, B., Paterson, H., Abdullah, O., & Marantz, A. (2020). Covert contrast in the articulatory implementation of glottal variants of coda /t/ in American English. New Orleans. (Poster presented at the Linguistic Society of America Annual Meeting, New Orleans, January 2–6, 2020).

Davidson, L., Orosco, S., & Wang, S.-F. (under review). The link between syllabic nasals and glottal stops in American English.

Dilley, L., Shattuck-Hufnagel, S., & Ostendorf, M. (1996). Glottalization of wordinitial vowels as a function of prosodic structure. Journal of Phonetics, 24(4), 423–444. DOI:  http://doi.org/10.1006/jpho.1996.0023

Docherty, G., & Foulkes, P. (1999a). Derby and Newcastle: Instrumental phonetics and variationist studies. In P. Foulkes & G. Docherty (Eds.), Urban Voices: Accent studies in the British Isles (pp. 105–142). New York, NY: Routledge.

Docherty, G., & Foulkes, P. (1999b). Sociophonetic variation in ‘glottals’ in Newcastle English. In Proceedings of the 14th International Congress of Phonetic Sciences (pp. 1037–1040).

Docherty, G., Foulkes, P., Milroy, J., Milroy, L., & Walshaw, D. (1997). Descriptive adequacy in phonology: A variationist perspective. Journal of Linguistics, 33(2), 275–310. DOI:  http://doi.org/10.1017/S002222679700649X

Docherty, G., Hay, J., & Walker, A. (2006). Sociophonetic patterning of phrase-final /t/ in New Zealand English. In P. Warren & C. I. Watson (Eds.), Proceedings of the 11th Australian International Conference on Speech Science & Technology (pp. 6). New Zealand: University of Auckland.

Eddington, D., & Channer, C. (2010). American English has got a lot of glottal stops: Social diffusion and linguistic motivation. American Speech, 85(3), 338–351. DOI:  http://doi.org/10.1215/00031283-2010-019

Eddington, D., & Taylor, M. (2009). T-glottalization in American English. American Speech, 84(3), 298–314. DOI:  http://doi.org/10.1215/00031283-2009-023

Esling, J. H., Fraser, K. E., & Harris, J. G. (2005). Glottal stop, glottalized resonants, and pharyngeals: A reinterpretation with evidence from a laryngoscopic study of Nuuchahnulth (Nootka). Journal of Phonetics, 33(4), 383–410. DOI:  http://doi.org/10.1016/j.wocn.2005.01.003

Fabricius, A. (2002). Ongoing change in modern RP: Evidence for the disappearing stigma of t-glottalling. English World-Wide, 23, 115–136. DOI:  http://doi.org/10.1075/eww.23.1.06fab

Farrington, C. (2018). Incomplete neutralization in African American English: The case of final consonant voicing. Language Variation and Change, 30(03), 361–383. DOI:  http://doi.org/10.1017/S0954394518000145

Fasold, R. W. (1981). The Relation between Black and White Speech in the South. American Speech, 56(3), 163. DOI:  http://doi.org/10.2307/454432

Fokkema, M., Smits, N., Zeileis, A., Hothorn, T., & Kelderman, H. (2018). Detecting treatment-subgroup interactions in clustered data with generalized linear mixed-effects model trees. Behavior Research Methods, 50(5), 2016–2034. DOI:  http://doi.org/10.3758/s13428-017-0971-x

Fujimura, O., & Sawashima, M. (1971). Consonant sequences and laryngeal control. Annual Bulletin of Research Institute of Logopedics and Phoniatrics, University of Tokyo, 5, 1–13.

Gabry, J., Simpson, D., Vehtari, A., Betancourt, M., & Gelman, A. (2019). Visualization in Bayesian workflow. Journal of the Royal Statistical Society: Series A (Statistics in Society), 182(2), 389–402. DOI:  http://doi.org/10.1111/rssa.12378

Garellek, M. (2012). The timing and sequencing of coarticulated non-modal phonation in English and White Hmong. Journal of Phonetics, 40(1), 152–161. DOI:  http://doi.org/10.1016/j.wocn.2011.10.003

Garellek, M. (2013). Production and perception of glottal stops (Unpublished doctoral dissertation). UCLA.

Garellek, M. (2014). Voice quality strengthening and glottalization. Journal of Phonetics, 45, 106–113. DOI:  http://doi.org/10.1016/j.wocn.2014.04.001

Garellek, M. (2019). The phonetics of voice. In W. F. Katz & P. F. Assmann (Eds.), The Routledge Handbook of Phonetics (pp. 75–106). Abingdon, Oxon; New York, NY: Routledge. DOI:  http://doi.org/10.4324/9780429056253-5

Garellek, M., Ritchart, A., & Kuang, J. (2016). Breathy voice during nasality: A cross-linguistic study. Journal of Phonetics, 59, 110–121. DOI:  http://doi.org/10.1016/j.wocn.2016.09.001

Garellek, M., & Seyfarth, S. (2016). Acoustic differences between English /t/ glottalization and phrasal creak. Interspeech 2016, 1054–1058. DOI:  http://doi.org/10.21437/Interspeech.2016-1472

Garrett, A., & Johnson, K. (2013). Phonetic bias in sound change. In A. C. Yu (Ed.), Origins of Sound Change: Approaches to Phonologization. Oxford: Oxford University Press. DOI:  http://doi.org/10.1093/acprof:oso/9780199573745.003.0003

Gordeeva, O. B., & Scobbie, J. M. (2013). A phonetically versatile contrast: Pulmonic and glottalic voicelessness in Scottish English obstruents and voice quality. Journal of the International Phonetic Association, 43(03), 249–271. DOI:  http://doi.org/10.1017/S0025100313000200

Gorman, K. (2013). Generative phonotactics (Unpublished doctoral dissertation). University of Pennsylvania.

Guy, G. R. (1991). Explanation in variable phonology: An exponential model of morphological constraints. Language Variation and Change, 3(1), 1–22. DOI:  http://doi.org/10.1017/S0954394500000429

Hall, K. C., Hume, E., Jaeger, T. F., & Wedel, A. (2016). The Message Shapes Phonology. DOI:  http://doi.org/10.31234/osf.io/sbyqk

Hall, K. C., Hume, E., Jaeger, T. F., & Wedel, A. (2018). The role of predictability in shaping phonological patterns. Linguistics Vanguard, 4(s2). DOI:  http://doi.org/10.1515/lingvan-2017-0027

Hardcastle, W. J., & Roach, P. J. (1979). An instrumental investigation of coarticulation in stop consonant sequences. In H. Hollien & P. Hollien (Eds.), Proceedings of the IPS-77 Conference. John Benjamins. DOI:  http://doi.org/10.1075/cilt.9.56har

Harris, J. G. (2001). States of the glottis of Thai voiceless stops and affricates. Essays in Tai linguistics (pp. 3–12).

Hay, J. (2003). Causes and consequences of word structure. Routledge. DOI:  http://doi.org/10.4324/9780203495131

Hay, J., & Foulkes, P. (2016). The evolution of remembered /t/ over real and remembered time. Language, 92(2), 298–330. DOI:  http://doi.org/10.1353/lan.2016.0036

Henton, C. G., & Bladon, A. (1988). Creak as a sociophonetic marker. In L. M. Hyman & C. N. Lee (Eds.), Language, speech and mind: Studies in honor of Victoria A. Fromkin. Routledge.

Higginbottom, E. (1964). Glottal reinforcement in English. Transactions of the Philological Society, 63(1), 129–142. DOI:  http://doi.org/10.1111/j.1467-968X.1964.tb01010.x

Holmes, J. (1995). Glottal stops in New Zealand English: An analysis of variants of wordfinal /t/. Linguistics, 33(3). DOI:  http://doi.org/10.1515/ling.1995.33.3.433

Hothorn, T., & Zeileis, A. (2015). Partykit: A Modular Toolkit for Recursive Partytioning in R. Journal of Machine Learning Research, 16, 3905–3909.

Huffman, M. K. (2005). Segmental and prosodic effects on coda glottalization. Journal of Phonetics, 33(3), 335–362. DOI:  http://doi.org/10.1016/j.wocn.2005.02.004

Johnston, P. (2007). Scottish English and Scots. In D. Britain (Ed.), Language in the British Isles (pp. 105–121). Cambridge: Cambridge University Press. (OCLC: ocm85690302). DOI:  http://doi.org/10.1017/CBO9780511620782.007

Jones, T., Kalbfeld, J. R., Hancock, R., & Clark, R. (2019). Testifying while black: An experimental study of court reporter accuracy in transcription of African American English. Language. DOI:  http://doi.org/10.1353/lan.0.0235

Jun, J. (1996). Place assimilation is not the result of gestural overlap: Evidence from Korean and English. Phonology, 13, 377–407. DOI:  http://doi.org/10.1017/S0952675700002682

Jun, J. (2004). Place assimilation. In B. Hayes, R. Kirchner, & D. Steriade (Eds.), Phonetically based phonology (pp. 58–86). Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511486401.003

Kahn, D. (1976). Syllable-based generalizations in English phonology (Unpublished doctoral dissertation). Massachusetts Institute of Technology.

Kaźmierski, K. (2018). Word-boundary intervocalic t-glottaling in Midland American English. Unpublished. DOI:  http://doi.org/10.13140/rg.2.2.22855.75688

Kaźmierski, K. (2020). Prevocalic t-glottaling across word boundaries in Midland American English. Laboratory Phonology: Journal of the Association for Laboratory Phonology, 11(1), 13. DOI:  http://doi.org/10.5334/labphon.271

Kaźmierski, K., Wojtkowiak, E., & Baumann, A. (2016). Coalescent assimilation across word boundaries in American English and in Polish English. Research in Language, 14(3), 235–262. DOI:  http://doi.org/10.1515/rela-2016-0012

Keating, P., Garellek, M., & Kreiman, J. (2015). Acoustic properties of different kinds of creaky voice. In Proceedings of the 18th International Conference of Phonetic Sciences.

Kerswill, P. (2007). Standard and non-standard English. In D. Britain (Ed.), Language in the British Isles (pp. 34–51). Cambridge: Cambridge University Press. (OCLC: ocm85690302). DOI:  http://doi.org/10.1017/CBO9780511620782.004

Keyser, S. J., & Stevens, K. N. (2006). Enhancement and Overlap in the Speech Chain. Language, 82(1), 33–63. DOI:  http://doi.org/10.1353/lan.2006.0051

Kilbourn-Ceron, O. (2017). Speech production planning affects variation in external sandhi (Unpublished doctoral dissertation).

Kilbourn-Ceron, O., Clayards, M., & Wagner, M. (2020). Predictability modulates pronunciation variants through speech planning effects: A case study on coronal stop realizations. Laboratory Phonology: Journal of the Association for Laboratory Phonology, 11(1), 5. DOI:  http://doi.org/10.5334/labphon.168

Kilbourn-Ceron, O., & Sonderegger, M. (2018). Boundary phenomena and variability in Japanese high vowel devoicing. Natural Language & Linguistic Theory, 36(1), 175–217. DOI:  http://doi.org/10.1007/s11049-017-9368-x

Kilbourn-Ceron, O., Wagner, M., & Clayards, M. (2016). The effect of production planning locality on external sandhi: A study in /t/. In Proceedings of the 52nd Meeting of the Chicago Linguistic Society. Chicago, IL.

Kim, H., Maeda, S., Honda, K., & Crevier-Buchman, L. (2018). The Mechanism and Representation of Korean Three-Way Phonation Contrast: External Photoglottography, Intra-Oral Air Pressure, Airflow, and Acoustic Data. Phonetica, 75(1), 57–84. DOI:  http://doi.org/10.1159/000479589

Klatt, D. H., Stevens, K. N., & Mead, J. (1968). Studies of articulatory activity and airflow during speech. Annals of the New York Academy of Sciences, 155(1), 42–55. DOI:  http://doi.org/10.1111/j.1749-6632.1968.tb56748.x

Kohler, K. J. (1994). Glottal stops and glottalization in German. Data and theory of connected speech processes. Phonetica, 51, 38–51. DOI:  http://doi.org/10.1159/000261957

Kreiman, J. (1982). Perception of sentence and paragraph boundaries in natural conversation. Journal of Phonetics, 10, 163–175. DOI:  http://doi.org/10.1016/S0095-4470(19)30955-6

Kühnert, B., & Hoole, P. (2004). Speaker-specific kinematic properties of alveolar reductions in English and German. Clinical Linguistics & Phonetics, 18(6–8), 559–575. DOI:  http://doi.org/10.1080/02699200420002268853

Kul, M. (2015). Speech rate plays marginal role in processes of connected speech. In Proceedings of the 18th International Congress of Phonetic Sciences.

Lass, R. (1976). English phonology and phonological theory. University Press Cambridge.

Levon, E. (2006). Mosaic identity and style: Phonological variation among Reform American Jews. Journal of Sociolinguistics, 10(2), 181–204. DOI:  http://doi.org/10.1111/j.1360-6441.2006.00324.x

Löfqvist, A., & McGarr, N. (1987). Laryngeal dynamics in voiceless consonant production. In T. Baer, C. Sasaki & K. S. Harris (Eds.), Laryngeal Function in Phonation and Respiration. Boston, MA: Little, Brown, and Company Inc.

Löfqvist, A., & McGowan, R. (1992). Influence of consonantal environment on voice source aerodynamics. Journal of Phonetics, 20, 93–110. DOI:  http://doi.org/10.1016/S0095-4470(19)30256-6

Löfqvist, A., & Yoshioka, H. (1984). Intrasegmental timing: Laryngeal-oral coordination in voiceless consonant production. Speech Communication, 3(4), 279–289. DOI:  http://doi.org/10.1016/0167-6393(84)90024-4

Manuel, S. Y., & Vatikiotis-Bateson, E. (1988). Oral and glottal gestures and acoustics of underlying /t/ in English. The Journal of the Acoustical Society of America, 84(S1), S84–S84. DOI:  http://doi.org/10.1121/1.2026518

Mathisen, A. G. (1999). Sandwell, West Midlands: Ambiguous perspectives on gender patterns and models of change. In P. Foulkes & G. Docherty (Eds.), Urban Voices: Accent studies in the British Isles (pp. 210–240). New York, NY: Routledge.

Mees, I. M., & Collins, B. (1999). Cardiff: A real-time study of glottalisation. In P. Foulkes & G. Docherty (Eds.), Urban Voices: Accent studies in the British Isles (pp. 355–389). New York, NY: Routledge.

Michaud, A. (2004). Final Consonants and Glottalization: New Perspectives from Hanoi Vietnamese. Phonetica, 61(2–3), 119–146. DOI:  http://doi.org/10.1159/000082560

Milroy, J., Milroy, L., Hartley, S., & Walshaw, D. (1994). Glottal stops and Tyneside glottalization: Competing patterns of variation and change in British English. Language Variation and Change, 6(03), 327. DOI:  http://doi.org/10.1017/S095439450000171X

Moisik, S. R., Czaykowska-Higgins, E., & Esling, J. H. (2019). Phonological potentials and the lower vocal tract. Journal of the International Phonetic Association (pp. 1–35). DOI:  http://doi.org/10.1017/S0025100318000403

Moll, K. L., & Daniloff, R. G. (1971). Investigation of the timing of velar movements during speech. The Journal of the Acoustical Society of America, 50(2B), 678–684. DOI:  http://doi.org/10.1121/1.1912683

Morris, J., & Hejná, M. (2019). Pre-aspiration in Bethesda Welsh: A sociophonetic analysis. Journal of the International Phonetic Association (pp. 1–25). DOI:  http://doi.org/10.1017/S0025100318000221

Munhall, K., & Löfqvist, A. (1992). Gestural aggregation in speech: Laryngeal gestures. Journal of Phonetics, 20, 111–126. DOI:  http://doi.org/10.1016/S0095-4470(19)30242-6

Netsell, R. (1969). Subglottal and intraoral air pressures during the intervocalic contrast of /t/ and /d/. Phonetica, 20(2–4), 68–73. DOI:  http://doi.org/10.1159/000259275

Newbrook, M. (1999). West Wirral: Norms, self reports and usage. In P. Foulkes & G. Docherty (Eds.), Urban Voices: Accent studies in the British Isles (pp. 177–209). New York, NY: Routledge.

O’Brien, J. (2012). An experimental approach to debuccalization and supplementary gestures (Unpublished doctoral dissertation). Santa Cruz: University of California.

Parrell, B., & Narayanan, S. (2018). Explaining Coronal Reduction: Prosodic Structure and Articulatory Posture. Phonetica, 75(2), 151–181. DOI:  http://doi.org/10.1159/000481099

Penney, J., Cox, F., Miles, K., & Palethorpe, S. (2018). Glottalisation as a cue to coda consonant voicing in Australian English. Journal of Phonetics, 66, 161–184. DOI:  http://doi.org/10.1016/j.wocn.2017.10.001

Penney, J., Cox, F., & Szakay, A. (2019). Glottalisation of word-final stops in Australian English unstressed syllables. Journal of the International Phonetic Association, 1–32. DOI:  http://doi.org/10.1017/S0025100319000045

Penney, J., Cox, F., & Szakay, A. (2020). Effects of Glottalisation, Preceding Vowel Duration, and Coda Closure Duration on the Perception of Coda Stop Voicing. Phonetica, 1–29. DOI:  http://doi.org/10.1159/000508752

Pierrehumbert, J. (1994). Knowledge of variation. In Papers from the 30th Regional Meeting of the Chicago Linguistic Society. Chicago: University of Chicago.

Pierrehumbert, J. (1995). Prosodic effects on glottal allophones. In O. Fujimura (Ed.), Vocal Fold Physiology, 8, 39–60. San Diego: Singular Publishing Group.

Pierrehumbert, J., & Talkin, D. (1991). Lenition of /h/ and glottal stop. In Papers in Laboratory Phonology II (pp. 90–117). Cambridge, UK: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511519918.005

Pitt, M., Dilley, L., Johnson, K., Kiesling, S., Raymond, W., Hume, E., & Fosler-Lussier, E. (2007). Buckeye Corpus of Conversational Speech (2nd release) (Tech. Rep.). Columbus, OH.

R Core Team. (2018). R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.

Ramisch, H. (2007). English in the Channel Islands. In D. Britain (Ed.), Language in the British Isles (pp. 176–182). Cambridge: Cambridge University Press. (OCLC: ocm85690302). DOI:  http://doi.org/10.1017/CBO9780511620782.012

Raymond, W. D., & Brown, E. L. (2012). Are effects of word frequency effects of context of use? An analysis of initial fricative reduction in Spanish. Frequency effects in language, 2, 35–52. DOI:  http://doi.org/10.1515/9783110274059.35

Redi, L., & Shattuck-Hufnagel, S. (2001). Variation in the realization of glottalization in normal speakers. Journal of Phonetics, 29(4), 407–429. DOI:  http://doi.org/10.1006/jpho.2001.0145

Riebold, J. M. (2011). Time to pull out the stops: Spirantization in Pacific Northwestern English. The Journal of the Acoustical Society of America, 129(4), 2453. DOI:  http://doi.org/10.1121/1.3588057

Roach, P. J. (1973). Glottalization of English /p/, /t/, /k/ and /tʃ/ – a re-examination. 13. Journal of the International Phonetic Association, 3(1), 10–21. DOI:  http://doi.org/10.1017/S0025100300000633

Roach, P. J. (1979). Laryngeal-oral coarticulation in glottalized English plosives. Journal of the International Phonetic Association, 9(01), 2–6. DOI:  http://doi.org/10.1017/S0025100300001857

Roberts, J. (2006). As old becomes new: Glottalization in Vermont. American Speech, 81(3), 227–249. DOI:  http://doi.org/10.1215/00031283-2006-016

Sanker, C. (2019). Influence of coda stop features on perceived vowel duration. Journal of Phonetics, 75, 43–56. DOI:  http://doi.org/10.1016/j.wocn.2019.04.003

Schertz, J. (2013). Exaggeration of featural contrasts in clarifications of misheard speech in English. Journal of Phonetics, 41(3–4), 249–263. DOI:  http://doi.org/10.1016/j.wocn.2013.03.007

Selkirk, E. (1972). The phrase phonology of English and French (Unpublished doctoral dissertation). Massachusetts Institute of Technology.

Seyfarth, S., Buz, E., & Jaeger, T. F. (2016). Dynamic hyperarticulation of coda voicing contrasts. Journal of the Acoustical Society of America, 139(2), EL31–EL37. DOI:  http://doi.org/10.1121/1.4942544

Seyfarth, S., & Garellek, M. (2015). Coda glottalization in American English. In Proceedings of the 18th International Conference of Phonetic Sciences.

Seyfarth, S., & Garellek, M. (2018). Plosive voicing acoustics and voice quality in Yerevan Armenian. Journal of Phonetics, 71, 425–450. DOI:  http://doi.org/10.1016/j.wocn.2018.09.001

Slifka, J. (2006). Some Physiological Correlates to Regular and Irregular Phonation at the End of an Utterance. Journal of Voice, 20(2), 171–186. DOI:  http://doi.org/10.1016/j.jvoice.2005.04.002

Slifka, J. (2007). Irregular phonation and its preferred role as cue to silence in phonological systems. In Proceedings of the 16th International Congress of Phonetic Sciences (pp. 229–232).

Smith, J., & Holmes-Elliott, S. (2018). The unstoppable glottal: Tracking rapid change in an iconic British variable. English Language and Linguistics, 22(03), 323–355. DOI:  http://doi.org/10.1017/S1360674316000459

Sóskuthy, M., & Hay, J. (2017). Changing word usage predicts changing word durations in New Zealand English. Cognition, 166, 298–313. DOI:  http://doi.org/10.1016/j.cognition.2017.05.032

Stan Development Team. (2018). RStan: The R interface to Stan.

Stevens, K. N., & Keyser, S. J. (1989). Primary features and their enhancement in consonants. Language, 65(1), 86–106. DOI:  http://doi.org/10.2307/414843

Stevens, K. N., & Keyser, S. J. (2010). Quantal theory, enhancement and overlap. Journal of Phonetics, 38(1), 10–19. DOI:  http://doi.org/10.1016/j.wocn.2008.10.004

Stoddart, J., Upton, C., & Widdowson, J. (1999). Sheffield dialect in the 1990s: Revisiting the concept of NORMs. In P. Foulkes & G. Docherty (Eds.), Urban Voices: Accent studies in the British Isles (pp. 143–176). New York, NY: Routledge.

Stuart-Smith, J. (1999). Glasgow: Accent and voice quality. In P. Foulkes & G. Docherty (Eds.), Urban Voices: Accent studies in the British Isles (pp. 390–427). New York, NY: Routledge.

Sung, K., & Kochetov, A. (2018). Allophonic variation in English coronal stops: An EPG corpus study. Toronto Working Papers in Linguistics.

Suthau, E., Birkholz, P., Mainka, A., & Simpson, A. P. (2016). Non-invasive photoglottography for use in the lab and the field. Speech Communication, 5.

Tanner, J., Sonderegger, M., & Wagner, M. (2017). Production planning and coronal stop deletion in spontaneous speech. Laboratory Phonology: Journal of the Association for Laboratory Phonology, 8(1). DOI:  http://doi.org/10.5334/labphon.96

Tollfree, L. (1999). South East London English: Discrete versus continuous modelling of consonantal reduction. In P. Foulkes & G. Docherty (Eds.), Urban Voices: Accent studies in the British Isles (pp. 314–354). New York, NY: Routledge.

Tollfree, L. (2001). Variation and change in Australian English consonants: Reduction of /t/. In D. Blair & P. Collins (Eds.), English in Australia. Amsterdam; Philadelphia: John Benjamins Pub. Co. DOI:  http://doi.org/10.1075/veaw.g26.06tol

Trudgill, P. (1999). Norwich: Endogenous and exogenous linguistic change. In P. Foulkes & G. Docherty (Eds.), Urban Voices: Accent studies in the British Isles (pp. 241–273). New York, NY: Routledge.

Turk, A. (2010). Does prosodic constituency signal relative predictability? A Smooth Signal Redundancy hypothesis. Laboratory Phonology, 1(2). DOI:  http://doi.org/10.1515/labphon.2010.012

Turnbull, R., Seyfarth, S., Hume, E., & Jaeger, T. F. (2018). Nasal place assimilation trades off inferrability of both target and trigger words. Laboratory Phonology: Journal of the Association for Laboratory Phonology, 9(1). DOI:  http://doi.org/10.5334/labphon.119

Umeda, N. (1978). Occurrence of glottal stops in fluent speech. The Journal of the Acoustical Society of America, 64(1), 88–94. DOI:  http://doi.org/10.1121/1.381959

Wagner, M. (2012). Locality in Phonology and Production Planning. McGill Working Papers in Linguistics, 22(1).

Watson, K. (2002). The realization of final /t/ in Liverpool English. Durham Working Papers in Linguistics, 8, 195–205.

Watt, D., & Milroy, L. (1999). Patterns of variation and change in three Newcastle vowels: Is this dialect levelling? In P. Foulkes & G. Docherty (Eds.), Urban Voices: Accent studies in the British Isles (pp. 67–104). New York, NY: Routledge.

Westbury, J. R. (1983). Enlargement of the supraglottal cavity and its relation to stop consonant voicing. Journal of the Acoustical Society of America, 73(4), 1322–1336. DOI:  http://doi.org/10.1121/1.389236

Westbury, J. R., & Niimi, S. (1979). An effect of phonetic environment on voicing control mechanisms during stop consonants. Journal of the Acoustical Society of America, 65, S23. DOI:  http://doi.org/10.1121/1.2017165

Wightman, C. W., Shattuck-Hufnagel, S., Ostendorf, M., & Price, P. J. (1992). Segmental durations in the vicinity of prosodic phrase boundaries. Journal of the Acoustical Society of America, 91(3), 1707–1717. DOI:  http://doi.org/10.1121/1.402450

Williams, A., & Kerswill, P. (1999). Dialect levelling: Change and continuity in Milton Keynes, Reading and Hull. In P. Foulkes & G. Docherty (Eds.), Urban Voices: Accent studies in the British Isles (pp. 274–313). New York, NY: Routledge.

Zeileis, A., Hothorn, T., & Hornik, K. (2008). Model-Based Recursive Partitioning. Journal of Computational and Graphical Statistics, 17(2), 492–514. DOI:  http://doi.org/10.1198/106186008X319331