This is a a tidyverse-compatible function that makes it easy to normalize your data using the method described in the Atlas of North American English (Labov, Ash, & Boberg 2006).

norm_anae(df, hz_cols, token_id, speaker_id, g = "telsur")

Arguments

df

The dataframe containing the formant measurements you want to normalize.

hz_cols

A list of columns (unquoted) containing the formant measurements themselves.

token_id

The name of the column containing unique identifiers per vowel token. If your data is set up so that there is one row per token, you can put row.names(.) here instead.

speaker_id

The name of the column containing unique identifiers per speaker (usually the column containing the speaker name).

g

By default, "telsur", whichwill use the Telsur G value (6.896874) listed in the ANAE. If set to "calculate", it will calculate the G value based on the dataset. This can be set to any arbitrary number, such as 0 as well.

Value

The same dataframe, but with new column(s), suffixed with "_anae" that have the normalized data.

Details

The data must be grouped by speaker prior to running the function.

The function works best when only F1 and F2 data are included. F3 can be included but the results may not be comparable with other studies.

By default, the function will use the Telsur G value listed in the ANAE (6.896874) which will make the results most compatible with the ANAE and other studies that use the same normalization procedure. The function can calculate a G value based on the dataset provided when g is set to "calculate". Alternatively, g can be set to an arbitrary number, such as zero.

It is unclear how the ANAE function should work with trajectory data. This function pools all data together and normalizes it together, which means one small modification was required to calculate the G value if the Telsur G is not used: I had to add the average number of time points per vowel token in the denominator. Not sure if that's how it should be done, but it makes sense to me and returns sensible results.

References

Labov, William, Sharon Ash, and Charles Boberg. The Atlas of North American English: Phonetics, Phonology and Sound Change. Berlin: Walter de Gruyter, 2006.

Examples

library(tidyverse)
df <- joeysvowels::idahoans

df %>%
   group_by(speaker) %>%
   norm_anae(hz_cols = c(F1, F2), speaker_id = speaker) %>%
   ungroup() %>%
   select(F1, F2, F1_anae, F2_anae) # <- just the relevant columns
#> # A tibble: 1,100 × 4
#>       F1    F2 F1_anae F2_anae
#>    <dbl> <dbl>   <dbl>   <dbl>
#>  1  699. 1655.    714.   1690.
#>  2  685. 1360.    700.   1388.
#>  3  713. 1507.    728.   1539.
#>  4  801. 1143.    818.   1167.
#>  5  757. 1258.    772.   1284.
#>  6  804. 1403.    821.   1432.
#>  7  664. 1279.    678.   1306.
#>  8  757. 1325.    773.   1353.
#>  9  730. 1578.    746.   1611.
#> 10  700. 1546.    715.   1578.
#> # … with 1,090 more rows

# Slightly different if G is calculated internally.
df %>%
   group_by(speaker) %>%
   norm_anae(hz_cols = c(F1, F2), speaker_id = speaker, g = "calculate") %>%
   ungroup() %>%
   select(F1, F2, F1_anae, F2_anae) # <- just the relevant columns
#> # A tibble: 1,100 × 4
#>       F1    F2 F1_anae F2_anae
#>    <dbl> <dbl>   <dbl>   <dbl>
#>  1  699. 1655.    660.   1562.
#>  2  685. 1360.    646.   1283.
#>  3  713. 1507.    673.   1422.
#>  4  801. 1143.    756.   1078.
#>  5  757. 1258.    714.   1187.
#>  6  804. 1403.    759.   1323.
#>  7  664. 1279.    626.   1207.
#>  8  757. 1325.    715.   1250.
#>  9  730. 1578.    689.   1489.
#> 10  700. 1546.    660.   1459.
#> # … with 1,090 more rows

# G can be set to an arbitrary value.
df %>%
   group_by(speaker) %>%
   norm_anae(hz_cols = c(F1, F2), speaker_id = speaker, g = 0) %>%
   ungroup() %>%
   select(F1, F2, F1_anae, F2_anae) # <- just the relevant columns
#> # A tibble: 1,100 × 4
#>       F1    F2 F1_anae F2_anae
#>    <dbl> <dbl>   <dbl>   <dbl>
#>  1  699. 1655.   0.722    1.71
#>  2  685. 1360.   0.707    1.40
#>  3  713. 1507.   0.736    1.56
#>  4  801. 1143.   0.827    1.18
#>  5  757. 1258.   0.781    1.30
#>  6  804. 1403.   0.830    1.45
#>  7  664. 1279.   0.685    1.32
#>  8  757. 1325.   0.782    1.37
#>  9  730. 1578.   0.754    1.63
#> 10  700. 1546.   0.722    1.60
#> # … with 1,090 more rows