Normalize vowel formant measurements with ∆F (see Johnson 2020). This function is intended to be used within a tidyverse pipeline.

norm_deltaF(df, .F1, .F2, .F3, .F4, suffix = "_deltaF", return = "formants")

Arguments

df

The data frame containing the formant measurements you want to normalize

.F1, .F2, .F3, .F4

The (unquoted) name of the column containing the F1 measurements. The first three are required, but you may leave off F4. It is recommended that you include F4 if the data is available and reliable since it produces more accurate results.

suffix

A string. The suffix you'd like to append to column names in new normalized columns. By default, it's "_deltaF" so if your original F1 column was called F1 then the normalized one will be F1_deltaF.

return

A string. By default, it's "formants" so it'll return the normalized values for you. If you'd like to see the actual ΔF values, you can do so by putting "deltaF" instead.

Value

The original dataframe with new columns containing the normalized measurements.

Details

The ∆F is a normalization technique that is based on a single, interpretable parameter for each speaker. The parameter is called ∆F and is "an estimate of formant spacing in a vocal tract with no constrictions" (Johnson 2020:10).

You will need to group the data by speaker with group_by() before applying this function if you want to normalize the data by speaker.

The data must be numeric, and there cannot be any NAs. So, if you're using data extracted from Praat, you may have to filter out bad F3 and F4 data and then convert the column to numeric.

Note that this is a new function and has not been tested very robustly yet.

References

Johnson, Keith. 2020. The ΔF Method of Vocal Tract Length Normalization for Vowels. Laboratory Phonology: Journal of the Association for Laboratory Phonology 11(1). https://doi.org/10.5334/labphon.196.

Examples

library(tidyverse)
df <- joeysvowels::idahoans

# Basic usage
df %>%
   group_by(speaker) %>%
   norm_deltaF(F1, F2, F3, F4)
#> # A tibble: 1,100 × 11
#> # Groups:   speaker [10]
#>    speaker sex    vowel    F1    F2    F3    F4 F1_del…¹ F2_de…² F3_de…³ F4_de…⁴
#>    <fct>   <chr>  <chr> <dbl> <dbl> <dbl> <dbl>    <dbl>   <dbl>   <dbl>   <dbl>
#>  1 01      female AA     699. 1655. 2019. 3801.    0.637    1.51    1.84    3.46
#>  2 01      female AA     685. 1360. 1914. 4257.    0.624    1.24    1.74    3.88
#>  3 01      female AA     713. 1507. 2460. 3617.    0.650    1.37    2.24    3.29
#>  4 01      female AA     801. 1143. 1868. 2908.    0.730    1.04    1.70    2.65
#>  5 01      female AA     757. 1258. 1772. 2778.    0.689    1.15    1.61    2.53
#>  6 01      female AA     804. 1403. 2339. 4299.    0.733    1.28    2.13    3.92
#>  7 01      female AA     664. 1279. 1714. 2103.    0.605    1.17    1.56    1.92
#>  8 01      female AA     757. 1325. 1929. 2660.    0.690    1.21    1.76    2.42
#>  9 01      female AA     730. 1578. 2297. 2963.    0.665    1.44    2.09    2.70
#> 10 01      female AA     700. 1546. 2109. 3432.    0.638    1.41    1.92    3.13
#> # … with 1,090 more rows, and abbreviated variable names ¹​F1_deltaF,
#> #   ²​F2_deltaF, ³​F3_deltaF, ⁴​F4_deltaF

# F4 is not required
df %>%
   group_by(speaker) %>%
   norm_deltaF(F1, F2, F3)
#> # A tibble: 1,100 × 10
#> # Groups:   speaker [10]
#>    speaker sex    vowel    F1    F2    F3 F1_deltaF F2_deltaF F3_deltaF    F4
#>    <fct>   <chr>  <chr> <dbl> <dbl> <dbl>     <dbl>     <dbl>     <dbl> <dbl>
#>  1 01      female AA     699. 1655. 2019.     0.626      1.48      1.81 3801.
#>  2 01      female AA     685. 1360. 1914.     0.614      1.22      1.71 4257.
#>  3 01      female AA     713. 1507. 2460.     0.639      1.35      2.20 3617.
#>  4 01      female AA     801. 1143. 1868.     0.718      1.02      1.67 2908.
#>  5 01      female AA     757. 1258. 1772.     0.677      1.13      1.59 2778.
#>  6 01      female AA     804. 1403. 2339.     0.720      1.26      2.09 4299.
#>  7 01      female AA     664. 1279. 1714.     0.595      1.15      1.53 2103.
#>  8 01      female AA     757. 1325. 1929.     0.678      1.19      1.73 2660.
#>  9 01      female AA     730. 1578. 2297.     0.654      1.41      2.06 2963.
#> 10 01      female AA     700. 1546. 2109.     0.627      1.38      1.89 3432.
#> # … with 1,090 more rows

# Change the new columns' suffix
df %>%
   group_by(speaker) %>%
   norm_deltaF(F1, F2, F3, suffix = "_norm")
#> # A tibble: 1,100 × 10
#> # Groups:   speaker [10]
#>    speaker sex    vowel    F1    F2    F3 F1_norm F2_norm F3_norm    F4
#>    <fct>   <chr>  <chr> <dbl> <dbl> <dbl>   <dbl>   <dbl>   <dbl> <dbl>
#>  1 01      female AA     699. 1655. 2019.   0.626    1.48    1.81 3801.
#>  2 01      female AA     685. 1360. 1914.   0.614    1.22    1.71 4257.
#>  3 01      female AA     713. 1507. 2460.   0.639    1.35    2.20 3617.
#>  4 01      female AA     801. 1143. 1868.   0.718    1.02    1.67 2908.
#>  5 01      female AA     757. 1258. 1772.   0.677    1.13    1.59 2778.
#>  6 01      female AA     804. 1403. 2339.   0.720    1.26    2.09 4299.
#>  7 01      female AA     664. 1279. 1714.   0.595    1.15    1.53 2103.
#>  8 01      female AA     757. 1325. 1929.   0.678    1.19    1.73 2660.
#>  9 01      female AA     730. 1578. 2297.   0.654    1.41    2.06 2963.
#> 10 01      female AA     700. 1546. 2109.   0.627    1.38    1.89 3432.
#> # … with 1,090 more rows

# Return ∆F instead
df %>%
   group_by(speaker) %>%
   norm_deltaF(F1, F2, F3, F4, return = "deltaF")
#> # A tibble: 10 × 2
#> # Groups:   speaker [10]
#>    speaker deltaF
#>    <fct>    <dbl>
#>  1 01       1098.
#>  2 02        940.
#>  3 03       1156.
#>  4 04       1115.
#>  5 05       1067.
#>  6 06        935.
#>  7 07       1058.
#>  8 08       1151.
#>  9 09       1135.
#> 10 10       1045.