Normalize vowel formant measurements with ∆F (see Johnson 2020). This function is intended to be used within a tidyverse pipeline.
norm_deltaF(df, .F1, .F2, .F3, .F4, suffix = "_deltaF", return = "formants")
The data frame containing the formant measurements you want to normalize
The (unquoted) name of the column containing the F1 measurements. The first three are required, but you may leave off F4. It is recommended that you include F4 if the data is available and reliable since it produces more accurate results.
A string. The suffix you'd like to append to column names in
new normalized columns. By default, it's "_deltaF"
so if your original F1
column was called F1
then the normalized one will be F1_deltaF
.
A string. By default, it's "formants"
so it'll return the
normalized values for you. If you'd like to see the actual ΔF values, you
can do so by putting "deltaF"
instead.
The original dataframe with new columns containing the normalized measurements.
The ∆F is a normalization technique that is based on a single, interpretable parameter for each speaker. The parameter is called ∆F and is "an estimate of formant spacing in a vocal tract with no constrictions" (Johnson 2020:10).
You will need to group the data by speaker with group_by()
before
applying this function if you want to normalize the data by speaker.
The data must be numeric, and there cannot be any NA
s. So, if you're using
data extracted from Praat, you may have to filter out bad F3 and F4 data
and then convert the column to numeric.
Note that this is a new function and has not been tested very robustly yet.
Johnson, Keith. 2020. The ΔF Method of Vocal Tract Length Normalization for Vowels. Laboratory Phonology: Journal of the Association for Laboratory Phonology 11(1). https://doi.org/10.5334/labphon.196.
library(tidyverse)
df <- joeysvowels::idahoans
# Basic usage
df %>%
group_by(speaker) %>%
norm_deltaF(F1, F2, F3, F4)
#> # A tibble: 1,100 × 11
#> # Groups: speaker [10]
#> speaker sex vowel F1 F2 F3 F4 F1_del…¹ F2_de…² F3_de…³ F4_de…⁴
#> <fct> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 01 female AA 699. 1655. 2019. 3801. 0.637 1.51 1.84 3.46
#> 2 01 female AA 685. 1360. 1914. 4257. 0.624 1.24 1.74 3.88
#> 3 01 female AA 713. 1507. 2460. 3617. 0.650 1.37 2.24 3.29
#> 4 01 female AA 801. 1143. 1868. 2908. 0.730 1.04 1.70 2.65
#> 5 01 female AA 757. 1258. 1772. 2778. 0.689 1.15 1.61 2.53
#> 6 01 female AA 804. 1403. 2339. 4299. 0.733 1.28 2.13 3.92
#> 7 01 female AA 664. 1279. 1714. 2103. 0.605 1.17 1.56 1.92
#> 8 01 female AA 757. 1325. 1929. 2660. 0.690 1.21 1.76 2.42
#> 9 01 female AA 730. 1578. 2297. 2963. 0.665 1.44 2.09 2.70
#> 10 01 female AA 700. 1546. 2109. 3432. 0.638 1.41 1.92 3.13
#> # … with 1,090 more rows, and abbreviated variable names ¹F1_deltaF,
#> # ²F2_deltaF, ³F3_deltaF, ⁴F4_deltaF
# F4 is not required
df %>%
group_by(speaker) %>%
norm_deltaF(F1, F2, F3)
#> # A tibble: 1,100 × 10
#> # Groups: speaker [10]
#> speaker sex vowel F1 F2 F3 F1_deltaF F2_deltaF F3_deltaF F4
#> <fct> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 01 female AA 699. 1655. 2019. 0.626 1.48 1.81 3801.
#> 2 01 female AA 685. 1360. 1914. 0.614 1.22 1.71 4257.
#> 3 01 female AA 713. 1507. 2460. 0.639 1.35 2.20 3617.
#> 4 01 female AA 801. 1143. 1868. 0.718 1.02 1.67 2908.
#> 5 01 female AA 757. 1258. 1772. 0.677 1.13 1.59 2778.
#> 6 01 female AA 804. 1403. 2339. 0.720 1.26 2.09 4299.
#> 7 01 female AA 664. 1279. 1714. 0.595 1.15 1.53 2103.
#> 8 01 female AA 757. 1325. 1929. 0.678 1.19 1.73 2660.
#> 9 01 female AA 730. 1578. 2297. 0.654 1.41 2.06 2963.
#> 10 01 female AA 700. 1546. 2109. 0.627 1.38 1.89 3432.
#> # … with 1,090 more rows
# Change the new columns' suffix
df %>%
group_by(speaker) %>%
norm_deltaF(F1, F2, F3, suffix = "_norm")
#> # A tibble: 1,100 × 10
#> # Groups: speaker [10]
#> speaker sex vowel F1 F2 F3 F1_norm F2_norm F3_norm F4
#> <fct> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 01 female AA 699. 1655. 2019. 0.626 1.48 1.81 3801.
#> 2 01 female AA 685. 1360. 1914. 0.614 1.22 1.71 4257.
#> 3 01 female AA 713. 1507. 2460. 0.639 1.35 2.20 3617.
#> 4 01 female AA 801. 1143. 1868. 0.718 1.02 1.67 2908.
#> 5 01 female AA 757. 1258. 1772. 0.677 1.13 1.59 2778.
#> 6 01 female AA 804. 1403. 2339. 0.720 1.26 2.09 4299.
#> 7 01 female AA 664. 1279. 1714. 0.595 1.15 1.53 2103.
#> 8 01 female AA 757. 1325. 1929. 0.678 1.19 1.73 2660.
#> 9 01 female AA 730. 1578. 2297. 0.654 1.41 2.06 2963.
#> 10 01 female AA 700. 1546. 2109. 0.627 1.38 1.89 3432.
#> # … with 1,090 more rows
# Return ∆F instead
df %>%
group_by(speaker) %>%
norm_deltaF(F1, F2, F3, F4, return = "deltaF")
#> # A tibble: 10 × 2
#> # Groups: speaker [10]
#> speaker deltaF
#> <fct> <dbl>
#> 1 01 1098.
#> 2 02 940.
#> 3 03 1156.
#> 4 04 1115.
#> 5 05 1067.
#> 6 06 935.
#> 7 07 1058.
#> 8 08 1151.
#> 9 09 1135.
#> 10 10 1045.