Now /wɛr/ Were We…


I’m happy to finally have gotten a semi-working web interface for the generated minimal pairs. The data behind it is still not great (only 360 videos’ worth of audio), but I’m hoping to feed it with a bunch more data (approx 4x more) within the next month or so.

Since last time, I updated it to just look for vowel minimal pairs, since this seems like the most useful thing to do. this has decreased the number of buttons on the screen from around 800 to only 60 or so. I’ve also added examples on the buttons of word pairs that are minimal pairs.

Also, I’ve added in some visual feedback during the playing of the audio clips. Now the word currently being spoken grows during the playback so it’s more obvious which word is being spoken. As mentioned, the data is still not great, so it would be a bit difficult to make out some of the words without the visual scaffolding.


So I think I still need to do a bit of tweaking with the data collection in the automated generator, since it needs to check that both parts of a minimal pair have a minimum length (currently set to 0.5 seconds) instead of just checking for one, though with enough data, maybe this also wouldn’t matter as much.

This lack of a double check is why, for example, a pair like Cool/Call might have 16 of the voices saying Cool and only 2 saying Call. Obviously, it would be preferable if each of the vowel pairs had an equal number of repetitions.

Lastly, the current implementation is in very approximate jQuery, so it could use a refactoring, but it’s currently more or less fast enough, so I probably won’t bother.

Future work

To be actually useful, this needs a search function to go straight to the minimal pairs instead of scrolling all the way down. Learners presumably will want to focus on particular minimal pairs according to their needs, so this is a must have for a working web app.

In terms of automated assessment (at least in a very basic form), it also needs quiz functionality to provide learners with immediate feedback on their auditory discrimination abilities with authentic data.

Most likely this will involve chopping out the entire sentence that the minimal pair word is in and presenting a choice to select which word was spoken in the context of the sentence. This shouldn’t be too hard (the very easy way to do it would be to just grab a 5~ish second window around the word), though I’m not sure how much the context would affect performance on this task.

…and that means I need to rewrite it in React, which I’m now halfway proficient in, so hopefully I can get to that before too long.

For any interested, the source code is here


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s