A man who can talk backwards (2017)

Just a few days ago I saw an amazing video about Kurt Quinn, a man who can talk backwards. When I say backwards, I don’t mean that he reverses the order of the words in a sentence, but he actually reverses the order of the sounds. He’s got a YouTube channel where you can see this in action, and I would recommend this video by Smarter Every Day which tests the limits of Quinn’s skill.

The Smarter Every Day (SED) video not only demonstrates Quinn’s skill, but also offers up a little bit of phonetic science to explain how it works. While I deeply appreciate SED’s attempt to describe phonetics, there are a few things missing from the explanation that I want to go over here. This should not be taken as a criticism of SED at all. What Quinn can do is highly unusual, and highly interesting, and I want to explore it further.

At this point in this video, SED mentions that it’s important to realize Quinn is not reading the letters of a word backwards. It’s common for people to confuse the notions of “letters” and “sounds”, so I’m glad that SED brought this up.

Instead, SED explains that Quinn has to “phonetically re-create the sounds those letters make, backwards”. This leads to the discussion on phonetics in the video, where SED make an interesting observation – you can’t physically make certain sounds backwards. SED gives the example of plosive sounds. These are sounds produced by blocking airflow in your mouth and nose, building up air pressure behind the closure, then releasing the closure. The English plosives are [b,p,d,t,k,ɡ]. Those are actually phonetic symbols, but they represent pretty much what you would expect they do.

SED claims that you can’t reverse a plosive because you can’t physically reverse this process of air pressure build-up and release. Therefore, Quinn must have “tricks” for making us think that he is making a plosive in reverse.

I actually don’t think this is correct. What Quinn is doing is not a matter of phonetic reversal. It’s something else. I’ll get to that in a bit, but first let’s consider the claim that certain speech sounds can’t be made in reverse.

Can you reverse speech sounds?

All English speech sounds are known as pulmonic egressive sounds in phonetic terminology. The word “pulmonic” refers to the lungs, and “egressive” is fancy science-talk for “outward”. Therefore a pulmonic egressive sound is one where air flows from the lungs out of the body (through the mouth, the nose, or both).

SED mentions plosive sounds, which are obviously egressive given the burst of air that occurs. But let’s think about another type of speech sound, called a fricative. These sounds are made by forcing air through a narrow constriction at a high speed. The sounds [s] and [z] are fricatives, as are the sounds represented by “sh” and “th” in writing. Unlike a plosive, you can easily “reverse” the fricative by simply inhaling air through a narrow constriction.

Try this: make an [s], then hold your tongue position where it is and inhale. The sound this makes is quite different from an [s], and it’s not a sound you will hear Quinn make. Now for something harder: make a [z], hold the tongue position, then inhale to make a reversed [z]. You’ll find it’s very difficult, if not impossible. It’s also not something Quinn is doing.

Another category of speech sounds is the nasal, which includes the likes of [m] and [n]. They are egressive and they involve air flow out through the nose (but not the mouth). To truly “reverse” them, you’d need to inhale air through your nose, and that’s also not what Quinn is doing. Just like with fricatives, you’ll find that inhaled nasals sound completely different from the normal egressive nasals. Listen to Quinn doing the reversed talk, and he never inhales through his nose.

As a quick aside, egressives are the most common kind of sound across languages, and most languages have nothing but egressives. However, non-egressive sounds exist too. One kind is called an implosive. This is similar to a plosive, in that air is trapped behind a closure and pressure builds up. For an implosive, the larynx is lowered and then the closure is released. The result is a sound that is often described as a “swallowed” plosive in non-technical terms.

There are also sounds called “velaric”, more commonly known as “clicks“. These sounds do not involve airflow at all. Clicks are extremely rare, and occur mainly in the Khoi-San languages spoken in southern Africa.

Anyway, returning to the guy who talks backwards: Quinn is not reversing his phonetic productions of words. He’s producing each sound in a more-or-less “normal” way. What is his trick then?

Two kinds of language

This brings up a fundamental concept in linguistics: the distinction between underlying forms of language and surface forms of language. Surface language refers to language actually produced by someone. Spoken, signed, and written language would all qualify as surface forms. Underlying language refers to language as it exists in your brain.

What Quinn has learned to do is to reverse the order of underlying forms of language. More specifically, Quinn knows how to reverse phonemes, which are the underlying sound categories.

To understand phonemes, you need to understand one important thing about speech: it is highly variable. The pronunciation of a sound is heavily influenced by its context. For instance, vowels in English are considerably more nasal when they are pronounced before a nasal consonant than when they are pronounced elsewhere. The vowel in “tan” is quite nasal, while the vowels in “tap”, “tack”, or even “nap” are much less so.

Another example is the way that plosives are pronounced. When they occur at the beginning of a word, followed by a vowel, there is an extra burst of aspiration when they are released, compared when they are pronounced elsewhere. Here’s a really low-budget phonetics experiment you can do to: Hold your palm up just in front of your mouth, and say these sentences – “I had a nap” and “I had a pin”. You should feel an extra burst of air on your palm when you say “pin”, because the plosive [p] is at the start of the word. In “nap” the plosive is at the end of the word, and the end of the sentence, and the burst is much weaker (in fact, you might not release the [p] at all).

This variation is only at the surface level of the language. The vowels we produce and perceive as English speakers are both nasal and non-nasal. The plosives are both aspirated and unaspirated. But what does this mean for the underlying forms of language? How does this affect the way that we store and process language in our brain? And what does this tell us about Quinn’s ability to talk backwards?

Standard linguistic theory holds that we actually learn to ignore some surface variation as children, when we are acquiring language. Instead of learning that there are two different p-sounds in English, children learn that there is a single underlying /p/ sound, but that you have to pronounce it differently under different circumstances. To put it another way, our ears hear the difference between an aspirated plosive and an unaspirated plosive, but our brain learns to treats them both like members of the same category. In technical terms, the underlying category is called a phoneme, and the surface pronunciations are known as allophones of the phoneme.

The collection of words we keep in our head is called our mental lexicon. The words in the lexicon are stored as strings of phoneme categories. When we produce speech we select a word from the lexicon, then transform its phonemes into surface forms. When we listen to speech, we map surface sounds to underlying phonemes so we can look up the word we’re hearing in our mental lexicon.

Now, you might wonder why would children learn to map two p-sounds to a single phoneme, but not do this with other sounds. For instance, why not assume that [f] and [p] are actually just variants of a single category?

The short answer is that [f] and [p] are contrastive, which means that there are words which are identical except for [f] and [p], e.g. fin/pin or spear/sphere. Learners need to assign [p] and [f] to different phoneme categories in order to distinguish these words. If a learner thought that [f] and [p] were part of the same phoneme category, then the surface forms [fɪn] and [pɪn] would both correspond to the same word in the mental lexicon, and a learner wouldn’t be able to distinguish between them.

On the other hand, aspirated [p^h] and unaspirated [p] are not contrastive. There are no two words in English that differ only by aspiration. Both [p^h] and [p] can correspond to a single /p/ phoneme without any chance of confusing words.

Quinn’s “trick”

The point of all this is that Quinn isn’t producing speech sounds backwards. His trick is to look up the phoneme string in his mental lexicon, reverse that string, then turn the resulting string into a surface form.

For example, in a word like “fish”, which contains the sequence of phonemes /fɪʃ/, Quinn simply reverses the sequence and says [ʃɪf], which would be spelled “shif” if it were a word. The fricatives aren’t reversed in their articulation, just in their relative ordering.

One thing that really caught my attention is when Quinn talked about the “ch” sound. The sound that he’s referring to is phonetically transcribed as [tʃ], and it’s a type of sound known as an affricate. This is a sound that begins as a plosive, but instead of releasing the built-up air into a burst, it is released into a fricative sound. There is another affricate in English [dʒ], which is represented by “j” in “jam”.

Affricates are interesting because they involve the articulation of two consecutive surface sounds which correspond to a single underlying phoneme. In theory, the phoneme is a unanalyzable unit. If you ask a random English speaker to reverse the word “peach”, they will probably say [tʃip] (sounds like “cheap”). That is, they will reverse the order of the phonemes in the word, but they won’t reverse the internal ordering of the affricate. Quinn, however, has the insight that to truly speak backwards, you do need to reverse the affricate ordering. He would almost certainly say the word “peach” backwards as [ʃtip], spelled “shtip” if it were a word.

Sound patterns

The words in a language are not random strings of sounds. Every language has specific rules and constraints on how sounds can be ordered in a word. These are technically known as “phonotactic rules“. In English, for example, words never start with a sequence of a plosive and a nasal. English lacks words like *bnick, *ntip, or *kmop. Another rule is that if a word starts with 3 consonants, the first one has to be /s/, e.g. we have words string and strip but not *ktring or *wtrip. (I actually have a whole post over here explaining how phonotactic rules affect the number of possible words in a language.)

Phonotactics vary considerably across languages. Japanese, for example, has very strict rules where consecutive consonants are not allowed (although the Japanese nasal consonant is an exception). Georgian, on the other hand, has crazy complicated syllables and allows up to 6 consonants in a row.

These differences present major barriers when learning to speak other languages. People generally find it very difficult to pronounce sequences of sounds that don’t appear in their native language. Japanese speakers often struggle with English consonant clusters, and words like “trip” come out as “turipu”, with added vowels so that the word conforms more to Japanese phonotactics.

How does this relate to the guy who talks backwards? The regular phonotactic patterns of English don’t apply when you reverse words. For example, no words start with [lw], and in fact pronouncing a word like “lwob” is awkward for most English speakers (try it and you’ll probably insert a whole vowel between [l] and [w]). However, Quinn does just this with the word “bowl” in “dirty fish bowl”. The word “bowl” ends with [wl], so when Quinn reverses it he has to say [lwob], therefore producing a sequence of consonants that he otherwise has no practice producing. This is much, much more difficult than it sounds, and Quinn must have worked very hard on his articulatory skills.

A more subtle part of the trick

When we look up phoneme representations of words in our mental lexicon, it’s not just a single step from there to the surface form. Several other kinds of processes need to take place first. For example, underlying phonemes with systematic variation, like the plosives discussed earlier, have to be transformed into the correct surface sound. Prefixes and suffixes may get added to root words. Stress gets assigned to particular syllables. And so on.

In broad terms, these additional operations are called phonological rules (or morphological rules, depending on what they do exactly). There exist theories of language that don’t rely on rules, and instead use other things like constraint satisfaction or neural networks. I’m just going to say “rules” for simplicity here.

Now here’s something really interesting about Quinn’s reverse speech. He’s not applying all the rules, then reversing the result. I think he’s reversing the underlying phoneme order, then applying rules to the reversed forms.

A good example is word stress. This is not part of the underlying representation in English mental lexicon. It’s added by some kind of rule. The reason it isn’t part of the underlying representation is because the addition of suffixes can potentially change where stress goes. Stress is not a completely predictable fact that we memorize for any given word. Consider the word Canada, where stress is on the first syllable. If you add a suffix to make Canadian, the stress moves to the second syllable (and the quality of the vowel in the first syllable changes too).

Now, let’s look at how stress works in Quinn’s backwards-English. (Maybe we can call it Shilnge? We can’t call it Hsilgne for reasons I’ve already explained.) A good example is the first word he says backwards, which is “banjo”. In phonemic notation, it looks like the following, with syllables separated by periods:

/bæn.dʒow/

When Quinn says it backwards he says

/woʒd.næb/

Now, let’s think about the stress pattern. “Banjo” is a two-syllable word, where the first syllable is stressed, and the second is unstressed (“weak”). A simple diagram of the stress pattern looks like this:

S W

/

[bæn.dʒow]

But when you hear the reversed playback of Quinn’s [woʒd.næb], there’s something odd about the stress. Rather than [bæn] being stressed, it’s [dʒow] that gets stressed. You can hear as a rising tone in the playback. We could notate the stress this way:

S W

/

[woʒd.næb]

In other words, Quinn reversed the phoneme order, and then applied the regular stress pattern to that reversed order.

For another example, listen to how Quinn says the cat’s name, which is Ganymede [ɡæ.nə.mid] with stress on the first syllable [ɡa]. In the reversed version the stress sounds like it’s on the syllable [mid], rather than [ɡa]. (And as a side-note, I was super impressed by this particular feat. Ganymede is such an unusual word, there’s a good chance that Quinn had never heard it before. So to take a brand-new word, convert it into phonemes, then reverse those phonemes and repeat the word within milliseconds…it’s just incredible.)

Now to be fair, I’m not sure that I’ve identified a fully consistent pattern. When Quinn says Alabama, the stress sounds like it is on [bæ] in both the backwards version and the playback version. So there may be something more complicated happening.

Final thoughts

Quinn’s skill at reversing words is remarkable. As far as I know, this is not something that ever occurs naturally in human languages – there are no languages where reversing phoneme order is a normal thing to do. He is one of the very rare humans who has ever learned to do this. After all, voice-recording technology was only invented very recently, so prior to this humans would have been unable to practice or prove such a skill. The closest natural phenomenon is metathesis, where two sounds swap order for grammatical reasons, but this is not a reversal of an entire word or phrase, and it’s very rare in any case.

What Quinn can do is not just amazing for the difficulty of the mental task involved, but it provides a unique look at how humans organize language in their brains. I highly recommend you go watch that Smarter Every Day video and check out Quinn’s own channel too.

A man who can talk backwards (2017)

Can you reverse speech sounds?

Two kinds of language

Quinn’s “trick”

Sound patterns

A more subtle part of the trick

Final thoughts

Bayesian Statistics: The three cultures

Reverse-engineering my speakers’ API to get reasonable volume control

Zen 5’s 2-ahead branch predictor: how a 30 year old idea allows for new tricks

LEAVE A REPLY Cancel reply

Most Popular

Facebook doesn’t think hackers accessed third-party sites

It’s getting a lot harder for global brands to win in China

Why it’s time for investors to go on the defense

Facebook doesn’t think hackers accessed third-party sites

Recent Comments

EDITOR PICKS

Top Fashion Trends to Look for in Every Important Collection

Spring Fashion Show at the University of Michigan Has Started

Laptop with 128-bit Processor, 32GB of RAM and 24MP Front Camera

POPULAR POSTS

Reflecting on 18 Years at Google

Gboard Hat Version

Feathered robotic wing paves way for flapping drones

POPULAR CATEGORY