# Phonology Deep Dive | Vietnamese pronunciation, tones, and tone marks
> Vietnamese is a tonal language. This guide walks through all of the Vietnamese tones so you can learn how to pronounce them. Audio samples & diagrams. 
**URL:** https://migaku.com/blog/language-fun/vietnamese-tones-overview
**Last Updated:** 2025-05-02
**Tags:** deepdive, fundamentals
---
I'm not going to lie: Vietnamese phonology is pretty complex. Learning to hear and distinguish Vietnamese's tones will require you to learn about some new concepts, think about how you speak English, and listen carefully.

Having said that, you _can_ learn the tones.

This article walks through each tone in Vietnamese and everything you need to make sense of it. We've also gathered dozens of audio recordings from real Vietnamese speakers so that you can literally _hear_ what we're talking about.

This article isn't short—it'll be a 15–20 minute read—but by the time you're done, you'll know what you're doing and where to go from here.

Let's get into it:

<toc></toc>

---

> <CenteredText bold underline>Important note</CenteredText><br> <CenteredText>There are multiple dialects of Vietnamese. Some tones and sounds differ between them. This article will focus on the tones of Northern Vietnamese, which is what is typically taught to foreigners in Vietnamese language classes. </CenteredText>

## Is Vietnamese a tonal language?

I hate to be the bearer of bad news, but yes: Vietnamese is a tonal language.

In fact, you might call it a _very_ tonal language.

Whereas Mandarin has 4 tones (plus a neutral one), Vietnamese has 6 tones:

- A flat tone <custom-audio src="/assets/blog/vi_ba.mp3" :type="3"></custom-audio>

- A low falling tone <custom-audio src="/assets/blog/vi_bà.mp3" :type="3"></custom-audio>

- A mid / high rising tone <custom-audio src="/assets/blog/vi_bá.mp3" :type="3"></custom-audio>

- A glottalized mid falling tone <custom-audio src="/assets/blog/vi_bạ.mp3" :type="3"></custom-audio> <br> _(we'll explain what "glottalized" means down below; it's not as scary as it sounds)_

- A falling-rising tone <custom-audio src="/assets/blog/vi_bả.mp3" :type="3"></custom-audio>

- A glottalized rising tone <custom-audio src="/assets/blog/vi_bã.mp3" :type="3"></custom-audio>

_(Yes, each of those audio recordings had a different tone. If you couldn't tell how, you're in the right place 💪)_

Here they are visualized in this same order:

<img src="/assets/blog/migaku_vietnamese_tones_table.webp" width="1413" height="965" alt="A visualization of Vietnamese's six tones" />

(More precisely, Vietnamese is what's called a "register language" because its tones are defined not only by pitch contour/melody but also voice quality and length.)

That's a lot of big words, but hang on a second before you start panicking.

I've got some good news.

## The #1 thing you need to understand to make sense of Vietnamese tones

English makes heavy use of tones, too.

Don't believe me?

Well, check this out. [I'm going to say the word _dude_ eight times](https://www.sinosplice.com/life/archives/2015/01/27/kaisers-dude-system-of-tones), and it's going to mean something different every time.

Here we go:

1. <custom-audio src="/assets/blog/dude_annoyed.m4a" :type="3"></custom-audio>
2. <custom-audio src="/assets/blog/dude_high.m4a" :type="3"></custom-audio>
3. <custom-audio src="/assets/blog/dude_high_short.m4a" :type="3"></custom-audio>
4. <custom-audio src="/assets/blog/dude_you_dawg.m4a" :type="3"></custom-audio>
5. <custom-audio src="/assets/blog/dude_exasperated.m4a" :type="3"></custom-audio>
6. <custom-audio src="/assets/blog/dude_low_short.m4a" :type="3"></custom-audio>
7. <custom-audio src="/assets/blog/dude_falling.m4a" :type="3"></custom-audio>
8. <custom-audio src="/assets/blog/dude_rising.m4a" :type="3"></custom-audio>

Cool, huh?

I'm not much of a voice actor, but there's some cool things we can point out:

- #2 and #3 are both high tones, but the third is much shorter... and that matters! This is the difference between "dude, did you really eat my last candy bar?" and "dude—it's not that hard."
- #6 and #7 are both falling tones, but #6 is much shorter and #7 is much more intense. This is the difference between "dude, chill, it's not that big of a deal" and "dude, I said NO!"
- #5 is a dude of pure exasperation. _This_ dude in particular will be important for later on in the article, so listen to it a few times. In particular, notice how my voice's '"ooh" vowel gets grittier over the course of the recording.

And now we need to get technical for a second:

### Tones vs intonation

English has tones, and Vietnamese has tones, but they're not quite the same sort of tones:

- English has _intonation_, which means that we use tones to communicate information about how we feel about whatever we're saying

- Vietnamese is _tonal_, which means they use tones in a much more mechanical way to mark syllables (and they have intonation like we do, too—more on that in [Tjuka, Nguyen, Vijver, and Spalek (2024)](https://www.frontiersin.org/journals/education/articles/10.3389/feduc.2024.1411660/full))

What this means is that a rising tone (dude #8) is a sure signal that we're voicing a question or communicating our uncertainty/anxiety in English, but this isn't necessarily the case in Vietnamese. In Vietnamese, some syllables just rise by nature, whether they're part of a question or not.

If this is hard to wrap your mind around—say the word _articulation_ out loud for a second. Now find the beat. Notice what you said? arTIcuLAtion. some syllables are stressed—you pronounce them louder, more clearly, and hold them for slightly longer—and some syllables are unstressed. In _articulate_, the stressed syllables are TI and LA. Importantly, this isn't anything to do with TI or LA themselves—these same syllables are unstressed in words like PAtty and GElatin.

Tones work similarly in Vietnamese.

Each syllable can have a different "melody"—it might have a pitch that's high and flat, a pitch that initially dips before rising up high, or one of a handful of other shapes. Tones work differently in Vietnamese than they do in English, and that'll take getting used to... but, for now, just remember:

> As a native English speaker, you're already comfortable with the concept of tones. You use them in every single sentence you utter. To learn Vietnamese, you just need to get comfortable using tones in a mechanical fashion to mark words, rather than to express emotion.

---

## The five components of every tone in Vietnamese

In the section after this one, we're going to walk through all of Vietnamese's tones, step by step. To prepare for that, there are five things about tones in Vietnamese that I think you should pay attention to as a learner:

1. Syllables
2. The tone mark
3. The relative pitch a tone starts at
4. The quality of voice used when making a particular tone
5. Whether a syllable ends in a consonant or a vowel

These aren't necessarily The Things™ a textbook or journal article will use to categorize tones by... but, as a learner, each one does give you a nice little handhold to latch onto.

### 1. The syllable, as every tone occupies a syllable

Your journey to getting comfortable with Vietnamese tones starts with identifying Vietnamese syllables. Thankfully, this is very easy. Spaces are placed before and after each syllable in Vietnamese.

For example, there are three syllables in the phrase "in Vietnam":

- ở Việt Nam <custom-audio src="/assets/blog/vi-ở việt nam.mp3" :type="3"></custom-audio>
  - ở
  - Việt
  - Nam

What's a bit tricky is that these spaces are _also_ inserted between syllables that form words, such as between "Việt" and "Nam". You'll get used to this before long, but you might initially struggle to determine whether a particular syllable stands by itself or is part of a word.

### 2. Dấu, or tone mark, which tells you the tone of a syllable

The Vietnamese word "dấu" <custom-audio src="/assets/blog/vi-dấu.mp3" :type="3"></custom-audio> means mark/sign/symbol. (_Yes, in Vietnamese, "d" makes a Z sound.)_ There are five dấu, plus an dấu-less dấu, and each one serves to indicate the tone (ngang <custom-audio src="/assets/blog/vi_ngang.mp3" :type="3"></custom-audio>) of a word.

You may see these two terms used seemingly interchangeably online, but, strictly speaking, dấu refers to the written symbol that appears above/below a vowel, while ngang refers to the sound/pitch pattern of a tone.

Here are the Vietnamese tone markers:

| Dấu (Tone Mark) | Audio                                                               | Tone Mark Name                    | Description                                         | Thanh (Tone)                        |
| --------------- | ------------------------------------------------------------------- | --------------------------------- | --------------------------------------------------- | ----------------------------------- |
| a               | <custom-audio src="/assets/blog/vi-a.mp3" :type="3"></custom-audio> | **không dấu** <br> "no mark"      | No visible marking, as shown in "Nam" of "Việt Nam" | thanh ngang <br> "flat tone"        |
| à               | <custom-audio src="/assets/blog/vi-à.mp3" :type="3"></custom-audio> | **dấu huyền** <br> †"grave mark"  | A descending accent marker over a vowel             | thanh huyền <br> "falling tone"     |
| á               | <custom-audio src="/assets/blog/vi-á.mp3" :type="3"></custom-audio> | **dấu sắc** <br> "sharp mark"     | An ascending accent marker over a vowel             | thanh sắc <br> "rising tone"        |
| ạ               | <custom-audio src="/assets/blog/vi-ạ.mp3" :type="3"></custom-audio> | **dấu nặng** <br> "heavy mark"    | A dot under a vowel                                 | thanh nặng <br> "heavy tone"        |
| ả               | <custom-audio src="/assets/blog/vi-ả.mp3" :type="3"></custom-audio> | **dấu hỏi** <br> "asking mark"    | What looks like a ? without the dot, above a vowel  | thanh hỏi <br> "dipping tone"       |
| ã               | <custom-audio src="/assets/blog/vi-ã.mp3" :type="3"></custom-audio> | **dấu ngã** <br> "stumbling mark" | A tilde/squiggly mark above a vowel                 | thanh ngã <br> "broken rising tone" |

_† Note: Each of the tone names is just a normal Vietnamese word, and different people may translate them slightly differently. For example, you'll sometimes see ngang hỏi translated as "asking tone", and other times a "question tone"._

> <CenteredText bold underline>Warning: Non-tone markers</CenteredText><br> <CenteredText>There are seven Vietnamese letters—⟨ă⟩, ⟨â⟩, ⟨ê⟩, ⟨ô⟩, ⟨ơ⟩, ⟨ư⟩, and ⟨đ⟩—which include a marker/accent by default. As such, ⟨ấ⟩ is really ⟨â⟩+ ⟨á⟩. This will be confusing at first, but you'll get the hang of it. </CenteredText>

<accordion heading="How to know which vowel to put the marker on">

While it may not appear so at first, the placement of Vietnamese tone markers is actually standardized.

1. If a syllable has only one vowel, the dấu goes on that vowel → không, ngã
2. If a syllable has multiple vowels and ends in a <u>vowel</u>, the dấu goes on second-to-last vowel → hỏi, dấu
3. If a syllable has multiple vowels and ends in a <u>consonant</u>, the dấu goes on the last vowel → huyền, Việt

</accordion>

### 3. Starting point, or the relative highness/lowness of a tone's pitch

Bear with me for a moment, but I'm going to state something very obvious:

> Different people have different voices.

And this is super important to understand.

- A piano has _absolute_ tones—if you whack the middle C key on any piano, it'll sound at exactly the same pitch
- Vietnamese has _relative_ tones—what matters is not necessarily the specific pitch of a tone, but rather its pitch in relation to the tones that appear around it

For example, consider that we have two people: a young girl with a naturally higher-pitched voice and an old man with a naturally lower-pitched voice. Chances are, the young girl's thanh nặng (mid falling tone) will be pronounced at a higher pitch than the old man's thanh sắc (mid / high-rising tone).

Another important implication of the fact that everybody's voice is unique is that not everybody will make these tones in exactly the same way. Take a moment to skim these charts (both taken from [Nguyễn & Edmondson (1998)](http://www.sealang2.net/archives/mks/pdf/28:1-18.pdf)) for me:

Vietnamese's six tones, as uttered by a male from northern Vietnam:

<img src="/assets/blog/wikipedia_vietnamese_tones_1.jpeg" width="1243" height="620" alt="A diagram of one Hanoi speaker's tones" />

Vietnamese's six tones, as uttered by a female from Hanoi (in northern Vietnam):

<img src="/assets/blog/wikipedia_vietnamese_tones_2.jpeg" width="939" height="648" alt="A diagram of a second Hanoi speaker's tones" />

The _shape_ of the tones is more or less the same, but there's also a fair bit of variance!

- The first speaker's sắc tone (blue) started _higher_ than their ngang tone (black), whereas the second speaker's sắc tone started _lower_ than their ngang tone.
- The first speaker's hỏi tone (green) dropped and didn't rise, whereas the second speaker's hỏi tone dropped and rose. (Typically, it _does_ rise back up when it appears at the end of an utterance or you're speaking carefully, but it _doesn't_ rise back up when you're speaking normally/quickly.)
- (You'll notice a lot of things if you look closely)

> <CenteredText bold underline>The point</CenteredText><br> <CenteredText>You don't need perfect pitch to make Vietnamese tones. Your mid flat / level tone doesn't need to be pronounced at the same pitch every single time. Your _relative_ pitches are what matter: it's not a big deal if your sắc starts a bit higher or a bit lower, but it should end higher than your ngang.</CenteredText>

### 4. Phonation type, or the type of voice you use

This section was originally very complex, but I've decided to simplify it. Go ahead and check out [the article Wikipedia](https://en.wikipedia.org/wiki/Phonation) if you want a bit more complexity, or [Edmondson (2006)](https://www.researchgate.net/publication/231909247_The_valves_of_the_throat_and_their_functioning_in_tone_vocal_register_and_stress_laryngoscopic_case_studies) for a much more anatomically-heavy look. (It's cool, if you're a nerd.)

Having said that, there are two main things you need to understand about speech:

1. Speaking involves expelling air from your mouth (and sometimes your nose)
2. The manner in which you manipulate that airflow significantly affects how your voice sounds

For example, from more constricted to more lax airflow:

| Tension<br>(↑more) | Quality       | Audio                                                                            | Description                                                                                                           |
| ------------------ | ------------- | -------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------- |
| ↑                  | Glottal stop  | <custom-audio src="/assets/blog/glottal_stop.m4a" :type="3"></custom-audio>      | The sound in the middle of "uh-uh"                                                                                    |
| ┃                  | Creaky voice  | <custom-audio src="/assets/blog/creaky_phonation.m4a" :type="3"></custom-audio>  | AKA "vocal fry", here's [a lot of audio samples](https://www.youtube.com/watch?v=Q0yL2GezneU&t=629s)                  |
| ┃                  | Tense voice   | <custom-audio src="/assets/blog/tense_phonation.m4a" :type="3"></custom-audio>   | Creaky voice, but with more airflow—a kind of strained voice                                                          |
| —                  | Modal voice   | <custom-audio src="/assets/blog/modal_voice.m4a" :type="3"></custom-audio>       | Just your normal talking voice                                                                                        |
| ┃                  | Slack voice   | N/A                                                                              | _(Not important for Vietnamese, but between modal and breathy)_                                                       |
| ┃                  | Breathy voice | <custom-audio src="/assets/blog/breathy_phonation.m4a" :type="3"></custom-audio> | Talking and letting more air out than normal—think [Marilyn Monroe](https://www.youtube.com/watch?v=heQaJPFP5gU&t=44) |
| ↓                  | Whisper       | <custom-audio src="/assets/blog/whisper_phonation.m4a" :type="3"></custom-audio> | If you relax your larynx further, you end up whispering                                                               |

The audio recordings likely sound a bit weird because people don't normally talk "purely" in one of these other voices. We use our modal/normal voice, and then we sprinkle in bits of creakiness or breathiness for effect. It might help to think of these voice qualities as being knobs you dial up or dial down:

- If you constrict the muscles in your neck (your larynx), the reduced airflow leads to creakiness and eventually cuts off your sound all together (a glottal stop)
- If you relax the muscles in your throat (the larynx), air can escape your voice more naturally, and you get a more breathy/wistful quality

_(If you ctrl + f for "#5 is a dude of pure exasperation", that recording features a transition from modal to tense voice)_

> <CenteredText bold underline>Important point</CenteredText><br> <CenteredText>Vietnamese tones aren't _just_ about the tones. Some tones are made with modal (normal) voice quality, some are made with a creaky voice, and some are made with a breathy voice. We'll talk about this down below in the step-by-step guide.</CenteredText>

### 5. Checked tones, or whether a tone ends in a consonant or vowel

Eventually, you'll need to build mastery over the tones, recognize them when you hear them, and become able to reproduce each one confidently.

For now, though, here's a quick hack for you:

- If a syllable ends in a P, T, or K sound, its tone _must_ be either sắc (sharp/mid rising) or nặng (mid glottalized falling)
- If a syllable ends with any other sound, it can be any of Vietnamese's 6 tones
  - If it ends in a high tone, it's either sắc or ngã (and may sound slightly different than the "normal" sắc/ngã, but you can pick this up as you go)
  - If it ends in a low tone, it's either nặng or huyền
  - Hỏi can go both ways—it might dip and stay low (in normal/fast speech) or it might dip and rise back up (in careful speech/the end of an utterance)

And now let's get a crash course into each of Vietnamese's 6 tones—and, more importantly, hear a bunch of audio samples for each one.

---

## A step-by-step guide to all six tones (+ audio)

As we said in the beginning of the article, Vietnamese has six tones. We're now ready to talk about them—but before you start thinking too hard about this, take a second and listen to your ears.

In particular, I want to do three things:

1. Look at each tone's name
2. Find that tone on the below graphic and note it's general shape
3. Listen to the recordings; think about (a) how different speakers pronounce the same tone, and (b) what distinguishes each tone from the other tones

- **Ngang**: Ba <custom-audio src="/assets/blog/vi_ba.mp3" :type="3"></custom-audio> <custom-audio src="/assets/blog/vi_ba_creaky.mp3" :type="3"></custom-audio> <custom-audio src="/assets/blog/vi_ba_breathy.mp3" :type="3"></custom-audio> ・ La <custom-audio src="/assets/blog/vi-la.mp3" :type="3"></custom-audio> <custom-audio src="/assets/blog/vi_la_creak.mp3" :type="3"></custom-audio>
- **Huyền**: Bà <custom-audio src="/assets/blog/vi_bà_m.mp3" :type="3"></custom-audio><custom-audio src="/assets/blog/vi_bà_f.mp3" :type="3"></custom-audio>・ Là <custom-audio src="/assets/blog/vi_là_m.mp3" :type="3"></custom-audio><custom-audio src="/assets/blog/vi_là_f.mp3" :type="3"></custom-audio>
- **Sắc**: Bá <custom-audio src="/assets/blog/vi_bá_m.mp3" :type="3"></custom-audio> <custom-audio src="/assets/blog/vi_bá_f.mp3" :type="3"></custom-audio> ・ Lá <custom-audio src="/assets/blog/vi_lá_breath.mp3" :type="3"></custom-audio><custom-audio src="/assets/blog/vi_lá.mp3" :type="3"></custom-audio>
- **Nặng**: Bạ <custom-audio src="/assets/blog/vi_bạ_f.mp3" :type="3"></custom-audio><custom-audio src="/assets/blog/vi-bạ_m.mp3" :type="3"></custom-audio>・Lạ <custom-audio src="/assets/blog/vi-lạ_f.mp3" :type="3"></custom-audio><custom-audio src="/assets/blog/vi-lạ_m.mp3" :type="3"></custom-audio>
- **Hỏi**: Bả <custom-audio src="/assets/blog/vi_bả_m1.mp3" :type="3"></custom-audio><custom-audio src="/assets/blog/vi_bả_m2.mp3" :type="3"></custom-audio>・Lả <custom-audio src="/assets/blog/vi_lả_m.mp3" :type="3"></custom-audio>
- **Ngã**: Bã <custom-audio src="/assets/blog/vi-bã.mp3" :type="3"></custom-audio>・Lã <custom-audio src="/assets/blog/pronunciation_vi_lã.mp3" :type="3"></custom-audio><custom-audio src="/assets/blog/vi-lã.mp3" :type="3"></custom-audio>

<img src="/assets/blog/migaku_vietnamese_tones_overview.webp" width="1413" height="965" alt="A visualization of Vietnamese's six tones" />

_I've summarized the following notes from the common points of where [Brunelle, 2009;](https://www.researchgate.net/publication/293182693) [Pham, 2003;](https://www.internationalphoneticassociation.org/icphs-proceedings/ICPhS2003/papers/p15_1703.pdf) [Phạm & McCleod, 2019;](https://researchoutput.csu.edu.au/ws/portalfiles/portal/199645633/32337908_Accepted_manuscript.pdf) and Wikipedia overlap. Also featured are wonderfully clear audio recordings from [Tung Hoang of Open Lib](https://openbooks.lib.msu.edu/vietnamese/chapter/section-2-tone-and-tone-marks/), via [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/)._

### Ngang, the "flat" tone

Ngang is your baseline: it's what every other tone you make in Vietnamese is set against. Thankfully, it's also the easiest tone to make.

Before we get started, close your eyes and listen to several syllables with this tone:

> <CenteredText bold underline> Tung Hoang's audio: <custom-audio src="/assets/blog/Tung_Hoang_Ngang.mp4" :type="3"></custom-audio></CenteredText>

<img src="/assets/blog/migaku-vietnamese-ngang.webp" width="1875" height="1131" alt="A screenshot showing the pitch contour of Vietnamese's 'ngang' tone" />

Some pointers:

- This tone is completely flat. As a native English speaker, you'll likely want to drop your pitch toward the end of it. _Don't_. Maintain a flat, almost monotonous tone quality that does not waver or move.
- Use your normal voice to make this tone. If that doesn't sound quite right, place your fingers lightly on your Adam's apple and then say the tone slightly higher and lower in pitch. You're looking for a place that is very comfortable and creates the most vibration against your fingertips.
- Your pitch should be in the middle, or slightly above the middle, of your vocal range

Now listen to these audio samples while thinking about the above details:

- con voi <custom-audio src="/assets/blog/pronunciation_vi_con_voi.mp3" :type="3"></custom-audio> _(elephant)_
- hoa sen <custom-audio src="/assets/blog/pronunciation_vi_hoa_sen.mp3" :type="3"></custom-audio> _(lotus)_
- năm châu <custom-audio src="/assets/blog/pronunciation_vi_năm_châu.mp3" :type="3"></custom-audio> _(5 continents)_
- ban đêm <custom-audio src="/assets/blog/pronunciation_vi_ban_đêm.mp3" :type="3"></custom-audio> _(at night)_
- tivi <custom-audio src="/assets/blog/pronunciation_vi_tivi.mp3" :type="3"></custom-audio> _(television)_

### Huyền, the "deep" tone

Huyền is very similar to Ngang, but it is lower in pitch. With only that in mind, listen carefully to Hoang's audio recordings and see what your ears pick up on:

> <CenteredText bold underline> Tung Hoang's audio: <custom-audio src="/assets/blog/Tung_Hoang_Huyen.mp4" :type="3"></custom-audio></CenteredText>

<img src="/assets/blog/migaku-vietnamese-huyen.webp" width="1875" height="1134" alt="A screenshot showing the pitch contour of Vietnamese's 'huyền' tone" />

Some tips:

- Huyền is somewhat flat, like ngang, but drops in pitch gradually over the tone's duration
- The #1 differentiator between ngang and huyền is that huyền is pronounced with breathiness, whereas ngang is pronounced with your normal voice; the amount of breathiness varies from speaker to speaker
- The pitch of huyền is lower than that of ngang, but this isn't necessarily something you need to worry about consciously. Say _ahh_ normally, and then, staying just as relaxed, say _ahh_ in a breathy fashion. You'll notice that when you add breathiness to the sound, the pitch of your voice drops.

Now listen to these audio samples with the above pointers in mind:

- bà già <custom-audio src="/assets/blog/vi_bà_già.mp3" :type="3"></custom-audio> _(old woman)_
- mùa màng <custom-audio src="/assets/blog/vi-mùa màng.mp3" :type="3"></custom-audio> _(crops)_
- vừa lòng <custom-audio src="/assets/blog/vi-vừa lòng.mp3" :type="3"></custom-audio> _(satisfied / pleased)_
- lờ đờ <custom-audio src="/assets/blog/vi-lờ đờ.mp3" :type="3"></custom-audio> _(dull/lack-lustre)_
- hài hoà <custom-audio src="/assets/blog/vi-hài hoà.mp3" :type="3"></custom-audio> _(harmony)_

### Sắc, the "sharp" tone

Sắc is one of Vietnamese's rising tones. Again, before we say more, listen to Hoang's audio and see what you observe:

> <CenteredText bold underline> Tung Hoang's audio: <custom-audio src="/assets/blog/Tung_Hoang_Sac.mp4" :type="3"></custom-audio></CenteredText>

<img src="/assets/blog/migaku-vietnamese-sac.webp" width="1875" height="1131" alt="A screenshot showing the pitch contour of Vietnamese's 'sắc' tone" />

Some pointers:

- Sắc starts at a lower pitch, hangs flat for just a moment, and then quickly rises in tone—it ends at a tone higher than ngang, but how high it goes depends on the speaker
- Sources disagree: some scholars suggest sắc is pronounced with a modal (normal) voice, others that it is pronounced with a †tense voice (between normal and creaky)
- Sometimes (I can't quite figure out when/why) this tone is simply pronounced as a high, short, almost flat pitch—note the 3rd word in Hoang's audio, _tóc_ below, and _sắc_ below; it seems to be related to when two sắc tones appear in succession
- As you get more advanced, start paying attention to how this tone sounds when it occurs in syllables that end in a P, T or K sound vs. those that don't

Now that you've got an image of this tone, try listening to it again:

- mái tóc <custom-audio src="/assets/blog/vi-mái tóc.mp3" :type="3"></custom-audio> _(hair)_
- bánh tráng <custom-audio src="/assets/blog/vi_bánh_tráng.mp3" :type="3"></custom-audio> _(rice paper)_
- pháo sáng <custom-audio src="/assets/blog/vi-pháo sáng.mp3" :type="3"></custom-audio> _(flares)_
- dấu sắc <custom-audio src="/assets/blog/vi_dấu_sắc.mp3" :type="3"></custom-audio> _(sharp tone mark)_
- phấn trắng <custom-audio src="/assets/blog/vi-phấn trắng.mp3" :type="3"></custom-audio> _(white chalk)_

_† Note: If you don't know what this means, go skim #4 of the "five components" section above._

### Nặng, the "heavy" tone

Nặng is a low tone, and it is also a very short tone. With this in mind, go ahead and listen to Hoang's audio:

> <CenteredText bold underline> Tung Hoang's audio: <custom-audio src="/assets/blog/Tung_Hoang_Nang.mp4" :type="3"></custom-audio></CenteredText>

<img src="/assets/blog/migaku-vietnamese-nang.webp" width="1875" height="1134" alt="A screenshot showing the pitch contour of Vietnamese's 'nặng' tone" />

Some pointers:

- Say _uh-oh_ a few times, slowing down as you go. Notice how you kind of "swallow" the middle of the word—your "uh" suddenly stops? This is called a glottal stop, and nặng's defining characteristic is that it ends in a glottal stop
- This is Vietnamese's lowest tone—start it off lower in your vocal register, and quickly drop your tone, tensing your voice until a glottal stop snuffs the sound out
- This is Vietnamese's shortest tone; it lasts much less time than the other ones
- As you get more advanced, start paying attention to how this tone sounds when it occurs in syllables that end in a P, T or K sound and those that don't

And now to the audio:

- bận rộn <custom-audio src="/assets/blog/vi_bận_rộn.mp3" :type="3"></custom-audio> _(busily)_
- tận tụy <custom-audio src="/assets/blog/vi-tận tụy.mp3" :type="3"></custom-audio> _(devoted/dedicated)_
- chậm chạp <custom-audio src="/assets/blog/vi_chậm_chạp.mp3" :type="3"></custom-audio> _(slow/sluggish)_
- tệ hại hại <custom-audio src="/assets/blog/vi_thật_tệ_hại.mp3" :type="3"></custom-audio> _("that's terrible")_
- tịch mịch <custom-audio src="/assets/blog/vi-tịch mịch.mp3" :type="3"></custom-audio> _(lonely/quiet)_

### Hỏi, the "asking" tone

This is perhaps Vietnamese's most complex tone. Before thinking about it logically, take a few listens to Hoang's recording and see what stands out to you:

> <CenteredText bold underline> Tung Hoang's audio: <custom-audio src="/assets/blog/Tung_Hoang_Hoi.mp4" :type="3"></custom-audio></CenteredText>

<img src="/assets/blog/migaku-vietnamese-hoi.webp" width="1875" height="1134" alt="A screenshot showing the pitch contour of Vietnamese's 'hỏi' tone" />

Some pointers:

- Hỏi is one of Vietnamese's "dipping" tones—it initially drops down, then rises back up
- In careful speech (such as that of these speakers, who know they are being recorded) or when it occurs at the end of an utterance, the complete hỏi tone is used—it dips and rises; in normal speech, or in the middle of a sentence, hỏi often simply drops low without rising back up
- While this tone rises, it does not rise as much as sắc or ngã; the highest-pitch portion of this sound (its end) is only roughly as high as ngang
- The researchers came to quite different conclusions about how the voice quality of this sound works in a way that isn't easily reconciled; from my (non-professional) interpretation of several audio recordings, this sound (a) uses a breathy voice as it drops, and then (b) transitions into a tenser voice as it rises back up

And now for the examples:

- khỏi phải <custom-audio src="/assets/blog/vi-khỏi phải.mp3" :type="3"></custom-audio> _(no need to / not necessary to)_
- khủng hoảng <custom-audio src="/assets/blog/vi_khủng_hoảng.mp3" :type="3"></custom-audio> _(crisis)_
- thỉnh thoảng <custom-audio src="/assets/blog/vi_thỉnh_thoảng.mp3" :type="3"></custom-audio> _(sometimes)_
- cả nể <custom-audio src="/assets/blog/vi-cả nể.mp3" :type="3"></custom-audio> _(easily agreeing to everyone's requests (because of fear of letting people down))_
- tưởng thưởng <custom-audio src="/assets/blog/vi-tưởng thưởng.mp3" :type="3"></custom-audio> _(to reward/compensate)_

### Ngã, the "tumbling" tone

Ngã is similar to sắc, but with a distinct difference that'll pop out at you. Listen to Hoang's audio and see if you can pick it out:

> <CenteredText bold underline> Tung Hoang's audio: <custom-audio src="/assets/blog/Tung_Hoang_Nga.mp4" :type="3"></custom-audio></CenteredText>

<img src="/assets/blog/migaku-vietnamese-nga.webp" width="1875" height="1134" alt="A screenshot showing the pitch contour of Vietnamese's 'ngã' tone" />

Some pointers:

- This is Vietnamese's other dipping tone: it initially drops a bit, then rises
- This sound is broken up by a glottal stop: say _uh-oh_, but drop when you say _uh_ and rise when you say _oh_, taking care to preserve the "swallowing" that happens in between _uh_ and _oh_—that's the rough "shape" of this tone
- This tone's final pitch is on par with that of sắc for some speakers, but may go notably higher for other speakers
- To differentiate this from hỏi, listen for (a) the glottal stop or significant tenseness and (b) a high ending pitch

Nga without break in its middle would be like sắc, a rising tone.

- ngã ngũ <custom-audio src="/assets/blog/vi-ngã ngũ.mp3" :type="3"></custom-audio> _(stalemate, deadlock)_
- dĩ vãng <custom-audio src="/assets/blog/vi_dĩ_vãng.mp3" :type="3"></custom-audio> _(the past)_
- dễ dãi <custom-audio src="/assets/blog/vi-dễ dãi.mp3" :type="3"></custom-audio> _(overly permissive)_
- mỹ mãn <custom-audio src="/assets/blog/vi-mỹ mãn.mp3" :type="3"></custom-audio> _(perfect)_
- cũ kĩ <custom-audio src="/assets/blog/vi-cũ kĩ.mp3" :type="3"></custom-audio> _(old / worn out)_

## Some quick notes about Southern Vietnamese vs Northern Vietnamese

This is going to be very brief, but on the off-chance you're learning Southern Vietnamese, here are a few things you should keep in mind.

For an in-depth contrast of Vietnam's four main dialects, see:

- [Consonants, vowels and tones across Vietnamese dialects by Phạm, B. & McLeod, S. (2016)](https://pubmed.ncbi.nlm.nih.gov/27172848/) _(emphasis on consonants and vowels)_
- [The ups and downs of Vietnamese tones by Bauman, Blodgett, Rytting, and Shamoo (2009)](https://sealinguist.wordpress.com/wp-content/uploads/2015/04/tto_2118_e-5-3_the_ups_and_downs_of_vietnamese_tones_section2.pdf) _(emphasis on tones)_

### Hỏi and ngã have merged

Hỏi and ngã, the final two tones discussed above, are Vietnamese's two "dipping" tones. In Southern Vietnam, they have merged.Notice how, in the bottom-right chart, the purple and brown line are almost completely superimposed.

<img src="/assets/blog/migaku_northern_southern_vietnamese.jpeg" width="1244" height="538" alt="A comparison of the tonal inventory of Northern and Southern Vietnamese accents" />

> <CenteredText> _Diagram sourced from Bauman, Blodgett, Rytting, and Shamoo (2009)_</CenteredText><br>

### Pitch contour (the shape of your tone) is more important than phonation type (the type of voice you use)

If you're learning Southern Vietnamese, you can ignore all of the above notes about modal/creaky/breathy voice. While tones and voice quality matter in Northern Vietnamese, tones in Southern Vietnamese are differentiated purely by their pitch contour (their "melody").

### Vietnamese vowels and consonant clusters

This is beyond the scope of this article—for in-depth information, see Phạm, B. & McLeod, S. (2016), linked above.

From a very high-level perspective:

- TR is pronounced differently, and neither sound exists in English. Both sounds are similar to T, but made with the middle of your tongue against the roof of your mouth in Northern Vietnam ( /c/<custom-audio src="/assets/blog/Voiceless_palatal_plosive.ogg" :type="3"></custom-audio>) and with the tip of your tongue raised up and back toward the middle of the roof of your mouth in Southern Vietnam (/ʈ/ <custom-audio src="/assets/blog/Voiceless_retroflex_stop.ogg" :type="3"></custom-audio> )
- P at the beginning of a syllable is pronounced like a B in Southern Vietnam
- Q at the beginning of a word is pronounced like a K in Northern Vietnam but a W in Southern Vietnam
- V D and GI at the beginning of a word are pronounced like a Z in Northern Vietnam but a /j/ (the sound in the beginning of "yes") in Southern Vietnam
- S is pronounced as you'd expect in Northern Vietnam, but with the tip of your tongue angled up and backwards in Southern Vietnam (/ʂ/ <custom-audio src="/assets/blog/Voiceless_retroflex_sibilant.ogg" :type="3"></custom-audio>)
- R can be pronounced in multiple ways in both Northern and Southern Vietnam; it's too much to summarize here, so make a point to listen out for it as you consume Vietnamese media
- The diphthongs iê and yê are pronounced as monophthongs like the EE in "feet" in Southern Vietnam, whereas both the i and the ê get pronounced in Northern Vietnam (sounding like a more carefully articulated version of the "ye" in "yes")
- The diphthong Uô is pronounced as a monophthong like the "oo" in "goose" in Southern Vietnam, whereas each vowel is pronounced in Northern Vietnam (oow-aww); the same goes for ươ, but the vowel quality is slightly different (you'll hear it)

This isn't a complete list of all the changes; it's just the things that I felt could be summarized in a bullet point.

---

## A big table of tone pairs

Tones do not occur in isolation in Vietnamese: they always occur next to other tones. This is important for two reasons:

- If you can learn to recognize and produce the following tone pairs, you'll be able to pronounce the tones in any word
- Some tones undergo small changes when they come before or after another tone; this is subtle and we won't cover it here, but be aware of it, and listen out for it as your Vietnamese improves (_the technical terms for this are [progressive and regressive assimilation](https://en.wikipedia.org/wiki/Assimilation_(phonology))\_)

As you progress from left to right in this table, the tone of the second syllable will change; as you progress from top to bottom, the tone of the first syllable will change.

| Tone<br>combo | **o**                                                                                                     | **ò**                                                                                                       | **ó**                                                                                                  | **ọ**                                                                                                       | **ỏ**                                                                                                          | **õ**                                                                                                   |
| ------------- | --------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- |
| **o**         | **con ong** <custom-audio src="/assets/blog/vi_con_ong.mp3" :type="3"></custom-audio> <br> bee            | **ban đầu** <custom-audio src="/assets/blog/vi-ban đầu.mp3" :type="3"></custom-audio> <br> initially        | **ca hát** <custom-audio src="/assets/blog/vi-ca hát.mp3" :type="3"></custom-audio> <br> to sing       | **đi bộ** <custom-audio src="/assets/blog/vi_đi_bộ.mp3" :type="3"></custom-audio> <br>to walk               | **em nhỏ** <custom-audio src="/assets/blog/vi-em nhỏ.mp3" :type="3"></custom-audio> <br>little child           | **lơ đễnh** <custom-audio src="/assets/blog/vi-lơ đễnh.mp3" :type="3"></custom-audio> <br>absent-minded |
| **ò**         | **mười năm** <custom-audio src="/assets/blog/vi-mười năm.mp3" :type="3"></custom-audio> <br> ten years    | **nhà hàng** <custom-audio src="/assets/blog/vi_nhà_hàng.mp3" :type="3"></custom-audio> <br>restaurant      | **quần áo** <custom-audio src="/assets/blog/vi_quần_áo.mp3" :type="3"></custom-audio> <br>clothing     | **trường học** <custom-audio src="/assets/blog/vi_trường_học.mp3" :type="3"></custom-audio> <br>school      | **tiền lẻ** <custom-audio src="/assets/blog/vi_tiền_lẻ.mp3" :type="3"></custom-audio> <br>coins/change         | **hà mã** <custom-audio src="/assets/blog/vi-hà mã.mp3" :type="3"></custom-audio> <br> hippo            |
| **ó**         | **phóng viên** <custom-audio src="/assets/blog/vi_phóng_viên.mp3" :type="3"></custom-audio> <br> reporter | **bánh mì** <custom-audio src="/assets/blog/vi_bánh_mì.mp3" :type="3"></custom-audio> <br>bread             | **máy tính** <custom-audio src="/assets/blog/vi_máy_tính.mp3" :type="3"></custom-audio> <br>calculator | **rát họng** <custom-audio src="/assets/blog/vi-rát họng.mp3" :type="3"></custom-audio> <br>sore throat     | **xét hỏi** <custom-audio src="/assets/blog/vi-xét hỏi.mp3" :type="3"></custom-audio> <br>to interrogate       | **bác sĩ** <custom-audio src="/assets/blog/vi_bác_sĩ.mp3" :type="3"></custom-audio> <br>doctor          |
| **ọ**         | **học sinh** <custom-audio src="/assets/blog/vi_học_sinh.mp3" :type="3"></custom-audio> <br> student      | **thịt gà** <custom-audio src="/assets/blog/vi_thịt_gà.mp3" :type="3"></custom-audio> <br>chicken (as food) | **Phật giáo** <custom-audio src="/assets/blog/vi_phật_giáo.mp3" :type="3"></custom-audio> <br>Buddhism | **hoạt động** <custom-audio src="/assets/blog/vi-hoạt động.mp3" :type="3"></custom-audio> <br>activity      | **địa điểm** <custom-audio src="/assets/blog/vi_địa_điểm.mp3" :type="3"></custom-audio> <br>location           | **rực rỡ** <custom-audio src="/assets/blog/vi-rực rỡ.mp3" :type="3"></custom-audio> <br>bright/radiant  |
| **ỏ**         | **hỏi han** <custom-audio src="/assets/blog/vi-hỏi han.mp3" :type="3"></custom-audio> <br> to inquire     | **cửa hàng** <custom-audio src="/assets/blog/vi_cửa_hàng.mp3" :type="3"></custom-audio> <br>shop/store      | **cảnh sát** <custom-audio src="/assets/blog/vi_cảnh_sát.mp3" :type="3"></custom-audio> <br>policeman  | **nghỉ bệnh** <custom-audio src="/assets/blog/vi-nghỉ bệnh.mp3" :type="3"></custom-audio> <br>sick leave    | **thỉnh thoảng** <custom-audio src="/assets/blog/_vi_thỉnh_thoảng.mp3" :type="3"></custom-audio> <br>sometimes | **giải mã** <custom-audio src="/assets/blog/vi-giải mã.mp3" :type="3"></custom-audio> <br>to decode     |
| **õ**         | **mỹ nhân** <custom-audio src="/assets/blog/vi_mỹ_nhân_(美人).mp3" :type="3"></custom-audio> <br> nymph   | **đỡ đần** <custom-audio src="/assets/blog/vi-đỡ đần.mp3" :type="3"></custom-audio> <br>to assist           | **bẫy cá** <custom-audio src="/assets/blog/vi-bẫy cá.mp3" :type="3"></custom-audio> <br>fish trap      | **nghĩa vụ** <custom-audio src="/assets/blog/vi-nghĩa vụ.mp3" :type="3"></custom-audio> <br>duty/obligation | **trễ nải** <custom-audio src="/assets/blog/vi-trễ nải.mp3" :type="3"></custom-audio> <br>tardy/delay          | **mỹ mãn** <custom-audio src="/assets/blog/vi-mỹ mãn.mp3" :type="3"></custom-audio> <br>perfect         |

---

## How am I ever going to learn all of this?

First: _Yes_, you can learn how to pronounce Vietnamese's tones. It'll eventually be second nature and won't require thought or conscious effort on your part at all.

On the way there, though, you'll pass through several stages:

- Previusly, you didn't know which tones Vietnamese had
- Now, you know them, but you likely can't reliably distinguish them
- Next, you'll be able to pick them out when you hear them, but won't be able to pronounce them well
- After that, you'll be able to make tones confidently when you can focus, but will make mistakes when stressed (like when having a conversation, for example)
- Eventually you'll become able to produce consistently correct tones that Vietnamese people understand without effort—_but_ there'll still be little nuances you get wrong
- Perhaps, with a lot of listening, intentional practice, and (likely) some guidance from a professional, you'll be able to make the tones perfectly naturally

In fact, this same basic pipeline can be applied to learning pretty much any aspect of Vietnamese—or, indeed, learning anything about anything. It's a process!

> Right now, the most important thing you can do is listen to a lot of Vietnamese content—whether that's real people, YouTube, Netflix, podcasts, or anything that lets you hear Vietnamese as it's actually spoken.

Migaku facilitates this process by making text interactive—you can simply click on words you don't know to see definitions of what they mean. This makes it possible to make sense of Vietnamese media even if you aren't that good at Vietnamese yet.

<img src="/assets/blog/migaku_vietnamese_mobile_youtube.jpeg" width="1806" height="1256" alt="A screenshot of Migaku's app, showing how we make text in Vietnamese subtitles interactive" />

To get more out of the time you spend in Vietnamese, you can click that orange button in the top-right corner of the dictionary to make flashcards out of useful-looking words. We'll nudge you to review it periodically (this is called _[spaced repetition](/blog/language-fun/spaced-repetition-language-learning)_) and, gradually, these words will work their way into your long-term memory.

<img src="/assets/blog/migaku_vietnamese_mobile_mining.jpeg" width="1802" height="1254" alt="A screenshot of Migaku's app, showing how allow you to make flashcards from content you consume" />

When you eventually begin interacting with native Vietnamese speakers, every conversation will give you immediate feedback. You'll either find that you spoke clearly enough to be understood—in which case, great!—_or_ you'll gradually identify trouble syllables that you need to pay a bit more attention to.

For now: immerse, enjoy, and improve!

---

## Phonology, articulation, or whatever this has been...

...Phew.

That was a lot. Even for me.

If you're feeling a bit overwhelmed right now—that's OK. This takes time, and it'll come with its fair share of challenges.

For now, just remember one thing:

> The way we _really_ learn languages is by interacting with them. If you consume Vietnamese media, and you understand at least some of the sentences and messages within it, you'll make progress. _Period_.

Now, go take a break.

Bookmark this page, come back to skim through the explanations from time to time, and try to see if the audio recordings have become any clearer to you.

Good luck!
