Table of Contents

Intro

As language learners, we need to be able to talk about different levels of ability. It's what allows us to set concrete goals, measure how effective different methods are, and determine how qualified someone is to give advice. Yet, in my opinion, we still don’t have an adequate system for discussing language ability in a nuanced and technical way.

At this point, it's already a cliché to complain that the terms "fluency" and "native level" are too vague to be helpful. Everyone's definition of "fluency" is different, and not all native speakers are at the same level of ability. Systems such as CEFR or JALUP's "Levels" are big steps up, as they provide a full continuum of well-defined levels, rather than a simple binary. But these systems are still extremely limited in that they don't account for the multi-faceted nature of language ability. What if you can write academic articles in a language, but don't have enough verbal fluency to hold a conversation? What if you're a native speaker who never learned how to read? What if your overall linguistic competence exceeds most native speakers, but you speak with a thick foreign accent? These models can't account for such imbalances in the different aspects of ability. And because almost everyone is imbalanced in one way or another, this constitutes a big problem.

And this problem isn't easy to solve, as there is no obvious way of defining what counts as a separate "aspect" of language ability. For example, we could break "speaking ability" into "pronunciation," "grammar," and "word choice." We could also further divide "pronunciation" into "timbre" (pronunciation of vowels and consonants), "stress accent/pitch accent/tones," and "intonation". Or maybe "cadence" and "flow" should be given their own categories as well. And maybe how much "flow" someone can speak with depends on how familiar they are with a topic, so we might want "flow when speaking about a familiar topic" and "flow when speaking about an unfamiliar topic". Or do we need "flow when speaking about a semi-familiar topic" as well? As you can see, there is a virtually unlimited number of different categories that we could hypothetically slice and dice language ability into. The more distinct categories we create, the greater detail we will be able to describe language ability with. But more distinct categories also mean a more complicated and convoluted model. If the model is too complicated, it will be confusing, cumbersome, and therefore effectively useless.

The problem is further complicated by the fact that many aspects of "language ability" exist on a continuum alongside skills that are seemingly unrelated to language. For example, it's hard to draw a clear line between "speaking ability" and "social skills," or "vocabulary" and "knowledge" (is not knowing the difference between a "lunar eclipse" and a "solar eclipse" an issue of scientific knowledge, vocabulary, or both?) We also must take into account that judgments about linguistic ability are often inherently subjective. For example, let's say we decide that someone who strings words together in a creative and elegant way is "better" at speaking than the average person… How exactly do we measure how "creative" and "elegant" someone's speech is?

After taking all the above into consideration, I created what I call the "6-Point Model of Language Ability." In this model, language ability is first broken down into "spoken language ability" and "written language ability," and then each of these two categories are further divided into six distinct aspects of ability. Each category is rated on a one-to-ten scale, ten being the hypothetical peak of that aspect of ability.

Considerations

Before I jump into the details of the different categories, let me touch on a few broader points about the model:

In order to be applicable to both native speakers and learners alike, the model is set up in such a way that the average native speaker won't receive a perfect score in any category. For the model to be comprehensive, a perfect score must signify the hypothetical peak of ability in that category, so it wouldn't make sense to base it on the level of the average speaker. For example, we would expect a professional voice actor to be able to give a much better vocal performance than someone with no vocal training, so if the model was calibrated so that the average person received a perfect score in Delivery, then the model wouldn't be able to quantify the difference in ability between a professional voice actor and the average person.

Each category of ability is comprised of a myriad of smaller, more specific abilities (micro-abilities), which can each develop independently from one another. This means that rating a given aspect may not always be completely straightforward. An extreme example might be someone who is a phenomenal public speaker but profusely stutters in one-on-one conversations. How would we rate their Delivery? Although it may be tricky, in cases like these, try to average the major micro-abilities that comprise the category to come up with an overall score.

As a category of ability develops, the range of micro-abilities that comprise that category become broader. For example, at the lower levels, Phrasing develops along a relatively straightforward path (becoming increasingly able to use words and phrases in natural and idiomatic ways). But at the higher levels, when Phrasing becomes about using words in the most skillful or elegant ways possible, development branches off into countless specific domains of language, such as giving a formal speech, creating an engaging YouTube video, or being a standup comedian.

Development at these highest levels is mainly about truly excelling in the limited number of domains most relevant to the person. Even if other domains that aren't relevant to the person remain at a more average level, that shouldn't negatively affect their rating. This may seem to contradict what I said above, but the difference is whether a micro-ability is relevant to the person or not. Everyday conversation is a domain that is relevant to nearly everyone, so if someone can't speak to other people face-to-face, that would negatively affect their score. On the other hand, if a world-class motivational speaker couldn’t control their voice as well as a standup comedian, that wouldn't affect their score, as that specific micro-ability wouldn't be required by any of the domains relevant to the person.

Reaching the highest levels of development in a certain category may require development in other categories as a prerequisite. For example, a certain level of development in Pronunciation is necessary to reach the highest levels of Delivery possible.

Although the model breaks language ability down into various categories, that isn't meant to imply that each category is equally important, or that someone must be at a high level in every category to be considered a competent user of a language. The relative importance of each category will differ depending on the specific goals and priorities of the individual, and in some cases, certain categories may be given little to no importance at all.

Applying the model to specific individuals will require a significant amount of subjective judgment. This is inevitable, as there is no ultimate authority to refer to when thinking critically about language use, and much of what people find "skillful," "ideal," or "good" boils down to personal preference.

At the end of the day, it's just a model. By definition, a model is a representation of reality, so some amount of detail is inevitably going to be lost when compared to reality itself. The model is meant to be a pragmatic tool for thinking about language ability, not the end-all-be-all description of someone's capacity to use language.

I am sure that the model has many flaws that I was not able to foresee. I plan on continuing to update the model as these flaws are made aware to me.

Rubric

In order to make applying the model as clear and straightforward as possible, I have created a detailed rubric which specifies the differences between each possible score in each of the categories. I made the rubric in Google Sheets for easy viewing and editing. And because the details of the rubric are subject to change as flaws in the model are discovered, I have decided to keep the rubric hosted on Google Sheets for the time being. This way, I don’t have to worry about keeping it up to date in multiple places.

The rubric can be found here.

Categories

Spoken Language

Comprehension

The extent to which one can understand messages communicated through vocal and nonverbal language. Nonverbal language includes things like gestures and body language, which native speakers regularly use to communicate ideas both directly and indirectly. Comprehension is the one aspect of language ability that isn't related to producing language. Comprehension can be thought of in terms of accuracy, ease, and range. Native speakers tend to understand very close to 100% of the kinds of language that they are regularly exposed, so we could say that within the range of what they are comfortable with, their accuracy is close to 100%. Native speakers also usually find that understanding the kinds of language that they are comfortable with doesn't require any effort; what is heard is automatically and instantaneously understood. So, we could also say that native speakers can understand familiar kinds of language with great ease. Range refers to the spectrum of different language domains that can be understood, and is likely to differ significantly between native speakers. For example, someone who didn't graduate from high school may have a hard time understanding a high-level lecture by a verbose college professor.

Comprehension can also be thought of in terms of "phoneme-parsing ability" and "processing ability." "Phoneme-parsing ability" is the ability to determine which vowels and consonants (and by extension, which words) were actually spoken. For example, second language speakers often have a hard time making out what a fast-paced native speaker is saying, even though they might have no trouble understanding if they were given a written transcript of what was said. "Processing ability" is the ability to put together the words that are spoken in order to understand the message that is being communicated. In the example above, although the high school drop-out would be able to clearly hear what words the verbose college professor was saying, they still wouldn’t be able to comprehend the message, due to not having a large enough passive vocabulary and knowledge of concepts.. In this way, knowledge of concepts is directly related to Comprehension ability. Inferential ability (the ability to use what you do understand to fill in the gaps of what you don’t understand) can be considered part of "processing ability" as well.

Grammar

The extent to which the syntax used in speech is consistent with what is considered natural and correct by the majority of native speakers of the variety of the language that is being spoken. Grammar is something that second language speakers tend to struggle with much more than native speakers, but native speakers still slip up now and again. Native speakers will often make spontaneous grammatical errors due to getting tongue tied in the midst of conversation. In such cases, they will usually be fully aware of errors. Other times, native speakers will consistently break some generally accepted descriptive grammar rules. For example, although "me and Matt" may be consistent with some native speakers' grammar, those speakers may look uneducated if they used that instead of "Matt and I" in formal situations.

Phrasing

The extent to which words, phrases, and expressions are used skillfully in speech. Phrasing is equally relevant to both native and second language speakers. At the lower end of the spectrum, Phrasing is about whether usage would be considered natural and correct by the majority of native speakers of the variety of the language spoken. For example, "he said a lie," although grammatically correct, would be considered unnatural by nearly all native English speakers. Around the middle range of the spectrum, Phrasing becomes about using language that is appropriately tailored to the situation. For example, speaking with the right amount of formality in a business meeting. At the high end of the spectrum, Phrasing becomes about using language in exceptionally creative, clever, entertaining, elegant, or poetic ways. At the highest levels, Phrasing transitions into art. And as with all forms of art, value judgements become inherently subjective. At the lower end of the spectrum, distinguishing between Phrasing and Grammar is straightforward: Grammar is accuracy with regards to syntax rules, and Phrasing is accuracy with regards to word usage. "He told a lies" is an error in Grammar, whereas "He said a lie" is an error in Phrasing. But, a high level of development in Grammar is practically a prerequisite for reaching the higher end of the Phrasing spectrum, so making a strict distinction between the two can become difficult.

Pronunciation

The extent to which the timbre (vowels and consonants), pitch accent/stress accent/tones, and intonation of speech deviate from what is considered to be natural and correct by the majority of native speakers of the variety of the language spoken. Said another way, when reading a text out loud, how close does the person sound to a native speaker? Because Pronunciation is mainly a matter of not deviating from native speakers, it is most relevant to second language speakers. That said, many native speakers will still mispronounce words from time to time. For example, let's say a native speaker learns a word from reading and incorrectly guesses the spoken pronunciation and then uses that mispronunciation in speech. This could be an entirely idiosyncratic deviation from all other native speakers of the language. Native speakers with speech impediments may struggle with pronunciation as well.

If multiple varieties of the language exist (for example, American English, British English, etc.), scores should be based on the amount of deviation from the specific variety of the language that the person’s speech most closely resembles. Speakers who are a perfect mix of multiple varieties of a language may be difficult to score.

Delivery

The extent to which the verbal performance and body language of a speaker either enhances, or subtracts from, the experience of listening to that speaker. Delivery is equally relevant to both native and second language speakers. Some examples of qualities that would subtract from a speaker's Delivery are stuttering, overuse of filler words, mumbling, hesitation, awkward or unnatural body language, a sense of being strained or tensed up, speaking too fast or too slow, and constantly restarting sentences mid-way through. Some examples of qualities that would enhance a speaker's Delivery are an engaging and easy to understand cadence, an ideal amount of enunciation, an appropriate amount of confidence, and a sense of flow or effortlessness.

Delivery is independent from Pronunciation, Grammar, Phrasing, and Content. If two native speakers read the same text out loud, and one speaker was easier to understand and more pleasant to listen to than the other, that would be a difference in Delivery. Thought of another way, it is possible for a second language speaker to have poor Pronunciation, Grammar, and Phrasing, yet still have great Delivery.

Content

The extent to which messages verbally and nonverbally communicated are skillful. Whereas Phrasing is about words, Content is about ideas. Content is the most subjective category of language ability, as well as the category that most greatly diverges from what is traditionally thought of as "language ability." Content is equally relevant to both native and second language speakers.

At the lower end of the spectrum, Content is about assimilating into a linguistically-bounded group of people through following cultural and societal norms. For example, in Japanese culture, accepting compliments is highly frowned upon; when complimented, playing down or denying the compliment is considered to be the appropriate reaction. Consistently complying with all these sorts of cultural norms (with the exception of intentionally breaking one in order to make a point) puts one at least at the mid-range of Content related ability. Development in this area will often have a greater influence on how a second language speaker is perceived by native speakers than improvements in the more technical aspects of language ability.

At the high end of the spectrum, Content becomes about conveying messages that strongly influence other people. Some examples might be making people laugh, educating or informing, or providing encouragement or inspiration. The influence that a speaker exerts doesn't necessarily have to be positive, but it does have to be intentional. For example, accidently making someone laugh would not reflect a high level of development in Content.

This high end of the Content spectrum covers an extremely broad range and is decidedly subjective. But, language is a tool to interact and communicate with other people, and so the ability to produce language that affects listeners in a desired way clearly must be taken into account. And because the actual Content that is communicated is undeniably the largest deciding factor in how language will affect someone, Content must be included in a comprehensive model of language ability.

Written Language

Comprehension

The extent to which one can understand messages communicated through written language. Similar to Comprehension in spoken language, except "phoneme-parsing ability" (the ability to determine which vowels and consonants were actually spoken) is not relevant, so only "processing ability" (the ability to put together words in order to understand the message that is being communicated) is required.

In written language, Comprehension can be thought of in terms of accuracy, speed, ease, and range. Native users of a language tend to understand very close to 100% of the language domains they regularly read in, so we could say that within the range of what they are comfortable with, native users’ accuracy is close to 100%. Native users also usually find that they can read in the language domains they are most comfortable in relatively quickly, and without a large amount of effort. Thus, we could also say that native users can read familiar kinds of language with a high level of speed and ease. Range refers to the spectrum of different language domains that can be understood, and is likely to differ significantly between native users. For example, someone who didn't graduate from high school may have a hard time understanding a philosophy thesis. Knowledge of concepts, as well as inferential ability (the ability to use what you do understand to fill in the gaps of what you don’t understand), are directly related to Comprehension ability.

Grammar

The extent to which the syntax used in writing is consistent with what is considered natural and correct by the majority of native users of the variety of the language that is being written. Largely the same as Grammar in Spoken Language Ability. Grammar is something that second language users tend to struggle with much more than native users, but most native users still slip up now and again. Instead of making spontaneous syntax errors due to getting tongue tied like in spoken language, when writing, native users often make spontaneous syntax errors due to typos, but are able to notice and self-correct these errors when re-reading what they wrote. Similar to spoken language, some native users will consistently break some generally accepted descriptive grammar rules. For example, although "me and Matt" may be consistent with some native speakers' grammar, those users would look uneducated if they used that instead of "Matt and I" in formal writing.

Phrasing

The extent to which words, phrases, and expressions are used skillfully in writing. Largely the same as Phrasing in Spoken Language Ability. Phrasing is equally relevant to both native and second language users. At the lower end of the spectrum, Phrasing is about whether usage would be considered natural and correct by the majority of native users of the variety of the language written in. For example, "he said a lie," although grammatically correct, would be considered unnatural by nearly all native English users. Around the middle range of the spectrum, Phrasing becomes about using language that is appropriately tailored to the situation. For example, writing with the appropriate amount of formality in a business email. At the high end of the spectrum, Phrasing becomes about using language in exceptionally creative, clever, entertaining, elegant, or poetic ways. At the highest levels, Phrasing transitions into art. And as with all forms of art, value judgements become inherently subjective. At the lower end of the spectrum, distinguishing between Phrasing and Grammar is straightforward: Grammar is accuracy with regards to syntax rules, and Phrasing is accuracy with regards to word usage. "He told a lies" is an error in Grammar, whereas "He said a lie" is an error in Phrasing. But, a high level of development in Grammar is practically a prerequisite for reaching the higher end of the Phrasing spectrum, so making a strict distinction between the two can become difficult.

Hand Writing

The extent to which written Handwriting is legible and aesthetically pleasing. This category only applies when using a pen or pencil to write language out by hand. Judgements about Handwriting are largely subjective. Although having good handwriting was valued by many cultures in the past, due to the advent and proliferation of typing and digital text, in recent years its perceived value has waned immensely. Many people do not place any value on Handwriting when thinking about the big picture of writing ability.

Conventions

The extent to which the Conventions agreed-upon by native users of the variety of the language written are adhered to. "Conventions" include things like spelling and punctuation use. I have also placed the frequency of typos into this category. Technology has made it much easier to get by without having a detailed knowledge of Conventions. Judgements about competence in Conventions should be based on one's level when not relying on technology. For example, kanji knowledge in Japanese should be based on the ability to recall characters from Memory. Ability related to reading and understanding kanji would fall under Comprehension. Because technology is increasingly replacing knowledge of Conventions for many people, and Conventions are largely independent from the rest of Written Language Ability, many people may not place a lot of emphasis on Conventions when thinking about the big picture of writing ability.

Content

The extent to which messages communicated through writing are skillful. Largely the same as Content in Spoken Language Ability. Whereas Phrasing is about words, Content is about ideas. Content is the most subjective aspect of language ability, as well as the aspect that most greatly diverges from what is traditionally thought of as "language ability." Content is equally relevant to both native and second language users. At the lower end of the spectrum, Content is about assimilating into a linguistically-bounded group of people through following cultural and societal norms. At the high end of the spectrum, Content becomes about conveying messages that strongly influence other people. Some examples might be entertaining, educating, inspiring, or motivating people. The influence that a user exerts doesn't necessarily have to be positive, but it does have to be intentional. This high end of the Content spectrum covers an extremely broad range and is decidedly subjective. But, language is a tool to interact and communicate with other people, and so the ability to produce language that affects listeners in a desired way clearly must be taken into account. And because the actual Content that is communicated is undeniably the largest deciding factor in how language will affect someone, Content must be included in a comprehensive model of language ability.

Application

Defining Fluency

So, now that we have a nuanced working model of language ability, how are we going to define fluency? Of course, we can’t hope for a be-all-end-all definition, as the word means something different to everyone. But in the context of MIA, the minimum criteria to be considered “fluent” is as follows:

Spoken Language:

Written Language:

In the charts above, each rung represents “two” of the ten possible points (i.e., Grammar is “six”). You can make your own ability graphs with this tool. Here is a chart that compares fluency (blue) to the minimum amount of ability required to reliably get confused for a native speaker (purple):

Although I would argue that you could still consistently get confused for a native speaker with only a “2” in Delivery, because this low of a level would significantly get in the way of fluent communication, I made “3” the minimum requirement for spoken language fluency. On the other hand, although I didn’t make development in Handwriting a requirement for written language fluency (as it is largely irrelevant in modern times), I would argue that nearly all natives could score least “4” in this category. Below you will find examples of the model applied to second language speakers of English and Japanese at various levels. The full extent of a speaker’s Comprehension and Content can’t be accurately judged from a single video, so I have left these categories blank. Please note that all scores are based on my own subjective opinion, which I formed after watching the video in question. These scores are for strictly educational purposes and are not meant to be taken as criticism of the speakers in question.

Japanese:

English