아리랑
the right goal, the wrong strategy, 7 minutes to fix it

Hearing a sound feels like knowing it, until your mouth proves otherwise.
Imagine you're at a concert. Everyone around you knows the words. You’ve been listening on repeat hoping that recognition will become production. But it hasn’t. You have 7 minutes before this matters.
This was me… After attempts to sing along, I realize repetition isn’t working. Not because the song is impossible. Because this alphabet, this time constraint, and this brain need a different approach.
I stop. Drop what isn’t working. Design an alternative instead. If you want a name for this, it is metacognitive flexibility.
I open my phone’s AI1 app to do something that no one had templated for me. I use it for just-in-time learning—to pick up a skill at the moment when it’s needed—for the concert where everyone knows the words. I don’t need to learn how to speak fluent Korean. I need to learn enough to sing along tonight.
what makes this possible
In two prompts, I cross three representational systems:
Hangul glyph-looking alphabet → mouth shape → English sound-alike
None of this I could have mapped from memory alone, so I design around that. I need the LLM to meet me where my memory has no foothold.
I use AI to commission something my brain needs, at this moment, with full knowledge of what it cannot deliver.
metacognitive moves
This is a lip-shape mapping challenge, not a translation one. The goal: match mouth geometry, not meaning. Vowels matter most; consonants where visible.
One thing to watch: the NEOM-EO-GAN-DA row. In the actual melody that phrase spans four distinct notes, so you have time for all four mouth positions. Don't rush it.
This table gives me what listening alone could not: the physical anchor of mouth position before asking my working memory to hold onto an unfamiliar sound.
Two things are at play here:
I break an impossible whole (sing in Korean) into two manageable parts — mouth mechanics first, in the shapes column; phonetic mnemonic second, in the English approximation column. Cognitive scientists call this task decomposition.
Each sound gets hooked onto an English word my brain already owns. A familiar mouth position, a recognizable phrase. The load drops; the production becomes possible. That is cognitive anchoring: latching new, unstable information onto memories that are already stable.
Here’s the misheard lyrics version. Each line reads as a real English phrase:
A few notes on the trade-offs: Our radio (A-RA-RI-YO) and narwhal burrito (NA-REUL BEO-RI-GO) are both phonetically tight and absurd enough to feel like genuine mishearing. Nemo’s gonna is solid on mouth shape and has that dangling unfinished quality authentic to misheard lyrics.
Weakest line: Shimmy dough, not gaze — oh is a stretch. It has that mid-word Korean syllable break that resists clean English phrasing. “Simply don’t, my gaze, oh” is an alternative if you want more grammatical sense at the cost of some phonetic looseness.
BAL-BYEONG-NAN-DA is genuinely hard — the byeong cluster has no natural English equivalent, so bawling sacrifices the palatal but preserves the emotional register, which is fitting given the song is about abandonment.
By flagging its trade-offs (e.g., the weakest line, its phonetic compromises, and the difference between phonetic fit and usability2), this LLM prevents a new illusion from forming. The one that narwhal burrito is somehow exact Korean.
The transparency gives me what auto-magical consumption never does: the output is mine to evaluate, edit, and own.
map of the goal
Owning the output starts before the output exists. It starts in how you ask.
The LLM did not figure out that I needed to overcome the lip-shape problem. I did. What I asked for was specific: make it look right from the outside, not sound right to a Korean speaker.
But conspicuous limitations—the trade-offs the model can surface—are a feature that the human has to request.
By default, the complexity is hidden. The model gives us narwhal burrito without flagging it. Pulling the reasoning out into the open is a move worth practicing. I didn't prompt "teach me Korean." I asked to map mouth shapes. The specificity of the ask is what made the trade-offs visible. Visibility is step one.
The stakes determine what you do with the limitations. Imagine you are about to meet your Korean future in-laws. The flagged weaknesses become your practice list. You spend another hour on BAL-BYEONG-NAN-DA and SIP-RI-DO MOT-GA-SEO. However, minutes before a concert, you belt out Shimmy dough, not gaze, oh with confidence, because you know exactly what to approximate and what to trust.
The brain was right. Practicing the strategy of asking for the trade-offs is where our agency lies.
steal this prompt structure
I need to [specific task] in [time/context]. I don't need [what you don't need]. Map me [the specific thing]. Flag where this falls short.
The last sentence changes what you own. Without it, the AI model delivers output. With it, the AI model delivers output and the underlying assumptions, which you need to evaluate the output. This makes the difference between consuming a result and being able to use, edit, and trust it.
the research behind this
The ideas here draw on cognitive load theory, metacognition research, and work on AI-human interaction. Here are papers worth going deeper on, depending on which thread interests you most:
Bjork E, Bjork RA. Making Things Hard on Yourself, But in a Good Way: Creating Desirable Difficulties to Enhance Learning. In: Psychology and the Real World: Essays Illustrating Fundamental Contributions to Society. Worth Publishers; 2011:56-64.
Efklides A. Metacognition: Defining Its Facets and Levels of Functioning in Relation to Self-Regulation and Co-Regulation. European Psychologist. 2008;13(4):277-287. doi:10.1027/1016-9040.13.4.277
Fleur DS, Bredeweg B, Van Den Bos W. Metacognition: Ideas and Insights from Neuro- and Educational Sciences. npj Sci Learn. 2021;6(1):13. doi:10.1038/s41539-021-00089-5
Rozenblit L, Keil F. The Misunderstood Limits of Folk Science: An Illusion of Explanatory Depth. Cognitive Science. 2002;26(5):521-562. doi:10.1207/s15516709cog2605_1
Schellaert W, Martínez-Plumed F, Vold K, et al. Your Prompt is My Command: On Assessing the Human-Centred Generality of Multimodal Models. JAIR. 2023;77:377-394. doi:10.1613/jair.1.14157
Sweller J, Van Merriënboer JJG, Paas F. Cognitive Architecture and Instructional Design: 20 Years Later. Educ Psychol Rev. 2019;31(2):261-292. doi:10.1007/s10648-019-09465-5
Weis PP, Wiese E. Know Your Cognitive Environment! Mental Models as Crucial Determinant of Offloading Preferences. Hum Factors. 2022;64(3):499-513. doi:10.1177/0018720820956861
I use the terms AI (artificial intelligence) and LLM (large language models) interchangeably here.
Best phonetic match and medium confidence can coexist. Narwhal burrito fits the mouth shapes well but the English phrase is absurd enough that the LLM hedges on whether it will serve me.







I love how self aware and funny this is. And a very interesting exercise to try out with AI!
I need the whole stadium to 🥳🥳