Learning in the LLM Era
The original Chinese version of this article is here.
Since the meteoric debut of ChatGPT in the second half of 2022, we have transitioned from initial curiosity and anticipation to an era of fierce clashes among tech giants and ubiquitous productivity tools. LLMs and VLMs have repeatedly shattered our paradigms, offloading more and more tasks from our cognitive burden. While we fixate on how AI is altering our workplace dynamics, a much larger question has quietly surfaced: In the age of LLM, when the answers to almost everything are so readily at our fingertips, how and what should we learn? What should the next generation of education look like? Now that I have a child of my own, this question becomes incredibly acute. Below, I will use an imaginary conversation to conduct a thought experiment.
Sometime in 2040, he angrily asks me, what is the point of learning all this boring algebra and formulas, reciting these antiquated poems and articles, and poring over obscure assembly code and algorithms? Exams shouldn’t even exist anymore—as long as we know how to use AI, that is enough! After his rant, he earnestly preaches his so-called “happy path”:
“It goes without saying that I’ll use AI for homework. As for exams, as long as I can handle them with AI, I can scrape by. I won’t be challenged at work either: just solve everything with AI. Tokens are so cheap now, and I can generate enough electricity riding my bike to cover the costs…”
I suddenly realize something and cut him off:
“Okay, let’s say you sail through everything smoothly, you’re incredibly gifted at using AI, and your prompts are just better than everyone else’s. As a result, you become a renowned figure in your field. Now, you’re invited to give speeches, attend lectures, and debate with others in a setting where real-time AI usage is completely impossible. How do you handle that?”
“Well, I’d obviously use AI to prepare my speech, help me familiarize myself with the topic, and get ready. Wouldn’t that work?”
“But how exactly do you prepare? By reading the answers in advance? Even if the organizers give you the context beforehand, you can’t accurately predict or control what kind of challenges the other guests or the audience will throw at you.”
“Haha, then I’ll just have it prepare a bit more, and I’ll memorize it all to get familiar.”
I interrupt him like I’ve grabbed a lifeline:
“Haha, you fell right into my trap. Think about it: if you can familiarize yourself with AI-prepared materials to the point where you can fluidly converse and debate with others on the spot, how is that any different from mastering the content through your own learning? Furthermore, could you really achieve that level of mastery just by rote memorization of what AI prepared for you? If this is the ultimate scenario, why not start learning it now so it will be easier when the time comes?”
My baby is completely caught off guard by my response, utterly speechless.
Part I: The Brain is the Bottleneck
This thought experiment interrogates the essence of the problem: What exactly is learning, and where is the limit of AI capabilities, in assisting human?
As someone who uses AI heavily in daily life, I can clearly sense that the bottleneck in problem-solving lies within my own brain, not the AI tool. When a tool’s capabilities exceed the boundaries of your brain’s processing capacity, the tool effectively sidelines you. For an individual user, this means the tool morphs from your assistant into your master, leading you into a world you fundamentally do not understand. Imagine the frustration of Li Hongzhang [a late Qing dynasty diplomat]: The foreigners are saying this and that, demanding this and that, so in the end, you just need to sign here; but I really don’t understand what they are saying! Fine, I’ll sign it. Next page.
For a team, this means entering a mode of telepathic communication where each teammate’s AI starts talking directly to the others, and the humans are reduced to messengers: What did you change in this MR? AI wrote a document. Okay, let me take a look, but I don’t understand it, so I’ll let my AI read it too. Here are its suggestions. Of course, if you believe that future AI will be powerful enough that you can trust eyes closed, you can ignore this section.
What is the crux of the problem? The brain is the bottleneck. Since we know where the bottleneck is, we must eliminate it. This is the first entry point to answering how to learn in the LLM era. We need to de-bottleneck our brains. That means improving thinking efficiency, processing speed, and cognitive load capacity.
You should notice that for things you are familiar with—like a song you can sing, a text you’ve memorized, or a subject you excel in—your brain navigates them effortlessly. It absorbs familiar information without any resistance, even with a sense of pleasure. But for unfamiliar things or cross-disciplinary communication—like listening to Post Punk for the first time, or reading a paper outside your field—your brain might start to shut down within minutes. It manifests as: What the hell is this awful noise? I know every single word written here, but what do they mean when put together? Under such high cognitive load, the brain’s processing speed drops significantly. Even if you brute-force your way through one article, moving on to the next might make you feel like your head is going to explode, making it impossible to concentrate. This is where the performance of the “brain processor” is tested. People who regularly persist in deep thinking have trained their anterior cingulate cortex against this cognitive resistance. Under a full load, they can maintain high-speed operation with astonishing willpower, defeating those whose attention has already begun to scatter. This can also be seen as a manifestation of “mental stamina.” And this process of deep thinking is exactly what we call “learning.” In other words, learning in the LLM era is not driven by memorizing or reciting a certain amount of content, rather by increasing one’s own cognitive load capacity. This involves two aspects:
• Internal Capacity Building: Breaking through cognitive resistance to train your brain to process more content faster. “Mental gymnastics” like spatial geometry, linear algebra, and physics, combined with subjects that require summarization, integration, and mind-mapping like biochemistry, languages, and history, can greatly enhance this capability—both in terms of receptiveness and system construction. Make no mistake: taking fragmented pieces of information you’ve seen, creating a mental map, and internalizing it from a surface-level blueprint into your own mind system consumes an immense amount of mental energy.
• External Familiarization: Turning as many unfamiliar things as possible into familiar ones. This has two benefits: First, the transition from unfamiliarity to familiarity inherently involves breaking boundaries and piercing through cognitive resistance; second, for familiar things, its cognitive payload on brain naturally decreases.
After long-term training on these two points, you will find that reading massive blocks of text and plans spit out by an AI agent is no longer strenuous. You can read through all of it effortlessly over a sustained period and engage in meaningful interaction with it. Put bluntly, to use AI well, you have to catch up to AI’s level, and the only way to catch up is through learning.
Part II: The Solution Space and AI’s Technical Limits
Based on a thought experiment, the first part took a bottom-up approach and reasoned backward from the outcome to reach a conclusion. In the second part, I will briefly explain from a technical perspective why AI cannot replace humans, even in text generation—the domain where AI excels the most.
Let’s start with another experiment: You ask today’s most advanced AI video model to generate a Hollywood-level clip of an alien invasion on Earth. Let’s say, 2 minutes long. It works hard and produces it: various camera cuts, special effects, a sense of breathing, a doomsday vibe, epic sci-fi aesthetics, massive scale, intense plot. You hold your breath watching it, deeply shocked, getting goosebumps. Wow, AI is too powerful.
Okay, now ask it to generate another video: A piece of white paper with a single line. At one end of the line is a triangle; it tumbles and rolls to the other end, slowly morphing into a regular pentagon along the way. Nothing else, just the simplest lines, like an old Flash animation.
Suddenly, it malfunctions. The most basic Flash-style animation, just a few lines, but no matter how much you tweak it, it just can’t nail the effect. What happened?
No one has actually seen an alien invasion; the model can let its imagination run wild. Scavenging and piecing things together from its massive training data will generally yield a satisfying result. But faced with the simplest, yet highly specific requirement, it freezes up. Why? Because the solution space is too small. In such a tiny, singular solution space, the diffusion output of a generative model has almost nowhere to sample from.
This same principle applies to text. If you have completely thought out exactly what you want to say in your head, no matter how you explain it to the AI, it cannot generate it identically. The only feasible way is to feed your entire cause-and-effect thought process to the AI as context; even then, there’s no guarantee it will generate the exact same phrasing. And why bother? Isn’t it better to just write it yourself? It’s like wanting to shoot a movie where you already have a perfectly concrete vision in your head—exactly what props are placed where. In this case, no matter how you describe it, AI cannot generate what you imagine; you’d have to give it a reference photo first. As for those “dream generators” that look so cool—that’s only because you don’t actually remember the exact details of your dream, giving the AI freedom to improvise. In other words, if you think the stuff AI produces is mind-blowingly good, it’s either because you didn’t really know what you wanted in the first place (“just wing it”), or because you’re doing highly context-dependent grunt work (in simple terms, corporate PPT slave labor). As for that high-value idea in your mind, the exact token combination with an equivalent probability of P=0, AI simply cannot generate it. If it can, it means your context was too simple—so simple that you could have just told it directly.
Therefore, truly brilliant ideas, high-value writing, and genuine creativity are things AI cannot do. The only thing it can do is summarize, synthesize, and use the massive data it has seen to fill in the gaps of imagination. This grunt work of scouring literature is exactly the great scenario where AI can save us mental energy. As for what you ultimately do with the summarized results and that saved energy, well, that depends on your own capability. This is where learning helps you add that final, masterful finishing touch.
Part III: Experiencing the Winding Path
In the third part, I will discuss why we need learning from an experiential point of view. Since the advent of AI, knowledge has become far too accessible. There is no longer a process of waiting and excavating; everything can be instantly gratified. Perfectly summarized frameworks and details can be spoon-fed to you immediately. Learning seems to have become a highly accelerated process; work and life now prioritize efficiency above all else—if you aren’t fast, someone else will be faster. If there’s something you don’t understand, a large model will instantly unfold all relevant content in that field, from shallow to deep, right before your eyes. It has everything you need.
But experientially, this is a downgrade. Over the course of these recent productivity revolutions, our sensory experiences have already been downgraded many times. It’s already hard for modern people to resonate with the poetic longing of “As the bright moon shines over the sea, from far away you share this moment with me.” Not to mention today’s “post-moderns” whose attention spans, ravaged by short videos, last barely 10 seconds. Or consider the delicate emotional experiences derived from food—like the poetic longing for water shield and perch, or the appreciation of the bony *Hilsa herring—which have long been drowned in a red sea of chili peppers in an era where “spicy” reigns supreme and everything can be barbecued. *(As an aside, I don’t think there’s much point in emphasizing the original flavor of Chinese cuisine to Americans. They grew up on hyper-processed foods or heavily flavored meals. Even if their taste buds could perceive it, their neural synapses wouldn’t process the taste of ingredients like cattail or fava beans; to them, it’s no different than tasting water).
The emergence of large models just pushes this experiential downgrade to the extreme, putting a shortcut in your brain. People are gradually forgetting what the processes of learning and thinking actually entail, let alone feeling their beauty. Learning itself is a process of following a guide inward. But with this “cut to the chase” approach, people have lost the patience for winding paths. The subtle, implicit beauty of a classic southern Chinese garden, created by the use of occlusion, becomes an obstacle—if it were popular, they’d smash all the walls down for you. This strips humanity of the ability to slowly uncover and appreciate the beauty of knowledge.
I remember when I was an undergrad taking the introductory Electrical Engineering course, ELEC 241. The concepts were brought in one by one, unfolding slowly before my eyes like from a revolving lantern, walking me to through this journey of learning. The surprise and charm of finally piecing together the complete picture is something utterly incomparable to, and unattainable through, non-systematic, short-term, high-dose exposure.
From this perspective, the true process of learning is: when others can’t explain it clearly, you help yourself and attempt to understand it. The direct answers provided by large models are just rote memorization—a forceful memoization of QKV without the journey.
The above three parts are my reflections on what learning is and how we should learn in the era of LLM. As technology advances in the future, perhaps some of these views will become outdated, but I will leave them stated here for now.
Enjoy Reading This Article?
Here are some more articles you might like to read next:
- LLM Study Notes III: Post-Training
- Trajectory Basics VI: Adaptive Tracking II
- LLM Study Notes IV: Multimodal Large Language Models
- Trajectory Basics I: Foundations
- Frame Transform Fun