The Field of Education is Due For a Copernican Revolution

You’d think that teacher training programs would focus on the mechanics of learning, but instead they typically focus on ritualistic compliance. If we trained doctors like we do teachers, then we’d still be bloodletting. Teacher credentialing severely lacks rigor, and this lack of rigor leads to a massive loss in human potential. Students suffer for it, and it drives serious educators out of the profession. It attracts and supports the type of people who think it’s more important to practice sharing circles than to learn about the importance and implementation of spaced review. When you make it your mission to maximize student learning — including leveraging the learning-enhancing practice techniques that have been known, reproduced, and yet ignored by the education system for decades — you realize that there is a massive amount of human potential being left on the table. Students can be learning way, way, way more than they currently are.

Want to get notified about new posts? Join the mailing list and follow on X/Twitter.

CONTENTS

My Experience with Teacher Credentialing and Professional Development
It’s Not Just Me Who Thinks Teacher Training Lacks Rigor — It’s a Known Phenomenon in the Literature
Lots of People in Education Disagree with the Premise of Maximizing Learning
Why It’s Important to Take Learning Seriously, Especially in Math
Some Learning-Enhancing Practice Techniques So Replicable They Might As Well Be Laws of Physics
We’ve Known About These Techniques For Decades, So Why Aren’t They Being Taught/Used?
These Techniques Connect All The Way Down to Mechanics In The Brain
“The Future Is Already Here, It’s Just Not Very Evenly Distributed”
Further Reading

My Experience with Teacher Credentialing and Professional Development

Speaking as someone who had to suffer through a teacher credentialing program… it’s actually an anti-signal when someone references their teaching credential as a qualification to speak about how learning happens. It’s centered around political ideology rather than the science of learning.

There exist learning strategies that have been scientifically shown to improve student learning, such as mastery learning, spaced repetition, retrieval practice (the testing effect), and varied practice (interleaving).

These learning strategies have been researched extensively since the early to mid 1900s, with key findings being successfully reproduced over and over again since then.

Yet, when I completed my teaching credential from 2020-21 and attended numerous professional developments (PDs) from 2019-23, not once did I hear any mention of these learning strategies!

Instead, the focus was 100% on diversity, equity, & inclusion: readings (and an essay) on hegemonic heteronormativity, anti-racism training, “sharing circle” training, and even a presentation on the gender unicorn, complete with an extraordinarily complex gender classification flowchart, just to name a few examples.

Forget the science of learning – even the most obvious practical skills that a teacher would need to exercise on a daily basis, such as managing a rowdy classroom, communicating with parents, holding students accountable for their work, and dealing with academic dishonesty, were not covered at all in teacher credentialing nor PD.

However, there was no shortage of pointless activities.

I vividly remember a virtual PD in which the first 30 minutes was spent going around the room of 30+ teachers, the PD leader asking each teacher to describe the weather outside in their physical location and explaining how their personal feelings that day related to the weather.

At another PD, there was an activity involving a circle of traffic cones, each cone with a different emoji taped to the top. The PD leader read through a list of words and asked the teachers to walk over to the cone that they felt matched their feelings in response to the word.

I wish that my experience with teacher credentialing was an edge case, a dysfunction that is not widespread. But if you look at the curricula of standard teacher credentialing programs, and even the schools of education within well-reputed universities, you’ll find the same phenomenon. The curricula are focused entirely on making education engaging, diverse, and unbiased, and there’s little to nothing about the science of learning.

This lack of rigor would not pass in other disciplines. Engineers are required to take plenty of rigorous courses on math and physics. Doctors are required to take plenty of rigorous courses on biology and chemistry. But educators are not required to take a single course on the science of learning, much less a rigorous one.

It’s Not Just Me Who Thinks Teacher Training Lacks Rigor — It’s a Known Phenomenon in the Literature

By the way, it’s not just me who’s pointing this out. There are numerous peer-reviewed academic studies about how the science of learning is missing from teacher education. Here are some entrypoints into the literature:

Teaching the science of learning (Weinstein, Madan, & Sumeracki, 2018)

*“The science of learning has made a considerable contribution to our understanding of effective teaching and learning strategies. However, few instructors outside of the field are privy to this research.

In particular, a review published 10 years ago identified a limited number of study techniques that have received solid evidence from multiple replications testing their effectiveness in and out of the classroom (Pashler et al., 2007).

A recent textbook analysis (Pomerance, Greenberg, & Walsh, 2016) took the six key learning strategies from this report by Pashler and colleagues, and found that very few teacher-training textbooks cover any of these six principles — and none cover them all, suggesting that these strategies are not systematically making their way into the classroom.

This is the case in spite of multiple recent academic (e.g., Dunlosky et al., 2013) and general audience (e.g., Dunlosky, 2013) publications about these strategies.”*

How learning happens: Seminal works in educational psychology and what they mean in practice (Kirschner & Hendrick, 2024, pp.275)

*”…[M]ost students, and also many or even most teachers, don’t have an accurate picture of the effectiveness of their study approach.

After more than a hundred years of research into learning and memory, there are a few things that we know about good and less good approaches. Since the turn of this century, people have been trying to figure out how to remember as much as possible, how to ensure that we forget as little as possible, and how to do this in as little time as possible.

The reason we have our doubts with respect to teachers is because the findings that have emerged from this research aren’t yet included in textbooks for teachers (both in research in the US, as well as in the Netherlands and Flanders; Pomerance, Greenberg, & Walsh, 2016; Surma, Vanhoyweghen, Camp, & Kirschner, 2018).”*

Unanswered questions about spaced interleaved mathematics practice (Rohrer & Hartwig, 2020)

“We fear, however, that continued advocacy might fall on deaf ears… [E]mpirical evidence is not highly valued by many of the educators who recommend learning methods and train teachers (e.g., Robinson, Levin, Thomas, Pituch, & Vaughn, 2007; Sylvester Dacy, Nihalani, Cestone, & Robinson, 2011). Against this backdrop, it might be difficult to inspire the kind of support for evidence-based interventions like those that sparked the dramatic improvements in Western medicine over the last century. Doing so, we believe, is the most pressing challenge facing learning scientists.”

Lots of People in Education Disagree with the Premise of Maximizing Learning

Ironically, the teaching profession seems to drive out the people who are most interested in optimizing student learning. That’s one thing I didn’t expect when I entered the profession: lots of people in education disagree with the premise of maximizing learning.

For instance, “Testing” and “repetition” have become dirty words in education.

However, practice testing and distributed practice (also known as spaced repetition) are widely understood by researchers to be two of the most effective practice techniques.

Moreover, deliberate practice – which has been shown to be one of the most prominent underlying factors responsible for individual differences in performance, even among highly talented elite performers – is centered around using repetitious training activities to refine whatever skills move the needle most on a student’s overall performance.

What gives? Why are there debates about scientifically proven learning techniques like testing and repetition?

Because lots of people in education disagree with the premise of maximizing learning. The debates aren’t about whether testing and repetition are effective learning techniques – the debates are about whether education should seek to maximize students’ learning.

There are plenty of students who would prefer for their education to maximize other things like fun and entertainment while, as a secondary concern, meeting some low bar for shallowly learning some surface-level basic skills.

And there are plenty of teachers who are incentivized to use easy, fun, low-accountability, hard-to-measure practice techniques that keep students, parents, and administrators off their back. (Unfortunately, these practice techniques tend to be ineffective.)

Why It’s Important to Take Learning Seriously, Especially in Math

The subfield within education that seeks to maximize learning is known as “talent development.”

In talent development, the optimization problem is clear: an individual’s performance is to be maximized, so the methods used during practice are those that most efficiently convert effort into performance improvements.

Practitioners of talent development tend to be found in hierarchical skill domains like sports and music, where each advanced skill requires many simpler skills to be applied in complex ways. This is because it’s hard to climb up the skill hierarchy without intentionally trying to do so.

To learn an advanced skill, you must be able to comfortably execute its prerequisite skills, and the prerequisite skills underlying those, and so on. Getting to the point of comfortable execution on any skill takes lots of practice over time – and even after you get there, you have to continue practicing to maintain your ability.

None of this happens naturally. If you don’t carefully manage the process, then you struggle. Nobody gets to be really good at a sport or instrument without taking their talent development seriously and intentionally trying to maximize their learning.

Success in sports and music requires talent development. But most students aren’t expected to achieve a high level of success in sports or music, so they can get away with de-prioritizing talent development. If every student in gym class were expected to be able to do a backflip by the end of the year, things would have to change – but the expectations are so low that meeting them does not require talent development.

But when it comes to math, de-prioritizing talent development becomes problematic. Like sports and music, math is an extremely hierarchical skill domain, so achieving a high level of success requires a dedication to talent development. However, unlike sports and music, most students are expected to achieve a relatively high level of success in math: many years of courses increasing in difficulty, culminating in at least algebra, typically pre-calculus, often calculus, and sometimes even higher than that.

As a result, in math, de-prioritizing talent development leads to major issues. When students do the mathematical equivalent of playing kickball during class, and then are expected to do the mathematical equivalent of a backflip at the end of the year, it’s easy to see how struggle and general negative feelings can arise.

Some Learning-Enhancing Practice Techniques So Replicable They Might As Well Be Laws of Physics

So what do we know about the techniques for optimizing student learning? I mentioned a few above: mastery learning, spaced repetition, retrieval practice, and interleaving. But here’s a more thorough list with descriptions.

All of these learning-enhancing practice strategies have been tested scientifically, numerous times, and are completely replicable. They might as well be laws of physics.

I’ll start with a few obvious findings, but there are plenty of less obvious findings later down the list.

(If they’re obvious, why cover them? Because in education, obvious strategies often aren’t put into practice. For instance, plenty of classes that still run on a pure lecture format and don’t review previously learned unless it’s the day before a test.)

Anyway, here we go:

Mastery learning: a student will be much more likely to succeed in learning a new topic if they’ve mastered the prerequisites.
Active learning: actively solving problems produces more learning than passively watching a video/lecture or re-reading notes. (To be clear: active learning doesn’t mean that students never watch and listen. It just means that students are actively solving problems as soon as possible following a minimum effective dose of initial explanation, and they spend the vast majority of their time actively solving problems. Also note that active learning does not imply unguided learning or group work — active learning is most effective when all information to be learned is explicitly communicated and all active practice is performed with corrective feedback and guidance. Ideally, over the course of a learning session, students will complete numerous cycles rapidly alternating between minimum effective doses of guided instruction and active practice.)
Review: if you don’t review information, you forget it. You can actually model this precisely, mathematically, using a forgetting curve. I’m not exaggerating when I refer to these things as laws of physics — the only real difference is that we’ve gone up several levels of scale and are dealing with noisier stochastic processes (that also have noisier underlying variables).

Here are some less obvious findings.

The spacing effect: more long-term retention occurs when you space out your practice, even if it’s the same amount of total practice.
A profound consequence of the spacing effect is that the more reviews are completed (with appropriate spacing), the longer the memory will be retained, and the longer one can wait until the next review is needed. This observation gives rise to a systematic method for reviewing previously-learned material called spaced repetition (or distributed practice). A “repetition” is a successful review at the appropriate time.
To maximize the amount by which your memory is extended when solving review problems, it’s necessary to avoid looking back at reference material unless you are totally stuck and cannot remember how to proceed. This is called the testing effect, also known as the retrieval practice effect: the best way to review material is to test yourself on it, that is, practice retrieving it from memory, unassisted.
The testing effect (retrieval practice effect) can be combined with spaced repetition to produce an even more potent learning technique known as spaced retrieval practice.
During review, it’s also best to spread minimal effective doses of practice across various skills. This is known as mixed practice or interleaving — it’s the opposite of “blocked” practice, which involves extensive consecutive repetition of a single skill. Blocked practice can give a false sense of mastery and fluency because it allows students to settle into a robotic rhythm of mindlessly applying one type of solution to one type of problem. Mixed practice, on the other hand, creates a “desirable difficulty” that promotes vastly superior retention and generalization, making it a more effective review strategy.
To free up mental processing power, it’s critical to practice low-level skills enough that they can be carried out without requiring conscious effort. This is known as automaticity. Think of a basketball player who is running, dribbling, and strategizing all at the same time — if they had to consciously manage every bounce and every stride, they’d be too overwhelmed to look around and strategize. The same is true in math. I wrote more about the importance of automaticity in a recent post here.
The most effective type of active learning is deliberate practice, which consists of individualized training activities specially chosen to improve specific aspects of a student’s performance through repetition (effortful repetition, not mindless repetition) and successive refinement. However, because deliberate practice requires intense effort focused in areas beyond one’s repertoire, which tends to be more effortful and less enjoyable, people will tend to avoid it, instead opting to ineffectively practice within their level of comfort (which is never a form of deliberate practice, no matter what activities are performed). I wote more about deliberate practice here.
Instructional techniques that promote the most learning in experts, promote the least learning in beginners, and vice versa. This is known as the expertise reversal effect. An important consequence is that effective methods of practice for students typically should not emulate what experts do in the professional workplace (e.g., working in groups to solve open-ended problems). Beginners (i.e. students) learn most effectively through direct instruction. I wrote more about that here.

There’s a lot more detail I want to include, along with hundreds of scientific references, but I’m going to skip it so as not to continue blowing up the length of this already-gigantic post. If you want to take the deep dive, here’s a draft I’m working on that covers all these findings (and more) with hundreds of references and relevant quotes pulled out of those references.

We’ve Known About These Techniques For Decades, So Why Aren’t They Being Taught/Used?

Now, this might seem like a lot of new information – a common reaction is “Wow, the field of education is experiencing a revolution!”

But here’s the thing: Most key findings have been known for many decades.

It’s just that they’re not widely known / circulated outside the niche fields of cognitive science & talent development, not even in seemingly adjacent fields like education.

These findings are not taught in school, and typically not even in credentialing programs for teachers themselves – no wonder they’re unheard of!

But if you just do a literature review on Google Scholar, all the research is right there – and it’s been around for many decades.

So why aren’t these key findings being leveraged in classrooms or even appearing in teacher credentialing / professional development? Why do they remain relatively unknown?

The biggest reason that I’m aware of:

Leveraging them (at all) requires additional effort from both teachers and students.

In some way or another, each strategy increases the intensity of effort required from students and/or instructors, and the extra effort is then converted into an outsized gain in learning.

This theme is so well-documented in the literature that it even has a catchy name: a practice condition that makes the task harder, slowing down the learning process yet improving recall and transfer, is known as a desirable difficulty.

Desirable difficulties make practice more representative of true assessment conditions. Consequently, it is easy for students (and their teachers) to vastly overestimate their knowledge if they do not leverage desirable difficulties during practice, a phenomenon known as the illusion of comprehension.

However, the typical teacher is incentivized to maximize the immediate performance and/or happiness of their students, which biases them against introducing desirable difficulties and incentivizes them to promote illusions of comprehension.

Using desirable difficulties exposes the reality that students didn’t actually learn as much as they (and their teachers) “felt” they did under less effortful conditions. This reality is inconvenient to students and teachers alike; therefore, it is common to simply believe the illusion of learning and avoid activities that might present evidence to the contrary.

These Techniques Connect All The Way Down to Mechanics In The Brain

The whole situation feels very similar to the Copernican Revolution. There are vested interests in keeping things the way they are, but enough evidence is compounding that it’s becoming impossible to ignore.

For instance, in addition to their effectiveness being so replicable that they might as well be laws of physics, the cognitive learning strategies discussed above actually connect all the way down to the mechanics of what’s going on in the brain.

The goal of learning is to increase the quantity, depth, retrievability, and generalizability of concepts and skills your long-term memory (LTM). At a physical level, that amounts to creating strategic connections between neurons so that the brain can more easily, quickly, accurately, and reliably activate more intricate patterns of neurons. This process is known as consolidation.

Now, here’s the catch: before information can be consolidated into LTM, it has to pass through working memory (WM), which has severely limited capacity. The brain’s working memory capacity (WMC) represents the degree to which it can focus activation on relevant neural patterns and persistently maintain their simultaneous activation, a process known as rehearsal.

Most people can only hold about 7 digits (or more generally 4 chunks of coherently grouped items) simultaneously and only for about 20 seconds. And that assumes they aren’t needing to perform any mental manipulation of those items – if they do, then fewer items can be held due to competition for limited processing resources. (Note that this is an emergent behavior of a more complicated underlying mechanism: the actual WM limitation is not a fixed number of storage units, but rather, the ability to sustain relevant neural activity while suppressing interference from irrelevant activity.)

Limited capacity makes WMC a bottleneck in the transfer of information into LTM. When the cognitive load of a learning task exceeds a student’s WMC, the student experiences cognitive overload and is not able to complete the task. Even if a student does not experience full overload, a heavy load will decrease their performance and slow down their learning in a way that is NOT a desirable difficulty.

Additionally, different students have different WMC, and those with higher WMC are typically going to find it easier to “see the forest for the trees” by learning underlying rules as opposed to memorizing example-specific details. (This is unsurprising given that understanding large-scale patterns requires balancing many concepts simultaneously in WM.)

It’s expected that higher-WMC students will more quickly improve their performance on a learning task over the course of exposure, instruction, and practice on the task. However, once a student learns a task to a sufficient level of performance, the impact of WMC on task performance is diminished because the information processing that’s required to perform the task has been transferred into long-term memory, where it can be recalled by WM without increasing the actual load placed on WM.

So, for each concept or skill you want to teach:

it needs to be introduced after the prerequisites have been learned (so that the prerequisite knowledge can be pulled from long-term memory without taxing WM),
it needs to be broken down into bite-sized pieces small enough that no piece overloads any student’s WM, and
each student needs to be given enough practice to achieve mastery on each piece (and that amount of practice may vary depending on the particular student and the particular learning task).

But also, even if you do all the above perfectly, you still have to deal with forgetting. The representations in LTM gradually, over time, decay and become harder to retrieve if they are not used, resulting in forgetting.

The solution to forgetting is review – and not just passively re-ingesting information, but actively retrieving it, unassisted, from LTM. Each time you successfully actively retrieve fuzzy information from LTM, you physically refresh and deepen the corresponding neural representation in your brain. But that doesn’t happen if you just passively re-ingest the information through your senses instead of actively retrieving it from LTM.

”The Future Is Already Here, It’s Just Not Very Evenly Distributed”

William Gibson’s quote applies here. The science of learning has been applied in the classroom, with phenomenal results: 8th graders – who are NOT prodigies, but have talent on par with kids in a typical honors class at a typical school – getting perfect 5 out of 5 on the AP Calculus BC exam (which is normally taken by honors 12th graders and can count for a year’s worth of calculus credit at most colleges).

This happened in Math Academy’s original school program. It’s the most accelerated math program in the USA, there have been plenty of other news articles written about it, and there’s plenty of other information straight from the horse’s mouth including a summary of events 2014-20 (from Sandy and Jason’s perspective), a summary of events 2019-23 (from my perspective, with a focus on teaching in the school program and getting the algorithms in place to turn it into a fully automated system), and another summary of events 2014-20 that I gave on Anna Stokke’s Chalk and Talk Podcast #42 (I’ll paste the relevant snippet below):

*” So back to this eighth graders taking AP Calc BC story. We originally started as a nonprofit school program founded by Jason and Sandy Roberts. One of their kids, Colby, was on the fourth-grade math field day team, and his parents were coaching that team. Their kid and his friends were all really excited about learning math, so they did the standard fourth-grade field day stuff. But the kids were so excited that they didn’t want to just stop at fourth grade. Something they would often ask Jason and Sandy was, “What’s the highest level of math?”

Jason and Sandy would have to say, “Well, it goes really, really high, but for your purposes, let’s just say it’s calculus, because that’s what seniors in high school take if they are on the honors track.” And the next question was, of course, “When do we get to learn it? Can we learn it now? Can we learn calculus tomorrow?” They were just so excited about it.

Jason and Sandy were teaching a bunch of these kids advanced math, even through fifth grade. They got up through a bunch of high school math and to the point where they could start learning calculus. One thing led to another, and this turned into an official school program that was not just a pullout class but became a daily Math Academy class. There were other cohorts that came in following years.

What this turned into was that we would get students in sixth grade who were solid on their arithmetic. They might know what a variable is, but they didn’t really know how to solve equations or anything. They were kind of at an early pre-algebra level. We would scaffold them up, teach them all of high school math within the next two years—sixth and seventh grade. Pre-algebra, algebra one, geometry, algebra two, and pre-calculus. In eighth grade, they’d be ready to take calculus.

Then, they would take the AP Calculus BC exam. We got to the point where most of the students who took the AP Calc BC exam in eighth grade passed, and most who passed got a perfect five out of five on the exam.

A couple of things I should say, these are not national talent search students.

How the kids were selected was that they scored at or above the 90th percentile on a middle school math placement exam, which is typically taken by all fifth graders in the district around February or March. They were then invited to join the program. It’s a seventh-grade math skills test, so it provides a somewhat high skill level, but it’s not designed to identify math aptitude.

This is also in the Pasadena Unified School District, where about two-thirds of the student population qualifies for the federal free and reduced lunch program, and about 44 percent of all K-12 students are educated in private schools, compared to the California average of 11%.

This is not a particularly talented group of students. It’s not a biased group of the top students in the nation. Just think of a standard school and kids in the standard honors class. They can be accelerated way, way, way higher than they currently are.

When Jason and Sandy were teaching, they were doing this all manually and achieving very good results. But these results got even better once students started working on the Math Academy system. Jason got tired of the kids saying, “I forgot to do my homework,” or “Oh, I forgot a pencil,” or all these excuses for not doing work. So, he just built a system where he could pick problems for them to do, and then all they had to do was log in at home and do the problems online.

It would automatically grade the problems and keep track of all the kids’ stats, keep track of the class accuracy, and various topics. Over time, this evolved into a system that did more and more of the teaching work.

In the summer of 2019, that’s when Jason pulled me in to make this system a fully automated platform that would actually select learning tasks for students. So, we built this automated task selection algorithm and continued refining it. By the time the pandemic hit in 2020, the big question was how to maintain this level of efficiency from manual instruction.

The answer was, “Well, we have this halfway baked task selection algorithm. Let’s just get it all in place over the summer and put the whole school program on it.” And that’s what we did. That’s how our AP Calc BC scores skyrocketed, from putting them on the system.”*

We really leaned into the automated system – it allowed us to leverage these learning-enhancing practice techniques to the fullest extent, selecting individualized learning tasks tailored to each student’s own knowledge profile, so that every student would always be working on the exact tasks that would move the needle most on their learning.

From my summary of events 2019-23:

*I got involved at the core of Math Academy’s software during the summer of 2019. At that time, the software had existed for several years as a tool that Math Academy instructors used to create and grade assignments — they would manually select problems from the database (which contained a mountain of content written by Math Academy’s team of PhD mathematicians), students would complete the problems online for homework, and the software would automatically grade the assignments and keep track of each student’s grades. But it was very time-consuming to manually choose a mix of problems that covered the multiple topics taught during class each day and also provided spaced review on previously-learned topics.

It was clear that while students were learning an incredible amount of math, giving the same assignment to each student in a class still left a lot of learning efficiency unrealized: even within a single class, different students had different strengths and weaknesses, had different sets of topics that they were ready to learn, and needed different amounts of practice on different topics to reach a sufficient level of mastery. Giving each student the same assignment virtually guaranteed that every student wasted lots of time being bored or lost — either way, not actually learning. To solve this problem, different students would need to learn different topics at different times, and get different amounts of practice (including review) on those topics, and each student’s learning plan would need to continually adapt to their individual performance.

The need for fully individualized learning, as well as other needs (e.g., financial sustainability and the constant effort to maintain accountability & standards across multiple classes / teachers / schools) led Jason and Sandy to the realization that the only way forward was for the system to become a fully-automated standalone online learning platform, commercially available to the general public. Aware of my background, Jason asked me to develop an algorithm that would automatically assign personalized learning tasks (personalized to each student’s individual knowledge profile) while leveraging effective learning techniques like mastery learning, spaced repetition, and interleaving.

By the end of summer the we had an implementation that — despite being very rough, brittle, and in many ways incomplete — was good enough to test out with a real student. During the 2019-20 school year, we started out with a single independent learner, a student who was previously in Math Academy and had moved to another state. She learned AP Calculus BC using only the system (i.e. no external help, no human intervention) and got a 5 on the AP exam. This proved the concept that we could upgrade the software from a manual assignment creation tool to a fully-automated adaptive learning system supporting independent learners without any human intervention.

A major transition point happened in Spring 2020, with the onset of the COVID-19 pandemic. I moved in with Jason and his family to quarantine, which led to a makeshift startup incubator experience run out of his living room. We worked every waking hour, well into the night, every day including weekends, so that by the end of summer we were ready to run entire school classes on the automated system. During the 2020-21 school year and COVID-19 pandemic, the automated system proved to be significantly more effective than traditional remote instruction, and by spring 2021 nearly all of Math Academy’s school classes were running on it (a couple of which I personally managed as a makeshift usability lab until 2023).

During the 2021-22 school year, even after school was back in person, we reached the point that the system was 4x as efficient as traditional in-person classes covering the same material. Seemingly impossible things started happening like some highly motivated 6th graders (who started midway through Prealgebra) completing all of what is typically high school math (Algebra I, Geometry, Algebra II, Precalculus) and starting AP Calculus BC within a single school year. Math Academy’s AP Calculus BC exam scores rose, with most students passing the exam and most students who passed receiving the maximum score possible (5 out of 5). Four other students took AP Calculus BC on our system, unaffiliated with our Pasadena school program, completely independent of a classroom, and all but one of them scored a perfect 5 on the AP exam (the other one received a 4).

Finally, during the 2022-23 school year, we opened up mathacademy.com commercially to the world at large, became accredited, grew to hundreds of commercial users, and hit operating break-even. Even more commercial students aced the AP Calculus BC exam, including an elementary schooler.*

There’s plenty more to this story – we also ran a quantitative CS course sequence within the original school program from 2020-23, where we scaffolded our high schoolers up from having little to no coding experience to doing masters/PhD-level coursework (reproducing academic research papers in artificial intelligence, building everything from scratch in Python). We called this the “Eurisko” program.

It’s still pretty early on, and as of spring 2025 the very first cohort is still in undergrad (it’s currently their junior year). However, there have already been some amazing student outcomes in terms of college admissions, accelerated graduate degrees, research publications, and science fairs.

Just to name some impressive stats on 4 of the 16 students in the Eurisko program:

Anton attends MIT
Justin is attending Caltech
Colby started taking grad courses his 2nd year of college and is considering an early master’s degree
Matteo published a math-heavy research paper, solo author, in high school, and won 1st place in the Regeneron Science Talent Search (his poster and presentation are available here). He’ll be attending Stanford in 2026.

We haven’t been systematically tracking this info or sending out alumni surveys or anything, so there’s probably even more interesting stuff going on that we haven’t heard about yet.

But my point is this: when you make it your mission to maximize student learning – including leveraging the learning-enhancing practice techniques that have been known, reproduced, and yet ignored by the education system for decades – you realize that there is a massive amount of human potential being left on the table. Students can be learning way, way, way more than they currently are.

Follow-Up Questions

Q: What motivated you to get your teacher credential? I’ve also thought about getting my teacher credential, but decided against it for the same reasons as you’ve outlined. What made you decide to suffer through it?

A: I had to do a teaching credential in order to teach in Math Academy’s original school program, which operated in public schools (Pasadena Unified School District).

I.e., I worked as a public school teacher who exclusively taught Math Academy classes, and the district required all teachers to do credentials.

The credential itself was a complete waste of time, but I was so serious about Math Academy – even back then – that I was willing to put up with the suck and suffer through it (among other things).

In hindsight, I’m actually glad I did it because it gave me more firsthand experience with how bad things have gotten.

Some of these things sound so ridiculous and unbelievable that some people actually don’t believe it’s happening. But I can say “no, it’s real, I spent years right there in the train wreck.”