Challenges One Might Encounter While Leveraging Effective Pedagogical Techniques

Want to get notified about new posts? Join the mailing list and follow on X/Twitter.

Anatoly Vorobey posted a list of challenges that he experienced while tutoring a student while leveraging effective pedagogical techniques. These were interesting to evaluate and I expect they may be useful reading for any educator in a similar position.

Challenge 1

“But when in a process of solving something else (like a word problem) two unknown quantities come together in two different ways, and you get two equations, she wouldn’t jump from that to solving a system.”

Word problems would need to be a separate topic that the student gets repetitions on. There are two things going on here, 1) you can set up a system of equations describing a word problem, and 2) you can solve that system using previously learned techniques. I generally wouldn’t expect a student to naturally make the jump from (1) to (2) on their own unless they’re exceptionally bright. The vast majority of students need to be taught this stuff explicitly and it takes lots of repetitions spaced out over time before it becomes a natural reflex (kind of like muscle memory).

“They [inequalities] come up all the time in different contexts (problems with absolute value, domains of functions etc.). But knowing how to solve an inequality and knowing to recognize something as an inequality to solve are two different skills. … Solving inequalities on their own doesn’t help with this. … Solving lots of problems that depend on inequalities? If they’re all typical and basic, this doesn’t seem to train up the concept enough. Maybe I should try to assemble a *diverse* set of very different problems that hide inequality in different ways. But that seems too clever.”

Yes, assembling a diverse set of problems that hide the inequality in different ways is what needs to be done. For the vast majority of students, this is a skill that needs to be hit over and over again, spaced out over time, in each context you expect them to apply it in. For students who are particularly mathematically gifted, they will require fewer repetitions and a smaller variety of examples from which to generalize. But most students need a large spread like that where they can get plenty of explicit practice in all these contexts. Of course, this takes a ton of work to have them hit all those bases, but it’s what needs to be done.

Challenge 2

“a meta-algorithm that looks something like: you’re given some setup with lots of interrelated thingies and some facts about them, and you need to find the numeric value of one of the thingies. Denote some of the thingies as variables and express others in terms of those variables using the facts. Work towards an equation, knowing it must come. When it comes, solve it; sometimes there’ll be more than one, then solve the system. … Maybe the meta-algorithm is trainable for everyone/most people, but some need x100 fewer reps than others.”

Yes, I would agree that students who are particularly gifted can generalize from far far fewer reps, while less gifted students may need 10x or 100x or maybe even more. I actually benchmarked this once by estimating how many practice problems it took to prepare some of my students for the AP Calc BC exam – it took about 400 problems for a kid who was prodigy-level gifted, and about 4000 problems (10x more) for kids who were “just” highly gifted. For students who are “just” above average, or typical, this multiplier will shoot up way higher. This is discussed in detail here, and I’ll paste a particularly relevant snippet below:

*“Midway through 7th grade, M and his parents started thinking more about his future and came to agree that it would be a good idea for him to knock out the AP Calculus BC exam in 8th grade so that he could make a convincing case for enrolling in university courses the following year. But at the same time, he wasn’t going to take a separate calculus course, and he didn’t want to do a whole lot more work outside of our weekly discussions. The totality of M’s calculus instruction before the exam was limited to an hour-long chat each weekend and 5-10 homework problems per week for about 12 months, about 400 problems total, plus 3 or 4 practice exams.

Meanwhile, students that I taught in the radically accelerated school program did about 300 lessons (each with about 10 questions) and 300 reviews (each with about 3 questions) for a total of about 4000 questions, plus 6 practice exams. Not only did they solve an order of magnitude more problems than M student, they also had more one-on-one time with me (there were only 5 students in the class), they were doing this every single school day for an hour, and they were also working from a far more scaffolded and comprehensive curriculum (whereas for M, I had to slim down the curriculum to the bare essentials, otherwise we wouldn’t get through it all). Yet, M thought the AP exam was pretty easy, came out of the exam quite confident that he got a 5 out of 5, and indeed he did – whereas in my class, the average score was 3.6 out of 5 (two 5’s, a 4, two 2’s), and even the students who ended up getting a 5 did not come out of the exam confident that they got a 5.

(For more context about the level of giftedness of those 4000-problem students: those students were 8th graders studying AP Calculus BC in a radically accelerated math program, but were around the same giftedness level of typical kids in a typical honors math class at a typical school. They started the program in 6th grade, taking Prealgebra. We got them all the way up through AP Calc BC on one class period’s worth of work per school day from 6th-8th grade — we were increasing efficiency, not workload. How the they were selected: they scored at or above the 90th percentile on a middle school math placement exam typically taken by all fifth graders in the district in the spring. They were then invited to join the program. It’s a seventh-grade math skills test, so it provides a somewhat high skill level, but it’s not designed to identify math aptitude. This was in the Pasadena Unified School District, where about two-thirds of the student population qualifies for the federal free and reduced lunch program, and about 44 percent of all K-12 students are educated in private schools, compared to the California average of 11%. Four other students took AP Calculus BC on our system, unaffiliated with our Pasadena school program, completely independent of a classroom, and all but one of them scored a perfect 5 on the AP exam — the other one received a 4. More info here and here.)”*

Challenge 3

“3. Negative knowledge. Remembering what NOT to do, even though it seems obvious. We can solve many inequalities with a method of intervals, then after a while I throw in a simple “1/(5-x) > 4”, and her first instinct is to - obviously - turn this into 1 > 4(5-x). It seems harder to train NOT to do something than to do something right. This problem does seem to be helped by doing many very simple reps interspaced, but it remains to see if the learned negative knowledge will stick.”

It’s good to hear that interleaving between both categories of problems is helping. I would expect the learning to stick provided that you follow a reasonable spaced repetition schedule. The knowledge is always in a state of decay; there is no one-and-done solution to make it stick forever on the first exposure, but if you periodically have the student solve review problems then the rate of forgetting should gradually slow down enough that no more explicit review is needed (because the student ends up happening to practice this as a subskill in more advanced material frequently enough to keep the forgetting from getting too extreme).

Challenge 4

“4. Focused attention. So many small errors are just inattention. Algebraic manipulation especially. Repeated focus on fundamentals makes it better but it’s not clear that it makes it go away completely. It usually helps to say “re-check your work” even without pointing out the mistake, so maybe the idea is to train *that* to happen automatically.”

Small errors due to inattention should go away gradually as the student develops full mastery of the material (by getting more practice spaced out over time while also layering more advanced knowledge on top).

This won’t happen super quickly, but the rate of making errors should decrease as 1) the mathematical manipulations become almost like muscle memory, 2) the student experiences less cognitive load while solving the problems, and 3) they can better maintain a “big picture” view of what they’re doing and detect early on when something is going off the rails.

(3) in particular comes down to perceptual learning, the ability to extract key features from complex environments while filtering out irrelevant noise, which comes down to building long-term memory representations (elaborated here).

As you’ve pointed out, a critical component of developing full mastery is holding the student accountable for solving problems correctly, independently. If they are always told where their mistake is, they will not develop the ability to catch mistakes on their own, which means they will never reach full mastery of the skill.

(If a student is unable to identify their mistake then of course you’d need to provide some scaffolding by nudging them towards where they need to look, or if they’re really stuck, then you might even need to point it out explicitly – but this scaffolding needs to be gradually ripped away.)

Additionally, this is another reason why it’s critical that the student builds sufficient mastery on prerequisite skills – if there are 4 subskills involved and the student executes each of those subskills correctly 70% of the time, then the student’s probability of landing the compound maneuver correctly is only 0.7^4 = 24%. The student needs to practice their subskills enough to become really really solid executing them. Frequent silly mistakes are often a symptom of not being solid enough on component skills (i.e., not having gotten enough practice on those component skills).

Challenge 5

“5. Long-term memory. This is the obvious one. Everything fades. Specific examples are the formula for roots of the quadratic equation, and the order of terms in a derivative of a fraction (f’g - fg’, not the other way around). Spaced repetition can and does help with forgetting, but I also do wonder if different students have vastly different lengths of retention before they need to be reminded.”

Yes, different students will have vastly different retention intervals (and so will different topics depending on intrinsic difficulty). This is actually factored into Math Academy’s spaced repetition system, which I wrote about here. Here’s a particularly relevant snippet from the section “Calibrating to Individual Students and Topics”:

*“The speed at which students learn (and remember what they’ve learned) varies from student to student. It has been shown that some students learn faster and remember longer, while other students learn slower and forget more quickly (e.g., Kyllonen & Tirre, 1988; Zerr et al., 2018; McDermott & Zerr, 2019). Similarly, learning speed also varies across topics: easier topics are learned faster and remembered longer, while harder topics take longer to learn and are forgotten more quickly.

So, for each student, each topic has a learning speed that depends on the student’s ability and the topic’s difficulty. Student ability and topic difficulty are competing factors – high student ability speeds up the overall student-topic learning speed, while high topic difficulty slows it down. In this view, a student-topic learning speed can be measured as a ratio between
1. the speedup due to student ability, and
2. the slowdown due to topic difficulty. Student-topic learning speeds are used to adjust the speed of the spaced review process.”*

I also wrote about this in the post that I mentioned earlier about the prodigy-level student versus students who are “just” highly gifted. I’ll paste two particularly relevant paragraphs below:

*“M would also retain it much longer after the initial practice – for instance, we could cover a new topic one week and then he’d be able to recall most of it a week or two later, whereas many students in my class would forget most of a new topic within a couple days of learning it if they did not receive additional practice. That said, M is not immune to forgetting, and it’s not like he’s “locking things into place” indefinitely in his brain. It’s just that his rate of forgetting is much slower.

I’ve worked with plenty of students who are well above average mathematically but not nearly to the extent of M. They are much slower to absorb new information, and even after they are able to consistently solve problems correctly, they will forget it almost entirely within a week or two. Imagine you’re writing code to develop some application, but you’re using some buggy version control where each day, 10% of your code is deleted. That’s how it feels working with these other students. It’s like writing in disappearing ink. On the other hand, for M, it’s like his code gets implemented in a more robust way, and less than 1% gets deleted each day.”*

Challenge 6

“Say when solving a geometry problem with trigonometry, you get to a triangle, and you know some side lengths/angles, you need others. You can set up an equality with a law of sines, and it’s often helpful. But if your triangle is right-angle, you don’t need the law of sines. … [but] It seems counterproductive to specifically teach ‘do NOT use this in a right angle triangle’”

This strikes me as a variation of example 3 (remembering what NOT to do, even though it seems obvious), and I would expect interleaved spaced practice to help here, provided that the student is receiving corrective feedback to use the simpler formula.

One additional thing that I’d expect to be helpful here is timed practice. Many students will happily resort to always using a more cumbersome method if it means there’s less to remember, but as you point out, it’s typically counterproductive to treat an correct answer as incorrect on the basis of a non-preferred solution methodology. However, you can shut down inefficient solutions by adding a time constraint – the student legitimately can’t solve the problem within the expected amount of time, and learning to pick the more efficient solution methodology can help them overcome this. The time constraint helps frame the situation more productively, and timed practice (with follow-up practice on anything the student can’t do quickly enough) is needed anyway to help build automaticity (only after a student is able to successfully perform the skill in an untimed setting, of course).

Follow-Up Questions

Q: What worries is less the amount of reps needed for generalization, and more the question of whether reps of practice problems alone, however many of them, will get the student there. When you teach “solve word problems by choosing unknowns judiciously, make sure to exploit all the known facts, above all search for an equation, and not a trivial one which just restates what you already assumed”, I just don’t see how doing 100 or 1000 problems necessarily gets the student to create the right mental space. We’ll cover various types of problems, but the student just learns to generalize within each subtype. The variety is endless and when she gets a subtype she’s never seen before, she struggles. At the same time some other student just takes it in and builds the right generalization without much conscious effort. So I’m looking for additional ways to transfer that understanding, to help grow that mental space, besides grinding problems. Maybe there isn’t one!

A: I don’t know of any other techniques to more efficiently grow that mental space. My experience has been that some students will need a (potentially prohibitively) large volume of practice problems, it is indeed the most efficient way – it seems that there’s a limit to any particular student’s generalization ability, it’s not particularly trainable, and it limits how big of “mental leaps” they can make off of their bridge of accumulated problem-solving experiences. So the way you increase a student’s ability to make mental leaps is not actually by jumping farther, but rather, by building bridges that reduce the distance they need to jump.

This seems to be echoed in the cognitive science literature as well, e.g., Sweller, Clark, & Kirschner (2010) state the following:

…[T]he research suggests that we can teach aspiring mathematicians to be effective problem solvers only by providing them with a large store of domain-specific schemas. Mathematical problem-solving skill is acquired through a large number of specific mathematical problem-solving strategies relevant to particular problems. There are no separate, general problem-solving strategies that can be learned.”

If interested, I’ve elaborated further on this “bridge-building” perspective here.

Q: For the prodigy solving 400 problems and the gifted non-prodigies solving 4000 problems, what is this in terms of minutes?

A: For the 4000-problem group, about 1-2m on average. There were of course some shorter problems and some longer problems, but it averaged out to about 1-2m/problem.

The prodigy-level kid would take a bit longer per problem (maybe like twice as long) but that’s because he was working exclusively on harder problems with much less scaffolding, taking larger “leaps” between problem types, and doing less repetition on each problem type.

(In hindsight, I probably should have put some error bars on the ballpark estimates for the number of problems as well: the approximate ranges would be 3000-4000 problems for the gifted non-prodigy group and 300-400 problems for the prodigy.)

Note that the 4000-problem cohort of accelerated middle schoolers were spending roughly the same amount of time as a typical high school math student: about an hour per school day. They weren’t slaving away for multiple hours every day. We increased efficiency, not workload. More info here.

Q: Why not have the 4000-problem group work with less scaffolding, take take larger “leaps” between problem types, and do less repetition on each problem type, just like the prodigy? Wouldn’t they learn the material faster that way?

A: The non-prodigies wouldn’t learn the material at all under the same setup as the prodigy.

In order to solve problems successfully, they needed heavier initial scaffolding, the difficulty needed to be increased in smaller increments, and they needed more repetition before moving forward to the next difficulty increment. Without those supports, they wouldn’t be solving problems successfully, and they wouldn’t be learning the material.

They’d just be flailing, completely out of their depth – not only failing to solve the problems, but also not getting much practice on their foundational skills. So not only are they not learning much new material, but they’re also forgetting a lot of previously learned material, and their overall learning is actually moving backwards.

I’ve seen that situation play out as well. In fact, the rationale behind our high-initial-scaffolding, smaller-leaps, more-repetition setup was in part motivated by seeing the situation play out in earlier cohorts of students in the same program. The year we really got our arms around the problem of delivering a high-initial-scaffolding, smaller-leaps, more-repetition learning experience to these students, was the year their AP scores skyrocketed.

Want to get notified about new posts? Join the mailing list and follow on X/Twitter.