Some thoughts on teaching & learning

What lecturing teaches you about learning and why it's silly to compete with people who will do anything to gain an inch.

What did I learn making the transition from quiet grad student to lecturing a course with almost 1000 students at UC Berkeley? These thoughts are pretty orthogonal to electronic-first teaching, but I think all of it is more polarized now (like our world).

Above: My first lecture ever, this was a time. Source, cs188.

Being up on the stage of Wheeler Hall at UC Berkeley (known enough to have its own Wikipedia page) has definitely brought on imposter syndrome, but it’s forced me to be real with how I am performing, what I should change, and what I will never know. I’m teaching CS188: Artificial Intelligence at UC Berkeley and have no past lecture experience. Wheeler Hall (below) has had many famous lecturers, such as a Nobel Laureate acceptance speed from Ernest Lawrence during WW2. The wood is unchanged — and now captures a long list of impressive displays (the list in AI researchers alone includes Stuart Russell, Pieter Abbeel, and more).

Above: The auditorium built with famous Cali redwood. Source — UC Berkeley.

I use the word performance intentionally, as the more I do it, the more I learn lecturing is an art. It’s an art of engagement, it’s an art of information transmission, it’s an art of belief.

I’ve watched full professors give the same lectures I am giving, and am trying to take my own practical spin on things (as someone who works in the area, and truly wants to master the concepts as well). Here are my takeaways from the first half of mastering a new creative activity — under the gazes of 1000s of eyes (and permanent digital video).

1. Being weird on stage is a win.

I remember I finished a lecture almost 5 minutes early because I was overwhelmed, had gone through 10 slides of unprepared material, and I was left empty. After babbling, I tried:

I’m totally happy standing up here for 2 minutes until I get any questions.

The student response was fantastic. Either they’re laughing at my awkwardness, or engaged by the honesty. After a couple minutes I got a super precise question leading into how Markov Decision Processes (MDPs) are the tool that enables modern Reinforcement Learning (RL). Something along the lines of:

How is one-step lookahead’s sufficient in MDPs when distilling into a global policy.

The idea of sufficient, but not necessary, is important to learning, because we need to know when something will converge, but do not necessarily care how. The recommended time to wait for a question or answer is 7 seconds, which feels like an eternity on stage, but works. So 120 seconds was a long time.

2. It matters.

Every lecture morning is an experience is stress management. Talking to my co-lecturer who has been doing this for years — it never goes away, because the education of students matters. I was expecting to get better at this, but as in life, it’s not about getting rid of stress — it’s about getting better at dealing with it. This has been a huge lesson in the middle of my PhD, and I am sure accepting the important stresses will help me on so many journeys.

I’ve learned that the only way I am going to be okay with the stress is if I am just being myself. I am here and I am present to help the students. I give the best I can, and they’ll understand my mistakes, which are bound to happen in the learning process. I’ll just say my weird quirks in the reinforcement learning unit were a poorly tuned exploration function!

3. You know when you mess up.

There have been one or two lectures (not detailed to keep my cards hidden from potential student readers) where I have not prepared in the right way, so my explanations clearly lacked. The result of a weak explanation is a wall of confused faces (or backs of smart phones). This level of direct feedback is so impactful — it goes directly to my bones. There’s not a lot of careers where you can get such real time feedback on performance, so it’s a blessing and a curse, but I do apologize to students when I am not doing well enough for them.

4. You cannot prepare too much for things that matter.

Reflecting on the last point, you can’t really prepare too much. Every minute of lecture is multiplied by 1000x, so improving one sentence or one slide is so incredibly time-efficient in terms of total future world-productivity. If I spend 30–60minutes on two slides that removes 5minutes from each student’s learning time on the homework, I have saved almost 3 days of productive minutes.

I always used to be one to prepare the material, but not the words. I’ve learned that you certainly should not always do that — your words effect how safe people feel, if they believe you, and if they’ll be open to their classmates. Little things like changing ‘you guys’ to ‘you all’ add up to create the defining nature of the university.

5. Education is changing.

While I only get 400–800 students in the lecture, I can get 1000s more in webcast views. I’ve noticed there’s a group of students that want to engage and make strong conceptual webs and there’s student that want to treat it as a Massive Open Online Course (MOOC). Every week I go from super fun questions after lectures to a room filled with students wanting to ask project questions (yes, they were not in lecture). I’m not saying one method is better, but I think universities need to take care in balancing student life and experience.

The students of today are extremely motivated, but also can be more high-strung and less connected. I have made a big push in my class to foster community — as it’s the people they all could meet that will control the AI industry in a decade, but it’s easy to see where people can slip up.

Onward to the next half of the course! I’ll sure have more reflections at the end, but I am still thrilled to have this opportunity to teach a phenomenally structured class at a top institution. Here’s a tool I used to learn to speak:

Source — Author, in Costa Rica.

How top CS students problem solve

How I’ve been impressed by the top students of CS188: Artificial Intelligence at UC Berkeley. The top students of today earn the craze of recruiting from top companies — they show consistent ways to out-innovate problems and create value.

Stay one step ahead

When lecturing, I notice how the best students are always asking the question as to where we are going in a lecture before I introduce it. Underpinning the next two sections is a clear ability to build conceptual maps. I know I am a fairly idea-centric lecturer, and these students don’t miss a beat. The raw compute power is impressive.

I’ll tell this story with a running multi-agent learning contest, where we wanted to optimally solve new Pac-man puzzles. This is designed to be a small amount of extra credit for above and beyond students, but it ended up giving great insight into what the top minds are doing. The scoring was as follows:


  • +10 for each pellet

  • +500 for collecting all pellets

  • -0.4 for each action taken

  • -1000 x (compute time in seconds)

Brute Force

I’ve heard so many stories about how hard undergrads at UC Berkeley work, but it’s hard to know what that means or how it translates into practical productivity without examples.

A student who got second place in the Pac-man contest ultimately ran a brute force bash script overnight to create a feedback loop over the random seeds and upload new solutions to the test case we were trying (poorly) to keep hidden. The reason the student wasn’t disqualified: he documented his approach and we had not created a rule against this. A new feature of the contests is averaging over multiple seeds (obvious to scientists, yes) so that the signal returned from the contest is not as direct.

A wow, sample entry

Below is a summary of the ‘wow’ contest submission, it seemed like the student may have used some shady practices to hard-code solutions (hard coding will remove the computation time penalty on score), but they set up a minor class competition as a serious engineering problem. Here’s how in some of his own words:

  1. Copy my answer from a project into the Mini-Contest — 645 points.

  2. Make a tool to get feedback on contest scores from the auto-grader, including individual scores for each puzzle (*a huge step because it adds feedback to what was a black box test case).

  3. Add a “Targeting data” to each Pac-man — reducing computation by planning and executing trajectories (don’t replan each step).

  4. Add shared data for Pac-men, so that if a Pac-man is targeting dot A, other Pac-men decide to ignore dot A. This encourages each Pac-man to take a different route — 1185 points.

  5. Instead of just ignoring dot A, Pac-man decides to treat a targeted dot as a wall. This further encourages each Pac-man to take different paths, if, for example, the two closest dots are on the path of another Pac-man (*a great insight into how intelligent state-space reduction will help search problems.).

  6. Hardcode the initial state information so that I can decide which AI to use. Then use sweeping to match seeds.

  7. Modify’s reporting feature so that after a successful run, it copies the actions taken to a text file, with a unique text file per layout. If a text file already exists, then it compares the score of that file with the new proposed path, and only selects the best option.

  8. Produce a program that combined all the AIs into one dictionary structure, so that I could quickly copy/paste it.

  9. Run step 8 on my best AI, and modify my AI such that if the current layout is exactly the same as one of my hardcoded layouts, it decides to follow the hardcoded path; otherwise, it uses the AI found in step 4 — 1226 points.

  10. Run step 8 once on ~3 variants of my AI, combine the best results, and update the hardcoded paths — 1252 points.

  11. Modify so that after one successful round, it analyzes the sequence of moves generated, and determines which pellets each Pac-man tried to reach, in what order. (*a crucial engineering step).

  12. Run step 9 once, and combine the new AIs into my file — 1255 points.

  13. Modify my code so that it automatically downloads better AIs from the files, and updates itself, so that my AI can improve without human intervention. I then created a bash file that ran the modified auto-grader repeatedly, thus automatically improving the paths. (*a second crucial engineering step)

  14. Run step 11 for 1 hour, and combine the new AIs into my file. This corresponds to my sixth submission, which scored 1261 points.

  15. Run the bash script overnight. Total training was about 8 hours, but most of the improvement was made in the first hour — 1266 points. And the student was done.

Submission from step 15.

There are some serious insights in this process, combined with the force of will and motivation to get it done. So what it takes to be a top student at a top CS university is the ability and the will the walk oneself through this kind of rigorous implementation. I would not say I have that level of fine tuning — maybe that’s why I am a graduate student and not a software engineer.

Don’t build in the box — break the box

The brute force solution is all about building a better toolkit, but ultimately the student lost to a clever solution that is from one stroke of genius.

The winning submission.

The top scorer in a class of 900 students was the one that made a change to the problem setting that made their score so much higher (10% margin). In a game penalizing extra actions, this student made 3 of the agents take no actions, and solved the game with one search agent.

Score — 1327.

This is the other half of the equation to brute force engineering. It’s the stroke of insight that seems simple post-hoc, but so few have.

The best students find the ways to win, and often result in the rules of the future. As course staff, the detail is impressive. This is the kind of effort that gets products out the door, and I strongly stand by these students’ scores. It’s not a requirement to be successful to have these types of effort and insights, but they can take you very far.

I choose to define my own success on how I perform, so I don’t worry about trying to replicate any crazy win like this. You can have a great career through consistency and enjoyment, but it’s fun to see how some of the best minds break problems down.

I completed my undergrad almost 3 years ago at Cornell University, and it was almost 6 years before I was really trying to make a name for myself in my first competitive engineering classes. I’m not sure if it’s just a change of perspective, but I have been impressed by the students. I’m sure they’ll be rewarded for their hard work.

Now, seeing these types of paths, it’s not worth competing directly. I don’t think there is every a reliable way to “break the box” so to say, but if you figure that out you will always be ahead. I am thinking more along being anti-competitive: be yourself, and no one can compete with you.