Summary
Writing code is only one piece of creating good software. Code reviews are an important step in the process of building applications that are maintainable and sustainable. In this episode On Freund shares his thoughts on the myriad purposes that code reviews serve, as well as exploring some of the patterns and anti-patterns that grow up around a seemingly simple process.
Announcements
- Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science.
- When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
- Your host as usual is Tobias Macey and today I’m interviewing On Freund about the intricacies and importance of code reviews
Interview
- Introductions
- How did you get introduced to Python?
- Can you start by giving us your description of what a code review is?
- What is the purpose of the code review?
- At face value a code review appears to be a simple task. What are some of the subtleties that become evident with time and experience?
- What are some of the ways that code reviews can go wrong?
- What are some common anti-patterns that get applied to code reviews?
- What are the elements of code review that are useful to automate?
- What are some of the risks/bad habits that can result from overdoing automated checks/fixes or over-reliance on those tools in code reviews?
- identifying who can/should do a review for a piece of code
- how to use code reviews as a teaching tool for new/junior engineers
- how to use code reviews for avoiding siloed experience/promoting cross-training
- PR templates for capturing relevant context
- What are the most interesting, innovative, or unexpected ways that you have seen code reviews used?
- What are the most interesting, unexpected, or challenging lessons that you have learned while leading and supporting engineering teams?
- What are some resources that you recommend for anyone who wants to learn more about code review strategies and how to use them to scale their teams?
Keep In Touch
Picks
- Tobias
- On
Closing Announcements
- Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
- To help other people find the show please leave a review on iTunes and tell your friends and co-workers
Links
- Wilco
- Code Review
- Home Assistant
- Trunk-based Development
- Git Flow
- Pair Programming
- Feature Flags
- KPI == Key Performance Indicator
- MIT Open Learning Engineering Handbook
- PEP Repository
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA
Hello, and welcome to podcast dot in it, the podcast about Python and the people who make it great. When you're ready to launch your next app or want to try a project you hear about on the show, you'll need somewhere to deploy it. So check out our friends over at Linode. With their managed Kubernetes platform, it's easy to get started with the next generation of deployment and scaling powered by the battle tested Linode platform, including simple pricing, node balancers, 40 gigabit networking, and dedicated CPU and GPU instances. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover.
Go to python podcast.com/linode today to get a $100 credit to try out their new database service. And don't forget to thank them for their continued support of this show. Your host as usual is Tobias Macy. And today, I'm interviewing Anne Freund about the intricacies and importance of code reviews. So, Anh, can you start by introducing yourself?
[00:01:06] Unknown:
Sure. Hey, Tobias. Thanks for having me. I'm On. I'm the cofounder of Wilco, which is kinda like a flight simulator, but for software engineers to practice their hands on skills. I spent most of my career in engineering teams as an engineer, as a manager. And as part of that, really fascinated by how to make teams and individuals
[00:01:28] Unknown:
run better. And do you remember how you first got introduced to Python?
[00:01:32] Unknown:
I'll start by adding a disclaimer. I'm in no way a Python expert, not even close. It's probably not 1 of the top 2 languages in my arsenal, but I do love it. I think it's a great language. Got exposed to it many times throughout my career, but I seriously started using it when I was contributing to an open source project that was using Python. It's a project called Home Assistant for home automation. The entire system is in Python, so that kind of forced me to really get into the language.
[00:02:03] Unknown:
Yeah. Definitely an interesting project and 1 that was on the show few years ago at this point. I think early on before it kinda hit their hockey stick of growth. Awesome. Definitely a great project. And so in terms of the topic at hand for today, you know, code reviews, it's a bit of an overloaded term. You know, at at its base level, it's just somebody going and saying, yep. There's code there. I approve. And at the other end, it could be something that takes, you know, days or weeks to, you know, bring to resolution. I'm wondering if you can just start by giving kind of a shared working definition of what a code review is and kind of the overall purpose of what you're trying to achieve with conducting 1?
[00:02:43] Unknown:
Sure. You can go very formal and talk about change sets and all of that, but I think it's less interesting. I think what's really interesting is a code review is the opportunity for developers to discuss a given piece of functionality. And when I say discuss, it's because there are so many different aspects to it. And I think that 1 of the things that are missing from most conversations about code review is why are we even doing them, and it doesn't necessarily have to be the same answer for everyone. And that's why I'm keeping it very generic. You know, you wanna discuss a piece of functionality that's about to enter the code base, And there are so many different aspects of that funk functionality to discuss.
[00:03:26] Unknown:
It's fun because there are a lot of different kind of philosophies about code review and how you should structure your development. You know, there are camps that say, just commit straight to trunk and everybody just pushes to trunk and integrates all the time. And, you know, in that case, why do you need a code review? So, you know, another interesting aspect is, like, how does code review fit into your overall development practice? Like, do you need it in trunk based development? If so, how do you do that? You know, do you need it for kind of Gitflow? Like, how does your kind of branching or, you know, development model factor into the ways that you think about code review and its importance?
[00:03:59] Unknown:
Yeah. Definitely. And even though I'm very much pro code reviews, there are certain workflows that don't lend themselves well to code reviews or don't need them. If you're doing a 100% pair programming, why do you even need code reviews? Right? The piece of functionality has already been discussed as you were developing it, and some might even say this is better. You know, there is no bottleneck at the end of development. The discussion happens right away along with everything. So if you will, the whole moving from waterfall to agile is kind of completed by taking the code review and and instead of having it as a separate step, having it as part of the regular workflow.
[00:04:40] Unknown:
In terms of the actual activity of conducting a code review, you know, it can seem very simple of just, okay. I look at this code. I don't see any obvious bugs. Go ahead. Or it can be, as you said, very formalized. And I'm wondering if you can maybe talk to some of the subtleties that become evident as you do more code reviews and some of the ways that some of the activities and behaviors that can be incorporated into the code review process and, you know, some of the ways that different teams might prioritize those activities.
[00:05:12] Unknown:
Sure. And I'll start by saying, you know, you said there are no bugs. But to me, bugs is just 1 aspect of code reviews, and and I think that deserves another tangent and piece of conversation just to talk about what should be discussed in a code review, what shouldn't, and bugs definitely are an interesting aspect in that regard. But as for the subtleties of code review, I think the biggest subtlety of the code review is that it's probably the best example of stringing together soft and hard skills. So when we write code, it's mostly hard skills. There are some soft skills involved, like knowing how to structure commenting and code, for example, I think is a soft skill, but it's mostly hard skills.
When you're having, I don't know, a sprint planning, that would mostly be soft skills. Right? You're communicating with the rest of the team. There is some aspect of negotiation. Even though it's not called that way, there is some aspect of negotiation happening. But code review is where you put those 2 together. And during a code review, both the author and the reviewer are discussing very technical topics and trying to convince each other of certain viewpoints, but the way that it's conducted and the fact that it's conducted over text makes everything so complicated.
The potential to get into a fight is so high. It's amazing. This is, like, the most conflicted channel of communication, I think, between 2 people because, basically, 1 person is giving feedback to the other. It's written text, and the power dynamics could be completely inverted because the person doing the review could potentially be way more junior than the person who's being reviewed. And once you throw all of this together, you start to realize, you know, we take these reviews for granted, but there's just so much potential for bad things to happen.
[00:07:11] Unknown:
Another aspect of it too is that by the time you get to the code review process, if you're not doing that pair programming approach, you've already invested a lot of time and thought and energy into getting to that point. And you're just like, I just wanna get this merged. Like, let's just get this done with. And so adds that extra dimension of kind of urgency or friction where, you know, this is just something I have to get done, kind of raises the stakes of that communication because somebody has already put so much of their work into it. Yeah. Especially if you have scheduled releases, by the way. Exactly. And, like, oh, no. I'm gonna miss that train if if I don't get that this review over the line today. Right. And I'm curious if you've seen any strategies for being able to maybe reduce that aspect of I've already put a lot of time and effort into this. So, you know, ways maybe you can like, I know, for instance, with GitHub, they recently introduced the idea of kind of a draft PR where you can open a change set against the main branch before you're actually ready to merge it. So you can, you know, start getting the review early while you're still doing work in progress. And just some of the ways that you've seen teams approach that aspect of getting feedback early in the process before you've already made changes that are maybe harder to do diverge from because maybe you made a API design choice and then carried that to its logical conclusion, but somebody says, oh, you actually should have done it a different way. And so then you have to back out, you know, 3 days worth of work kind of a thing. Yeah. Certainly. So, you know, first of all, if you're
[00:08:37] Unknown:
already at the stage where, you know, the scheduled release is tomorrow and you're pushing a very big change set and no 1 has seen it yet, then, yes, you're probably too late. And I don't have any tips that are gonna make this run smoother. So you really have to think in advance about these things. But then it kind of runs in odds with everything we've been taught the past 20 years, which is, you know, forget waterfall, try to get all these stages happening at once, but all of a sudden, wait, if we are about to meet a deadline, we do have to take some steps ahead to make sure that we're aligned on what we're going to do. So, by the way, if you do have something big, I think pair programming is actually a great approach to do that. You have something big coming up. You're under a tight deadline.
You know you're gonna have to have, you know, at least another set of eyes looking at it. Why not do it together if you can, if the company and the way it's set up supports it? I think pair programming makes a lot of sense, especially in those situations. If not, figure out who would be the right reviewer for this piece of code and and try to work with them in advance. It's kinda like, you know, preparing for a meeting by talking to the people who you think are the most influential people in this meeting and prepping them before. So this is kinda like prepping the person for the review. Now it doesn't guarantee that it's gonna work. You know? Maybe the person gave you a piece of advice and you didn't listen, or maybe they gave you a piece of advice that ended up being wrong and they noticed that they were wrong afterwards. So it's not necessarily anyone's fault, but could be that no amount of preparation is going to prepare you for the fact that at the last minute, you have a change set that's not going to pass review, and then you have to figure out, out of the options that you have, what do you wanna do. But, you know, that's not directly related to code review anymore, but, you know, reducing scope, etcetera, etcetera always helps. And I think that could go back and you can, you know, backtrack and say, if reducing scope helps, then maybe I can reduce the scope that I'm introducing in every change set even if it's a huge feature.
Try to introduce it in smaller amounts as much as possible. And that way, you also are not afraid of scheduled releases and what they can do to you because, you know, at least you managed to get something out there. And that also brings
[00:11:03] Unknown:
in kind of the practice of things like feature flagging where you say, okay. I'm just gonna introduce this small change, but I'm going to hide it behind a feature flag so that behavior doesn't turn on until I've delivered the entire feature, and I've been able to test it. And then that gives you a way to incrementally deploy it and adds in a whole another dimension. And I'll add a link in the show notes to an interview I did on that whole topic so that we don't have to go down that road.
[00:11:28] Unknown:
Yeah. It's a good point, but, you know, everything in engineering is a trade off. So, you know, when you do that, you can end up with a lot of code in production that doesn't do anything or is even disabled. I would say that's also an anti pattern. Right? You don't wanna have half of your production code turned off with feature flags just sitting there rotting away. Absolutely.
[00:11:48] Unknown:
And another comment that you made brings up the question of who should be involved in the code review where you said, you know, oh, as you're doing the code review, you identify, oh, who's the best person to review this code? And that's a very subjective question system. That's also a missed opportunity to onboard more people into that area of the code base so that they can become familiar with it and start having questions and, you know, comment conversation about what the code does, why it does it that way. And I'm just wondering if we can explore that a little bit about kind of who should be involved in a code review. Like, what does that decision process look like?
[00:12:30] Unknown:
You're absolutely right. And when I said the best person to review this piece of code, we were, in a given context, talking about something that needs to urgently get out. And, you know, when you do that, a lot of the other considerations just fly out the window. Right? But when you're thinking about your normal day to day operations, you really have to start with why. You know? Why are we doing this code review? And there's so many pieces out there that focus on what and how. You know? What do you do in a code review? How do you perform it? But very few actually discuss the why, and you were right. You mentioned a few motivations that most people don't necessarily think about. Getting people to understand the code base better is amazing, and this works both for the reviewer and the author. Right? If I'm authoring a piece of code, obviously, the person reviewing me has the opportunity to teach me. I'm doing air quotes, which are not visible in podcasts, but that person can teach me, you know, both general objective lessons and individual lessons related to the code base that I'm working on. But I can also learn as the reviewer because I might be reviewing someone who understands the code base really well and is maybe doing things differently.
And, initially, I won't understand why, but seeing them in action might actually help me explore other ways in which to utilize that code or in which some of the best practices around this code base. So that's just 1 example for motivation, you know, that ability to share the knowledge with the entire team. You mentioned fixing bugs earlier. Obviously, another potential motivation for a code review, enforcing code styles. And then, you know, we're gonna probably discuss later about what should be automated and what should be manual. But a code review does have the potential to explore a lot of these things, So bugs, performance considerations, security considerations, design, APIs, which might be the most important aspect of a code review, making sure that you have the right APIs.
Maintainability. I don't wanna test that the code doesn't have bugs. I wanna mostly invest in making sure that this code is maintainable and other people can work on it. So once you, as a team, decide on the motivations and sort of give weights to the different motivations and it doesn't have to be very formal. Right? You don't have to say code reviews are 40% about maintainability, 20% about performance. It doesn't have to work that way. But once you do have a shared knowledge of the things that you care more about in a review and the things you care less, that's when you can decide on, you know, who gets to review what or who is the right person to be brought into a review.
And to me, a system where the person reviewing is always more senior or more knowledgeable of the specific code base than the person authoring the pull request has something wrong in it. I think that, you know, motivations might not be the best motivations for a code review if this is always happening in this 1 direction.
[00:15:48] Unknown:
And then on the question too of who should be involved, there's also the question of how many people. You know? Is it just 1 person writing the code and 1 person reviewing it? Is it 2 people collaborating on the code and 3 people reviewing it? Like, what are some of the ways that that plays out where, you know, maybe you have 1 person writing the code, and then you have a junior person reviewing to kind of learn from it and a senior person reviewing to give the kind of feedback that gets emerged into production or, you know, wondering what you've seen as some of the useful kind of thought processes around how many people to involve in a code review because you also wanna make sure that you're not overloading people with doing code review all the time so that they can't get their actual work done. I I say actual work as if code review isn't your work. So so that's another aspect of it where, you know, there's this idea of, oh, my actual work is just writing code, and then code review is just something I have to do. You know, that also brings the question of kind of the value that you apply to the different types of work that are done as an engineer.
[00:16:43] Unknown:
Yeah. So first of all, an engineer's job is definitely not just writing code. In fact, I would say writing code is probably the least important skill in a software engineer. Not that it's not important, but it's the least important and potentially also the 1 that develops the least throughout their career. So a friend of mine once gave me a great example from a different domain. He said that when he wants to understand the state of the art in medicine, he's going to go to someone who just graduated med school. But if he needs someone to operate on his shoulder, he's going to go to a surgeon with 15 or 20 years of experience. It's kinda like that in software engineering as well. I think that college grads might write better code. And when I say code, I mean, like, at a function level. Right? You're looking at a very specific piece of code. I think it's very possible for a very junior engineer to master this skill better than someone with 20 years of experience.
But if you look at understanding an entire system, understanding workflows, understanding the implication of maintaining a production system, understanding what it's like to work in a team, that's where, you know, the more experienced engineers really shine. So, definitely, your job isn't just to write code. Code reviews are an important part of your job. As for how many people, you know, throughout my career, I noticed that absolutes don't work. You know? If if I go the Star Wars way, only a Sith deals with absolutes. 1 of the most ironic pieces of dialogue in in cinema history.
But you have a team, and people always say, you know, do what works for your company, but it's actually more granular than that. It's what works for a team. And if I'm managing a large engineering organization and that organization has a team that deals with the billing system and a team that does the front facing website, and I'm asking both of them to follow the same practices, I'm probably doing something wrong. Because there's absolutely no reason that I'm asking them for different expectations, but, you know, at the same time, asking them to work the same way. Right? So if I'm working on a billing system and every number has to be accurate no matter what, it probably makes sense for me to introduce several people into a code review and have them go over it.
But if I'm on a front facing website and my main KPI is running experiments all the time, you know, getting the most experiments out there and learning from them as quickly as possible. And if anything goes wrong, I wanna roll forward rather than roll back. Then, you know, a single person is probably good enough, and maybe sometimes you don't even need a code review. You just wanna, you know, roll with it and get things out quickly, but also make sure that you get them out of the code base if they don't work.
[00:19:31] Unknown:
Yeah. And that question of kind of different areas of functionality brings up an interesting point too about, you know, if you're working in front end versus back end teams or you're working in a microservices environment where if you're working on code that is at the perimeter or at the boundary between those different interactions, you likely wanna get somebody from the other team to review some of that code too so that they understand what changes you're making If the, you know, new API that you're creating or updates to an existing API match their expectations, like, how do you think about kind of doing code reviews across teams or across kind of functionality layers?
And then we can get to this as a separate conversation, but I just wanna bring up to the question of we've been discussing code reviews largely in the context of a given team within a company, but there's also the question of code reviews in open source and, you know, bringing multiple organizations together to collaborate on a piece of software. So couple of different interesting aspects too of thinking about, you know, who should review what code.
[00:20:31] Unknown:
Absolutely. And I mentioned earlier that APIs are 1 of the best motivations for code review. And that goes to your point about, you know, when you're touching the outer perimeter of what you're doing, you want people from other teams involved as well, whether because they're the clients of this API or whether because you're making assumptions about the API they're exposing to you. So definitely. And open source is a whole, you know, a whole new dimension of complexity being added into core reviews. You know, we can discuss that at length as well, but open source reviews also usually have very different time constants
[00:21:11] Unknown:
than code reviews within a given team at a company. Yeah. The time aspect is also worth discussing. We talked a little bit about, you know, I have this change set. I need to get it merged before the release gets cut, but there's also the question of there are a lot of different aspects to this. So, you know, I open a pull request or change that or whatever the terminology is in the system that you're working with, and there's a question of, okay. Do I then assign this code review to somebody because I know who is the best person to do it? Do I just say, okay. I've got an open pull request. I'm just gonna wait for somebody to have time to review it. You know? How do you think about the kind of balance of potentially interrupting somebody else's work by saying, I need have this thing I need you to do versus interrupting my own work because I'm waiting for the code review to happen so I can move on? And how do you think about kind of what are some time time horizons in which you need to take certain actions to make sure that that code review gets merged? And then that also brings in the assumption that getting that code review merged is your responsibility, which largely it probably is, but a lot of dimensions to that. Yeah. Definitely a lot of dimensions to that. And, you know, that's also an answer I can split between companies and open source. At a company, obviously, you want the time constants to be
[00:22:26] Unknown:
extremely short. The more time the code review is out there, the more likely you are to run into merge conflicts, the more likely you are to simply forget what you were doing or at least lose some of the context of what was happening. You know, as much as we all like to think that we can memorize the entire code base and know what's going on, it's very easy for us to forget the motivations behind some of the things we were doing earlier, which could be, you know, a week ago or a month ago. And, by the way, that's another reason why it's so important just not just for code reviews, but also for code itself. And I always say that good comments answer why questions, not what questions.
So, you know, too much time, you forget what you've done. That's obviously bad. I would say that, ideally, you wanna get a code review the same day that you have the change set out there, hopefully, merge it within the same day. You can't expect everyone to just sit and wait for code reviews to come in. But at a large enough team, I would imagine that you usually can find someone who's available to do a code review sometime in that given day. And, you know, as much as we like to think that we're very busy as software engineers, it is a very mentally intensive profession, and people need to take some breaks.
And sometimes when you take a break, you don't actually need it to be a coffee break, just the context switch of doing something else, perhaps reviewing someone's code, I think, in many cases, goes a long way towards giving you that mental break that you need.
[00:24:04] Unknown:
The aspect of context also brings up the question of the pieces beyond just the actual change set that go into kind of a code review, particularly in GitHub, but I know in most other systems, there's the opportunity to add in. There's obviously the commit messages that you can add in, assuming that you're using Git or a similar source code management system. But in the interface in which you're conducting the code review, I guess, for the sake of simplicity, we'll just assume that we're talking about GitHub because most other systems have, you know, converged on a similar pattern. So if you're opening a pull request on GitHub, there's, you know, the text box where you type in, you know, this is the title. This is what I'm gonna say about it. And I'm wondering, you know, what you've seen as some of the best ways to think about what goes into that text box. Like, there's the option for pull request templates where you can say kind of here's a, you know, structure that I want you to fill out. You know, fill out this piece of information, this piece of information, and that's useful for either saying, I need this piece of information as a reviewer or as an open source maintainer so that I can know, you know, why this is being done or what's happening. Or it's an opportunity to encourage people to kind of reflect on what it is that they're doing so that they can have that context so that when they come back to the code after it's been merged, they know what the next step is? Like, what are some of the types of information that you want to see captured in that kind of code review interface so that everybody has that shared context?
[00:25:27] Unknown:
I think templates are great, by the way. And the more structure you can put into the information you're giving out to the reviewer, the best. But it's never a complete replacement. You can't really have an empty message or not do anything because the code should communicate a lot of what you're doing, but it's never enough and the the structured data could never be enough. I would say, 1st and foremost, start with your motivation. Why are you making this change? And, you know, arguably, you could say a really good outcome of a code review is we don't need to do it. Right? I mean, sorry for sorry that you had to waste all this time writing code, but, actually, we don't need this.
Now when something like this happens, you wanna examine your practices and see if you could have realized that earlier on, but it's still very important that people understand why is this specific piece of code going into the code base. So focus a lot of effort on that. And then everything that you do you can do structured is great, and it could also inform your CICD system. For example, if it's structured, then, you know, maybe different tests are running as a result of the data that you're putting in there or maybe different things need to be verified as a result of that. And this is, by the way, very common in open source projects when you can't really trust the people because many of the volunteers are just volunteers. I mean, no 1 interviewed them to join that project. Right? They didn't go through a a regular interview process like you do at a company, And then you wanna be extra careful. And then if you have something super structured and you can actually, you know, automatically tag the PR, use the the CICD system to base its testing on top of those tags, I think it could go a long way.
In a team, though, you usually trust the other people. And if you don't trust them, then you have larger issues than code reviews. So if you trust the other people, then it's okay if you have a template, but people don't conform to it. Because I know what I'm doing. I know this specific pull request doesn't actually need this piece of information in the template. Now you have to be careful if you have people who who are always not following the template or if there is a pattern of senior engineers never follow the templates, junior engineers do, then that's, you know, also kind of a role model problem that you're causing. But as long as the exceptions are well thought of, go for it. Don't be too strict.
It is more art than science, and, I mean, this is probably the biggest paradox of software engineering. We call ourselves engineering. We think we're super scientific, but an average marketing team is way more data driven and structured than your average engineering team. Right? So we are dealing with humans. It's okay to break the mold. Templates are great. They're to guide you, not to necessarily enforce.
[00:28:32] Unknown:
In terms of the kind of structured and consistency aspect of code reviews and maintainability of the software, that brings us to the question of automation. Like, what are the pieces of a code review that humans should be doing, and what are the pieces that we should be pushing to the bots and letting the bots take care of it and maybe eliminating some of that kind of potential for conflict and tension that we discussed earlier?
[00:28:58] Unknown:
Yeah. I'd I'd start by saying the obvious. No human should be involved in tabs versus spaces. Right? This is definitely a job for the bots. So formatting in general, you know, some languages like Go have their own formatter, which is great. You don't have to do anything, and no arguments, which is even better. Some languages don't, but you can always use a linter or anything of that sort. And when you do that, that's also great. Bots excel at this. I think bots should do a lot of the and now the static analysis, whether it's for performance, whether it's for security, whether it's for, quote, code quality, anything you can push to machines is great.
And then have the humans do the parts that humans are good at. Right? I mean, as much as AI, we think, mimics humans, it actually doesn't. It does something else very good that we are not very good at. So, you know, even if you look at AI for forget code reviews for a second. Look at AI for games, you know, like chess. Eventually, the state of the art AI today doesn't play like a human player. They play very differently, and it's totally okay. For code reviews as well, let the bots do what they do best. Let the humans do what they do best. But 1 of the biggest complexity I find with bots is that on the 1 hand, you want a very cohesive, small, scope limited change set.
And when you introduce bots, it's really easy to get into scope creep. So, you know, you've changed your code style. You've switched from tabs to spaces. And this very simple change set now has all of the tabs changed to spaces. And this simple change in code looks like it's huge. Right? So you have to be very careful with bots and make sure that you set them up in a way that doesn't include code creep. And, you know, generally speaking, I think it's always great to split the actual change that you wanna drive with the syntax enhancements slash refactorings that you want to introduce that don't directly interact with the code that you're writing.
And this is always the contention point with bots. You do that, all of a sudden, your rules for line breaking changes, and, you know, this very simple PR has turned from 5 lines into 20, which is not that bad, but maybe it turned in, you know, from a 100 to 500, etcetera, etcetera. So be very wary of making sure the bots are scoped.
[00:31:41] Unknown:
And there's also the question of how many bots. You know? There are definitely projects I've contributed to, and I've seen, oh, you need to wait for all these checks to run and, you know, the list just keeps growing. You know? Yeah. Sometimes it's, oh, here, you know, 1 or 2 things that it's gonna check automatically. Sometimes it's, oh, here, 15 different processes that have to run to make sure that your PR is doing what we want it to do. And that also adds back in the question of timeliness of, you know, what is the turnaround cycle from I've submitted this pull request to okay. Now I've gotta wait for an hour for the bots to run before I can actually really do the review or before I wanna bother doing the review. So
[00:32:19] Unknown:
Yeah. So, you know, back in the day, you know, where trains were running on steam engines and we would write our code and see, we would compile it. And, you know, a project could take an hour to compile and you would just go and wait around. And, actually, an hour is very optimistic. If we go even further back in time, projects could take a day to compile and more. So, luckily, we're no longer there, and that is no longer the bottleneck. And you're right. What we've done in many cases is introduced bots to replace that very long process.
Now it is great that we've introduced those bots because if, you know, back in the day, all we got is compiled code as a result of this very lengthy process. Now we actually have compiled code that has gone through a lot of automated checks, whether it's with regards to correctness through tests, whether it's in regards to performance, security, etcetera, etcetera. So it is great to introduce those bots, and I think it's less about the how many bots you introduce. It's about making sure, like you said, that timeliness is respected. And there are 2 things at play here. You know, timeliness, obviously, automated checks after you have created a change set, and those are running.
And in many cases, you don't want a human reviewer until those are running. Sometimes you do wanna paralyze, but sometimes you just say, I'll wait until all the checks pass. But it's also how do you ensure that those checks run very timely on your computer so that whenever you're making a change, you can actually, you know, make sure that it's passing. And the fact that it's running as a result of your change on a larger system is just for verification purposes, but it's never supposed to fail because it supposedly succeeded on your local machine.
[00:34:05] Unknown:
That also opens up the door for the conversation on anti patterns and code review where 1 of them is, I have 30 different bots, and now it takes an hour and a half for all the checks to run. But what are some of the types of anti patterns or negative feedback cycles or ways that code review can be a kind of a hostile process in an engineering team?
[00:34:27] Unknown:
Yeah. So first of all, for an open source project, and you know that no reviewer is even going to look at your PR within at least 24 hours of it being open, then, you know, maybe it's okay to have a 2 hour CICD pipeline. Just, you know, make sure that people can run some version of that pipeline locally that doesn't require testing the entire project. If they can, you know, run it just on their change set before they push it and do that very timely, then, you know, who cares that after they've pushed it, it takes 2 hours? But if you're working at a company, then those 2 hours could be very time consuming.
That's definitely an anti pattern of code reviews. Another anti pattern of code reviews is bottlenecks, you know, certain things that can only go go through 1 person or 1 team even. You know, if you have a dozen teams, but there is 1 piece of code that only 1 team has to review regardless of what's happening, who's touching it, who's doing what. It has to go through someone from that specific team. You've introduced a bottleneck into the system, and, know, Conway's law tells us that systems look like the teams that make them. Right? Or doesn't directly say that, but that's the corollary to it. And if you have a bottleneck in your engineering sorry, in your architecture, it's going to cause a bottleneck in your team. If you have a bottleneck in your team, it's usually an indication you have an architectural bottleneck.
So it's something to keep in mind, definitely something you wanna avoid. But the biggest antipattern of all, I would say, is reviews turning into heated discussions. Right? That goes back to what we said earlier about reviews being where soft and hard skills meet. And if reviews end up being fights all the time, then, obviously, there is something wrong happening in the team, and written communication is tough. So it's super easy for those fights to start happening, and, you know, I've done that before. I try to be a bit calmer these days, but earlier in my career, you know, I got into heated discussions on code reviews. So I can totally get it, but that's something you wanna make sure that you don't run into.
[00:36:38] Unknown:
Yeah. The heated discussions piece also brings in the question of kind of ego and the overall principle of you are not your code and kind of trying to separate yourself from the work that you do and your identity being represented in that work. And that's definitely a very hard thing to do, particularly as you're earlier in your career because you invest so much of your time and your energy in yourself into making sure that the code that you write is as good as it can be and that you're using all of the things that you learned in the process of getting to that point. So when somebody says, oh, no. That's wrong. You should do it this other way. It can be it can be an even bigger issue than it might seem, particularly if you're later on in your career and you've already been through that cycle and you say, oh, yeah. No. That doesn't make any sense. Do it this way instead. Because, you know, you've been working with the system for years. You know exactly what it's doing. And so figuring out how to kind of deliver that message in a way that is sympathetic to the fact that other people's egos are wrapped up in that work.
[00:37:33] Unknown:
Yeah. And it could also play into insecurities. I mean, if you're new on the team, even if you're very experienced, if you're new on the team, you know, people questioning your judgment could seem as, wait. You know? Maybe this is going to hurt my reputation on this team. Maybe they're gonna think that I don't know what I'm doing because my code isn't up to par with everyone else's. So insecurities are not just about your 1st year as a software engineer. You could run into them at every stage of your career, especially if you're switching companies.
But even if you're switching teams within a company, you know, there's always the risk of getting too attached to your code as a result of thinking that it reflects badly badly on you if the code is not ideal.
[00:38:18] Unknown:
Going back to kind of that question of I'm new to this project or I'm new to this team. I, you know, maybe write code in a different way because it's how I'm used to it. That also brings up the problem of onboarding onto a project and kind of establishing shared context, shared norms, and, you know, establishing trust among the people who are working on the project or, you know, ensuring that everybody is working from a place of assuming the best intentions. And so particularly in the space of open source, it can be very easy to have that kind of heated exchange of, know, what do you mean you're not gonna accept this PR? I just worked for hours on this, and, well, you didn't ask me first. And so what are some of the ways that both teams within an organization and in open source can kind of set those expectations, set those standards, and ensure that you can maybe bypass some of those potential kind of mixed messages or miscommunications or kind of over reliance on assumptions to make sure that everybody is working from that kind of shared space of understanding?
[00:39:20] Unknown:
So I think a team has a few unfair advantages over an open source project, specifically when it comes to code reviews. I'm not slamming open source projects. I'm a contributor to a lot of them. I love the concept and the idea, but, specifically, when it comes to code reviews, teams have some great advantages. First of all, the team is curated. Right? Everyone, like I said earlier, everyone has gone through some process to be included on that team. And that process usually determines 2 things. 1, the technical ability of that person to work on this project, and 2, the social ability or cultural ability of this person to work on this project.
You know, it's not as if people are objectively good or bad or objectively tests this, whether it's an internal 1, whether it's an external 1, it doesn't tests this, whether it's an internal 1, whether it's an external 1, it doesn't matter. You know, maybe you came from a different company. Maybe you came from a different team. You still had to go through some vetting. The second aspect is, like you said, you've onboarded. So, you know, you've spent a bunch of time understanding how this team works, and very few teams work alike. Right?
There are just too many variables at play. So very few teams work alike. And like I said earlier, even within the same company, different teams, different expectations, you don't expect them to work alike. So you've onboarded. You know what this team expects of you and what to expect of other people on the team. And then 3rd and maybe most important, you know the other people on your team. And like you said, you assume good intentions. But it's not just assuming good intentions. It's the ability to get up and, you know, if you're in the same office, tap someone on the shoulder. If you're working remotely, hit them up on Slack and try to move the conversation to a more private space where you feel like you can perhaps, you know, take down the flame level. And all of these are just unfair advantages of doing a code review at a company. It doesn't mean that it's not you know, it doesn't have the same potential for getting heated, for causing problems, but it's harder to get into those types of arguments. Now an open source project is the complete opposite.
No 1 has vetted you, and the whole concept of a pull request being against a forked version of the repo is all about no 1 trusting you with the original version of the repo, you know, the main version of the repo. Right? That's the first part. No 1 has vetted you. No 1 has onboarded you. People are hoping that you've read the documentation, assuming there is good documentation. And, you know, this is a pitfall for teams as well. They don't necessarily have good documentation, but at least they're working together so they can pick up on missed stuff. People assume that there is good documentation. You've read it. You know how the team works.
You're creating a PR that is based on understanding of how the system works, and that assumption is false probably in most cases. And, you know, I I was just looking at something yesterday where someone was saying something like, I'm not really interested in understanding how this project works and the architecture and all of that. I just want this very simple piece of code that was not simple and was actually very controversial. So, you know, that's a problem with open source. And then once things do get heated, you might not even have a way of getting in touch with a person on the other hand. It's not just that you don't know them.
You don't really have an avenue to say, hey, buddy. You know, sorry things are getting out of hand. Let's talk about this for a few minutes. Try to have things going among us in a good way instead of a very heated discussion, you don't have a way to do this. You know? I've had people curse me on GitHub, and I can't reach out to them and try to take the flame level down. It's like, alright. Fine. I'm just gonna take myself out of this discussion because it's not fun for me anymore, and it's an open source project, and I only wanna do things that are fun for me. Whereas at a team, if someone cursed me, which would be very rare, I would just, you know, get off my chair, reach out to them, and say, hey. Listen.
This isn't cool. Let's talk about this. Maybe I was wrong too. That's fine. But let's figure out how we solve this. And, you know, we have to work together in the future, so let's make sure we can communicate in a very respective manner, which you can't do with strangers on the Internet in many cases. And you know what? Sometimes you can't even get a hold of the maintainers, and you wanna know how to do something, and you can get in touch with a maintainer. They're They're not available or maybe they're not even there anymore. It's just, you know, all these complexities that make it really hard and create an enormous trust issue. And I love open source projects. I'm going to contribute to them hopefully till my last day or at least my last day writing code.
But, you know, sometimes these things are super hard. And if you look up some of my open source contributions, you're definitely going to see heated discussions. You're going to see me get slammed both technically and on my approach to criticism, and that's fine as long as eventually things, you know, resolve in a respective manner. But there are just a lot of examples where it doesn't.
[00:45:08] Unknown:
So in terms of your experiences, both working in teams and as a manager and contributing to open source projects, what are some of the most interesting or innovative or unexpected ways either that you have seen code reviews used to achieve a particular outcome or workflows around code reviews that you've seen?
[00:45:27] Unknown:
Yeah. So I don't know if it's, you know, that innovative, but what I really like to see is code reviews that aren't about code or aren't about code and, you know, a programming language. So we have this notion, especially now that DevOps are such a big thing, we have this notion of, you know, configuration as code. Right? But documentation as code also makes a lot of sense. And workflows as code also make a lot of sense. Like, I wanna know how this team is working. Having a repo and having code reviews against it really lets the team have good discussions about how it wants to work. And if you remember back in the beginning when you asked me what I think a code review is, I said it's a discussion about a specific piece of functionality that you want to introduce, and that piece of functionality doesn't have to be functionality for the production system. It could be functionality for the team itself, the the operation that is the team, and I really love seeing those. I love seeing repos that are not your usual suspects. They're not a code base in the traditional sense, but I like to see the envelope being pushed in any direction possible. Like, I've even seen VCs that document the way that they work in a repo, and I haven't seen pull requests on those, but that would have been really awesome if, you know, if you could see the partners in the firm discussing the way that due diligence is being done over a code review,
[00:47:01] Unknown:
that's nuts. These are the things I wanna see. Another good example of that too is things like the Python enhancement proposals where it's not software that's being written and discussed, but they are going through a code review process of, I have this suggestion of how we should think about functionality in this programming language or in this broader ecosystem. So I'm going to write a document. I'm going to submit it for code review. We're going to have that discussion in line before it gets merged. So that's another example of a code review isn't necessarily about code.
[00:47:30] Unknown:
Exactly. And that's why I think, you know, Python is ahead of the curve in many ways, and that's that's 1 of them. It's not the only language that does this, but it's definitely 1 of the best communities out there because of things like that. In your own work,
[00:47:45] Unknown:
both in engineering and being involved with code reviews and leading and supporting engineering teams, what are some of the most interesting or unexpected or challenging lessons that you've learned in the process?
[00:47:55] Unknown:
So I think the biggest lesson for me was understanding the difference between writing good code and being a good engineer. It's something I touched upon in the beginning. But when you push your first change set out there, that code could be perfect for all you think, but you haven't taken into account that it actually goes into production system. You know? Maybe it's an amazing piece of code. You've implemented this algorithm so efficiently, but it doesn't have instrumentation. So if anything goes wrong, we don't know what's gonna happen.
It doesn't take into account, you know, bad data being in the database from legacy versions. So it's gonna fail once it hits production even though it's working on your machine. You know, these are the aspects that you really pick up with experience that you really really have to practice. You know? As software engineers, we have to practice our craft. It's not just about learning. It's kinda like, you know, pilots, they have to practice, and a regular flight is not going to prepare you for the time you need to land on the Hudson. You need to go through a simulator. And as developers, we're used to practicing on the job, and we really need to get a lot of practice to pick up those skills.
And once I realized that, you know, a lot of things clicked to me, and I realized what's the best way to mentor developers to get them up to speed quickly, but also how you get senior developers to invest more efficiently in their career growth as well because, you know, it's a never ending process. Right? We always wanna become better at what they do. 1 of the great things about software engineers is that we have this intrinsic motivation to become better at what we do. So once you realize that becoming better is not necessarily about writing better code in the narrow sense, to me, that was the realization that I think really affected my career.
[00:49:51] Unknown:
As you were saying that 1 of the other things that came to mind that we didn't really discuss in detail, we don't need to spend a lot of time on it, is in terms of that question of who should be doing code review as somebody who is managing a team of software engineers, you should probably be involved in code review as well even just to know what is being discussed and, you know, the ways that your team is doing the work. And that's another kind of valuable aspect of code review is that it does kind of surface these interactions that can otherwise be very invisible or opaque.
[00:50:23] Unknown:
Yeah. Definitely. And, you know, as a manager, your ability to go over code reviews actually exposes a lot of what's happening in the team, both from a technical perspective and from a communication perspective. You know? I keep coming back to this about code reviews being the intersection between soft and hard skills, and it works on so many levels. As a leader, as a manager, you definitely want to be involved in code reviews. And, you know, when I hear a lot of managers say I'm not hands on, but then I realize, actually, they're doing a lot of code reviews. They're giving out great suggestions. They're also picking up new things about how the code base works that they then integrate into conversations with other teams or into conversations with customers.
To that, I say, well, if you're doing code reviews, you are hands on. That's part of your hands on job. That's part of what you're supposed to do, and it's great that you're doing that. And, you know, if you look at the Wilco code base, I think the first ever pull request was created by me, and I've also reviewed a few. Now, you know, that code review is mostly changing text. It wasn't about code, but still, I think that being part of that workflow was super important.
[00:51:36] Unknown:
For people who want to dig a bit deeper into this topic and understand more of some of the nuances and complexities around code review and all of the technical and social aspects that go into it, what are some of the resources that you found most useful?
[00:51:51] Unknown:
That's 1 area that I think open source projects really shine because it's all out there. Right? There is a huge corpus of knowledge in open source code reviews, both best practices, antipatterns, good reviews, bad reviews, you name it. Everything is out there. So I would say start by just going over code reviews. I mean, there are a lot of great blog posts. There's a lot of great theoretical knowledge about code reviews, but just get yourself into them. Start reading reviews. Try to understand, is the communication going well in that review that I'm reading? Could it be improved? What would I do differently as I'm reviewing?
Or what would I be doing differently as the author of this pull request? And just, you know, get as many reviews in as possible. I think that's the best way to learn. And if you look at open source projects, there are some really great reviewers out there. And because reviews and open source take a lot of time, sometimes they actually strive for better quality overall because, you know, if this review already took 3 weeks, then we might as well spend the extra 3 days to really get it to the highest level possible. So I think it's just great to see. Do remember that code reviews at a company are different in many ways, but just familiarize yourself with as many code reviews as possible.
And, of course, you know, if I could shamelessly plug WillCo by, you know, using Wellco and gaining practical experience, 1 of the things you'll be doing almost all the time is code reviews because everything revolves around them. Absolutely.
[00:53:41] Unknown:
Are there any other aspects of this overall space of code reviews and the role that they play in engineering teams and supporting the techno social interactions
[00:53:58] Unknown:
think about just think about the other side. Whether you're the author or the reviewer, there's someone on the other side who might not be as secure as you are or might not feel part of the conversation as you are or part of the click, whether it's a team or an open source project, or just feel like they're unheard or anything like that, try to, you know, be in their shoes. You know, what I'm saying right now might be trivial, but we we tend to not think about it, especially when it's over text, especially when we don't know the other party. Just, you know, try to put yourself in their shoes and think about how you leverage your clout within a given project or your click or your understanding and basically try to get them to absorb as much as possible of your knowledge, your access, or anything else that goes with it.
[00:54:56] Unknown:
Alright. Well, for anybody who wants to get in touch with you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And with that, I'll move this into the picks. This week, I'm going to choose a book that I've been reading with my son called The Girl Who Drank the Moon. It's just a very interesting story, very well written, a lot of layers to it, and a very slow reveal. But about halfway through, and I have some guesses as to some of the things that are going to happen and some of the things that they're hinting at, but just very well written book. So definitely recommend that. And with that, I'll pass it to you, Anh. Do you have any picks this week?
[00:55:29] Unknown:
Sure. Well, you know, if we're going all out and off topic, I can't ignore the timing. Better Call Saul just ended, and I have to say this was 1 of the best TV shows ever, I think. You know, a lot of people have watched Breaking Bad and didn't really get into Better Call Saul, but I would say this is a piece of television that is just phenomenal. And then on the technical side, I mentioned Home Assistant earlier. I think it's 1 of the best Python projects out there. I use it. I contribute to it. I encourage everyone to try it out. You can really do great stuff in your home with very little effort.
[00:56:09] Unknown:
Well, thank you very much for taking the time today to join me and share your thoughts and experiences on this overall space of code reviews and the role that they play in technical teams and open source projects. Definitely a very important subject and 1 that a lot of people start to take for granted, so it's great to be able to have a more nuanced conversation about it. So I appreciate all of the time and energy you put into helping engineers upscale in that space, and I hope you enjoy the rest of your day. Thank you so much. Thanks for having me. This was fun.
[00:56:42] Unknown:
Thank you for listening. Don't forget to check out our other shows, the Data Engineering Podcast, which covers the latest on modern data com to subscribe to the show, sign up for the mailing list, and read the show notes. Com to subscribe to the show, sign up for the mailing list, and read the show notes. And if you learned something or tried out a project from the show, then tell us about it. Email host at pythonpodcast.com with your story. And to help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.
Introduction and Sponsor Message
Interview with Anne Freund: Introduction and Background
Defining Code Reviews and Their Purpose
Subtleties and Activities in Code Reviews
Who Should Be Involved in Code Reviews?
Code Reviews in Open Source Projects
Automation in Code Reviews
Anti-Patterns in Code Reviews
Innovative Uses and Workflows in Code Reviews
Lessons Learned from Code Reviews
Resources for Learning About Code Reviews
Final Thoughts and Picks