Ep. 3 | Onboarding to a new codebase - Overcommitted

Erika (00:00) Hey everyone, I’m Erica and welcome to this week’s installment of the Overcommitted Podcast. We are a group of developers talking about the work that we do and how we get better every day at doing it. And I’m joined today by Bethany and Jonathan and Brittany.

And the topic today is learning a new code base. Brittany recently wrote a GitHub blog post on this subject, which was inspired by her joining a new team at GitHub and onboarding to all the new code and the domain knowledge required there. But we figured this is a relevant topic for

really any day to day of a software developer since part of the job is constantly learning new things, getting up to speed, staying up to date on new technologies and having to do that quickly. So we are going to start our discussion by going through talking about some of our strategies. So…

Brittany, I don’t know if you want to kick us off with kind of your overview on what you found on some helpful approaches and strategies and tools.

Brittany Ellich (01:13) Yeah, for sure. So this blog post, was originally an internal post, I think early last year that I put together after talking to, getting input from a bunch of different engineers at including you three. So thank you all for your input on this. But yeah, so the goal was just to reach out to as many people as possible and figure out, hey, how do you get up to speed quickly? Especially since we work remotely, everybody’s doing things a different way. So it was kind of nice to bring everything together.

One of the things that I found the most useful for me was documenting and going through and, you know, I put together a high level document that I just threw everything into as I was researching this new

base that I was getting on board onto. And then having that reference was super helpful. And it was also really helpful for working on the documentation of that code base later on. That was one of my hackathon projects once I, you know.

started getting into it was to actually revamp our documentation as well. So it was helpful for multiple people in multiple cases.

Erika (02:14) Well, that’s great. I know that documentation can be amazingly helpful when it’s up to date and accurate, but also a pain point when it is not. When it’s either outdated or not there. yeah, paying it forward, using that knowledge to help others and help yourself is amazing.

Jonathan, Bethany, do you have any thoughts on your initial approaches when faced with the new code base? Like, go to starting points and does it differ depending on the code base or the language?

Bethany (02:50) For me, I’m very much a person that thrives by just having something to run with. So a starting issue, a problem or whatnot. When I started on Copilot API, I actually dove into our sentry alerts and trying to figure out where a lot of our exceptions were coming from and figuring out how to fix

them or how they work and stuff and that kind of helped me start off strong with understanding what metrics we care about, where we’re getting errors, how they populate through this, propagate through the system and whatnot. But I really love just having an issue or something that I understand the value of doing and then figuring it out from there.

So that was definitely very helpful when I was starting. Additionally, pair programming was really helpful. Loved being able to work with people who were used to the system, who understood how it worked and its quirks and all that. And so that was very helpful while starting out as well.

Jonathan Tamsut (03:53) Yeah, and you know, it’s kind of it’s interesting because there’s you know code bases that you know vary in size and vary, you know Among a bunch of different dimensions. I think when I joined github there are like a lot of large code bases and I sometimes feel the pull to like Redocumentation and spend a bunch of time doing an overview and I found myself actually coming, you know to the point where I was like well

Why don’t I take an issue and let me try to finish this issue and understand as little as possible. And like about the code base. And I think that’s kind of an interesting principle because it’s like you’re forced to like, A, it kind of solves like I’m totally being overwhelmed. B, I think just like diving in and doing things is super, super useful in like just how I learn. And C, like

It also kind of teaches you maybe a little bit about the abstraction boundaries in a code base. Hopefully you can work on this little module, understand it, change it, and there’s this tangled mess reaching out into far off reaches of the code base that you can change. And if there are, well, you’ll learn about it when you either run it locally or test it or, case scenario, ship it and things break. So I think that was helpful. Yeah, I think…

Yeah, pair programming is certainly helpful. think like architectural diagrams are like if there’s like a strong architectural philosophy to code base, which I think like generally good code bases, there’s a little bit of forethought into like, this is how we’re organizing things. These are sort of our guiding principles. I think those are really powerful and kind of just like universally applicable through a code base, hopefully.

Erika (05:21) you mentioned in your blog post, Brittany, the importance of having some good first issues tagged as the maintainer of a code base. And I think that definitely touches on what Jonathan is saying of, you know, like having a large code base can sometimes be overwhelming. And sometimes the smallest things can take the longest time if you’re

if you don’t have the basics of running the code base and getting everything set up. yeah, those are often really good places for initial contributors to jump in.

Brittany Ellich (05:50) sure. think one thing too that Jonathan said that really resonates with me is when I working at GitHub is the first time I’ve worked at a big tech company. And one of the things that I had to learn was to learn to be uncomfortable with not or get used to the discomfort of not knowing everything where previously I’ve worked in a place where, you know, I could see end to end where everything was.

GitHub and a GitHub scale, that’s just not possible. You know, I worked in actions, we all worked in actions for two years and I still at the end of that I was like, I’m not really sure how this works. you know, like there’s, yeah, it’s a different scale and you have to really be, accept the fact that you’re not gonna know everything, which is hard.

Erika (06:32) Yeah, I think that also highlights the importance of knowing where your contributions fit in into the wider scale. So even if you don’t understand how everything fits together, if you know where your piece is being used and how it potentially relates to other parts of the code base.

that’s helpful. Like, for example, like Bethany was talking about with using telemetry, like, is this a hot path? Is it very error prone? Like, do we get rate limited? Does this have existing issues? Like, how is this performing now? So that when you look at it, you know, in production, like, did I make it worse? Did I make it better? Yeah, so that’s also…

Yeah, a good place to start with kind of understanding the context of your change or your contribution without boiling the ocean and having to understand every single layer of the application.

Jonathan Tamsut (07:33) One other thing that came to mind is using a debugger. I know there are members of our community who are anti-debugger, but I think just putting a debugger and walking through your code has definitely been helpful to me. it’s sort of similar to like, if you’re using a distributed tracing system, right? It’s like that, but just locally.

Erika (07:52) Yeah, another tool that you brought up, Brittany, was using Copilot. And I have to admit that since I read your post, I have been using some of the prompts that you suggest around asking Copilot for specific input-output scenarios or to explain in plain text what a method is doing. And I have found it super helpful.

for jumping into parts of the code that I’m not as familiar with. So thank you for providing those suggestions. And anything else that you’ve found since writing that or Jonathan, Bethany, any experiences you’ve had with using Copilot for jumping into a code base?

Jonathan Tamsut (08:34) I used copilot yesterday actually to, I was reading a file and it was actually a relatively straightforward file but I don’t know, was just, there’s something about it that wasn’t parsing and I just asked copilot just questions like, know, what happens when this flag’s not set? Does this line get called? Et cetera, et cetera. And it did a pretty good job, I think, right, like with cross file stuff.

copilot can maybe struggle, I think I mean generally my general philosophy with copilot and and maybe I should make it more sophisticated I just ask it questions in You know that I would ask any human or that I just organically have but I’m sure I could be prompting it better

Erika (09:12) But one other use case I’ve used it for is creating documentation from database schemas.

So having Copilot create like an entity relationship diagram from a schema or something if that’s not available. It’s actually done a pretty good job. It’ll generate like a mermaid diagram. So yeah, that’s kind of a nice thing to have.

Brittany Ellich (09:35) Something I came across recently is I’ve kind of accepted that I’m just not ever going to be a good Ruby on Rails developer. And so I’ve really offloaded a lot of that to a lot of that logic to copilot. There was something yesterday where I was like, why is this? You know, I was trying to just do a Boolean check and for some reason, like the bang before the like different ans. Anyway, I can’t explain it with words.

but I could not figure out why it wasn’t, you know, equating to what I thought it was. And I gave it to Copilot and it was like, oh, you just have to do this because this, you know, in Ruby, this is going to be evaluated before that. I’m like, there’s no way I would have ever known that, that that was the reason that that was the issue. So I’m happy that that exists.

Erika (10:11) Yeah

Bethany (10:15) Yeah, I agree. I think I always struggled with writing in Rails because it was very magic to me. And so I really appreciated that, like, you can look up documentation for anything. I could say, yeah, how do I do this in Rails online? But it’s harder to understand how to do something idiomatically in these, like, very opinionated languages. Like, I feel both Go and

Erika (10:15) them.

Bethany (10:41) are very opinionated but in opposite directions and so it was really helpful to say, hey, I want to do this, how do I do this but idiomatic grails then it would…

tell me and I was like, OK, that makes sense and walk me through exactly what’s happening in the code snippet so I would understand for the future. I do just really love that it almost turns your code base into your own personal Stack Overflow. So you can go in and ask a question like you would in Google and have references and examples and stuff from your code base, which is just really cool, I think.

Erika (11:15) It is cool with the caveat that as with Stack Overflow, it’s dependent on the quality of the previously written code. So if you’ve written a quality code base, then you’re going to get quality answers. Something that we have…

all mentioned as a helpful communication tool or practice for getting up to speed is pairing. And this is a way to get tips from coworkers, colleagues, expertise from people who’ve worked in the co-base previously. But something that I know I’ve had is varying degrees of success with.

with pair programming, sometimes for reasons of the person that I’m pairing with, but sometimes also for myself and holding back any questions I have because as a newcomer, sometimes I don’t want to like I’m asking too basic of a question or something like that. So is this something that any of you have struggled with and maybe have you?

overcome it by, yeah, like knowing what questions to ask or, yeah, making pairing successful.

Jonathan Tamsut (12:25) Shout out to episode 1 on imposter syndrome.

Bethany (12:28) Yeah.

I think with pairing, you have to kind of treat it like an investment of some sort. Like, you’re probably not going to be, no individual is going to be as productive in terms of output of work because you’re not necessarily just purely working, you’re having conversations. It’s more of an investment in understanding the code and having

this shared knowledge and shared collaboration experience. So I think it’s beneficial in ways that aren’t related to maybe like how long it takes to code something. Because it’ll nearly always be quicker to just like pump out code, at least if you’re familiar with the code base and whatnot. So I don’t think, I think it’s important to not be scared to ask questions because that’s kind of the…

purpose of pair programming is to have that outlet and that way to understand better. At least when I pair program with people or especially people who are newer, I always can tell when they’re holding back just because I’m like, okay, I understand what I’m saying doesn’t make sense, but I can’t really.

unless I know how it doesn’t make sense or how I can rephrase. I really think the sessions with just a lot of questions are the most fun and the most productive in a way.

Jonathan Tamsut (13:47) Yeah. And I think, you know, it’s interesting, you know, I don’t like, like what makes a good pair programmer. think right. Like, I think if, if you’re, if it’s sort of like a teacher, you know, there’s kind of different ways you can pair program. Like sometimes I’m just like, there’s like something that I’m stuck with, but like, I just, like, I could solve it, but I don’t have sort of the right mental configuration. So it’s not like, it’s not necessarily that I’m like missing facts. I just need someone who has like a different perspective.

And then there’s pair programming where like, yeah, one person is teaching another. I think it’s like, when I try to do that, I try to be as like selfless as possible, know, sort of, you know, really, really try to focus on like their experience. I think like they should lead if you’re trying to teach someone and like you should kind of be there to, but it, you know, to guide them, but it requires like patience. And I know, you know, that’s hard.

And I think also people, I think people in general, especially people who are more experienced forget how difficult things were when they started and forget that they struggled too. you know, there can be frustration there.

Erika (14:43) I also find that sometimes when you’re working with somebody who has worked on the code base before, sometimes they’ll apologize or qualify that the code that they’ve written as sort of a cover-up scenario. you know, code is code. It’s written. If it’s done, it’s there. Like, it’s really okay. You don’t need to qualify it. Sometimes it’s a good learning opportunity to say,

this is an area that could be improved. I know that this is somewhere with a lot of tech debt that we could invest in at some point. But yeah, I’ve found that to take tangents often in these sort of teacher-student scenarios. And I found that like going in even as a student in these like initial onboarding.

programming pair programming sessions with a plan and like a set of questions and things that I want to know Maybe it’s like, you know, what what are your workflows? Like what are the tips and obviously like giving room depending on who you’re pairing with and like how focused they are like how good of a communicator they are but like Often having like key points that you’re trying to get away from the session That you know will be helpful for you in the future

has helped me stay on track. Yeah. And then also like reminding myself that it’s okay to ask even the most basic questions and none of this is magic. If it’s not clear to me, it’s probably not clear.

Brittany Ellich (16:05) One thing I’ve noticed working with more and more senior engineers is like the most senior people that I’ve worked with don’t apologize for asking any questions. And they all also ask like the most basic of questions without, know, nobody else is out there saying like, wow, this person’s really dumb. How do they not know this? I don’t think I’ve ever had anybody at least say that to me. Maybe they’re thinking it, but I think the further I’ve…

When I was starting out, was like so apologetic, like, no, I’m so sorry. I have so many questions. And now I’m realizing ask them all earlier because the faster you understand things and the faster you get on board, the faster you can actually contribute. people on your team wants you to ask lots of questions because it not only helps get you up to speed, but it also helps them solidify their understanding. And maybe they’ll learn something by trying to answer those questions too. So it’s a win win for everybody to ask, ask lots of questions.

and continue to do it.

Erika (16:56) Yeah, that’s a great point.

Jonathan Tamsut (16:57) And

I think maybe this is a defense mechanism, but to me, there’s the distinction, least, I’m just being dumb, which what does dumb even mean? We can have a whole podcast on that. Versus just you not having that experience, right? And so you can’t really fault yourself for not having had experience. mean, you are where are. Some people who were just born before you started in this career earlier got switched to a team earlier.

There’s kind of, yeah, there’s like, feel less bad about that and more bad about, know, maybe I should have known this had I just thought harder. But yeah.

Erika (17:33) Another thing that you call out in your post-Brittany that I think is very perceptive is not only asking other engineers on your team for input or for clarification, but also product owners and members of other teams, other dependencies, any domain experts that can help you understand the problem area that you’re working in.

is a way to get up to speed quickly.

Brittany Ellich (18:03) Yeah, absolutely. One of the first things that I do anytime I join a new team is I set up one-on-one with every single engineer on the team, but also every single product manager that I know I’m going to be working with and everybody in design or like anybody somewhat adjacent to it because I found like having that their context and their knowledge is super helpful.

Erika (18:22) Yeah, that’s a great idea. All right.

Jonathan Tamsut (18:24) And it’s,

one more thing on that. It’s also interesting in a larger company, the wildly different perspectives or even goals that people in different positions have. And so yeah, I think it’s always interesting getting those different perspectives.

Erika (18:38) The last thing we’re going to talk about is our spicy hot take topic. Because we’ve, I’m sure all had the experience of running across things in a new code base that drive us absolutely nuts. So we are going to voice those grievances in this section of what drives us crazy when you’re spelunking or.

And I can kick us off. We mentioned documentation at the top of this podcast and it drives me absolutely crazy when I follow the Read Me and dutifully do the setup steps and it doesn’t work. And for me, the Read Me is your entry point. And if you’re going to keep anything up to date with your instructions for new people, you’ve got to keep that up to date.

but I can’t even count on one hand the number of times I have done that and everything falls flat on the floor. So for me, that is my absolute pet peeve. Please keep your read me up to date.

Brittany Ellich (19:33) I like that one. I’ll go next. My only real pet peeve, I think, is when I see like to-do comments that are like, we’re going to fix this later. And it was like two years ago. I’ve just stopped writing comments like that because I know like we’ll fix it when we need to, you know, well, it’ll, I’ll file an issue for it and, you know, get to it when we need to. But otherwise, if you’re going through there and like, why, why were you going to fix this? What’s wrong with it? Is it still broken? Is it, you know, is it still out of date? That’s probably.

That’s probably mine.

Jonathan Tamsut (20:02) Bethany, you want to give yours? I can give mine if you don’t want.

Bethany (20:03) Yes. No,

I can give mine. I would say for me it’s maybe PR templates that aren’t very helpful or I don’t see the purpose of all the information.

Jonathan Tamsut (20:12) Mmm.

Bethany (20:17) I know at GitHub at least we use PR templates heavily, but I don’t think I’ve ever seen one that has necessarily made me think more about the PR I was making or made it easier as a reviewer to file through that information. So I think I like PR templates that tell you how to deploy, how to…

merge your change in and what you need for getting approvals, but I think past that it’s being overly prescriptive is just not necessarily helpful or conducive to making it easy to contribute.

Jonathan Tamsut (20:51) Yeah, it’s funny. I was kind of gonna, you know, get on my high horse and say like, I have empathy for all these developers committing these pet peeves because I’ve done them all and it would be hypocrisy. But I will say issue templates or PR templates that have like required fields are super annoying. And to do’s are super annoying and yeah, bad readme’s are super annoying.

I’ve done them all, though. I’ve committed them all. So I feel, so I was, know, maybe they’re pet peeves, but I understand the human who made them, who did the peeve. mean, I’ve worked on, like, before here, I’ve worked on terrible code bases where, like, I was, like, a consultant, and, like, sometimes we’d be hired to, like, kind of…

Brittany Ellich (21:12) Same, yeah.

Jonathan Tamsut (21:31) like replace the like like we were hired by this company to replace their foreign dev team and they were just like openly antagonistic to us and they just like gave us the code base and like like a compressed like a tar file and it just like didn’t even run and So, yeah, happy having a code base that doesn’t run locally Or is really difficult to run in some local environment has driven me. I mean, I know we’ve all experiences of you’d have has driven me crazy

And you just want to test one little thing and you want this thing to run and it just can’t run and especially when it’s like a big complex distributed system. You’re just losing life hours trying to get this thing to a point where it

Erika (22:11) Yeah, it takes longer to get the thing up and running than to make the change. That’s a painful day of development. Great. Well, thank you all so much for your time and your contributions to this discussion. We will definitely link the article, the blog post in our show notes and

Please stay tuned for more episodes from Overcommitted about becoming a better engineer, one committed at a time.

Thank you all, talk to you later.

Listen

Hosted By

3: Ep. 3 | Onboarding to a new codebase

Show Notes

Episode Transcript