The “Dependency Cutout” Workflow Pattern, Part I
It’s important to be able to fix bugs in your open sourcedependencies, and not just work around them.
Tell me if you’ve heard this one before.
You’re working on an application. Let’s call it “FooApp”. FooApp has adependency on an open source library, let’s call it “LibBar”. You find a bugin LibBar that affects FooApp.
To envisage the best possible version of this scenario, let’s say you activelylike LibBar, both technically and socially. You’ve contributed to it in thepast. But this bug is causing production issues in FooApptoday, andLibBar’s release schedule is quarterly. FooApp is your job; LibBar is (atbest) your hobby. Blocking on the full upstream contribution cycle and waitingfor a release is an absolute non-starter.
What do you do?
There are a few common reactions to this type of scenario, all of which arebad options.
I will enumerate them specifically here, because I suspect that some of themmay resonate with many readers:
Find an alternative to LibBar, and switch to it.
This is a bad idea because a transition to a core infrastructure componentcould be extremely expensive.
Vendor LibBar into your codebase and fix your vendored version.
This is a bad idea because carrying this one fix now requires you tomaintain all the tooling associated with a monorepo1: you have to beable to start pulling in new versions from LibBar regularly, reconcile yourchanges even though you now have a separate version history on yourimported version, and so on.
Monkey-patch LibBar to include your fix.
This is a bad idea because you are now extremely tightly coupled to aspecific version of LibBar. By modifying LibBar internally like this,you’re inherently violating its compatibility contract, in a way which isgoing to be extremely difficult to test. Youcan test this change, ofcourse, but as LibBar changes, you will need to replicate any relevantportions of its test suite (which may be itsentire test suite) inFooApp. Lots of potential duplication of effort there.
Implement a workaround in your own code, rather than fixing it.
This is a bad idea because you are distorting the responsibility forcorrect behavior. LibBar is supposed to do LibBar’s job, and unless youhave a full wrapper for it in your own codebase, other engineers (including“yourself, personally”) might later forget to go through the alternate,workaround codepath, and invoke the buggy LibBar behavior again in some newplace.
Implement the fix upstream in LibBar anyway, because that’s the Right Thing To Do, and burn credibility with management while you anxiously wait for a release with the bug in production.
This is a bad idea because you are betraying your users — by allowing thebuggy behavior to persist — for the workflow convenience of your dependencyproviders. Your users are probably giving you money, and trusting you withtheir data. This means you have both ethical and economic obligations toconsider their interests.
As much as it’s nice to participate in the open source community and takeon an appropriate level of burden to maintain the commons, this cannotsustainably be at the explicit expense of the population you servedirectly.
Even if we only care about the open source maintainers here, there’sstill a problem: as you are likely to come under immediate pressure to shipyour changes, you will inevitably relay at least a bit of that stress tothe maintainers. Even if you try to be exceedingly polite, the maintainerswill know thatyou are coming under fire for not having shipped the fixyet, and are likely to feel an even greater burden of obligation to shipyour code fast.
Much as it’s good to contribute the fix, it’s not great to put this on themaintainers.
The respective incentive structures of software development — specifically, ofcorporate application development and open source infrastructure development —make options 1-4 very common.
On the corporate / application side, these issues are:
it’s difficult for corporate developers to get clearance to spend even small amounts of their work hours on upstream open source projects, but clearance to spend time on the project they actually work on is implicit. If it takes 3 hours of wrangling with Legal2 and 3 hours of implementation work to fix the issue in LibBar, but 0 hours of wrangling with Legal and 40 hours of implementation work in FooApp, a FooApp developer will often perceive it as “easier” to fix the issue downstream.
it’s difficult for corporate developers to get clearance from management to spend even small amounts ofmoney sponsoring upstream reviewers, so even if they can find the time to contribute the fix, chances are high that it will remain stuck in review unless they are personally well-integrated members of the LibBar development team already.
even assuming there’s zero pressure whatsoever to avoid open sourcing the upstream changes, there’s still the fact inherent to any development team that FooApp’s developers will be more familiar with FooApp’s codebase and development processes than they are with LibBar’s. It’s justeasier to work there, even if all other things are equal.
systems for tracking risk from open source dependencies often lack visibility into vendoring, particularly if you’re doing a hybrid approach and only vendoring afew things to address work in progress, rather than a comprehensive and disciplined approach to a monorepo. If you fully absorb a vendored dependency and then modify it, Dependabot isn’t going to tell you that a new version is available any more, because it won’t be present in your dependency list. Organizationally this is bad of course but from the perspective of anindividual developer this manifests mostly as fewer annoying emails.
But there are problems on the open source side as well. Those problems are allderived from one big issue: because we’re often working with relatively smallsums of money, it’s hard for upstream open source developers toconsumeeither money or patches from application developers. It’s nice to say that youshould contribute money to your dependencies, and you absolutelyshould, butthe cost-benefit function is discontinuous. Before a project reaches thefiscal threshold where it can be at leastone person’s full-time job to worryabout this stuff, there’s often no-one responsible in the first place.Developers will therefore gravitate to the issues that are either fun, orrelevant to theirown job.
These mutually-reinforcing incentive structures are a big reason that users ofopen source infrastructure, even teams who work at corporate users withzillions of dollars, don’t reliably contribute back.
The Answer We Want
All those options are bad. If we had a good option, what would it look like?
It is both practically necessary3 and morally required4 for you to have away to temporarily rely on a modified version of an open source dependency,without permanently diverging.
Below, I will describe a desirable abstract workflow for achieving this goal.
Step 0: Report the Problem
Before you get started with any of these other steps, write up a cleardescription of the problem and report it to the project as an issue;specifically,in contrast to writing it up as a pull request. Describe theproblembefore submitting a solution.
You may not be able to wait for a volunteer-run open source project to respondto your request, but you shouldat least tell the project what you’replanning on doing.
If you don’t hear back from them at all, you will have at least made sure tocomprehensively describe your issue and strategy beforehand, which will providesome clarity and focus to your changes.
If youdo hear back from them, in the worst case scenario, you may discoverthat a hard fork will be necessary because they don’t consider your issuevalid, but even that information will save you time, if you know it before youget started. In the best case, you may get a reply from the project tellingyou that you’ve misunderstood its functionality and that there is already aconfiguration parameter or usage pattern that will resolve your problems withno new code. But in all cases, you will benefit from early coordination onwhat needs fixing before you get tohow to fix it.
Step 1: Source Code and CI Setup
Fork the source code for your upstream dependency to a writable location whereit can live at least for the duration of this one bug-fix, and possibly for theduration of your application’s use of the dependency. After all, you mightwant to fix more thanone bug in LibBar.
You want to have a place where you can put your edits, that will be versioncontrolled and code reviewed according to your normal development process.This probably means you’ll need to have your own main branch that diverges fromyour upstream’s main branch.
Remember: you’re going to need to deploy this toyour production, so testinggates that your upstream only applies to final releases of LibBar will need tobe applied to every commit here.
Depending on your LibBar’s own development process, this may result in slightlyunusual configurations where, for example, your fixes are written against thelast LibBar release tag, rather than its current5main; if the project has a branch-freshness requirement, youmight need two branches, one for your upstream PR (based on main) and one foryour own use (based on the release branch with your changes).
Ideally for projects with really good CI and a strong “keep mainrelease-ready at all times” policy, you can deploy straight from a developmentbranch, but it’s good to take a moment to consider this before you get started.It’s usually easier to rebase changes from an older HEAD onto a newer one thanit is to go backwards.
Speaking of CI, you will want to have your own CI system. The fact that GitHubActions has become a de-facto lingua franca of continuous integration meansthat this step may be quite simple, and your forked repo can just run its owninstance.
Optional Bonus Step 1a: Artifact Management
If you have an in-house artifact repository, you should set that up for yourdependency too, and upload your own build artifacts to it. You can often treatyour modified dependency as an extension of your own source tree and installfrom a GitHub URL, but if you’ve already gone to the trouble of having anin-house package repository, you can pretend you’ve taken over maintenance ofthe upstream package temporarily (which you kind of have) and leverage thoseworkflows for caching and build-time savings as you would with any otherinternal repo.
Step 2: Do The Fix
Now that you’ve got somewhere to edit LibBar’s code, you will want to actuallyfix the bug.
Step 2a: Local Filesystem Setup
Before you have a production version on your own deployed branch, you’ll wantto test locally, which means havingboth repositories in a single integrateddevelopment environment.
At this point, you will want to have alocal filesystem reference to yourLibBar dependency, so that you can make real-time edits, without going througha slow cycle of pushing to a branch in your LibBar fork, pushing to a FooAppbranch, and waiting for all of CI to run on both.
This is useful in both directions: as you prepare the FooApp branch that makesany necessary updates on that end, you’ll want to make sure that FooApp canexercise the LibBar fix in any integration tests. As you work on the LibBarfix itself, you’ll also want to be able to use FooApp to exercise the code andsee if you’ve missed anything - and this, you wouldn’t get in CI, since LibBarcan’t depend on FooApp itself.
In short, you want to be able to treat both projects as an integrateddevelopment environment, with support from your usual testing and debuggingtools, just as much as you want your deployment output to be an integratedartifact.
Step 2b: Branch Setup for PR
However, for continuous integration to work, you willalso need to have aremote resource reference of some kind from FooApp’s branch to LibBar. Youwill need 2 pull requests: the first to land your LibBar changes to yourinternal LibBar fork and make sure it’s passing itsown tests, and then asecond PR to switch your LibBar dependency from the public repository to yourinternal fork.
At this step it isvery important to ensure that there is an issue filed onyour own internal backlog to drop your LibBar fork. You do not want to losetrack of this work; it is technical debt that must be addressed.
Until it’s addressed, automated tools like Dependabot will not be able to applysecurity updates to LibBar for you; you’re going to need to manually integrateevery upstream change. This type of work is itself very easy to drop or losetrack of, so you might just end up stuck on a vulnerable version.
Step 3: Deploy Internally
Now that you’re confident that the fix will work, and that yourtemporarily-internally-maintained version of LibBar isn’t going to breakanything onyour site, it’s time to deploy.
Somedeploymentheritageshould help to providesome evidence that your fix is ready to land inLibBar, but at the next step, please remember that your production environmentisn’t necessarily emblematic of that of all LibBar users.
Step 4: Propose Externally
You’ve got the fix, you’ve tested the fix, you’ve got the fix in your ownproduction, you’ve told upstream you want to send them some changes. Now, it’stime to make the pull request.
You’re likely going to get some feedback on the PR, even if you think it’salready ready to go; as I said, despite having been proven inyour productionenvironment, you may get feedback about additional concerns from other usersthat you’ll need to address before LibBar’s maintainers can land it.
As you process the feedback, make sure that each new iteration of your branchgets re-deployed to your own production. It would be a huge bummer to gothrough all this trouble, and then end up unable to deploy the next publiclyreleased version of LibBar within FooApp because you forgot to test that yourresponses to feedbackstill worked on your own environment.
Step 4a: Hurry Up And Wait
If you’re lucky, upstream will land your changes to LibBar. But, there’s stillno release version available. Here, you’ll have to stay in a holding patternuntil upstream can finalize the release on their end.
Depending on some particulars, itmight make sense at this point to archiveyour internal LibBar repository and move your pinned release version to a githash of the LibBar version where your fix landed, in their repository.
Before you do this, check in with the LibBar core team and make sure that theyunderstand that’s what you’re doing and they don’t have any wacky workflowswhich may involve rebasing or eliding that commit as part of their releaseprocess.
Step 5: Unwind Everything
Finally, you eventually want to stop carrying any patches and move back to anofficial released version that integrates your fix.
You want to do this because this is what the upstream will expect when you arereporting bugs. Part of the benefit of using open source is benefiting fromthe collective work to do bug-fixes and such, so you don’t want to be stuck offon a pinned git hash that the developers do not support for anyone else.
As I said in step 2b6, make sure tomaintain a tracking task for doing thiswork, because leaving this sort of relativelyeasy-to-clean-up technical debtlying around is something that can potentially create a lot of aggravation forno particular benefit. Make sure to put your internal LibBar repository intoan appropriate state at this point as well.
Up Next
This is part 1 of a 2-part series. In part 2, I will explore in depth how toexecute this workflow specifically for Python packages, using some populartools. I’ll discuss my own workflow, standards like PEP 517 andpyproject.toml, and of course, by the popular demand that I justknow willcome,uv.
Acknowledgments
Thank you tomy patrons who are supporting my writing onthis blog. If you like what you’ve read here and you’d like to read more ofit, or you’d like to support myvarious open-sourceendeavors, you cansupport my work as asponsor!
if you already have all the tooling associated with a monorepo,including the ability to manage divergence and reintegrate patches withupstream, you already have the higher-overhead version of the workflow I amgoing to propose, so, never mind. but chances are you don’t have that, veryfew companies do. ↩
In any business where one must wrangle with Legal, 3 hours is awildlyoptimistic estimate. ↩
In an ideal world every project wouldkeep its main branch ready torelease at all times, no matterwhatbut we do not live in an ideal world. ↩
In this case, there is no question. It’s 2b only, no not-2b. ↩




