remove separate file fetcher command#13275
Conversation
|
Removing the filesystem IO and base64 encoding has sped up the smoke tests by 30 seconds across the board. We should see that in production as well. Edit: Nope, I was wrong. I was comparing these smoke tests to the smoke tests that downloaded the image during the run. So the 30 seconds was just the difference in how the tests are setup. This is still a nice cleanup, but no major performance improvements unless there are a lot of large binaries in the manifests (which there usually aren't). |
honeyankit
left a comment
There was a problem hiding this comment.
@jakecoffman Can we also update the docs related to this change. I remember there are few places where we have instruction to use file_fetch command to debug.
There was a problem hiding this comment.
This change makes good sense to me as my understand is that the separate fetch- and update-commands predates the invention of the proxy being used everywhere, i.e. you had to be careful to supply a subset of creds to the first to pull repo content that you didn't necessarily want the latter to have.
The way we run this now means it's weird indirection and it's much simpler to think about the job definition as just input.
| sig { returns(T.nilable(Integer)) } | ||
| def save_job_details | ||
| # TODO: Use the Dependabot::Environment helper for this | ||
| return unless ENV["UPDATER_ONE_CONTAINER"] |
There was a problem hiding this comment.
🔍 We're always executing as a single container these days so as I understand it, this config point is just a hold over I think
There was a problem hiding this comment.
Yep, I added this in https://github.com/dependabot/dependabot-updater/pull/1743 so that in Actions we could avoid all the passing around of the outputs.
| dependency_snapshot = Dependabot::DependencySnapshot.create_from_job_definition( | ||
| job: job, | ||
| job_definition: Environment.job_definition | ||
| fetched_files: @fetched_files |
There was a problem hiding this comment.
💯 I like the implication that the job_definition is now immutable as it was hard to reason about in some tests and fixtures
In dependabot/dependabot-core#13275 the `fetch_files` command was made a no-op, and it does not need to be called. Also cleaned up environment variables that are not used as a result. Copied from: github/dependabot-action#1550
In dependabot/dependabot-core#13275 the `fetch_files` command was made a no-op, and it does not need to be called. Also cleaned up environment variables that are not used as a result. Copied from: github/dependabot-action#1550
What are you trying to accomplish?
Originally Dependabot ran
fetch_filescommand in a different container thanupdate_fileswhich might run arbitrary code. To communicate between containers, it would write the gathered files to disk base64 encoded, and the update_files would read and decode it.I dug into it and it seems the original idea was that the update_files could run with a smaller set of credentials, but that was never implemented.
So let's discard all that and simply fetch the files in the update_files command and keep it in memory. No need to encode, decode, and write to disk. A nice simplification!
Anything you want to highlight for special attention from reviewers?
If you ignore the VCR that needed an update, this PR is
+159 -251so it removes almost 100 lines of code. Nice!How will you know you've accomplished your goal?
Things will still work, but slightly faster.
Checklist