Unlocking Lockfiles - A developers Guide to Package Management
Posted on:
sourceUnlocking Lockfiles
A Developers Guide to Package Management
Another CodingWithCallum session
What are we covering?
- History
- Fundamentals
notes: I could spend hours talking about package managers and how they work so I'm going to try to filter the content to the most relevant information. If you want me to go into more depth on any topic please ask at the end
History Lesson
notes:
The easiest way for me to explain package management is to go through how we got here and explain some concepts and the needs along the way
History Lesson
Need (>2010)
- include libraries via
<script>
tags - copypasta code from stackoverflow etc.
notes: Back in the day there was no package management system We just cowboy'd script tags (if you'd like an example of this check out the analytics package 😉) Source control (if you were using it) would contain a bunch of vendor files
History Lesson
Need (>2010)
an extremely embarrassing example
notes: Open url: https://github.com/csi-lk/silk-ajax-comments/blob/master/ajaxcomments.js (written for wordpress v3, we're currently at v7) The reason for no indentation is that ie7 would sometimes stop executing when it would hit tabs in the code super secure approach
History Lesson
Birth (2010)
npm
is created- Originally designed for Nodejs
- Created standardised registry
package.json
created
notes:
npm - node package manager was created
first centralised registry for JavaScript packages... also confusingly named npm
standardised versioning and dependency management
History Lesson
Birth (2010)
another... extremely embarrassing example
notes:
https://www.npmjs.com/package/aws-ses-local
defined package.json
with version fields etc.
overall, a very manual npm publish
workflow
show dependency list
History Lesson
Birth (2010)
- Recursive dependency resolution
- 🐢🐢🐢🐢
- Nested
node_modules
notes:
Ok so the main point here is that module resolution was created and happened recursively
So when you install a package, npm
would read that package.json's file and install their packages and so forth
(turtles all the way down)
Originally npm
decided to create a node_modules
folder in each dependencies dependency
History Lesson
Birth (2010)
callums-bad-code/
├── node_modules/
│ └── A/
│ ├── package.json
│ └── node_modules/
│ └── B/
│ ├── package.json
│ └── node_modules/
│ └── C/
│ └── package.json
History Lesson
Birth (2010)
- This was pretty revolutionary
node_modules-black-hole.png
- Poor windows
notes:
No more manual management, projects became portable and conflicts could happen automatically Crazy amount of duplication and caused path length issues on Windows (affecting dozens of front end developers!) Great base to start from Ok we can move on to the next era, the rise
History Lesson
Rise (2012-2015)
- Bower / RequireJS / Browserfy / Grunt / Gulp
npm
gets used for frontend things alotnode_modules
folder becomes truely unusable
notes:
A bunch of new tooling starts to come on the scene
Bower specifically came out for front end dependencies and started to tackle fonts, static files etc. as well
RequireJS and Browserify started to bridge the gap between Node and JS dependencies
Then we had build systems of Grunt / Gulp start to integrate into the package management fray
But, overall... npm
starts to get adopted for the front end
The amount of dependencies increases, node_modules
becomes crazy complicated. We need a better solution
History Lesson
Flat (2016)
yarn
released (from Facebook -> meta)- lockfiles are born
- offline mode
- security (spoliers!)
npm
createspackage-lock.json
- flat dependency to avoid duplications
notes:
yarn
was my favourite package manager until very recently
Introduced lockfiles for deterministic installation
Parallelize dependency installation
So npm went, hey we can do that too and created the package lock
Ok, lets explain flat dependency resolution first
History Lesson
Flat (2016)
node_modules/
├── package-a/
│ └── node_modules/
│ └── package-c/ (version 1.0.0)
└── package-b/
└── node_modules/
└── package-c/ (version 1.0.0)
notes:
Ok remember our nested dependency structure from before that npm
introduced
package-c
appears twice, creating duplication even when both package-a
and package-b
use the same version
Well, lets start "hoisting" things
History Lesson
Flat (2016)
node_modules/
├── package-a/
├── package-b/
└── package-c/ (version 1.0.0, shared by both package-a and package-b)
notes:
What's hoisting
?
Dependencies are "lifted" to the highest possible level in the node_modules
tree.
By doing this we can de-duplicate our dependencies
But what happens when different versions are required? For example, if package-b
needs package-c@2.0.0
:
History Lesson
Flat (2016)
node_modules/
├── package-a/
├── package-b/
│ └── node_modules/
│ └── package-c/ (version 2.0.0)
└── package-c/ (version 1.0.0, used by package-a)
notes:
The most commonly used version is hoisted to the top level, and other versions remain nested where needed.
This solved a ton of the issues we saw introduced by npm
but created some more like the "Phantom dependency" problem 👻
But before we get to that... security!
History Lesson
Flat (2016)
yarn
's new security model
- checksums
- offline mode
- resolution
- license checks
- script execution
- auditing
notes:
Security!
- cryptographic checksums so that package content was verified against checksum
- avoids tampering and malicious substitutions
- offline meant it would protect against network attacks during install
- great for government 😉
- resolution security
- Deterministic installation algorithm that reduced the risk of inconsistent packages
- Stricter handling of package.json files
- license checking
- identify and report license types for all installed packages (MIT ISC etc.)
- script execution
- controlled environment for running lifecycle scripts
- isolation of script execution
- options to disable running package scripts
- auditing
- that thing that comes up saying your package has a vuln
History Lesson
Flat (2016)
Phantom dependencies (I ain't afraid of no ghost 👻)
(Using things not listed in package.json
)
Your package.json: { "dependencies": { "package-a": "1.0.0" } }
Package A's package.json: { "dependencies": { "package-b": "2.0.0" } }
node_modules/
├── package-a/
└── package-b/ (hoisted from package-a/node_modules/)
notes:
Because Node.js searches for modules in the node_modules directory, your code can access any package in the top-level node_modules, whether or not you declared it as a dependency
Because the packages are hoisted they exist in the node_modules
dir
This causes massive headaches around dependency hell (it's unclear which package is the problem)
Example of issue I had at Clearscore where we had an undeclared css dep that had a breaking version update that caused the site to go green
Phew, that was a lot... lets get into the modern era
History Lesson
Modern (2017+)
pnpm
created- content-addressable store
- symlink all the things
- solved "phantom" dependencies
yarn
(v2 berry) releases Plug'n'Play- Deno switches to URL imports
notes:
ok there's a lot here and that I'm going to go into individually
let's start with the content-addressable store
History Lesson
Modern (2017+)
pnpm
Content-Addressable Storage
- Global store in
~/.pnpm-store
- All packages x versions are stored here
- Hash-based addressing
- Unique + Free Integrity check
- Immutable
notes:
Data is stored and retrieved based on it's content NOT it's location
Each package version stored under a directory named based on a hash of it's content
Once a package is in the store, it's never modified so builds are reproducible and reliable
History Lesson
Modern (2017+)
pnpm
Symlinks
- Symlink local to the Global store
- Strict node module structure
- Multi-level linking
- symlinks on top of symlinks
notes:
Instead of copying packages into your project's node_modules
, pnpm creates symbolic links to the global store
pnpm creates a structure that strictly reflects the actual dependencies declared in your package.json.
- Direct dependencies are linked at the top level of
node_modules
- Nested dependencies are properly linked within their parent packages
- A
.pnpm
directory contains flattened links to all packages
History Lesson
Modern (2017+)
node_modules/
├── express -> ./.pnpm/express@4.17.1/node_modules/express
└── .pnpm/
├── express@4.17.1/
│ └── node_modules/
│ ├── express/ (actual link to global store)
│ ├── body-parser -> ../../body-parser@1.19.0/node_modules/body-parser
│ └── ... (other express dependencies)
├── body-parser@1.19.0/
│ └── node_modules/
│ ├── body-parser/ (actual link to global store)
│ └── ... (body-parser dependencies)
└── ... (other packages)
notes:
Solves the biggest dependency management issues we currently have
Disk space / Install speed / Strick boundaries / Smaller node_modules
/ High Security
And, lastly solves the "Phantom Dependency" problem we talked about before
Ok, quickly; what's Plug'n'Play?
History Lesson
Modern (2017+)
yarn
(v2 berry) Plug'n'play
- Solves the same problems I just covered
- Peer dependency issues
- Zero install
notes:
Yarn will specifically call out peer dependency mismatches Example from more of my code when I did a coding exercise when going for the ANZ role (I wanted to show I'm fancy) https://github.com/csi-lk/seek-coding-exercise/blob/main/.pnp.cjs
History Lesson
Today (2024+)
- Package management is everywhere yo
- Security concerns are now just supply chain attacks
- Monorepo management
- The rise of
bun
notes:
Phew, ok we're up to current state
Bun seems to be rapidly gaining market share (and I'm now using it for my own projects) as it's faster than pnpm
but includes build tools and runtime integration (best of pnpm, gulp, requirejs etc. mashed together)
Ok that's enough history lesson, now the main reason I wanted to give this talk...
Fundamentals
notes:
They're fun!
Fundamentals
SemVer
- Basic format:
MAJOR.MINOR.PATCH
(e.g.,2.3.1
) - MAJOR: Breaking changes
- MINOR: New features, no breaking changes
- PATCH: Bug fixes, no new features or breaking changes
notes:
Pretty sure everyone knows SemVer by now but just for a refresher
Fundamentals
Common Version Ranges
- Exact version:
"react": "17.0.2"
- Only use exactly version 17.0.2
- Caret (^):
"react": "^17.0.2"
- Allow updates to any 17.x.x version but not 18.0.0 (MINOR / PATCH)
- Tilde (~):
"react": "~17.0.2"
- Allow updates to 17.0.x but not 17.1.0 (PATCH only)
- Wildcard (*):
"react": "17.*.*"
or"react": "17.*"
- Any version starting with 17
notes:
Version range specifiers is the npm
defined spec for how to reference dependency versioning
Fundamentals
Un-common Version Ranges
- Greater than (>):
"react": ">17.0.0"
- Any version higher than 17.0.0
- Greater than or equal (>=):
"react": ">=17.0.0"
- Version 17.0.0 or higher
- Less than (<):
"react": "<18.0.0"
- Any version lower than 18.0.0
- Range:
"react": ">=16.0.0 <18.0.0"
- Between 16.0.0 and 18.0.0 (excluding 18.0.0)
- OR:
"react": "15.0.0 || 16.0.0"
- Either exactly 15.0.0 or 16.0.0
notes:
Please don't use any of these unless you have a very good reason to
Fundamentals
Lockfile
- Locking / Freezing your dependencies
- Same immutable state of dependencies
- Stop drift of dependencies between deployments
notes:
Before lockfiles package installations could vary from one day to the next or from one machine to another, even with the same package.json file WHICH WAS GREAT FUN
The name has become standard terminology across package managers:
(npm
called it shrinkwrap for a while but lets forget about that)
package-lock.json (npm), yarn.lock (Yarn), pnpm-lock.yaml (pnpm), Gemfile.lock (Ruby), Cargo.lock (Rust)
Fundamentals
Updating the Lockfile
- Install dependencies within defined ranges
"react": "^18.3.1"
- registry has
react@18.4.2
- registry has
- Update lockfile to point to
react@18.4.2
package.json
stays the same
notes:
What happens when we update the lockfile
We currently have react defined at 18.3.1 with a caret meaning take minor or patch updates
So when you update the lockfile you are taking on a tiny bit of risk that you're moving the dependencies for the whole project but it's within the risk profile we've accepted
Fundamentals
Renovate
- Update the
package.json
- Within defined ranges
- Update lockfile
notes:
Ok what is renovate doing?
https://github.com/anzx/bluestone/pull/8220
Effectively the package.json
updating part but within the same defined ranges
We currently have react defined at 18.3.1 with a caret meaning take minor or patch updates
Phew
we made it
Questions?
- How does
pnpm
handleworkspace:*
resolution? - You didn't go into peer dependencies, do you hate them?
- How is your beard not grey by now?