OS lifecycle

Edit on GitHub

projectlint is a projects-wide linter and style checker I've been working on during the last weeks. As part of its set of rules, one of them ensures that the current version of the operating system where the code is running is maintained and updated. But, is there a npm package with info about the operating systems lifecycles? Nope... enter OS lifecycle.

OS lifecycle offer a functions whom to query for the info of diferent operating systems on a specific date, inspired on @pkgjs/nv package to query info about maintence of Node.js versions. In addition to that, info is provided in a raw form in a json file. So far this is a simple package... What's interesting is how the json file is generated.

OS lifecycle code has two differenciated parts, the builder and the querier (in fact, the builder would make sense to move it to another module...). Builder is a script that fetch and agregate the info from several operative systems sites, currently Carnegie Mellon University SCS Computing Facilities as base info for several operating systems, and the Ubuntu releases for more up-to-date info for Ubuntu operating sytem, with the intention of adding other sources in the future. This sites provides the info as HTML tables, so is being used the tabletojson package to extract it (with some tune-ups to add support for rowspan cells).

So far, this is a regular web scrapper that needs to be executed by hand. I've used Github Actions to automate it, checking at midnight if there was updates in the data sources that day. This are versioned stored in the git repo by creating new commits using the git-auto-commit action. Thing is, Github Actions v1 was more powerful (and resources consuming) than v2 and was trying to publish all nightly versions (crashing the workflow with an error due to trying to overwrite the previous versions in the npm registry also when there was no updates in the data), so git-auto-commit needed to notify it so next steps could be skipped. I would have prefer to fully stop the workflow instead of doing that hack, but Github Actions v2 removed neutral output on purposse:

Good thing, this hack showed me how to define environment variables in the workflow, and it was almost equal to how Azure pipelines does it (as I learned just some weeks ago at work), so in the same style I added a --print argument to show the new version and use it not only to have nicer commit messages and tags... but also to detect if there was updates in the data sources and short-circuit the call to the git-auto-commit action itself :-)

And that's it! With all these steps, finally I've been able to fully automate the data scraping and normalization, and to publish the newly generated packages both on npm and Github Packages registry... a bit of duplicated effort now that Github has bought npm :-P

Written on March 19, 2020

Comment on Twitter

You can leave a comment by replying this tweet.