Site icon JVM Advent

The journey of providing Kotlin bindings for GitHub Actions

It’s been almost 3 years since the inception of github-workflows-kt library, a tool that lets you write GitHub Actions workflows in Kotlin instead of YAML (featured in Java Advent in 2022 here). One of its flagship features, and an interesting subproblem, is providing type-safe Kotlin bindings for as many GitHub actions as possible. We started by hand-crafting and unit-testing each individual binding, but it clearly doesn’t scale well as we want to support as many actions as possible. Curious about the challenges we faced (and the ones we still face!), and what approaches we tried on our way? Want to learn how we implemented a Maven-compatible server as a part of the current solution? Does generating and compiling Kotlin code on the fly sound interesting? Read on!

What’s the problem again?

I assume that you know what GitHub Actions are, and you had a chance to look up what github-workflows-kt is. This is crucial to go further.

To get us started with something, here’s a simple workflow (but complex enough to see some interesting parts):

Today we’re going to focus on the pieces that start with uses:. Let’s analyze them:

actions/cache is used to restore the contents of build and .cache directories, from the key build-cache-dirs. Notice that the first parameter is a multi-line string, enumerating directories to cache.

The most popular action in the world: actions/checkout, used to clone repos. Your attention is probably brought to the peculiar value of fetch-depth: are we cloning 0 commits? Nope, it means cloning the whole history for all branches and tags.

No parameters this time, but this action is special because its coordinates don’t point to the repository root. See gradle/actions, there’s a setup-gradle directory there, meaning we run an action from a subtree of the GitHub repo. v4 still points to the git ref (branch/tag).

Now that we’ve got most of the interesting cases in front of us, our goal is to represent these so that the users of github-workflows-kt can conveniently consume them from within the Kotlin script. We’re obviously skipping the stringly-typed approach, like:

although something like this is also possible, as a last resort, for whatever reason. We need more type-safety!

The journey

It turns out it’s one of these challenges where it’s just impossible to sit down for a few evenings and code it. It’s like walking in the mountains: once you’re done with the first hill, there’s a peak waiting for you, and the next one, and the next one. Somewhere behind the bushes, you hear a melody of the perfect solution, the sweet spot – but it’s not on your map, you have to find it by trial and error. But wait, is the source of the melody moving? Is it even a single source?

Let’s see what it’s been like to tackle this problem so far. Step by step.

Step #0 – hand-craft the bindings

We started really small. The first iteration on a binding for actions/checkout looked similar to this:

Just this single parameter. You can see that we got excited with the way the fetchDepth parameter can be modeled, especially the misleading 0 that actually means “check out everything”. Now we had a way of saying it explicitly, with the designated object Infinite.

After some time, we noticed some arguments are missing, so here they are, along with proper types:

In the above snippet, you can also see how the parameters get converted to a map that is then easy to serialize to YAML. Pretty much boilerplate.

Wait, where are the unit tests? Yes sir!

After an hour or so, it was ready. Tedious translation, from YAML to Kotlin, for which one would probably use an LLM today (and then double-check it manually). Anyway, this baby was hand-written. And so were the bindings for 12 other actions.

Even though it was time-consuming and contains a massive amount of boilerplate and repetition, this phase was necessary to get a feeling of what it’s like to create such bindings, learn what are the most common input types, and figure out what Kotlin primitives are best to model these. That’s how I’d start today as well, as a necessary step on the evolution path.

The “13” (actions) was our lucky number that motivated us to make the bindings easier to create and maintain, thanks to…

Step #1 – use code generation

After hand-coding a dozen of action bindings, some patterns emerged. The inner lazy developers in us started screaming “automate!”.

The naive idea was to build the Kotlin code using string templates, just glueing strings together. This, however, would mean that the string would let us make any error in the generated code, and we’d learn about it as late as upon compiling it. That’s why we thought it would be best to generate Kotlin… from Kotlin. Please welcome our guest, square/KotlinPoet.

To give you a sense of what it’s like to use KotlinPoet, here’s a snippet:

I realize it may be moderately readable at first glance. At the second one, it does read in a pretty straightforward way – you just add certain members of the class, set its various properties. Most of the chained function calls above come from KotlinPoet, and some are our domain-specific extension functions.

This was cool, but what was more important and ground-breaking wasn’t actually the way we generate the code, but the whole infra for it, so also how we get the data to generate it. We started calling GitHub to fetch action’s manifest – the action.y(a)ml file where info about the inputs and the outputs is stored by action owners. There was also a way to store information about the typings. In the early stage, we stored it in Kotlin like this (inputs of type “string” were initially omitted):

We ended up with a module that we know today as the Action Binding Generator. It accepts action coordinates (owner, name, version), the typings, and gives you a piece of Kotlin code with the action binding.

Did we also generate unit tests for each binding? Not really. It was enough to unit-test a synthetic binding with all possible kinds of inputs, so it’s just a single test that changes rarely. We also thoroughly tested a piece of logic that converts action and input names to camel case suitable for Kotlin.

What’s also worth noting is that for a long time, we kept the generated code in the repository. While it may seem like a smell, it was crucial for us to track how each change in the generation logic and the action manifest affects the produced code. This way we could iron out any remaining issues, including bugs (not always on our side) when putting inputs’ descriptions as KDoc comments, and ultimately got high enough confidence that the module works fine 99.9% of the time.

To make it easier for library users to find out which actions have their bindings, and to find their source code, a special page in the docs was generated and automatically updated alongside the bindings (they were called wrappers back then):

We lived with this state of affairs for almost a year, and it let us add support for 81 actions, 98 if you count each version separately.

Step #2 – let action owners host the typings

Subsequent library versions were released, the customers were happy, everything was fine. Well, not really. Some releases looked like this (an extreme case):

When adding support for new actions, we had to add typings for it. If someone wanted to contribute it, they had to learn how our little code generator works, and how to describe the typings.

There were also the “update” kind of changes. Each one usually meant that the action owner changed something in their action, even fixed a typo in some input’s description, so we had to regenerate the Kotlin bindings. Sometimes they added or modified the inputs, so we had to check with the action’s manifest or the docs what the type was.

It was super-boring. Our release cadence was two weeks, as a compromise between providing people with updated bindings fairly frequently, and not having to think about these updates too often. Sure we had some automation that created PRs for us or detected a change in inputs, but it still didn’t feel like the long-term solution.

We decided to start a journey to make this process more sustainable. A new tool, typesafegithub/github-actions-typing, was born. It’s a way to describe types of action’s inputs and outputs in a machine-readable format, so that our binding generator could easily parse it. It’s language-agnostic, isn’t aware of github-workflows-kt or Kotlin, and any other code generator could use it as well. The theory was to talk to the action owners, and ask them to store an extra file called action-types.yml, also for the benefit of their action’s users, as a standardized format of action’s API docs. If they agreed, we would remove the typings from the library’s repo, and some part of the problem of keeping the bindings up-to-date for a given action would be solved.

In practice, it didn’t always work, and still doesn’t work. For some actions, the issues/PRs related to adding the typings were never addressed, as if the action wasn’t maintained. If we did get in touch with the maintainers, some of them just didn’t like the idea, and responded that e.g. if GitHub adds such a feature, they will consider using it. Well, without getting too deep into such approach, let me say I respect it. The GitHub folks weren’t very enthusiastic about the idea so far (see this PR).

But it wasn’t all that negative. We’ve got some adoption! Some action owners did like the idea; one of the most prominent examples are microsoft/setup-msbuild (yes, Microsoft!), or benchmark-action/github-action-benchmark and ReactiveCircus/android-emulator-runner, both with ~1000 stars. Actions that hosted their own typings got a fancy marker on our “Supported actions” page in the docs:

So far (November 2024), excluding forks, this protocol of providing the typings was adopted by 24 actions. I’d say it’s fairly good for a bottom-up initiative! The more actions onboard this solution, the higher chances are for it to become a de facto standard. I really hope GitHub will revisit the idea of making GitHub Actions more type-safe.

To make things coherent, we ditched the Kotlin-based typing definitions stored in the library’s repo:

in favor of YAML-based typings, stored in a directory structure with well-defined conventions (e.g. actions/checkout/v2/action-types.yml):

After some time, we thought it would be a good idea to extract all typings to a separate repo, like what DefinitelyTyped is to TypeScript libraries, typeshed is to Python ecosystem, or SchemaStore is to JSON. It would address all these cases where we couldn’t get the typings hosted with the action in its repo. That’s how typesafegithub/github-actions-typing-catalog was created.

Looks like we managed to start a little pro-type-safety movement in the GitHub Actions community. It felt good!

Step #3 – allow client-side binding generation

All right, we did partially delegate maintenance of typings for some actions to their owners and the community, but we did not get rid of the need to maintain and vend the bindings. The generated code was still sitting in the library’s repo. The clients had to wait for at most two weeks for updated bindings, and for bindings for new major versions. There had to be a better way.

The first idea was to use a conventional technique of code generation (or alike) usable with standard Kotlin: KSP or a compiler plugin. Unfortunately, none of them is supported with Kotlin Scripting (see e.g. [KT-47384] Add ability to use compiler plugins in .main.kts (Kotlin Script) files). It means that the entry point of code generation had to be provided in a different way.

What’s the most obvious and explicit way of generate the code? Ask the user to do it! We’d ship the action binding generator as a stand-alone library, along with some convenience functions so that the amount of boilerplate is minimal. We’d also need a way of listing which actions in which versions should get their bindings.

As a result, as an experimental approach, we introduced client-side binding generation. The user only (ironically, of course) had to add an new Kotlin script like:

enable the feature via a flag in the workflow (generateActionBindings = true) which resulted in an extra step in the workflow YAML:

and add one more file

which repurposed the GitHub workflow’s YAML to drive the client-side binding generation. The nice thing about this was that dependency updating bots like Renovate or Dependabot could bump versions here, and such PRs would be auto-merged.

Upon generating the bindings, they landed in a .github/workflows/generated directory, and could be imported from the workflow scripts with e.g. @file:Import("generated/actions/checkout.kt").

It worked! We were proud we moved the needle by just a bit, and gave more freedom to the users in which actions they could use, and they weren’t tied to the library’s release cadence anymore.

However, it turned out that it’s just too much ceremony. I was blinded by the good parts of this approach, and didn’t see its bad ergonomics. The early adopters had to remember to regenerate the bindings every now and then. It was also impossible to get proper IDE support for multi-file Kotlin scripts, it’s been a long-standing issue (see e.g. [KT-42101] Scripts: @file:Import() in kotlin-main-kts uses a stale cache or [KTIJ-14580] Imported script are not supported for scripts outside of a source root).

That’s why we removed this experimental feature, and got back to the design board. We did collect some important findings, though.

Step #4 – create a Maven-compatible binding server

After the last experiment, we were back to the bundled bindings, without perspectives for a change.

One day, it hit me. My chain of thoughts was similar to this:

Action bindings are just another kind of dependency, but not on Maven libraries. We could publish them to Maven Central, but publishing over 100 artifacts with each library release sounds like it could fail a lot (recalling random issues with uploading to Central). Besides, we’d be tied to the release cadence, and it doesn’t really solve any problem.

I wish publishing to Maven Central was more flexible, perhaps done on demand when a user wants some action, but it’s not possible.

Can we create our own Maven server?

Yes, we can! And it’s not that difficult. Let’s ignore all the voices in our heads telling us to not create a service if it’s not absolutely needed, and go step by step to see how hard it is to get it working.

The following problems needed to be solved:

Starting with the API, Kotlin Scripting supports declaring dependencies on Maven artifacts, like this:

If we look at a typical coordinates of a GitHub action, they match pretty well. For example, for actions/checkout@v4:

It means that a URL to a JAR that contains the example action’s Kotlin binding could look like this: https://some-custom-maven-repo.com/actions/checkout/v4/checkout-v4.jar. Several other auxiliary files like POM or maven-metadata.xml would need to be hosted as well.

There’s one edge case here: actions that have their manifests hosted not in their repository’s root, so e.g. gradle/actions/setup-gradle. Notice that the first separator is in fact different from the second one, and if we were to treat “actions/setup-gradle” as Maven’s actifact ID, it would create a problem. Why? Because it would map to a URL like https://some-custom-maven-repo.com/gradle/actions/setup-gradle/v4/actions/setup-gradle-v4.jar – the JAR’s name would contain the slash, plus we wouldn’t really be able to tell if “gradle” is the owner and “actions/setup-gradle” is the action name, or maybe it’s “gradle/actions” and “setup-gradle” respectively. That’s why for the purpose of the binding server, we went ahead with replacing any slashes in the path relative to the repository root with a double underscore, so adding a dependency on such action looks like this: @file:DependsOn("gradle:actions__setup-gradle:v4"). The double underscore is rare enough to not expect it in owner or action names.

Creating a JAR turned out to be much simpler than I thought. Since generating the source code of a binding class is already solved with the previously described Action Binding Generator module, and putting files into a ZIP is fairly simple, the only true challenge was to run the Kotlin compiler. Luckily, the Kotlin compiler is available as a stand-alone Kotlin library! Its usage resembles how we’d use it through the CLI (kotlinc), along with some extra config. This function depicts how easy it is:

It’s time to expose the JAR (et.al.) generation logic via a REST API compatible with Maven, to allow providing the bindings on demand, on the fly. The lightweight ktor was used to expose a server, and the required routing can be depicted with a short code snippet:

where the artifacts from the route with package version are described as follows:

Providing the checksums isn’t a result of me being a purist, they’re needed to make Kotlin Scripting happy. Otherwise we get a nasty warning.

Regarding performance, generating a JAR of a single binding takes from 1 to 5 seconds, depending on how many HTTP requests the logic of fetching action metadata and typings has to make (“.yml” or “.yaml” extension is possible, and the typing may live in the action or the typing catalog). A simple in-memory caching mechanism was put it place (using cache4k), to address a case where a flood of requests coming from compiling Kotlin scripts from a single repo arrives to the server:

After asking around who would be able to host the service, Leo Colman from Brazil agreed to host it on his private VPS, making this project truly inter-continental.

That’s it! The service has been alive for several months now, and in github-workflows-kt starting from v3.0.0 this is the only way of providing bindings. I’m free from releasing the library every month with updated bindings, it’s all driven by the users and the community. The server supports any action, within seconds, and anyone can contribute typings for any action.

Current challenges

Is it the end of the story? No, of course not. Despite the service and the hosting proved to be stable and cope with the current load well, several challenges appeared.

Let’s start with the most customer-facing problem: dependency updating bots cannot handle all cases. They work fine when it comes to bumping versions of actions stored at the top level of their repo, so e.g. actions/checkout. In this case, e.g. Renovate creates a single PR that correctly updates both the Kotlin script and the YAML, and what’s important, it can be auto-merged without user’s intervention. The problematic case is for actions stored in a subdirectory, so e.g. gradle/actions/setup-gradle. Renovate creates two PRs in such case: the first one that updates the Kotlin script whenever we have @file:DependsOn("gradle:actions__setup-gradle:v4"), and the other one to bump all occurrences in YAML that refer to the “gradle/actions” repo. It’s because the bindings for such sub-actions are modeled by the bindings server as a separate artifact for each sub-action; for Maven, e.g. gradle:actions__setup-gradle and gradle:actions__wrapper-validation have merely the same group ID, but are disjoint libraries.

The ideal solution would be to mimic what’s done on the YAML level, so perhaps have a single Maven artifact to gather all actions in a given repo. However, it would be problematic because I can imagine actions with dozen sub-actions, and code generation for it would take significantly longer, so it’s about the scalability. Another approach is making the dependency updating bots aware of such cases, so that they create a single PR. So far, this problem hasn’t been too painful, so we’re staying with the current approach, and waiting for more data on how painful it is for the users.

The second problem is about backward-incompatible changes in the bindings provided by the server. Despite there have been none released so far intentionally, just to let people adapt to the new approach, we have a couple of improvements in the queue that would break at least some users. These are:

It’s generally possible to expose a “v2” of the bindings server (already implemented by a faithful contributor here), but it is extra hassle to keep the library in sync with the server, especially that the library exposes a RegularAction class that the bindings provided by the service inherit from. It will certainly require adding some validation to ensure that the users use mutually compatible library and server versions.

The third problem is that the library isn’t just a library with the bundled bindings anymore, so a single JAR you could security-review, and ensure the bindings’ code does what the user expects. If the server gets compromised, one can potentially inject some harmful logic into the bindings, causing e.g. data leak or impacting performance, depending on the context your GitHub workflows run. It’s been a blocker for at least one of the library’s users. While I think the shared, first-party server (https://bindings.krzeminski.it/) is fine for most open-source projects, I definitely hear the concern.

The ideal solution would be to follow a similar practice that is used to harden YAML-based workflows, so pinning to specific revisions for both the action logic (by commit hash) and the JAR (by checksum):

This, however, isn’t supported by Kotlin Scripting as of today, and if one uses Maven Central, this feature is not really needed because it’s guaranteed the artifacts are immutable. There would be also other problems with this approach, i.e. the dependency updating bots would have to be made aware of the JAR’s checksum.

What can be done about it right now? There are several possibilities:

We’ve got too little feedback yet to officially support any of the above approaches, so please let us know if you need help!

Summary

Looking back, it’s been a fascinating and fun journey of evolving the solution, trying out various approaches, automating whatever makes sense, and listening to the users.

I hope that this article showed that code generation and in-process Kotlin compilation isn’t that hard, thinking outside the box can bring surprisingly good results, and scaling a solution requires creativity at each step.

I’d like to thank all the contributors and the users who provided valuable feedback and improvements. In particular (alphabetically):

I feel like we’ll have yet another revolution when it comes to providing the bindings if [KT-47384] Add ability to use compiler plugins in .main.kts (Kotlin Script) files ever gets implemented… 😉

Author: Piotr Krzemiński

Piotr is a software engineer who likes digging deeper into how great software is created, and following best practices in his daily work. He thinks that by becoming a true software craftsman, long-term project maintenance becomes a pleasure, for the benefit of the end customers. Piotr is a fan of type-safety, small and simple components (Web services, classes, functions and alike), and automating to the max. He’s a happy husband and father.
Exit mobile version