According to Wikipedia, a monorepo is as software development strategy where many projects are stored in the same repository. This strategy allows for quick detection of potential problems and breakages caused by changes in dependencies, and it has been adopted by many organizations that work with large scale codebases, such as Google, Facebook, and Twitter.
You too can apply this strategy if you happen to use Gradle as your build tool of choice thanks to a feature known as Composite Builds, introduced back in version 3.1 (at the time of writing latest version is 5.0). Let’s have a look at a typical monorepo workflow when this feature is not in use.
Life without Composite Builds
Let’s imagine you’ve just started work at a company where projects are kept in a single repository. Each project has a separate build and the only relationship between them is via dependencies on one another as it fit their needs. Some projects will have more dependencies than others, some projects may not even have dependencies to the others.
The number of projects is important; when it’s low you could say that all of them can fit under an umbrella project, just like it’s done with Maven and its reactor feature. Gradle has a similar feature except that it’s easier to target a particular build without triggering all other projects; in a way you can say that Gradle’s reactor is smarter and picking the targets to be executed.
But what happens when project number goes above a dozen, say a couple hundreds? Even with a smarter reactor Gradle would have to read the configuration of all projects and then resolve the appropriate targets. This will certainly take precious time of your daily work, and that’s a big no-no.
The solution would be to break down each project into individual builds. Gone is the reactor feature thus we don’t have to pay the price of reading and configuring all projects to later discard most of them. However now we lost the opportunity to react when a dependency may have introduced a bug or a binary incompatibility, which is one of the reasons to organize code in a monorepo.
Now we have to follow the old and tried workflow of
- Make a change on the dependency project.
- Build and publish artifacts to a repository. Most people rely on snapshot artifacts.
- Make sure the dependent project consumes the freshly published artifacts/snapshots.
- Compile and run tests to figure out if the code works again.
- Rinse and repeat until it works.
The problem with this approach is that we waste time publishing intermediate artifacts, and form time to time, we’ll forget to publish a snapshot release and spend hours in a debugging session until we realize the binaries are incorrect, ugh.
Composite Builds to the rescue
Let’s look now at how Composite Builds may solve the problem we’re found ourselves in. We begin by looking at the following projects and their dependencies between them
Project1
Project2 <– depends — Project1
Project3 <– depends — Project2
This small dependency graphs tells us that any changes made to Project1 will affect Project2 and by consequence, to Project3 as well, because changes to Project2 also affect Project3. The directory structure for this monorepo looks like this
. ├── project1 │ └── build.gradle ├── project2 │ └── build.gradle └── project3 └── build.gradle
Here we can see the three projects with their respective builds files. Each project has its own release lifecycle and version, as we can observe in their build files
project1/build.gradle
apply plugin: 'java' group = 'com.acme' version = '1.0.0'
project2/build.gradle
apply plugin: 'java' group = 'com.acme' version = '2.3.0' dependencies { compile 'com.acme:project1:1.0.0' }
project3/build.gradle
apply plugin: 'java' group = 'com.acme' version = '1.2.0' dependencies { compile 'com.acme:project2:2.3.0' }
Activating the Composite Builds feature requires configuring the link between projects in a file named settings.gradle. Projects 2 and 3 required this file, thus our repository looks like this
. ├── project1 │ └── build.gradle ├── project2 │ ├── build.gradle │ └── settings.gradle └── project3 ├── build.gradle └── settings.gradle
Next we write down the links between projects like so
project2/settings.gradle
includeBuild '../project1'
project3/settings.gradle
includeBuild '../project2'
Great. With this setup in place we can now build project3 by issuing the following commands
$ cd project3 $ pwd /tmp/project3 $ gradle classes > Task :processResources > Task :project2:processResources > Task :project1:compileJava > Task :project1:processResources > Task :project1:classes > Task :project1:jar > Task :project2:compileJava > Task :project2:classes > Task :project2:jar > Task :compileJava > Task :classes
As you can appreciate, both project1 and project2 were built as well. Making a change in project1 and triggering the build on project3 once again will build all three projects as expected. Now imagine growing this monorepo to dozens or hundreds of projects and you’ll quickly realize that there’s little need to have snapshot releases, if any. Gradle has other features up its sleeve, like task caching of inputs/outputs which make builds faster as well; similarly the recently announced build cache feature speeds up builds by “yoinking” outputs that have been computed by other nodes in a CI farm.
If you enjoyed this article you may find other interesting posts about Gradle and build tools in general at my blog.
Author: Andres Almiray
Andres is a Java/Groovy developer and a Java Champion with more than 2 decades of experience in software design and development. He has been involved in web and desktop application development since the early days of Java. Andres is a true believer in open source and has participated on popular projects like Groovy, Griffon, and DbUnit, as well as starting his own projects. Founding member of the Griffon framework and Hackergarten community event.