How I approach a new open source project

When asked to evaluate a new technology to be pulled in I do a few things first, before even launching the application. Here is the list of those things:

  1. Read the documentation. There are two types of it in general: "here is a thing" and "this is how to solve a problem with a thing". Developers typically start with the first. If you see documentation merely listing the configuration parameters, you know the project is not mature yet. If you see sections like "Clustering" which describe how to solve a particular problem, this is a point for this technology.

  2. Read the performance tests. Both the results and the code. If you did not say a WTF doing this, that's a score point. If you see things like, for example, in Kafka - where the performance section of the doc points at a many years old blog post and you know that application's design has been revamped since, you know it is a pass. When you see load tests being run "alone", without comparison to any competitor tech, it is a pass. When you look at the benchmark code and it is atrocious, such as (again Kafka) missing a warmup phase, it is a hard pass. The authors have no idea what performance is and how to measure it.

  3. Read the code. Clone the repo, look at the crucial parts of it. How does application handle its IO? You'd be surprised how many times this will be enough to find bugs. How is the coding style, can you read the code and make sense of it right away? You'll be reading this code if you have the tech in production, so you should be better be able to instaread it. Finally, and this is controversial, if there are any references to academic papers in the code? Keep a close eye on any code that has them. The crucial part is if this cutting edge academic reference is in a place where it should have been. If you see a RDBMS with those references in its query optimizer, it is probably justified. If you see a message bus with those references in its custom scheduler code, which deals with just a few tasks for removing obsolete files, this means the authors are running after the shiny things. They optimized code which is not on any critical path and you should avoid it.

  4. Read the bug tracker. Get a sense of the authors response time, their style, their ability to fix bugs. I usually sort the issues by the number of comments on them. The most commented ones are the most interesting to read. Do it to systemd bug tracker for example - you'll see the best bugs.

  5. Find a bug. And I mean a real bug. You already have the code cloned. Just find a bug in it. This usually takes anywhere from 10 minutes to an hour. You may have already found one during step 3. The good place to look is any Utils or Tools file. Or a helper. Quite often a parser.

  6. Report that bug. Did you get a response within a few days? Did author acknowledge it is a bug? Did she offer to write a PR? Did she write one herself? For example, when looking at the Spring Cloud and Zuul I found an HTTP RFC breaking bug in 10 minutes. I reported it and neither of the projects even responded to a bug. And this is an HTTP bug in an HTTP proxy, mind you. Guess what was my conclusion about the quality of the project...

If the project made so far without major red flags, I will install and launch the thing. I will load test it, I will test the way it does HA. I will find more bugs and this time write a Pull Request to fix each of them. All of this takes time. But it is very rarely I have to do it. Not many projects get past the first stages.