Categories
Software Development Workflows

Gitflow Workflow vs. Trunk-based development: A true story in a startup

This is my story about how I stumbled upon Trunk-based development through the Atlassian pages. Seated in a real situation where I had to give advice to my team on how to handle branching, development, testing, and releasing. You will read about what thought process went through while evaluating the Atlassian page about Trunk-based development, which we used to decide if that would be the flow for us in a startup environment.

Real-life situation

For several years, as I was part of different teams across companies, we were using Gitflow Workflow. In some projects, it was modified a little bit to fit the need of the organization better, but more or less, we were following the original flow.

At a startup

Recently I joined a new startup, and we just got to a phase where testing and production need to be taken more seriously. We only used the dev branch, and we did not have a defined release process. There was no environment where we would do releases other than for development. The “release” we had for development setting up our Kubernetes Helm charts, and we made sure that the app worked not only in a local environment but in Azure as well.

We had the following challenges.

As the Software matured, including the applications we wrote, management asked to close Epics, and as part of the process, the QA team would have to test and sign the features as done. The challenging part was that we did not have any QA members on our team, and although we had an additional “demo” environment, it was exactly the same as the “dev” environment.

In these environments, when we closed a merge request, the CI/CD pipeline was updating only the docker images tagged with “dev.” So it meant two things:

  1. As soon as we closed a merge request, the app in the Azure environment changed with it.
  2. We could not make an environment for the QA team where we do not make changes under their hands.

It caused problems for QA. When we said a feature could be tested because we tested it, it worked, but by the time they got to it, sometimes it was already broken because another new feature might break it. In the development branch, it is ok since we are going to test these features, but since the updates are visible in an instant, we did not have time to react to them before QA would be affected by it.

It also had a chance to cause problems during the demo. If a merge request was flying in untested, it updated the demo environment. Even if it did not break the entire application, it might alter it enough to cause confusion during the demo.

And we were in need

On top of the iceberg, we lacked any dedicated QA team members who would have time to write automated tests for our features that would tell if anything was broken during the CI/CD pipeline.

We did not have many Unit tests either, so we relied on manually smoke testing the main features before asking the QA team to validate the features and close the Epic tickets.

Since our pipeline constantly updated the docker images with “dev” tags, and our Kubernetes Helm charts were all set up to use the “dev” Docker images, it made it challenging for the QA team to validate the features. As for the development team (us), it was embarrassing because it seemed unprofessional that we said we tested the features and it was just working a few hours ago, but it is not working for them. It was true at that time, but lacking any ideal process, including automated tests, we knew we were in need of a better workflow.

That is when I raised my hand, I could help with Gitflow Workflow

The Gitflow Workflow idea has been around since 2010, and in my experience, it had its footing in almost all of the projects I participated in. It is adaptable to the project’s specific needs and easy to work with once you get used to the flow.

Its processes can be automated, but it allows you to do steps manually. It gives freedom, although it requires high control and attention from multiple parties of the team, including Developers, QA, and POs (Project Owners).

At another company in 2014, I recommended that same flow, and it delivered to the expectations we had. Based on my experience with Gitflow Workflow, I knew that I could help the team to set up a workflow that could solve the key pain points.

We would only need a way to make images with release version tags. Once we have these, we could set up the Helm charts for the demo environment, so those would use versioned image tags and not a joker card, like the Docker images with “dev” tags we had at this time.

That way, once a sprint is ended, we (the developers) would release new images with versioned tags, and we could test the dedicated environment before handing it over to QA.

It also meant that our development process and environment would be the same as before. We could use the latest images with “dev” tags, and our environment would be updated as we merge requests.

Having versioned images also meant additional benefits.

  • If QA found any issues, they could tell us in which version the bug was introduced and how to reproduce it.
  • As the Dev team, we could validate it against the recommended versions and our latest “dev” environment. The issue might have been fixed when QA realized they had an issue. Or it could also happen that the bug only was introduced with a combination of services, so we have the control to investigate more easily what service introduced the problem and in which version.
  • Further down the line, when we will introduce production, it allows us to have a proper release flow with dev, test, staging, and production environments.

At this point, the only thing left is to keep a demo for my team members and convince them about the benefits in case we decide to take on this additional complexity.

The sudden realization

I suggested using Gitflow Workflow as described in the Atlassian documentation. As I was preparing for my meeting to dive deeper into the complexities and benefits of this flow, I discovered that the first paragraph of the documentation was updated, and it now calls the Gitflow Workflow a legacy flow and recommends using trunk-based workflows instead.

Since I only had a few hours to prepare for my meeting, I quickly started to go through the new trunk-based workflow documentation on the Atlassian website. I was surprised, and I did not want to recommend a “legacy Git workflow” for a startup without knowing what the new recommended alternative is. I was ready to alternate the topic of my meeting and be honest that I planned to recommend something that I know worked for many years. Still, I needed some time to take into consideration the new workflow before deciding to force everyone to use it.

Fortunately, the trunk-based development documentation on the Atlassian website is not long. It does not go into implementation details but gives a high-level overview of the trunk-based development workflow. I have to be honest, this documentation is quite biased in highlighting the benefit of the new workflow, but it does not highlight what the new requirements as an overweight to achieve trunk-based development are.

Disclaimer

I always gave high trust to the posts written by the Atlassian people. When I first learned about Git, I used their material to learn, and it gave me high confidence I knew what I was doing. This time, when I found out about the Trunk-based development on their website, and I read the pages, I was confused, especially since I had a really short time to figure out if I give good advice to my team members or not.

Based on that experience, I started writing this article to learn and teach what I learned, comparing the Gitflow Workflow with Trunk-based development. Once I finished writing the part about Trunk-based development and spent more time understanding what the Atlassian team was saying, I realized that my reaction to that ended with a good dose of confusion and criticism. After that, I stopped and read more about the topic on other blogs and realized that what is on the Atlassian pages has someplace to improve.

Since I started to learn more about this topic, I started to have a more expanded understanding of the topic compared to what I described below. Still, I want to keep my original text to show what I had in mind relying solely on the knowledge from Atlassian. At the end of the article, I will give you resources where you can learn more about the topic.

The pains and gains of trunk-based development

With trunk-based development, you gain some benefits once you automate the processes. Reading the Atlassian documentation about Trunk-based development makes it look like a dream to work with, especially if you read it with a business mindset and not with a developer mindset. In the text below, when I say “documentation,” I will refer to the Atlassian page describing trunk-based development.

Allows continuous code integration

In the documentation, this part is written to highlight how good it is when you add automated test suites and code coverage monitoring. While it is unarguable, it has an incredible value when you commit to a high level of test automation; it is quite expensive. You have to ask your developers to spend about the same time writing tests, sometimes more than what time they spend on feature implementation itself. It guarantees code quality, and you can be sure that the feature that has been asked is working as developed. I highlight that automated tests written by developers will make sure they are working the way they understood the story ticket required them to implement.

This point uncovers another requirement that your PO (Product Owner) has to create well-written stories to develop, and together with your QA team (Quality Assurance team / Testers), must write edge cases in the ticket to surely be covered in the first development iteration of the ticket. Developers will prepare the logic for edge cases, but their time is not spent fully thinking about all the ways their code will / could be used. It is the QA team, who has their full-time job to try to uncover edge cases, even the POs would not think about. After that is done, and the developers submit their code with their initial automated tests, the QA team has to take over and add another layer of automated tests (end-to-end tests, integration tests, smoke tests, etc.) from a user’s perspective.

Reality checks from a Start-up viewpoint.

Working lately on startup projects, where the main goal is to gain domain on the market quickly, as developers, we usually have to deliver features quickly. As a second thought, we deal with refactoring (resolving tech debts) at a later point when the business gains the trust of some customers and the cash flow starts to move in a positive direction.

All of the above practices are good when there are time and money to invest in a growing market with a well-established product. In the world of a startup, all of the basic requirements needed for trunk-based development are regarded as a luxury.

Lack of Quality Tickets

Usually, we have to make notes on meetings and add them to the ticket. Unfortunately, that is how most of them are written. POs are overwhelmed with the need of business leaders and can make only titled tickets with very brief descriptions, if at all. The rest of the information comes during refinements, where the details are in the hands of the developers.

That is not a safe-proof solution, but as logical thinkers, we can deliver on those small bits of information. If something is missed or misunderstood, we usually create follow-up tickets with the new requirements. Most of the time, though, those features are left untouched until the feature itself proves that it has market needs. If not, those features are left in the dust for removal after having paying customers.

Need for more QA

Usually, the company does not believe in hiring enough people for its QA team. Most of the time, there is no QA at all in a scrum team or just a few people trying to test everything that the company does in a separate team. Imagine having one team of QA covering features for ten or more developer teams. Not a fair requirement and it definitely won’t result in quality testing.

Testing too early

If you are in early development, TDD (Test Driven Development) will slow you down, and if a change is needed, it means the tests need to be changed with them as well.

Note: On paper, TDD is a practice where you first write failing tests based on new requirements, then you make them pass with matching code as you start building the required features. In reality, I never saw this in practice, but we also refer to TDD when we try to suggest having a high test code coverage, usually above 80 percent.

Once you have customers and you need reliability, it is considered best practice. In reality, when just starting out with the product, writing little to no tests will be more practical. This goes against most of the Software Development best practices, but from a business perspective, it makes more sense (Only at a very early stage). You want to deliver fast, and once you have approval from your customers that you are building the right product, you can focus on building it in quality. It would make no sense to spend a limited budget a startup has on features that are not needed at all. It could destroy a young company with limited resources and time.

CI/CD management

Setting up CI/CD is not something developers are usually doing. As a developer myself, I can read the documentation on how to set up CI-CD, and based on that, I can do it, but it is not part of my daily routine. In most places, from what I have seen, these are handled by the DevOps team or a developer with a high DevOps interest who will deal with these problems in exchange for lower feature development tickets.

DevOps, in most companies, has the same fate as QA. There are not enough people hired, and to go cheap, there are lacking numbers of Senior members who actually would be fluent in working with the required DevOps tools needed.

This is the reality for most companies

While reading about best practices, it seems like there is no other way of operating work properly, and that is ok. In most startups, we do not operate properly but still achieve something that can be delivered. And we do that quickly. It will be full of bugs. Sometimes it just will resemble what the business leaders wanted, but due to the high demand for speed, those are only fixed if they are critical. If not, they can wait to be fixed or removed, as I mentioned above.

Ensures continuous code review

In this section, the original documentation highlights the benefits of small feature branches against long-lived feature branches. While I agree, and I experienced what it is like when a feature branch lives for like three months (or more), it is not fun resolving all the merge conflicts and dealing with the confusion when it comes to integrating your code with the main branch, where other teams collaborated much quicker.

Sometimes there are even mind shifts about how the code is meant to work. It is not just like some features are colliding and conflicting, but it can be that the libraries have changed or new ones have been added that do not work together with the ones intended to be merged. The code style might have changed, which was not aligned with all the teams. Or the main framework was updated, and the merged code does not even compile. This list could go on. Not fun.

The benefits of short-lived feature branches are no question, but it is not a change from Gitflow Workflow. It is not new, and it is recommended in both workflows.

Enables consecutive production code releases

The Atlassian documentation mentions that “Teams should make frequent, daily merges to the main branch.” As exciting as it sounds, it is as scary what level of team orchestration would be needed to implement this in reality.

You should have well-prepared tickets

First, all the tickets that developers get to implement should be well prepared. Meaning that all the aspects of the needed feature/change should be described, with no questions remaining for the developer who is going to implement it. This level of refinement would require a lot of attention and meetings from all stakeholders, including Product Owners, User Experience designers, Quality Assurance team members, Architects, and Developers who are familiar with the code base. These tickets require a lot of effort before they would even get in front of the developer to work on them.

Small features

A second requirement would be that the feature is small enough so that developer is capable of implementing it in less than a day, better if in half a day. Once it is finished and tested by the developer, including the time to write automated tests, it now needs to be handed over to another developer to review the code and approve the changes.

Note: Although it is not highlighted in the documentation, later on, reading other articles, I found that code review is eliminated from Trunk-based workflows. It is a practice of Gitflow Workflow. This reaction below though still holds true to the information presented in the Atlassian documentation.

You have to have a simpler flow.

Let’s assume, on average, there is only one ping-back where the original developer has to fix something. Once it is reviewed, it needs to be tested by QA, and QA still has to write their own tests on top of the developer’s changes. Now when this one is done, it probably needs to be reviewed by another QA. Also, somewhere in the flow, the new feature would be needed to be demoed to the PO. After the demo, there might be additional change requests that would be needed before it could be released. In that final step, the “feature flags” that the documentation also discusses are a huge help since even if the code is in the main branch, it does not mean it is available to the customer. So there is time to create follow-up tickets to alter the implemented features.

Another complexity to achieving this ideal state of the daily merge is when a lot of stories are complex enough (even if you take all the complexity away you can think of) that it cannot be implemented on the same day unless the developer is working 24 hours a day and is a 10x developer with all the infrastructure knowledge and full permissions that are needed for the story to work. Not to talk about when you have dependencies on other teams.

Now, the actuality of how to achieve this level of teamwork is not discussed in the documentation I read. For me, it is hard to imagine any flow that could jam all this into a single day.

There is a simpler way.

This is out of the scope of my first reaction to the Atlassian documentation, but reading what I wrote above needs some explanation to not confuse you needlessly for long.

On other blogs, it is highlighted that if you give high trust to your developers, all of the developers are on a senior level, and they form a good team and know each other; there is no need for code reviews at all. It removes most of the complexity and gives high speed. It makes it possible to commit daily to the main branch, but still, it requires a level of maturity on the tickets that most of the startups do not have.

Trunk-based development and CI/CD

In the documentation, this section, again, is listed under the “Benefits of trunk-based development,” and it is only talking about requirements. Actually, it states: “trunk-based development is a requirement of continuous integration”, because, with continuous integration, you can only do trunk-based development. As I understand, it is saying if you use a CI pipeline, you are already doing trunk-based development. Go and read it yourself, I read it many times, and I can not understand it differently. There is a vice-versa recursive dependency in that statement that fails to be compiled in my mind.

Getting to the last sentence states that: “This ensures the project works at all times.” I agree; it is a benefit of a pipeline of that nature. As I see it, it has nothing to do with trunk-based development. Also, if you rely on the tests that developers write, you are in big trouble.

Developers have very limited time to test compared to what the QA can do since all their time and mindset are focused on testing. Also, developers writing tests for their own code is biased, so their tests will be as well. In a CI pipeline, you can define that if the code fails to compile, do not allow it to be merged. Also, you can ask the CI pipeline to run the tests, and if those are failing, do not allow the code to be merged. But from a business standpoint, we are still far from achieving business functionality just by requiring the existence of a CI pipeline. I never did trunk-based development, I just recently discovered that it exists, but I have used CI/CD for many years alongside Gitflow Workflow. There is a lot to refine in this section of the documentation.

The decision we made

Going through the lightly written Atlassian Pages about Trunk-based development, it was challenging to make a decision in favor of the new, recommended workflow.

I presented my findings to my peers and decided it was better not to spend more time on something new and shiny we did not fully understand. We started to discuss what is in place for the project already, and based on our knowledge, which flow stands closer to what we have and what we need.

What we had in place

It turned out that the branching strategy we had was already really close to what is recommended in Gitflow Workflow. Also, there was a release process in place when merging from dev to master; a new docker image with an appropriate version tag was created. The only change needed was to make a merge request from dev to master at the end of every sprint.

We already had an environment set up for dev and demo, although, at this time, these were the same. Fortunately, having separate configuration files for each, it was not hard to update the versioned image tags for the demo environment, which could be used as a stable environment for testing and demoing.

We also had to create an additional step, which is to create release tickets at the end of the sprint, so we do not forget to update the demo environment with the latest application changes that need to be tested.

The Assessment of Trunk-based Development

In the meantime, we also played with the idea of using a Trunk-based workflow, but since we were lacking most of the “benefits,” we felt we were not ready to make this step.

We were missing the following:

  • Quality tickets
  • Quality automated tests.
  • QA team members to improve on all types of test cases, there would need to be.
  • We still needed a review process to teach each other and to maintain code quality, so daily merges were out of the question.
  • Also, the CI/CD pipeline would need a lot of work before we could enjoy the real benefits of the new processes.

The winner flow is

Gitflow Workflow.

The outro

In this article, I purposefully concentrated on giving you a glimpse into what was our original story of evaluating between Gitflow Workflow and Trunk-based development, based on the information found in the Atlassian pages.

Since then, having more time at hand, I have dived more deeply into the topic and read more comparison articles from other websites. Although my story above is true, there is a lot more that can be learned about the topic. Having a compelling combination of team and project to work on, Trunk-based development can be a better choice sometimes.

Referenced sources

I hope you enjoyed this long article and it serves with some useful information. As promised, here are the resources used to give you more detailed knowledge of the topic.

The Atlassian documentation:

To learn more and get a more positive view about Trunk-based development, on Toptal, there is a really good article I recommend:

https://www.toptal.com/software/trunk-based-development-git-flow

By Botond Bertalan

I love programming and architecting code that solves real business problems and gives value for the end-user.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.