Blog

The Impact of Hidden Changes in DevOps

November 17, 2022

Organizations know how valuable it is to deliver services in a fast, highly automated, and secure way – without compromising the quality of code. Organizations with a mature DevOps practice can make code changes hundreds of times a day. This is done using a repository for version control, and continuous integration and continuous delivery (CI/CD) pipelines that automate testing and deployments. And there’s increasing interest and adoption in GitOps within the DevOps practice to manage infrastructure and application development lifecycle using a unified tool – the git repository. GitOps provides many benefits, including faster deployments and better coordination between teams, allowing them to collaborate in a central location.

What is GitOps?

GitOps is an evolution of Infrastructure as Code (IaC) and a DevOps practice that leverages Git as the single source of truth and control mechanism for creating, updating, and deleting a system’s infrastructure. More simply, GitOps uses automation and tooling to manage infrastructure and code deployments. The goal of GitOps is to eliminate the manual infrastructure provisioning and configuration process so teams can manage cloud resources effectively. While DevOps is a culture that focuses on CI/CD and collaboration, GitOps is a technique that uses git repositories, CI/CD, and other tools to implement infrastructure automation.

The Missing Link in Observability

While practices like CI/CD and IaC allow services to scale in the cloud, the availability and reliability of these services remain a big challenge. DevOps teams are constantly balancing between shipping new features fast and maintaining reliability of services in the cloud.

Observability practices and tools have not kept up with the speed and scale of modern software development and delivery. Despite efforts underway to consolidate and standardize observability tools, monitoring tool sprawl and disconnected data remain a problem. The challenge is that operational data (metrics, events, logs, traces) are not just disconnected, they’re dynamic and growing as the applications and the supporting infrastructure change.

Recent data shows that 76% of all performance problems can eventually be traced back to changes in the environment, and most organizations don’t have an effective way to identify changes that cause production issues. As a result, intentional changes, such as deploying new code, are accompanied by change anxiety.

Research from CtrlStack and Techstrong Group has found that observability data is the primary data source for troubleshooting production problems. It would seem obvious, then, that most organizations can’t be getting full value from their operational data. Are they missing a key piece of the puzzle by not looking at change data, including code commit and configuration changes?

 

 

The answer is, yes. Here are some of the sources of change events we find valuable when correlated with existing operational data to understand change impact:

  • AWS EventBridge: Using EventBridge, CtrlStack can analyze the firehose of events occurring within your AWS environment.
  • Kubernetes events: When changes occur in your Kubernetes cluster, the emitted events are tied into CtrlStack’s knowledge graph to show the impact of Kubernetes events on the cluster as well as on externally connected systems (i.e. AWS).
  • Terraform files: CtrlStack captures changes to terraform state, allowing engineers to identify a change in infrastructure and trace it back through the AWS API call to the terraform change that triggered it.
  • CI/CD code deploys: CI/CD pipeline events can be traced back to Git commits and traced forward to view impacted infrastructure and other related events.
  • SSM/Terminal commands: Yes – people are still making manual changes this way.

CtrlStack serves as the system of record for changes that happen in production. This system for change tracking is critical for showing teams when, where and why changes are made, and how those changes impacted operations.

Change Tracking is Increasingly Important

The report also gives us a closer look at where organizations are headed in terms of operational objectives. While only 15% of respondents centralize production change tracking, 45% are moving in that direction. This trend could be fueled by GitOps challenges that require teams to do things in a new way.

GitOps requires three core components: IaC, merge requests (MRs), and CI/CD. The MR serves as the change mechanism for all infrastructure updates, and where formal approvals take place. A GitOps approval process means whenever a developer makes changes to the code, they would create a merge request, and an approver merges these changes for deployment. For the engineers who are used to making quick changes directly in production or changing something manually, complying with this process is challenging. In fact, this was one of our DevOps horror stories from past experiences.

On December 5, our CEO and Head of Customer Success will discuss the details of the report with Techstrong Group. Each will share their experience and perspective on observability best practices for modern environments and how adding change impact and root cause analysis into your DevOps pipeline improves MTTR. Register for the webinar today.

About Author
Mary Chen
Sr. Director, Product Marketing