DevOps

What you are missing about DevOps

Friday, July 24, 2020

Google-Play-Store-Developers-Claim-Leaderboard

What are you still not getting about DevOps? Kevin Crawley with Containous offers his expertise about applying game theory, using mesh technologies, serverless computing, and more in this exclusive with App Developer Magazine.

DevOps and Kubernetes has challenges but most of them are centered around common misunderstandings or rushing into things too quickly. Kevin Crawley is a Developer Advocate at Containous and offers his thoughts with ADM about how you can overcome many of the challenges, along with some sharing some best practices, and even the solution to the prisoner’s dilemma.

ADM: What is management still not getting about DevOps?

Crawley: The biggest misconception about DevOps is that simply hiring a "DevOps" engineer or implementing a "DevOps" task force will solve an organization's inability to ship software efficiently and effectively. In reality, management needs to focus on fostering a genuine DevOps culture to be successful.

The most effective approach to implementing a DevOps culture comes by empowering and enabling application developers who deliver business value to continuously enhance their applications' performance. Embracing DevOps means defining, measuring, and improving KPIs, which might include:

•   Reliability – error rates, SLAs, availability
•   Stability – MTTD, MTTR, Defect Escape Rate
•   "Ship-ability" – deployment frequency, lead time, failed deployments
•   Happiness – customer tickets, application performance, application usage

Developing and managing these KPIs almost always means moving from a "project" mindset to a "product" mindset. Organizations that still apply traditional manufacturing approaches to their software delivery model will find adopting a DevOps culture exceedingly challenging.

ADM: Do you think some organizations / DevOps teams are rushing into microservices too quickly?

Crawley: We do see some organizations roll back their microservices strategy after realizing the significant effort required to effectively track and improve the KPIs that gauge how successful they’ve been at delivering software value. Before committing to a microservices strategy, teams should ensure they have the resources and organizational support to implement and manage a significantly decoupled, large-scale microservices environment successfully.

Teams pursuing microservices need to recognize when they’re acting like the metaphorical hammer in search of a nail, asking if they could but not if they should. Many organizations can still benefit greatly from modern DevOps practices without implementing microservices, or by implementing a small subset of them where it makes the most sense for their use case.

ADM: How can game theory be applied as a way to break down DevOps silos?

Crawley: Much like the “prisoner’s dilemma” paradox in decision analysis, many software organizations are structured such that DevOps teams acting in their own self-interest won’t produce the most optimal outcome for the business. These outcomes often result from a lack of shared ownership and prioritization around technical debt, on-call rotations, and improving key metrics around software delivery performance.

The optimal solution to the prisoner’s dilemma is developing shared ownership, collaboration, and mutual consideration to break down competing silos. With this approach in mind, we can overcome the dilemma by deliberately adopting strategies that reward cooperation. By sharing responsibility, measuring and tracking performance, incrementally improving performance metrics, and rewarding those who are actively driving these changes, we will eventually see DevOps silos dissolve, success rates increase, and business outcomes improve.

ADM: What are three k8s CI/CD pipeline best practices that you think are underutilized?

Crawley:

1. Phased canary testing of production services
2. SLA and SLOs on CI/CD performance and reliability
3. Evaluation of new tools/techniques (GitOps: Flux, ArgoCD, Jenkins X)

ADM: Why has distributed tracing and monitoring become so important in microservices?

Crawley: The adoption of microservices exposes what is historically one of the most unreliable aspects of distributed computing: the networking layer. Distributed tracing gives operators visibility into their infrastructure network performance, which enables reliability by capturing details around each network transaction. Distributed tracing also captures transactional information that software organizations use in KPIs to determine the performance, availability, and reliability of their applications. Distributed tracing brings value in both troubleshooting incidents, and in measuring performance by providing the framework and data required to define meaningful service level agreements and objectives across the organization.

ADM: Why are some organizations hesitant to adopt service mesh technologies?

Crawley: A few different issues have complicated service mesh adoption. It isn't easy to point to a single contributing factor. Organization adopting microservices have several other challenges to overcome before even looking at what problems a service mesh can solve. Introducing a service mesh comes at a rather high cost, and these costs come from the actual overhead of running the solution (cpu/memory/network), as well as the costs of developing application services to be service mesh topology aware. Applications built to manage authentication, rate limiting, retries, security, and some other concerns will likely need refactoring in consideration of the service mesh handling those concerns. The prospect of this burden offload is a huge selling point of service meshes; however, if applications aren't built with this abstraction in clear sight, implementing the service mesh has become tremendously complicated and costly.

Many of the most popular service mesh technologies in the ecosystem now offer a tremendous number of features and capabilities. The tradeoff often inherits complexity and lock-in. There are emerging alternatives in the ecosystem, such as Maesh, which provides a lightweight approach to many of the commonly-desired features of a service mesh but without the requirements of an "all-in" approach (instead offering opt-in capabilities utilizing the features that already exist in most off-the-shelf Kubernetes installations).

ADM: Will serverless computing be able to reach the widespread adoption/usage that containers have hit?

Crawley: Serverless will continue to evolve and transform into a platform that is as accessible as containers. The barriers to entry on serverless are not trivial and include broad topics such as practical change management, dependency mapping, testing, debugging, and security. As frameworks and tools gain maturity and adoption, serverless adoption will increase; however, there are no indications that serverless will gain the widespread adoption that containers enjoy today. The most exciting aspect of the serverless model is how it began to shape services like Fargate to bring some of the advantages of serverless to the container ecosystem.

About Kevin Crawley

Kevin Crawley is a Developer Advocate at Containous, a cloud-native networking company behind the open source projects Traefik and Maesh. He is passionate about championing the benefits of Open Source, DevOps, automation, observability, distributed tracing, and control theory.