Experimenting Your Way to DevOps Success

Nathen Harvey

Tuesday, April 16, 2019 - 09:30

nathenharvey

:wave: I'm Nathen and I'll be answering your DevOps questions all day here. This is a #TechWellHub Slack Takeover!

My hope for the day is that we can have a great conversation about DevOps here. What is it? How do you adopt DevOps practices? etc. I won't claim to have all the answers but we can have some interesting discussions on the topic!

I am a Developer Advocate at Google focused on the DevOps and SRE community. Prior to Google, I spent 6 years at Chef fostering and amplifying the community there. I am the co-host of the Food Fight Show podcast, a podcast about DevOps and Chef.

As it happens, today is also the five year anniversary of this video I made explaining DevOps - https://www.youtube.com/watch?v=0P0HD5pE-zU. There is some adult language in that video though, if you prefer one that's a little more safe for work, use this one - https://www.youtube.com/watch?v=g-BF0z7eFoU (edited)

One of the questions that comes up all the time:

*What is the definition of DevOps?*

The DevOps community has never adopted a single, canonical definition of DevOps. There are many schools of DevOps.

My favorite definition, though, was created in collaboration with a bunch of people and is part of an open source repository.

>*A cultural and professional movement, focused on how we build and operate high velocity organizations, born from the experiences of its practitioners.*

Gene Gotimer

How would you define Continuous Delivery? Where does it stop and end? What makes it successful?

nathenharvey

I'd go to the source for a definition!

https://continuousdelivery.com/

>*Continuous Delivery is the ability to get changes of all types — including new features, configuration changes, bug fixes and experiments — into production, or into the hands of users, safely and quickly in a sustainable way.*

(edited)

Two important measures of success in that definition are the words *safely* and *sustainable*

Continuous Delivery (CD) requires a number of social and technical pre-requisites to be successful. For example, continuous integration is a prerequisite: you have to run automated tests and build an artifact when code is checked in to the repository.

Jeff Payne

How would you define Continuous Deployment, then? I had read that Jez has gone back and forth on these two definitions a bit and confused lots of people (including me!).

nathenharvey

It's easy to get the terms confused! Continuous Integration (CI), Continuous Delivery (CD), and Continuous Deployment (CD). We certainly didn't do ourselves any favors by calling 2 of those "CD" though it is a well known computer science practice to overload terms :joy:

The common definitions are that "delivery" is what happens when the tech team is decides that this change is ready to go into production. We have delivered value to the business.

deployment is what happens when that change lands in a production environment and is in the hands of our customers. We have delivered value to our customers.

The question of "should we deploy this change?" becomes a business question, not a technical one.

*but*...

at the end of the day aren't all of the questions we're asking business questions? does the value really exist if customers can't use it?

My recommendation: don't get hung up on debating is this step `CD` or is it `CD`? Focus on improving the speed, safety, and sustainability of putting change in the hands of your customers.

Jeff Payne

Absolutely agree it doesn't matter in the end but people do get confused.

Jez's blog: “Continuous Delivery is about putting the release schedule in the hands of the business, not in the hands of IT,” wrote Continuous Delivery expert Jez Humble in a blog post. “Implementing Continuous Delivery means making sure your software is always production ready throughout its entire life cycle—that any build could potentially be released to users at the touch of a button using a fully automated process in a matter of seconds or minutes.”

The main difference between Continuous Deployment and Continuous Delivery is that deployment takes automation one step further. While Continuous Delivery is left in the hands of the business to release when it feels it’s time, Continuous Deployment automatically does it as soon as the code is ready. (edited)

And COMPLETELY agree that we tend to overload terms in our industry and it confuses the heck out of both non-technical and technical people at times!

nathenharvey

so, there you go. even Jez and I might disagree or get confused :slightly_smiling_face:

given those definitions, @Jeff Payne, I'd say it's a question of where / how many human gates exist between saving the code in a file and that code landing in production.

if it passes all the automated tests should it automatically be deployed to production? If the answer is yes, you're doing continuous deployment.

So be sure you've got some good mechanisms in place to detect when you have an issue in production.

Jeff Payne

That's kind the conclusion I've draw from reading all of the different opinions, too @nathenharvey 1) It doesn't really matter as its about getting value to customers 2) If you decide it helps you explain to your boss the difference, consider delivery to be the automated approach to delivering to downstream environments where business execs may make manual (or automated) decisions to move toward production 3) Continuous deployment is a completely automated delivery process that results in changes going automatically into production. It's not perfect but the best I can come up with!

Appreciate your view and perspective on this!

nathenharvey

The other important thing is to remember those keywords:

• safe - faster without a safety net is a recipe for disaster

• sustainable - we can't expect our people or systems to be effective when they're constantly in a high g-force centrifuge

• users - it's about getting change to the uses so that you can get feedback on the change and decide what to build next. few changes delivery business value before users can interact with them.

Jeff Payne

Those are great. Wrote a blog on what happens when you don't have a safety net ... the blog is called 'continuous bugs'

Eliel Goco

Is there a particular team/org structure that is more geared for DevOps? E.g. an org with a UI team, backend team, etc versus one grouped by product/feature.

cbird12

I was just thinking the same question, thank you for posting.

nathenharvey

There's some interesting content on this question over on https://web.devopstopologies.com/

The primary goal of any DevOps effort within an organisation is to improve the delivery of value for customers and the business, not in itself to reduce costs, increase automation, or drive everything from configuration management; this means that different organisations might need different team structures in order for effective Dev and Ops collaboration to take place.

I think there are a lot of things to unpack when it comes to this question of team/org structure though. When you are just getting started, you don't know what the org should look like and you need to do some experimentation to discover that.

Starting your DevOps journey by changing the org chart is probably a misstep. Though you may need to work with your HR counterparts or incentive plans to allow a temporary team to come together without having to change everything about how you are structured.

I've seen success when you pick a single, customer-facing service or capability and pull together a cross functional team who is going to be responsible for all aspects of this service and give them the authority, responsibility, and autonomy to discover a new way to build and operate that service. The team should be dedicated to that feature full time within some time box. Maybe it's a 8-16 week project? But be clear up-front about what the outcome will be and what measures will be impacted.

Depending on the size of your organization you may end up with teams that are cross functional focused on delivering certain outcomes. This has the benefit of focusing each team on delivering value instead of some capability. An example of a team focused on delivering a capability might be the "network team". If all of the networking work falls to a single team you will have friction when prioritizing the work of that team. Each business unit will view their work as the most important. Breaking up the team such that each team has the networking expertise necessary eliminates that friction.

With the potential downside of losing some common networking policies and imbalanced utilization of the expertise across the various humans that have that expertise.

You can remediate these downsides by creating communities of practice around various functional areas. The Networking Community of Practice is open to everyone and seeks to help expand networking expertise across more people in the organization while also discussing and agreeing on various policies, etc.

Eventually, your org may evolve back to a place where you have specialized teams focused on providing abstractions on top of the functionality that make it easier for application developers to understand how the platform works.

Of course, if your organization is 5 people delivering and operating one service that's very different than an organization of 50,000 delivering 50 different services.

So, circling all the way back....

Which org is best?

Figure out the outcome that you're driving for.

Experiment your way to the org that works best for you to deliver that outcome.

Kelly M

Hi! So my question is: who should be involved in DevOps within the organization?

nathenharvey

Hi, Kelly! Great question. Part of the definition I like for DevOps is that we are *focused on how we build and operate high velocity organizations*. I would say that you can't change an organization by focusing on only 2 departments (dev and ops). Taking that to it's logical conclusion: every part of the org needs to be involved in DevOps.

but ... if you're just getting started, that's a bit too much to take on!

Kelly M

that makes sense - thank you!

nathenharvey

When your getting started, I'd say get everyone involved with delivering a particular service involved. From the product owner through to the support team. Not every part of each of those orgs, just the people directly involved with that service.

Kelly M

okay great, yeah, that makes total sense.

Lobna Eslim

Hi, my question is about DevSecOps, is it a kind of replacement of just DevOps?

nathenharvey

Great question! I don't think DevSecOps is actually sufficient. We need to replace DevOps with *Rugged Enterprise DevSecNetQAGovOps*

actually, that's a joke :joy:

Lobna Eslim

I get lost though :joy:

nathenharvey

I think that the DevOps tent is actually large enough for everyone and we should only call it DevSecOps if that's what is required to bring a security mindset and security professionals to the table.

Gene Gotimer

https://youtu.be/H8M69Ljiito?t=11380

nathenharvey

Going back to my definition, DevOps is about *...how we build and operate high velocity organizations...* I truly believe that requires every part of the organization, not just dev and ops

wow, @Gene Gotimer thanks for that! I have not been able to find any videos of me giving that talk in a long time.

Gene Gotimer

Nathen did a great explanation of DevSecOps, Rugged DevOps, etc. that is my go to for these types of questions.

nathenharvey

Here's the github repo with that talk and a script for the talk, too - https://github.com/nathenharvey/rugged_enterprise_devsecnetqagovops

Lobna Eslim

I look at security as essential as QA however no one talk about DevQAOps! lol

nathenharvey

@Lobna Eslim 10 years ago, when DevOps was first coined, people were absolutely asking about QA.

I think we can thank agile for helping us change the way we think about QA and about how the QA team works. And the parallels for security are absolutely there.

Gene Gotimer

I like to explain that Dev and Ops are just the bookends. We are really talking about them AND everything in between: QA, Performance, Security, Support, etc.

nathenharvey

So swap in QA or Security but think about it like this:

When X is someone else's job, I don't need to worry about it. My job is to write features for the application. X's job is to make sure the application is [bug free | secure].

That's flawed thinking. X needs to be built in from the beginning.

That means the role of X becomes more of a coaching role. X can show developers how to write [better tests | more secure applications] etc.

And it comes back to incentives. What gets QA a bonus, raise, or promotion? Finding more bugs? (edited)

What if we aligned incentives around delivering value to customers and agreed on some common metrics that we could use as proxies for that? For example the researchers at DORA (https://devops-research.com/) run an annual State of DevOps Report and look at 4 top-level metrics:

• Time form commit to deploy

• Deployment frequency

• Time to resolve production incidents

• Change failure rate

Let's get QA and Sec and the rest of the org thinking about how to impact those specific metrics as a team not as individual groups

How does that change security? Well, we start with the code we're bringing into our applications and need to have a view of our entire *software supply chain*. We need processes and tools that help us vet those things and allow us to update them frequently and safely. We need more testing (security and all other types) earlier in the process, *shift-left on security*, and we need to have the organizational muscle to quickly deploy change, especially when a vulnerability is discovered, but change is change is change and it shouldn't matter if we're patching a vulnerability or shipping a new feature, the same process should be used.

Alan

Local incentives (focused on one team or role) will at best get you local optimizations; at their worst they will create goal conflicts between teams.

nathenharvey

@Lobna Eslim does that help answer your question? What follow-up questions do you have about DevSecOps?

Lobna Eslim

Thanks Nathen. I am just not fan of new terms that adds nothing! answers cover all :slightly_smiling_face:

nathenharvey

@Lobna Eslim I've come to think that "DevOps" is a terrible choice of a word that describes what we do. Yet it has proven an excellent banner for us all to rally behind.

Jeff Payne

A great discussion with @nathenharvey in this thread and sorry I missed it! A couple of thoughts. It should have been called Dev2Ops, LOL, as @Gene Gotimer said, Dev and Ops are the bookends, not the entire process.

Also, I equate 'continuous testing' to testing/quality properly integrated into DevOps. For a small amount of time you would see the portmandeau DevTestOps appear but it seems to have fallen out of favor and replaced with Continuous Testing. As Nathen mentioned, if you are doing DevOps right, security / quality / availability / etc. are all in the definition so these mutants like DevSecOps and Continuous Testing are only helpful if it gets groups in your company engaged in the DevOps process when they otherwise would not have been motivated to do so.

Alan

How would you describe the relationship between SRE & DevOps?

nathenharvey

My friends Seth and Liz have published a blog post and video series on this topic: https://cloud.google.com/blog/products/gcp/sre-vs-devops-competing-standards-or-close-friends

Google Cloud Blog

SRE vs. DevOps: competing standards or close friends? | Google Cloud Blog

Site Reliability Engineering (SRE) and DevOps are two trending disciplines with quite a bit of overlap. In the past, some have called SRE a competing set o

I have some other thoughts that I'll share in a bit, too. Just need to take a short break first :slightly_smiling_face:

Alan

Thanks!

nathenharvey

OK, back...

I would say that SRE is focused on reliability, heck it's in the name. And that it provides a bit more concrete framework for implementation whereas DevOps is less prescriptive.

They both treat realize the benefit of getting the organization aligned on common goals, objectives, and incentives.

SRE picks reliability as the primary objective and provides a framework, error budgets, that helps the organization to balance feature velocity and reliability.

For an error budget to be effective it must:

• be indicative of customer happiness. If customers are unhappy but your are meeting your error budget, you need to revisit the parameters of that budget.

• have real consequences for overspending that everyone across the business agrees to in advance. Getting technical practitioners to agree that we should focus on reliability is a different conversation than convincing product owners or "the business" of the same. But buy-in across all departments is absolutely required.

both SRE and DevOps look at failure as a normal thing that provides an opportunity for learning in the same way and both encourage deep, meaningful reflection on those failures with things like blameless post mortems.

both SRE and DevOps encourage automation as a way to increase speed, safety, and reliability of systems. SRE does so by using a specific concept: toil. Toil can be measured and, importantly, limits can be set on how much toil a team participates in.

I think DevOps is more concerned with some of the social components. For example, you will often hear talks about inclusion and diversity at DevOps conferences. SRE certainly benefits from building more diverse, inclusive teams, but it's not necessarily an explicit part of SRE.

Both SRE and DevOps practices can improve performance at organizations of any size.

I am of the opinion that a DevOps "team" should not exist or should only exist for a time that is long enough to demonstrate the value of adopting DevOps practices. It should then be disbanded as DevOps becomes "the way". Likewise I do not believe an individual can be a "DevOps Engineer" or "DevOps Manager".

I do believe SRE can be both a team and a job title. At Google, for example, we definitely have SREs and an SRE team that collaborates with the rest of the organization.

There is one caveat though. If calling your self a "DevOps Engineer" gets you a significant pay raise, you should totally do that. Likewise if calling yourself an SRE but working in a more traditional operations role without error budgets, treating failures as opportunities, etc. gets you a pay raise then you should definitely call yourself that.

But do what you can to influence the organization to actually adopt DevOps or SRE practices.

JannaLoeffler

Hi @nathenharvey! One thing that I know you are passionate about is community and change. Im currently in an organization that tends to resist change, especially when it comes to DevOps. They are very focused on "learn and implement all the DevOps tools and then we can say we are doing 'the DevOps'". Any suggestions on helping guide the organization into a cultural shift instead of just thinking "we have DevOps tools, so we do DevOps"?

nathenharvey

The tools used in an organization amplify and reinforce the culture and practices in the same way that the culture and practices reinforce and amplify the tool choices, how the tools are used, and their effectiveness. Trying to separate tools from culture is folly.

There is a bit of a :chicken: and :egg: problem though. How do we change our culture without changing our tools? or our tools without changing our culture?

One approach is to really think about the outcomes you are trying to deliver and the metrics that matter.

I mentioned the the researchers at DORA (https://devops-research.com/) run an annual State of DevOps Report in another thread this morning.

Their research looks at 4 top-level metrics:

• Time form commit to deploy

• Deployment frequency

• Time to resolve production incidents

• Change failure rate

And the book, Accelerate, further defines 24 capabilities to drive improvement. Those capabilities are classified into:

• Continuous delivery

• Architecture

• Product and process

• Lean management and monitoring

• Cultural

https://itrevolution.com/book/accelerate/

I think the best way to create cultural change is to do the work. So come back to those 4 questions / metrics that the DORA team identified, set some goals for how you want a service to meet those metrics, identify the capabilities that require improvement to get there, and then *iterate* your way there.

Take one step forward then reassess. What's the next most impactful thing you can do?

At the same time, the organization should create some room, time, space, incentives, etc. for the people in the org to go to participate in "community" activities either within the four walls of the organization, or outside, or both!

activities like meetups, lunch-n-learns, conferences, podcasts, demo days, etc.

I have been known to say that the only "DevOps tool" is the person that calls themself the "Director of DevOps"

There is no such thing as a "DevOps tool" but there are certainly some tools you will use along your journey to improve the four metrics most closely associated with DevOps.

@JannaLoeffler does that help? what follow-up questions can I answer for you?

JannaLoeffler

I think that's a really good point. Pick one thing and do it, you don't have to just wake up one day and decide you're doing 'the DevOps'. Im going to ponder on those questions and see what I come up with. Thank you! :slightly_smiling_face: Im also looking at how to start up internal lunch-n-learns, like I helped with at another company. We shared what we were doing internally, but also brought in outside people to learn from as well.

Thanks!

nathenharvey

lunch-n-learns, especially with folks from the outside, are a great step!

One thing I like to encourage is the *learn* part of that. It's meant for everyone in the room. So if you're tasked with demonstrating something or sharing a new concept, this is a time for a discussion where everyone in the room can *learn*.

It's not called a lunch-n-teach which doing that would put some pressure on the person in the front of the room to prove that they are the expert.

JannaLoeffler

That's an awesome perspective! Thank you! I know that's stopped me from presenting at before because I feel like I have to be the expert and answer every question.

cbird12

As someone who is about to co-present my first ever lunch-n-learn this week that advice is SO helpful! It’s terrifying to think I have to be an expert but it’s not as terrifying to think “I’m going to do my best to share the knowledge I have and maybe learn more from questions I get” Thank you!

nathenharvey

@cbird12 you got this!!

You can add this tweet to the beginning of every lunch-n-learn you host - https://twitter.com/nathenharvey/status/1118189138820378624

cbird12

LOVE it, I most definitely will.

Himu H

nathenharvey

Himu, I'm not sure how to distinguish a "DevOps pipeline" from any other sort of pipeline that gets change from the developer's workstation through to the production environment.

As part of the testing team, I think you will find success by moving closer to the team that is building the applications both physically and mentally.

When one team is the team that builds things and another is a team that tests things you have misaligned incentives and separate responsibilities. Whereas aligned incentives and shared responsibilities will serve you and your customers better.

If your teams are not distributed, start by sitting with one another. If they are distributed start with shared slack rooms, video chat, etc. Get as close to one another as possible and increase the chances that you will hear about what the other team is doing just by showing up to do your work.

The next thing might be to create a community of practice around testing. Everyone is welcome and maybe you hold informal meetings one or twice a month. During these meetings you discuss testing practices, share what's working, and what's causing issues.

Have your team and the development team start participating in post mortems that happen after every outage or production incident and ask the question: how could we have detected this sooner?

Offer your team as application development apprentices as a way to pair program with the developers and start introducing them to better testing practices. static analysis, unit testing, etc. Come as an "apprentice" who's interested in learning more about the craft of developing applications, not as "experts" who are going to ensure the developers are testing properly.

@Himu H, I'm not sure if that's helpful or inline with what you are asking so I'll stop and ask what clarifications or follow-up questions do you have?

Himu H

Thanks @nathenharvey. I think you gave me the right starting point that's suitable for my team. A major change in culture will be required in our team as we adopt DevOps. That testers sort of becoming "apprentices" to the dev team sounds like a great approach.

nathenharvey

Most teams would rather take on a apprentice than someone who's going to tell them what to do :slightly_smiling_face:

Gene Gotimer

A major change in culture will be required in our team as we adopt DevOps. Getting the org to understand this is one of the biggest challenges.

Himu H

@Gene Gotimer Agree. But at least the organization is taking the first step so I'm hopeful. :slightly_smiling_face:

nathenharvey

In the spirit of learning, I'd like to ask you:

*What has been the most impactful thing (positive or negative) on your organization's DevOps journey?* (edited)

Alan

The most positive impact has been the willingness of members of the DevOps community to openly share their experiences, methods, and approaches in open ways through conferences, books, posts, opensource software and especially taking their time to visit our teams and share their knowledge with us.

nathenharvey

@Alan thanks for sharing that. That feels like something that can be reciprocal, too. Your team can now go back out into the world and share your experiences!

Gene Gotimer

I'm a consultant, so we are generally tasked with helping others adopt DevOps, and getting them to understand the culture shift is the hardest part. The biggest negative impact I am seeing with my current clients is adding extra approval processes to slow down the team since they are making so many changes quickly. In their move to DevOps, they have _added_ a process to be able to deploy an app to the development system.

nathenharvey

@Gene Gotimer I wonder if an Error Budget might be a good construct to help them balance speed and reliability?

Gene Gotimer

It would be logical, but they aren't really worried about errors or reliability. They are concerned about accountability.

nathenharvey

Here's a paraphrased quote from a CIO that I met with once:

> We do DevOps here. I ask the QA team to tell me who's to blame when there's an outage...dev or ops. QA is my secret weapon.

Accountability and blame aren't, necessarily, the same thing, of course. And saying "everyone's accountable" often leads to a reality where "nobody's accountable"

You might start asking and measuring how often the new approval steps prevented an error from reaching production.

oh, and I just reread your statement, Gene. "...deploy an app to the *development* system."?! They're slowing down changes to _development_? I'm glad they hired you as a consultant :joy:

and hope they'll take your advice to heart

Gene Gotimer

Yep. "Nobody's accountable" and "everyone's accountable" are exactly what they want to prevent. The process is designed to slow progress down so the Security group, primarily, can absolve themselves of responsibility if anything goes wrong. The process identifies 2 specific individuals that are responsible for each change. If you don't name the 2 people in writing, you aren't allowed to make the change.

This is all in reaction to moving into the cloud and implementing Jenkins and Sonatype Nexus. I can't wait to see how they react once we introduce Chef.

Jessica Romero

In our company we don't have DevOps - How can start the integrating DevOps and QA?

nathenharvey

Thanks for asking!

What is your role in the organization @Jessica Romero? Could you share a expand a bit on your vision of what integrated DevOps and QA might look like in your organization?

I touched a bit on some of the QA stuff in @Lobna Eslim's DevSecOps question earlier --> https://techwellhub.slack.com/conversation/C6K82BV34/p1555424434030800

And in @Himu H's question about the DevOps pipeline --> https://techwellhub.slack.com/conversation/C6K82BV34/p1555428570051000

It think the big take away is that Agile and DevOps have both shown us that getting the developers to write and execute more tests earlier in the process is a win for everyone. Your role as an SQA Engineer is likely becoming more of a coaching role helping developers understand and implement better testing practices.

@Jessica Romero what follow-up questions do you have?

Jessica Romero

I think this is good for right now

Actually how do you start injecting devOps in a team who is not familiar

in other words how can you sell the Idea

to the developers?

nathenharvey

I think it depends on your perspective of the idea of "DevOps". Part of the problem, as I stated early this morning, is that there is not really an adopted, universally agreed on definition of that word.

But selling ideas, DevOps or otherwise, starts with understanding what's important to the people you're selling the ideas to and challenging them to imagine a new world.

So, maybe go back to those four metrics from the state of DevOps reports:

• Time form commit to deploy

• Deployment frequency

• Time to resolve production incidents

• Change failure rate

Which of these do the developers care about? Which cause them pain?

Guide them in a discovery of how impacting those metrics would improve their daily lives. Get management on board to align incentives, bonuses, promotions, etc. around those four metrics.

Jessica Romero

Thank you

Alan

How would you describe the relationship between DevOps and Resilience Engineering?

nathenharvey

My initial thought is that Resilience Engineering (RE) is a specialization that comes "after DevOps" but that's based on a bunch of things in my personal experience. First and foremost, RE really came into my context after DevOps did :wink:

But I do think RE is similar to SRE in that there's a venn diagram that shows there is a lot of overlap with DevOps.

I'm not very well read or versed in RE but I do think it's maybe begins the discussions at an the opposite side of the workflow from most DevOps journeys.

Where an organization looking to adopt DevOps might initially say/ask:

> It takes us too long to respond to changing market conditions. How can we increase the velocity of our releases?

An organization coming at the system from an RE-perspective might say / ask:

> We are seeing more failures in the system then we are comfortable with. What can we learn from these failures and how will those lessons cause us to change the system?

They may both lead to similar discoveries and changes but start from a very different place.

@Alan, @Jessica DeVita what do you think? Does that resonate and/or align with your own perspective?

Alan

I think it definitely starts from different sides, but connects to some important facets of DevOps for many practitioners. For one example, we owe much of the “blameless postmortem” or “learning review” methods to the Resilience Engineering field. Things like the Stella Report, Etsy’s Facilitation Debriefing Guide, conference talks by John Allspaw and Richard Cook, and the ReDeploy Conference come to mind.

https://snafucatchers.github.io/

https://extfiles.etsy.com/DebriefingFacilitationGuide.pdf

nathenharvey

totally! I think they're adjacent and/or overlapping approaches and there's a ton of crossover in the humans involved in both communities

Alan

https://www.youtube.com/watch?v=xA5U85LSk0M

nathenharvey

these are all awesome resources, too! Thanks @Alan and @Jessica DeVita! Keep them coming :slightly_smiling_face:

Alan

https://www.youtube.com/watch?v=3ZP98stDUf0

https://re-deploy.io/videos/

Another connection comes through the SRE lens in John & Richard's chapter 28 in the Seeking SRE book.

https://www.adaptivecapacitylabs.com/blog/2018/12/06/chapter-in-seeking-sre-sre-cognitive-work/

nathenharvey

One thing that's true about DevOps: it's always been influenced by and embraced learning and practices from other fields. Just some that come immediately to mind:

• Agile

• Safety

• Lean

• Resilience

• Chaos Engineering

• Continuous Delivery

Jessica DeVita

RE is very much about the _capacity_ to respond to disturbances, and learn what went right, how experts know what they know, what signals they trust/distrust, and in incidents/post-incident reviews, learning what goal conflicts operators faced? How automation surprise factored in. How they _thought_ they were defended, and reconciling that surprise. Absolutely foundational to any "devops" or automation- and fully in alignment with the goals of "devops", however you define that. (edited)

• Anticipate

– Ready for threats and changes (_a chronic sense of unease_)

• Monitor

– sensemaking of system states

• Responding

– immediate actions to restore service

• Learning

– Learn from all outcomes, success/normal work, near misses, as well as incidents/ outages. (edited)

Kelly M

Hey @nathenharvey do you have any resources you like to reference or recommend for those that want to get more familiar with DevOps or just to get better understandings? I know you shared a few in the threads but wanted to see if there were anymore! All the threads have been really insightful!

nathenharvey

My favorite books on the subjects (in no particular order but if you can only buy one, go with Accelerate):

• The Phoenix Project

• Accelerate: Building and Scaling High Performing Technology Organizations

• Lean Enterprise

• Effective DevOps

• Site Reliability Engineering

• The Site Reliability Workbook

• Seeking SRE

• Continuous Delivery

• Driving Technical Change

For shorter form reading, I recommend the Accelerate State of DevOps report from 2018 - https://devops-research.com/2018/08/announcing-accelerate-state-of-devops-2018/

I also recommend getting to conferences like Agile + DevOps, DevOpsDays, the DevOps Enterprise Summit, and Velocity

For newsletters there's

• https://devopsish.com/

• https://www.devopsweekly.com/

• https://sreweekly.com/ (edited)

Kelly M

Awesome, these are great, thanks! I will definitely get on the newsletter lists & bookmark the report. I'm possibly attending Agile + DevOps West, so if do, I look forward to meeting you and attending your talks!

Alan

Videos are available for many of the talks from the conferences Nathen mentioned; reviewing current & past talks has been a great resource for me.

Another helpful book is the DevOps Handbook by Kim, Debois, Willis & Humble.

Also check out podcasts from the DevOps Cafe, Arrested DevOps & The Food Fight Show

http://devopscafe.org/

https://www.arresteddevops.com/

http://foodfightshow.org/

nathenharvey

oh, yeah...I'm bad as self-promotion, apparently :slightly_smiling_face: Checkout the Food Fight Show :slightly_smiling_face:

Also the definition of DevOps I mentioned this morning comes directly from this repository which has slides and a video and gives you the opportunity to join that school of DevOps by sending a pull request the README file adding your name to the list of practitioners. https://github.com/chef/devops-kungfu

cbird12

@foodfightshow ‏on twitter! :wink:

nathenharvey

Since you are here, you are interested in DevOps. There is an annual survey that you should complete: https://google.qualtrics.com/jfe/form/SV_0v2VZMeA2Eha365?sp=2

This survey is the basis of the the annual state of devops report so participating will help advance the state of the art within our space.

cbird12

Hey @nathenharvey, I am wondering how do you measure the success of implementing devops?

nathenharvey

I'd recommend the 4 measures from the state of devops report and other research from the team at DORA:

• Time form commit to deploy

• Deployment frequency

• Time to resolve production incidents

• Change failure rate

Get baselines for each of these today, make some changes that you think will impact them, measure. repeat

commit-to-deploy is the time from when a line of code goes into the source code repository to when that code is running in a production environment where a user can interact with it.

deployment frequency is how often a service is updated.

resolve production incidents is the time from an incident starting until the time it's fully resolved.

change fail rate is the percentage of changes pushed to production that cause an incident or outage in production.

three of four should go down. you decide which three :joy:

no, I'm kidding...it's the deployment frequency that should be increasing :slightly_smiling_face: the other three should go down :slightly_smiling_face:

cbird12

Very helpful! Thank you.

Join TechWell Hub

Get TechWell Insights Delivered Weekly

About the Author

Nathen Harvey has been active in the DevOps community for almost a decade, putting the practices of DevOps to work and helping others learn and implement those practices. As a cloud developer advocate at Google, he helps the community understand and apply DevOps and SRE principles and practices in the cloud to drive business outcomes. Prior to joining Google, Nathen led the Chef community, helping them adopt continuous automation to build, deploy, and manage applications in fast, secure ways. He also has a background in running operations and infrastructure for a diverse range of web applications. Nathen is a cohost of the Food Fight Show, a podcast about Chef and DevOps.

TechWell Insights To Go

(* Required fields)

Get the latest stories delivered to your inbox every month.

Email *

Country *

Leave this field blank