This is part 2 of a multi-part article series see other articles here:
Innovation is your chaos monkey! (Bob’s your uncle)
Innovation and agility are buzz words I hear a lot in IT. Innovation is more about culture than capabilities. Innovation is inherently a proactive activity, see a problem and choose to solve it in a new way. Agility is the ability to embrace change quickly it is inherently reactive. I had some exposure to an IT environment where everyone was compensated based upon not being the cause of outages. After an outage the root cause analysis would be done to determine which groups compensation was negatively affected by the outage. As you can image this policy was created to reduce outages. In fact, it had a direct negative effect on mean time to resolution. During an outage everyone was focused on making sure they didn’t get any of the blame. Innovation did not exist in this company because it had the potential of creating outages which were unacceptable. No one would work together on anything they had a culture of blame instead of innovation. Innovation requires organizations to be willing to endure acceptable downtime. Acceptable downtime was defined by Google as part of it’s site reliability engineering. It is focused on the idea that we can continue to innovate until we have passed the threshold for acceptable downtime for the month. Site reliability engineers focused 50% of their time on operations and 50% automating operations. Once the month has passed innovation can continue. Using the acceptable downtime or allowed downtime has turned the traditional SLA model upside down. It allowed Google’s IT to innovate at a much faster pace. Increased proactive innovation has a direct effect on reducing the amount of reactive work being done.
The second real challenge is that innovation and agility demand change.
We are focused on the initial state
When you consider manufacturing they are concerned with the initial state. Auto manufacturing has really optimized every portion of the process. They have supply chain whipped, huge buildings full of robots and they produce tens of thousands of cars a day. All of these efforts optimize the end deliverable product of a car to the consumer. Once the consumer takes ownership all optimized automation ends. Once you reach 5,000 miles you have to take the car to a shop where a human changes the oil. If something breaks humans change the parts and immediately start to break all the standardization and quality created by initial automation. End to end creation of a car takes roughly 17 hours. That same car is likely to be in the wild for 87,600 hours (10 years) yet everything is focused on optimization of the 17 hours of initial state. There are a number of parallels to IT with cars. Most IT shops seem to be focused on delivering initial state quickly(day 1), a lot less thought is given to day two operations which will persist for the next five to ten years. The major difference is the customer expected outcome. With a car you expect a drivable product with some level of quality. With a server you expect it to operate on the fifth year the same as initial delivery.
The third challenge is that we continue to focus on initial state instead of life of a service as a constant source of change.