Site Reliability Engineering – From DevOps to NoOps
Site Reliability Engineering (SRE) is a practice that combines software development skills and IT operations into a single job function. Automation and continuous integration and delivery are used to reach the goal of improving highly dynamic systems. The concept originated with Google in the early 2000s and was documented in a book with the same name, Site Reliability Engineering (a must read). SRE shares many governing concepts with DevOps—both domains rely on a culture of sharing, metrics and automation. SRE can be thought of as an extreme implementation of DevOps. The role of the SRE is common in cloud first enterprises and gaining momentum in traditional IT teams. Part systems administrator, part second tier support and part developer, SREs require a personality that is by nature inquisitive, always acquiring new skills, asking questions, and solving problems by embracing new tools and automation.