13 April 2021 Automating governance in the Cloud with AWS and Azure
Governance automation in the cloud
All things Cloud, Agile, DevOps and transformation
13 April 2021 - 4 min read
The goal of automating governance is to improve productivity and raise compliance and best practice by using automation in the place of approval forums.
Why automate governance?
This approach has been hugely successful for companies like Netflix (see: Jason Chan, from Gates to Guard Rails). Automating governance takes on a variety of forms including best practice patterns, templates and guard rails. In this article we will focus on guard rails as preventative, detective and remedial controls running against configuration items in your Cloud. Guard rails typically deal with Cyber, access control and compliance concerns (e.g. PCI DSS), but also best practice related to operability, cost management, tagging and the like.
Imagine you work within a constraint around what access your Cloud Account or Subscription has to the outside world. A manual governance approach might see you receive approval from a networking or cyber team prior to making changes, or even delegating this work to those specialists. The governance automation approach takes a different tack: your Account/Subscription is set up so the only options available are approved known good, and monitoring rules fire regularly to determine whether best practice is being adhered to or not. The results are made available to a wider community who can step in and help as well as talk through with you the reasons for the intervention. In some cases, auto-remediation may kick in–with automatic encryption of AWS S3 Buckets being a frequently cited example.
The benefits of working in this way are substantial: once set up correctly, product and DevOps teams get autonomy without having to worry about significant consultation and approval burdens. And people concerned with governance and standards can get a guaranteed outcome and a standardised way of assessing compliance status. Of course, up front investment is required, but once in place greater levels of trust develops between teams aligned to the DevOps ethos: using collaborative engineering to solve our challenges. Our Site Reliability Engineering (SRE) team also get involved to take the sting out of this undifferentiated heavy lifting, which is just as well since the fine-grained nature of access control and config management in the Cloud means there is a fair amount of skilled work in setting this up correctly.
At Centrica, we’re primarily an AWS and Azure shop. So how do these platforms compare on this?
Amazon provides AWS Organizations, a directory hierarchy of organisational units (OUs) linked to a Payer account with attached Service Control Policies (SCPs) governing what can and can’t be done at each node (unlike other controls, SCPs also restrict root).
Also provided is AWS Config, a rules engine that fires periodically and whenever a configuration item in your Account changes. AWS Config ships with managed rule sets and you can even develop your own custom rules. An example of a managed rule is: access-keys-rotated, which as the name suggests fires an alert if access keys in an account are not rotated within a specified time period. These alerts are placed in Cloud Trail and are therefore bound to an AWS Account where they can be viewed in AWS Security Hub and even shipped to your SIEM. As of time of writing there were 180 managed rules available–all capable of working hard on your behalf. Note that unlike AWS Organizations, AWS Config rules are associated with a given Account and must be duplicated if you want to apply them more widely. It’s possible with some work and perhaps a bit of Open Source (e.g. Cloud Custodian) but there are complications: I may wish to trial a detective control before making it preventative at a later date. This is useful as my restrictive control may be too draconian or require other changes to make it work. With the AWS toolset we have to jockey things around from detective using AWS Config at the account level to preventative using SCPs at the AWS Organization/Payer level. And AWS Config is not free.
Microsoft provides Azure Policy to achieve all guard rail outcomes (preventative, detective, remediative) for Subscriptions in a Tenant from a single dashboard. And with a whopping 635 managed rules at the time of writing together with custom options, Microsoft clearly see this as a strategic underpinning of their Enterprise Cloud. They describe their governance architecture as follows:
Azure policy also has the concept of Initiatives, a way of grouping policies reinforcing a shared goal (e.g. tagging, PCI DSS compliance) to simplify management. As you would expect, there is a hierarchy of policy management groups much like AWS Organizations. Microsoft report that all 300 of their top customers are using Azure policy and the quality of this feedback shows in the product–and unlike AWS Config it’s free to use. Another potential advantage is simplified integration to Azure AD and many organisations, regardless of which Cloud provider they use, will want to integrate with the Microsoft stack when it comes to internal user identity. There is also a nice tie-in with Azure DevOps allowing API calls to be made to the policy engine to enforce policy as code, illustrating how Microsoft’s strategic emphasis on E2E lifecycle management is paying off.
Automating governance should be near the top of your priority list when evolving Cloud capability, particularly if you’re a growing or large enterprise wrestling with the balance between centralised and devolved models of control. There are some great tools out there to help but you will need to invest in solid engineering and depending on the platform even some product costs. You can achieve it whichever public Cloud provider you use but there are differences: while AWS is great, its automated governance is more fragmented, requiring work on the part of SRE and DevOps teams to integrate. As a result it’s a little less friendly for professionals working in the compliance space. I do question the implicit AWS strategy of making customers pay for automated governance as it could serve as a disincentive. Microsoft’s approach, perhaps as a result of their Enterprise heritage, seems more thought through.