Being the largest cloud platform in the world comes with serious responsibilities, not just to your customers, but also to regulators, governments, and the wider internet as a whole.
Security and privacy are core pillars for any cloud provider to uphold and maintain, but the stakes don’t come much higher than for Amazon Web Services (AWS) – after all, it’s a long way to fall.
We spoke to leaders from Amazon’s cloud computing division at its recent security-focused AWS re:Inforce 2023 event to gain insight into the emerging trends of cloud security, the changing face of cyberattacks thanks to the generative AI boom, and the importance of getting the basics right before anything else.
AI on both sides
Since the public release of ChatGPT in November 2022, generative AI has grown rapidly to dominate the market, with powerful automated tools now available to anyone with access to the internet. This includes threat actors, who can code malware to pen convincing phishing emails in an instant. So how do the defenders protect the cloud from this powerful new threat?
Mark Ryland, Director at the Office of the CISO at AWS, thinks that while the new danger is real, we need to get such advanced threats into perspective:
“There’s a lot of basic things we need to get better at [phishing etc.]. Cloud can help with that: we have a lot of automation, a lot of tooling, a lot of features and capabilities that make it easier to manage infrastructure, to get away from all the low level technology you have to manage on premises, it uses higher level services, so that continues to be a theme.”
“I’ll have a sales team that says ‘my customer wants to talk to you about the evolving threat landscape – the most sophisticated actors, what are they doing?’ Well, that’s an interesting topic, but probably that’s not gonna be your problem – your problem is probably gonna be some really basic, you know, Russian hacker gang that’s just out there hitting everybody and if you don’t patch that last Windows CVE then you’re gonna get in trouble. And it’s not very exciting, it’s not very sexy – it’s just routine, but that is what you should really focus on, then you can talk about advanced actors.”
And when it comes to deploying AI on the other side of the fight, Hart Rossman, VP of Global Security at AWS, believes that thoughtfulness needs to be applied in the development of such defensive tools:
“The way I’ve been talking to customers and our team about it is by trying to push some of the hype to the side and talk about the practical applications. And so, the mental model I’ve been using is kind of: AI is the new DevOps. If you go back a little more than a decade, particularly in the security space… we were moving some traditional development methodologies to… DevOps, and it took about a decade for the world to get comfortable with that… And that’s like the modern way of building applications now.”
“AI and some of the advanced applications of machine learning [are having] that watershed moment… where we’re going to democratize it, we’re going to make it available broadly for everybody, and now’s the time to start thinking about the practical applications.”
“Today, that seems to be, for the most part, as what we used to call expert systems – so, automation that helps somebody do their job better, faster, smarter, cheaper, but not necessarily showing groundbreaking new capability; I think that’s not far off, though.”
“And I know we’re working on cool stuff… but today, I’m really encouraging the team to look for practical applications; again, teach them about the technology that demonstrate meaningful outcomes (security) and then really use that to prime an innovation pump right over the next several years to take care of our needs today.”
“And so, again, it’s those practical applications that show real material benefit to the customer today that I’m really trying to remind people to stay focused on.”
Security concerns today
So while the fundamentals of security remain as important as ever within the cloud, how does AWS tackle them?
Rossman was enthusiastic about Cedar, the newly developed open source language from AWS designed for writing access controls:
“Cedar [is] fundamentally two things: it’s a human-readable security-validated policy language – so it makes it easier for people to write correct authorization policies – and then there’s an engine that… parses those policies [in a way] that’s provably correct.”
“Our automated reasoning group… uses… mathematics and logic to prove that a particular algorithm performs as advertised and only as expected – we call it computational correct. And so Cedar is the first open source authorization policy language – certainly that I’m aware of, but I think ever – that has been designed with correctness as a first principle… it’s provably correct; it’s automated reasoning.”
“And so, what does that do? It increases the trust for a user, it increases the ease of implementation for a developer, and if you’re in compliance or audit it’s also human readable, so you can actually look at the policy… when you’re having that conversation about PCI compliance, a lot of it has to do with identity access control – whether it’s access to… computational resources or network access – and being able to look at that in Cedar just makes it substantially easier for a QSA to determine whether or not you’re compliant.”
“And then what we really care about is: is it effective in application? And that’s where the verifiability comes in. So you’ve got compliance and audit saying, ‘you’re doing the right thing’ – and it was easy to figure that out – and then you’ve got the system enforcing it which is what makes the security valuable.”
“I’m excited that so many people are talking about Cedar… most authorization policy languages and authorization engines are closed source and built into a service …and the cool thing about Cedar is anybody – it could be one person in their garage, it could be a multinational corporation – can now have provably secure authorization at scale and a validation engine to go with it, in any application they want to write. It’s awesome – it demonstratively raises the bar for security for anybody on the planet.”
Another key consideration for such a large and expansive cloud service like AWS is to make sure that elements are effectively partitioned, so that if unauthorized access were to occur in one layer, the keys to your entire kingdom wouldn’t be handed over. Ryland explains:
“Even the network is not just the ‘outside’ and the ‘inside’ – even inside we should have all these sub-insides, sub-environments, where lateral movement is harder or impossible because there’s really no need for that part of your system to talk to that part except under very controlled circumstances, so you don’t have a generally open network, instead you have clear limits and demarcations.”
“And in the cloud for years we’ve had this feature called security groups which are built within the EC2 environment and very easy to use dynamic software-defined firewalling technology where you could just literally configure any server running with the security group and they can talk to each other but they can’t talk to other things… that technology… has always been used in EC2 but people are beginning to use it in other environments… in a way [it’s] doubling down on network, but just making it so the segments are smaller and more contained if there’s a problem, and then adding that identity based signal as well wherever you can.”
Ryland does note, however, that, “it does require, sometimes, upgrading applications or putting proxies in because there is legacy technology that doesn’t understand identity signals in a network path – but there are ways to it, and more and more we see people building those capabilities and it’s a very good trend.”
For his part, Rossman similarly explains:
“Services are isolated, in a variety of ways – all the way back to the idea of a two-pizza team where, you know, you have a single development team that deeply owns a feature or service end-to-end, working backwards from the customer, driving the innovation – and so, from that sense, just like in a VPC… we kind of limit the blast radius in a number of different ways as you work out from the account down into [a] regen availability zone… a particular service implemented… particular VPC, particular subnet.”
Customer privacy
With all the power and security features one needs to implement large-scale cloud computing successfully, how is customer privacy preserved, both from threat actors and AWS itself? Because with the copious amounts of sensitive data the company holds, users are instilling a huge amount of trust in the cloud provider to keep it away from prying eyes – whosever they are.
“One thing that’s emerging now that’s quite interesting,” says Ryland, “is privacy-preserving data sharing… for example… I can share data with you in some way that still limits your visibility into the data.”
“We have a service we launched not long ago called Clean Rooms and is specifically designed for data sharing between parties who don’t want to fully trust each other, but they need to collaborate on data, and it gives you a lot of really sophisticated and rich access controls and limits as to who can see what, time-bound access – things like that.”
“Also, when used with client software that we provide, which encrypts data in a way that still allows searchability, for example, and queries, then you can really do some interesting things to create these sort of shared data environments where you can collaborate with someone.”
Ryland gives the rather innocuous example of sharing an IP address, which, as he points out, “under many data regulations is considered… Personally Identifiable Information: if it’s the IP address of your home router, that’s considered PII.”
“I can’t share IP addresses with you, but if you also have that same IP address in your database, and I have it and we encrypt it… we recognize, ‘hey, we both have this.’ So we haven’t shared anything, we haven’t violated any transfer, but… we can correlate data even though we don’t actually have to share the data.”
“A lot of the core use cases are marketing and advertising… because they want to understand things like demographics and users without being able to literally share data, but there’s broad applicability to this kind of technology… so if you’re working with fraudulent behavior on your platform, you would much rather be able to tell some other equivalent business about fraud that’s occurring on your platform, but you’re not able to just share the data in a raw form because of privacy concerns.”
“But if you could do correlation of data, and say, ‘if you’re seeing this, then that to us is an indicator of a fraudulent user,’ we could potentially do that in ways that preserves privacy, and so that’s a project that I’ve been working on with some other companies to see if we can come up with a way.”
“We’re getting a lot of encouragement, if you will, or pressure from the US government and other governments to do more work to try to prevent fraudulent abusive use of cloud infrastructure… because you don’t want bad actors using cloud.”
Regarding the points of entry for these bad actors, Ryland again explains that the basic methods remain effective, saying that they often get in by, “compromising innocent customers who have unpatched infrastructure… you don’t patch your WordPress site, someone’s gonna hack in and establish a presence there and you won’t even notice it.”
When it comes to protecting the privacy of AWS’ customers from itself, Ryland compares the cloud to an envelope: “We don’t look at the letters – we see the outside, we see traffic coming and going, we see DNS lookups (if you use our DNS service), we see – as part of your billing information – how many instances your using and how much data your storing – all that stuff is part of the meta data surrounding your environment that we process.”
“We still protect that data very carefully and keep humans away from it, but it is part of what a cloud is. But inside your compute nodes, inside your storage, inside your databases – that to us is toxic: we don’t want to see that, we try not to see it, and we have many protections against ever seeing it and keeping that isolation as much as possible.”
“We have some services now that we call hermetic services where we literally can’t see it – we don’t have any technical means to look inside your workload, and that most importantly includes our core compute service, the EC2 Nitro architecture… we designed it from the start to be completely isolated from even privileged operators in AWS, so we have no way to look inside the storage, the memory and so forth. And our key management service is also like that – in terms of the cryptographic keys that are used to encrypt all your data.”
“Between those core services and spreading out from that, we’ve built more and more protections such that we, again, can see the outside edge of your workload, [but] we don’t see the inside. And then we work with audit teams and compliance organizations to provide evidence to prove that that’s true and then customers can rely on that in making decisions about running very sensitive workloads in AWS.”
Despite this, Ryland says there are still instances within AWS where complete privacy isn’t always possible:
“It’s an ongoing trend, and we still have some services where it’s hard to operate in a completely ‘lights out’ way; probably the longest challenge will be when you run commercial database engines… and so forth – those were never really designed for completely automated operation, they always kind of assumed that a DBA could log in and do certain things: rebuild an index or drop a table because of this or that.”
“So even though we have a lot of automation to do some of those DBA functions, there are times when a human has to log in and do something – it’s rare, but it does sometimes occur, just to keep the system up and running… to meet the SLA.”
“So now we’re developing technology to inform customers of those rare – but not zero – operations. And over the long term, probably provide capability to give you a workflow where you can say yes or no… we can’t speak for availability, but if you don’t want us to log in, we won’t.”
“But a lot of the core technology is effectively already to the level where our privileged operators just can’t see your data at all, which is what we want and what customers normally want – we have some customers, like law enforcement that wishes that we could see inside all of your data.”
“So we have these weird conversations… where we’re talking to the law enforcement agency and they’re like… ‘here’s a court order, give me access… to that data’ – we literally can’t do that; we don’t have the technology.”
“And then another government agency is like, ‘thank God, I run workloads on AWS and I sure hope nobody can see inside!’”