Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds
Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds Thomas Ristenpart, Eran Tromer, Hovav Shacham, and Stefan Savage, Proceedings of Computer and Communications Security -- CCS '09
Reviews due Thursday, 3/8.
Comments
In this paper the author presented their work on exploring the novel vulnerabilities introduced by cloud computing. Their exploration is mainly done on the Amazon EC2 service.
First they use network probing to measure the placement of the Virtual Machines in the cloud. With the assumption that different availability zones are likely to corresponding to different internal IP address ranges, they mapped the EC2 service IP address and availability zones, by making use of both internal and external network probing.With this information they would be able to infer the type and availability zone of public target instances.
Next, they explored the possibility to decide whether two Virtual Machines are on a same physical machine(co-resident). They used several methods. The best method is network-based checks. They can check the Xen host IP address, the round trip time of packages and the distance between two internal IP addresses. Also they made use of the hard disk convert channel to transmit bits to check the co-residence which has very good result. Moreover, they can use the load based method to detect the co-residence which doesn’t rely on the network-based.
The next step is to try to deploy an VM instance to be co-resident with a target instance so that the attacker can do something malicious. They find that the instance created at the same time are likely to be deployed on a same physical machine, and they use the placement locality to deploy instances co-resident with the target. Since the the instances of a service will automatically start to satisfy the needs of scale they can detect the start of a target instance and start their own instance within a small time lag. In this way, they have 40% percent possibility to have instance co-resident with the target. Their experiment shows that, with the increase of the time lag, the possibility will decrease, which is an evidence for the placement of locality.
Once they were able to place an instance co-resident with a target instance, there are many ways they can attack the target such as steal cryptographic keys, denial service, measuring cache usage, estimate traffic rates and keystroke timing attack. If the core migration exists, it would be more difficult to steal the cryptographic through cache-based side channels. But through the side channels they extract less bits information in a robust and simple way. By measure the cache usage they can measure the load of the target instance and therefore estimate the traffic of the target instance. They can gather the inter-keystroke times which can be used to recover the password through statistics work.
The author also gives suggestions about preventing the attacks they explored such as obfuscate both the internal structure and placement police, inhibiting the side channels and expose the risk and placement decisions directly to the users.
Posted by: Xiaoming Shi | March 8, 2012 10:51 AM
This paper is fairly different from the others we've reviewed. The authors are analyzing security properties of Amazon's EC2 cloud service. EC2 is a distributed system and should include in its design parameters that the clients be kept isolated from each other. Certainly they should never be able to target specific other clients. This paper shows that the system is vulnerable in this way.
The attack is split into two parts. Getting coresidence on a cloud machine with a process you want to attack then actually performing the attack. The authors' approach to the first is to map Amazon's assignment mechanism for processes. It turns out that this assignment is not completely random across the system but clustered according to some parameters. An attacker can be more sure that their process starts on the same machine if they know some properties of the process being targeted. Once instantiated, a process can check for coresidence by comparing network address. The discussed method demonstrated a zero false positive rate (effectively). The authors conclude that the cloud service has made it too easy for processes to map its internal structure and that the design of the assignment mechanism, while making the job of maintainance and load balancing easier, is too simple with respect to the security of clients. Various measures ought to be taken to restrict the amount of information a client can gain about the cloud system.
Once coresident on a machine, the attacker can exploit the shared hardware to examine the target. One method in particular uses the shared cache to measure activity of the target. The proposed defenses to these attacks would lead to inefficiency and examining alternatives lie outside the scope of a distributed systems course.
The methodology of the authors was to think of a vulnerablity and experiment, documenting its success rate. Some impressive results came out of it but I wish it were more clear how the attack ideas were generated. I don't feel that reading this paper has left me better capable of designing a secure system or demonstrating a system is unsecure. Perhaps that is simply the nature of security work. Earlier I was wondering why the authors didn't approach Amazon for confirmation that, say, their guess about machine assignment being linked to size was correct. After some thought, I realized that not only have the authors proved that these vulnerabilities exist but also the stronger fact that no help is needed to discover that they exist.
Posted by: Brian Nixon | March 8, 2012 08:39 AM
1. one or two sentence summary of the paper
The paper mainly claims that the use of virtualization in third-party cloud computing services introduces new vulnerabilities. Focusing on the case study of Amazon EC2 service, the authors claims that one can possibly map the internal cloud infrastructure, figure out the location of a particular target virtual machine and instantiate new VMs which can be placed co-resident with the target VM. The authors also states that such co-resident placement may lead to information leakage of target VM. In addition, the authors provide discussions on defenses a cloud provider might try in order to prevent such attacks.
2. A description of the problem they were trying to solve (and how the problem solved)
This paper aims at exploring how the attacks mentioned above can work in real cloud computing system and what the risk of such attacks is. Particularly, as mentioned in the introduction part in the paper, there are 4 main problems that this paper is trying to solve. First, can one determine where in the cloud infrastructure an instance is located; second, can one easily determine if two instances are co-resident on the same physical machine; third, can an adversary launch instances that will be co-resident with other user’s instances; Final, can an adversary exploit cross-VM information leakage once co-resident. In addition, a problem the paper wants to handle is to propose certain suggestion of defenses of cloud provider to prevent such attacks.
3. A summary of the contributions of the paper
The main contribution of this paper is that it points out new vulnerabilities of the virtualization techniques (users instantiate VMs, and VMs may share the same physical machine) used by third-party cloud computing service. Moreover, by case study of Amazon EC2 service, the paper proposes a practical method to conduct the risk, and claims that it is highly possible to lead a successful attack. First, the authors describe how they map the internal structure of the EC2 service. They find that all the internal IP addresses associated with the service are divided among different three available zones, and there is a strong correlation between instance type and internal IP address. The authors claim that one can achieve co-resident placement using this technique with a success rate up to 8%, and it could be higher with the observation of sequential and parallel locality. In addition, the authors explore information leakage, primarily using previously established techniques to measure system load on the other VMs and they suggest that such techniques could be used to estimate traffic rates, monitor keystrokes, and measuring cache usage.
4. The one or two largest flaws in the paper.
First, I wonder whether the conclusion in the paper is general and applicable for all cloud computing services, since this paper only focus on the case study of EC2 services. It is highly possible other services with quite different infrastructure may not suffer such risk.
Second, the suggestion that of offering customers the option of getting a machine on which they are the only allowed VM for an extra fee may not be a good choice. I think that it may go against the original purpose of cloud computing. In addition, such similar service such as dedicated server rent is available.
Posted by: Junyan Chen | March 8, 2012 08:00 AM
The paper exposes many security vulnerabilities possible on public cloud settings like EC2 which uses virtualization for security, isolation among virtual machines and utilization of a single physical machine (by sharing). The authors show the practicality of such attacks using experiments on EC2.
The main idea behind the paper is that any possibility of getting two Virtual Machines co-resident on the same physical machine without any insider knowledge is a serious security loop-hole in the cloud infrastructure and is enough to achieve many well-known security attacks (side-channels, etc.). The authors explore this possibility and were able to successfully realize it on EC2. Though, the method of achieving co-residence is not trivial it seemed straight-forward, pointing out the bad security levels in the public cloud settings. For example, assignment of new EC2 instances to machines are fairly deterministic (like, allowing Dom0 IP address to appear on traceroute results - compromising critical security information). The possibility to achieve co-residence and detect it in such public cloud setting is the biggest contribution of this paper. After achieving co-residence, they authors experiment with the possibility of establishing covert channels between to co-resident instances and many such attacks. This points out that the Virtual Machine Monitor (Xen) is not providing good isolation between VM running on the same physical machine.
The paper brings out very important and critical security loop holes in public cloud settings and also provide suggestions to avoid such vulnerabilities. But, some of the attacks described is not convincingly well motivated. For example, Keystroke timing attack doesn't seem to be an apt attack in a cloud setting. It is hard to imagine that a cloud instance will be used for an interactive use that require password keying and direct user interface. It is hard to see if the frequency of that happening high even if it is remotely possible. It should also be noted that remote access wouldn't involve keypresses transmitted one by one rather than a burst to these Virtual Machines over the network.
The ideas presented in the paper are very relevant to the current setting and it is important for cloud providers, and customers to realize the possible security vulnerabilities when any sensitive application is hosted on the cloud.
Posted by: Venkatanathan Varadarajan | March 8, 2012 07:59 AM
"Hey, You, Get Off of My Cloud" is a paper detailing security holes in cloud
computing services, showing some staged attacks on Amazon's EC2 service. The
goal in the attacks was to achieve placement of your EC2 instance on the same
physical machine as the target victim instance, which allows attacks to be
mounted across covert channels on the physical machine. For this objective, it
was enough to verify that an attack is likely if co-placement could be achieved
with feasible probability on one or a small set of specific victims.
The main result/contribution is the discovery of Amazon's simple IP addressing
scheme among their machines used for cloud services, and the use of simple
methods to gain information about the neighboring machines such as sending ping,
traceroute, and HTTP requests. The authors showed that reasonably guessing
where a machine was located in Amazon's cloud was as simple as starting many
instances with known parameters, seeing what internal IP address was assigned to
the instance, and using traceroute requests to glean information on other
machines. This traceroute technique was also used to confirm whether an
instance was cohabited with another (possible target victim) instance. This
information alone was not enough to mount an attack with high probability on a
single instance, but Amazon's tendency to schedule instances on machines that
have just given up instances was. This knowledge allowed an attacker to monitor
a certain instance, wait until it is stopped and restarted, and launch many
instances. In one case, the likelihood of achieving co-residence was 40%.
The security holes defined in the paper are very simple, which is a neat aspect.
The easy ways for Amazon to avoid the ability for end users to infer information
about their environment is not a flaw, since the goal of security papers is to
find these holes so they may be patched. Through all this work though, the
paper assumes VMs are inherently non-secure if they are located on the same
physical machine as a malicious VM. I am not familiar enough with VM security
to know if this is true, but the attacks laid out seem like they would deliver a
lot of false positives or negatives to the attacker. This may just be the
nature of security hacking though, a world I am not familiar with.
This study brings up many applicable points. One could be simply that, if a
service needs complete confidentiality, it probably should not be outsourcing
work to a public cloud service. These holes are avoided by keeping sensitive
computations in-house and protecting it. Also, it is easier to have predictable
IP addresses in a distributed system, but obviously insecure in this realm. If
a system is to be used this publicly, obfuscation of adressing may be a good
idea. On a service like EC2, preserving the customer's freedom to a machine like
it could a local one (like allowing the use of traceroute, reading /proc, etc.)
is part of what makes the service attractive. The tradeoff is that allowing
these simple tools to be used from one EC2 instance to another can decrease
security significantly.
Posted by: Evan Samanas | March 8, 2012 07:50 AM
This paper reveals a potential security issue in public cloud. That is, different tenants sharing the saming physical infrastructure may cause DDoS attack or information leakage.
To achieve this kind of attack, the authors first use DNS lookups and controlled assignment get the distribution of internal IP on VM types and locations. They also use network probing and covert channel checking to verify whether two VMs are coresident. Based on this verfication method, they propose two methods to deploy a attacker VM on victim VM's physical machine. Brute-forcing placement just creates new VMs again and again until it places a VM with the victim VM; another method uses placement locality, the VMs that are started at the same are more likely to be deployed on the same physical machine, so it detects the victim's start time and creates a new VM at the same time. Finally, after attacker deploy VM on the victim VM's physical machine, he can read shared cache to get information of victim; or he can detect the physical machine's resource usage and get to know the bottleneck resource and perform a DDoS attack.
The
Discussions: (1) In all their placement detect experiments, they use small VMs. A physical machine can holds more small VMs, which increases the probability of coresidence. I am wondering whether they can get the same high probability of coresidence if they use large VMs.
(2) In fact, the success of placing VM resident with victim VM is really related with the VM placement algorithm. Without knowing the EC2's placement algorithm, we do not know whether this kind attack is pratical in other clouds.
(3) I am wondering why EC2 makes internal IP DNS public. I think only external IP should be exposed to tenants. In connection between internal VM and external IP, gateway should translate the internal VM IP to/from external VM IP. In connection between internal VMs, hypervisor should translate internal VM IP to/from tenants' external IP.
(4) In DDoS attack, the attacker use the bottleneck resource to affect the victim VM. When provider deploys VM, they should consider to satisfy the VM's maximum useage on these resources. Currently, CPU, disk space and memory are already considered; while network IO rate and Disk IO rate are not. But I believe DDoS can be sovled by more careful VM model. Currently, there are already works about VM bandwith guarantee in clouds. But the disk IO is not known.
Posted by: Wenfei Wu | March 8, 2012 07:34 AM
This paper explores possible non-obvious vulnerabilities in the third-party cloud computing services. The obvious risk of using cloud services is that of services providers not respecting the confidentiality of customers data, which is very unlikely. The non-obvious vulnerabilities arise from multiple instances of virtual machines sharing the same physical machine. That allows instances of different customers to run on shared physical resources. As reported in the paper, important information, such as traffic rates and inter-keystroke times that could be used to guess which keys were typed, can be extracted from a victim instance by an instance of an attacker residing in the same physical machine. The paper describes techniques on how an attacker can manage to launch his instance in the same physical machine as his victim, using Amazon’s EC2 cloud computing service as a case study. It also presents how side-channels can be used to leak the information through shared resources.
The vulnerability presented in the paper shows that cloud computing service providers need to be careful with their implementation. A seemingly harmless algorithm can be exploited by an attacker with a clever mind. Amazon’s cloud computing service (understandably) seems to try to maximize efficiency of their data center by using algorithms for instance placement that result in temporal and sequential locality. However, when this is observed through heuristics and experiments by the authors, it results in exploits to place an attacker’s instance on the same machine as victim’s instance. This points that the cloud service providers need to do the hard work of not only optimizing performance (for their profitability), but also consider how these could result in unforeseen exploits.
The paper presented the exploit clearly. It presented questions one would ask for such an exploit, and addressed each of them. For each of the attack steps, it also includes discussions to prevent that kind of attack. In addition, because it has been shown by running on Amazon’s EC2 without any information that is not available for users, we can ensure that it is practical.
To conclude, I believe this paper raised awareness in both service providers and users about vulnerabilities that could result from sharing physical machines for cloud computing services.
Posted by: Min Thu Aung | March 8, 2012 07:28 AM
This paper performs a practical analysis of security risks in
virtualization-based cloud-computing environments such as Amazon's EC2 and
Microsoft's Azure. These services offer flexible provisioning of rented time
on virtual servers running on physical hardware that is often shared with
virtual servers rented by other customers. The authors address two primary
topics in such environments, using EC2 as their test subject: influencing VM
allocation in order to achieve "co-residency" with a given target (controlling
a VM running on the same hardware as the target), and having done so,
constructing communication side-channels by which information can be leaked
from the victim to an attacker. Ultimately, they argue that mitigating these
risks entirely would be infeasible or have undesirable effects on efficiency,
and that cloud-computing providers should explicitly acknowledge these risks
and expose VM placement to customers.
The authors first explore how an attacker might attain co-residency by
performing a black-box mapping of the IP address space allocated to EC2
servers, finding very regular patterns corresponding to EC2 "availability
zones" and instance types. Easing detection of co-residency is the fact that
the IP address of the Xen Dom0, common to all DomUs on a physical machine,
appears in the IP route of each DomU. This bit of helpful information would
be simple for the provider to eliminate, however, so the authors also present
alternative methods of detection by inducing contention for shared resources,
such as CPU cache, disk access, and network bandwidth.
Having established the very realistic feasibility of achieving co-residency,
the authors then describe a number of covert channels an attacker could use to
surveil an Xen "neighbor", including even the potential recovery of passwords
by keystroke timing detection and analysis. These are largely similar to
previously-known covert channels in the context of multiple processes running
under the same operating system, but are complicated by the differences
between guest VMs on a hypervisor as compared to a multiprocessing OS. Though
it does not seem specific to VM-based covert channels, one technique I
particularly liked was introducing differential signaling via set-associative
caches in order to improve the signal-to-noise ratio of a CPU-cache channel.
The paper makes some effort to emphasize that their experiments in achieving
placement collisions were realistic (if not pessimistic) in their difficulty.
However, they did not seem to acknowledge the fact that pursuing only
"m1.small" instances makes achieving such collisions much easier (though this
does not particularly diminish the significance of their findings). Also, I
wondered when they were describing the disk-IO based side-channel whether or
not the randomness of the IO performed by the sender would really be very
significant. I would guess it likely the guests are each allotted a partition
(or something analogous, like an LVM volume) of a disk; given the
presumably-large displacement between any two VM's partitions then, I would
think any concurrent IO from different VMs, even if purely sequential from the
perspective of the VMs performing it, would induce enough head-seeks to
degrade performance quite detectably.
Posted by: Zev Weiss | March 8, 2012 05:35 AM
In 'Hey, You, Get Off of My Cloud' Restenpart, Tromer, Schaham, and Savage present a number of techniques for mapping a cloud infrastructure, identifying where a target VM is likely to reside, and then gain a VM co-resident with the target and exploiting a cross-vm side-channel attack. Most of the techniques are not new, but the combination of all of them is the major contribution of this paper. There are some possibilities for mitigating the threat posed by one or more of the techniques presented in the paper and here are a few possibilities.
Placement Attack:
A potential prevention technique for avoiding being located by the brute-force placement attacks in 7.1 might be to at some interval move a service within the cloud. Create a new instance, transfer the state of an old instance to it, and kill the old instance. That way there would be a moving target inside the cloud. With the data rates mentioned even if someone did can co-occupancy a short lifespan of an instance could limit potential attacks. A more complicated defense might be to run a number of detection instances along side your real services. An attackers probes could be detected at that point and as they would probably eliminate the probe instance from their search space, the target instance could be moved to run at the probe instance instead, actively hiding from an attacker.
Ssh timing attack
The referenced paper does not mention mention the population of users. Adding more users could spread out the statistics enough to increase the search space. Or if the population included non-English speakers the letter pair timings might be different.
Is there a maximum password size? Password's could be sent in a uniform sized block to not leak password length information. If sending passwords in one packet is impractical, the nagle algorithm could be turned on for passwords so timing information between keystrokes could be confused.
Avoiding cloud cartography:
If the amazon ec2 web pages are being interpreted correctly the Linux instances available offer root access to the customer. It is unclear if the kernel can be altered by the customer, but it's a possibility. It appears that many operating systems have a default TTL ( http://www.binbert.com/blog/2009/12/default-time-to-live-ttl-values/ ). Is there any reason for a smaller TTL OTHER than TCP syn based trace-routes? TCP Syn packets with low ttl's could be filtered.
Further one could avoid geolocation with ping round trip times by causing the dom0 controllers to add a random delay in responding to any ICMP echo requests.
Avoiding Load Based Co-Residency Detection:
If a web service is replicated across more than one instance, a dns hack could be installed that maps requests for that service in a random fashion across the replicas. The mapping could be done in a fashion to try and avoid any consistent correlation between get and load. Perhaps instead of constantly mapping a client to a server it could map it in random spurts.
Questions:
How does amazon's elastic load balancing effect locating a victim instance?
Posted by: Matt Newcomb | March 8, 2012 03:41 AM
This paper talks about a possible security attack in cloud services through co-existing VM instances. Cloud services generally provide the registered user a VM instance. These VM instances are sometimes shared across users in case of services like SQL azure. This paper considers the case of specific VM instances per user. The adversary, who wishes to attack certain services can deploy its VM in the same machine as the VM running the service and clog the CPU, steal pass keys, etc. This paper for the major part deals with techniques that help in deploying the adversary's VM to co-exist with the service's VM.
The paper has experimented with EC2 for a cloud service. They initially started with a port scan on the public IPs and from the responsive IPs, they could find the ones using EC2. This could easily be done using network map(nmap) which broadcasts ping requests to all nodes on a subnet. The paper also identifies patterns in the way the internal IPs are assigned and from that, they can decide the availability zone of the machine. The paper gives an overview of some basic network-level tests that can decide on co-occurence of VMs in a physical machine. However, methods exist that can easily work around this like setting up of VLans or setting appropriate IPTables during the bootstrapping of the instance.
The paper initially starts off with a brute force approach wherein a large number of nodes are started at random locations and kept alive for some duration so that co-existence can be detected with some probability. An optimization was suggested on top of this brute force approach, to monitor the state of the service to be attacked and only then start the instances for attack as this would reduce on wasted cycles of checking.
The paper also talks about an interesting method for checking the CPU cache usage which is an indirect indicator of the load on the system which might be due to multiple VMs coexisting together.
Overall, I feel that the paper makes a good case of why these are some difficult attacks to be shielded against. It does suggest some mitigation mechanisms like obfuscating the path to the services and minimizing the information to be leaked. These mitigation mechanisms I feel might hamper performance. For example, when you try to reduce the caching in the machines, you in turn reduce the ability to hold things in memory for more time. This will result in writes to disks and hamper performance.
Posted by: Srinivas Govindan | March 8, 2012 03:29 AM
The paper discusses vulnerabilities that can occur when several different users are able to run on the same machine in a cloud environment and identify other users. Despite certain precautions, information can be extracted from other users using the same machine.
Cloud based systems usually provide some type of insulation to a user through layers like virtual machines. The question is how secure this setup is on a cloud server and how hard it would be for an attacker to end up running on the same machine or what types of information the attacker could possibly extract from that situation. The paper examines specifically the Amazon EC2 service, but says that many others are likely quite similar.
The first question presented in the paper deals with how easy it would be for an attacker to be able to accurately get an instance on the same machine as another client. This becomes more difficult if there is a specific client using the cloud server that is the target rather than anyone else using the cloud server. One option is to run many instances on the server and hope that one of them ends up at the same location, but this would not usually be practical. With the Amazon EC2 service that was examined, a user has flexibility through which machine it goes to due to the partitioning from user provided parameters and especially availability zones which correspond to certain areas in the world. By running different instances, it is possible to map out locations of different servers to determine what might be running on them. Two items can be compared to see if they are running on the same server by checking for close IP addresses or quick responses between in communication between the two items. A third method preferred by the paper is to identify the IP address from the first hop out of the item running on the server which should usually be the same for another client if they are indeed on the same machine. This same number can be determined for another client's instance by using traceroute to calculate the address after the hop. With the design of Amazon EC2 at the time, another option was for an attacker to create a new instance immediately after detecting that someone else has started up, which greatly increased the chance of ending up on the same machine. When the same machine in the cloud has been located, side channel attacks can be used to measure input timings which could be used to help identify keystrokes if they are sent to the cloud in relation to the time they are originally typed. Statistics could also be gathered about the other user if information such as traffic can be determined. One solution proposed is to allow the client of the cloud server to request that only certain other users be allowed to run on the same machines.
Much of the paper's focus was on specifics of Amazon's EC2 distribution of instances which may not be that relevant for the more general problem of identifying locations in other cloud services. The paper seemed to be more heavily weighted towards the idea of locating a specific machine than on what bad effects could occur when this happens.
This paper shows the importance of keeping instances on a cloud system isolated as much as possible from each other. Both issues of keeping a user from knowing who else is running on the same machine and blocking the ability to extract information are general problems that can occur in cloud computing. It is especially important to be aware of the security concerns when using cloud computing based resources.
Posted by: Daniel Crowell | March 8, 2012 02:57 AM
This paper discusses security implications that are unique to cloud computing environments. Specifically how shared resources on physical machines could cause a virtual machine (or its data) that happens to be on that physical machine to be compromised. These are rather new and interesting attack vectors that would not appear in a classic single OS server. One of the main focuses of this paper was spent on methods to determine a target VM's physical location on a cloud service (EC2 is used as an example) and then how to spawn a VM on the same machine as the target VM to attempt to obtain information from these shared resources. In the end, they were able to successful determine when they were on the same physical machine as the target VM and they were able to take advantage of how amazon places VM instances to reduce the search space required for their brute force launching approach (since they only needed to launch one type of instance multiple times to get located on the same physical machine instead of having to test all instance types on EC2). The solution they proposed In this paper was to allow the user to have more choice in selection where their VM's were run to allow for a user to request that all of their particular VM's run on the same physical machine with exclusivity (Amazon does have some products now that might reduce the risk of this attack vector like EC2 Virtual Private Cloud). This paper then goes on to talk about specific threats to target VM's such as side channel attacks on caches, keystroke timing attacks, estimation of network throughput, and covert channels.
The specific attacks discussed in this paper seem to be more theoretical then an actual issue would cause serious data loss to a VM on EC2. However they did not try and optimize some of these attacks so there could still be some way for these to become viable attack vectors (such as improving bandwidth in covert cache channels). The security issue brought up with determining the location of a target VM and being able to get a VM on the same physical machine as the target VM could prove to be very significant if there is any vulnerability that could be exploited on that machine to perform a VM escape. If a VM escape can be managed on the same machine as the target VM then all data and control of the VM could be in the attackers hands (and could be done in a way which the target VM is not aware that its been compromised). It seems that all of the attacks mentioned in this paper could be made completely impractical if all methods for determining the physical location of a VM were removed. However this is likely not possible in current cloud data center environment. It seems like its a very hard problem to try and totally obscure all information about the underlying network/machine from the VM [ip address, routing table information, etc]. However without a comparison to another major cloud service environment i'm not sure if this is just an issue with Amazon's EC2 setup or if this is a problem for most cloud data centers. In reality though without a major security threat that would cause data loss or VM takeover's using this type of attack, it is unlikely this will be addressed anytime soon.
Posted by: Benjamin Welton | March 8, 2012 02:21 AM
This paper discusses the security issues on third-party clouds. The main idea of this paper is that the VM can be attacked by trying to place the malicious VM on the same machine with a target VM, and do a side-channel attacks to extract information from the target VM.
While people gets lots of benefits from third-party clouds, no matter the cloud providers or consumers of the clouds, the security of the service are ignored to some extent. Virtualization technology provides good isolation between virtual machines, which makes a cross-VM attack can hardly happen. Also, internal structures including the organization of physical networks and machines, and even the management issues including placement algorithms of VM on physical machines are hidden from outside, so that clouds are always considered safer. However, is it really safe as physical appliance? If not, why? This paper tries to answer these problems.
The contributions of this paper includes several aspects. First, it defines a threat model, which is reasonable in cloud environment. Second, by practically drawing the cloud cartography of EC2 with internal and external probes, the paper shows the possibility of exploiting the internal structures of a cloud service, and even the placement algorithm. Third, it presents the two placement strategies to exploit the placement. The first one is brute-forcing placement. As described by its name, this is a time-consuming strategy by repeatedly runs probe instances in target zone. The second one is abusing placement locality, which is based on the assumption that the attacker can launch instances relatively soon after the launch of a target victim. The attacker uses instance flooding to achieve the goal of placing the malicious VM on the same physical machine as the target. Finally, three ways to do a side-channel attack after been placed on the same physical machines are discussed.
One flaw that I think from this paper is the generality of the ideas proposed in this paper. The attacks presents by this paper mainly include two steps, placement, and side-channel attack. The paper also discusses the ways to reduce the vulnerabilities, such as offload choice to users for placement, and blinding techniques to minimize the information that can be leaked. These strategies sounds simple. And it is interesting to consider how other cloud service providers does on these issues. If those are the problems only on EC2, the ideas provided by this paper might be less useful. However, the paper doesn’t provide enough information about other platforms.
In sum, I think the ideas provided in this paper is interesting, especially the first part of the paper. The methods proposed to reduce the vulnerabilities sounds low-cost for cloud providers, so I think they might be useful and applicable for today’s cloud providers to make the services more secure.
Posted by: Xiaoyang Gao | March 8, 2012 01:43 AM
The paper describes how one can exploit cloud computing facilities like the
Amazon EC2 where multiple applications belonging to different parties run on
virtual machines which share physical resources. The paper describes the
attack in two stages. First on how to detect and ensure colocation of the
attacker and the victim using some networking techniques like DNS, traceroute,
by making use of communication through covert channels. To ensure that the VMs
of attacker and victim are co-located, they try to understand the VM
allocation through some brute-force mechanism by running some experiments and
gathering the allocation data by zones. Once co-located, the attacker can make
use of some side-channel attacks as described to leverage private information
that the victim is not aware of exposing. The attacker can create synthetic
loads on the physical machine and exploit the victim. The paper also talks
about how to avoid such attacks by encouraging measures like dynamic IP and
employing some level of obfuscation etc.
The paper in my opinion is a good example of how black box techniques can be
used to extract useful information and use the information extracted to plant
an attack on a victim. Having a conservative approach and ensuring at about
40% co-location is interesting and looks promising when one can fire off VM
instances for a pretty cheap rate. The whole paper is very practical in nature
and have enough numbers to support the claims and demonstrate what appears to
the outside world as safe.
While detecting and ensuring co-location seems convincing, most of the attacks
demonstrated by the authors seem subtle and require the machine to be in some
idle state. From my understanding, they would not work well when there is
enough noise from third party VM(other than the attacker and victim)
functioning on the same physical machine. Also, the cloud service provider
can enforce much stronger rules for internal communication between the VMs
registered under different users so that the attacker cannot map the internal
IP to the external ones and detect colocation.
Posted by: Sathya Gunasekar | March 8, 2012 12:08 AM
This paper describes a feasible way to do cross-VM side-channel attacks to service residing on third-party cloud infrastructure. It probes the target, distributes agents to the same machine on which the target resides and finally creates a covert channel.
Cloud computing is thriving and third-party cloud computing decreases costs and utilizes computing resources better. However, giving important data to third-party means security vulnerability. Even assume we can trust the third party, does it mean our data is safe on the cloud? If not, how can the third party improve its system to provide safer service? These are the questions that the paper tries to answer.
This attack makes full use of simple placement policy adopted by EC2. There exist geographical locality, sequential locality and parallel locality. Geographical locality indicates that zones have disjointed IP internal address sets so that we can focus on a certain range of IP address and a certain position. Parallel locality says that instances that start close in time will have a high probability of residing on the same machine, which I think it is the foundation of the attack because it makes co-residence of adversary instance and user instance much easier to achieve. These localities show that EC2 are using a relative simple placement policy that means easier to administrate and less possible to be faulty, but easier to be made use of. To solve this problem, it seems like no matter how complicated the placement policy EC2 adopted, as long as it is a global unique policy, it is possible for adversary to figure it out. Therefore, it is natural to say we should let the users provide placement policy so that there is no global policy to figure out. In this case, attacks to certain service cannot acquire useful information from observing other service, which makes the attacks much more difficult.
One flaw of this paper is that it claims that this attacks are generally applicable to other cloud infrastructure. The feasibility of the attack highly relies on the locality of EC2 placement policy. There is evidence that other infrastructure shares similar locality or even has locality properties.
Security of cloud is definitely going to be of larger importance. Even though this paper only describes one attack mechanism for one service provider, it still reminds us that security should always be a active topic from the view of service provider because it is impossible to have a 100% safe system.
Posted by: Xiaozhu Meng | March 7, 2012 11:38 PM
Summary:
This paper discusses some novel risks in the cloud due to the co-existence of VMs (in particular different kinds of side channel attacks such as DoS, cache usage measurement, load-based co-residence detection, traffic rate estimation and keystroke timing attack). Generally, the paper talks about four questions. (1) Cloud cartography. How can one determine the location of an instance in the cloud infrastructure? (2) Determining co-residence. How can one easily determine if two instances are co-resident on the same physical machine? (3) Causing co-residence. How can an adversary launch instances that will co-reside with victim’s instances? (4) Exploiting co-residence. How can the attacker exploit cross-VM information leakage once co-residence? Further, the paper gives some possible countermeasures for these potential attacks.
Problem:
The cloud is a promising infrastructure for hosting data and deploying software and services. While it brings unprecedented benefits such as cost savings, scalability and flexibility, it also brings up several security challenges like trust and dependence. This paper explores another cloud-specific threat – potential cross-VM attacks due to VM co-residence in the cloud.
Contributions:
+ As far as I know, it reveals for the first time the possibility to leverage co-residence feature in the cloud for attacks. The idea of covert channel communication is really smart (e.g. disk-based, cache-based and load-measurement). Experiments are conducted mostly in the real cloud EC2, which makes it quite convincing.
+ It also proposes some practical countermeasures including dynamic IP allocation, preventing identification of Dom0, allowing users to decide placement and exposing the cost/security tradeoff to users.
Flaws:
I’m still not quite sure how some kinds of attacks mentioned in this paper will actually happen in the real cloud due to author’s assumptions. For instance, for keystroke timing attack experiment, the author pinned the VMs to specific cores. But EC2 may sometimes migrate the virtual CPUs between physical cores as mentioned in the paper. So it seems this will reduce the probability of cache-based attack.
Also, I don’t quite understand how attacks can occur when there are more victims co-exist on the same physical machine. It seems the author assumes there is only one victim on each machine. In case of multiple victims on the same machine, how can an adversary tell e.g. which victim leads to the load variance it observes?
Applicability:
As far as I know, this paper unveils the first look at cloud specific security problems. And it leads to a bunch of follow-up works -- some try to come up with new solutions to prevent attacks talked in this paper, while others want to gain more insight into particular scenarios, such as Microsoft Research studied side-channel leaks in web applications. This paper mainly introduces the problems with co-residence and talks a bit about possible countermeasures. It would be interesting to see how cloud providers will actually react and how well the countermeasures perform in practice.
Posted by: Yizheng Chen | March 7, 2012 11:37 PM
The paper presents ways of stealing information from other Virtual Machines (VMs) running on the same machine, using cross VM attacks. The authors provide highly practical approaches and results from a case study run on Amazon's Elastic Compute Cloud.
The main problem is to extract confidential information from other VMs. This is a challenging problem which contains yet other challenging subproblems. They propose practical ways to solve each problem by carefully analyzing the system, which I believe constitutes the main contributions of the paper. First problem is to determine where the target instance is in the could. This can be accomplished by analyzing the IP distribution results and merge them with internal network probing and DNS resolution queries. After identifying the location of the target instance machine, the second problem is to put the adversary to the same machine. This problem also contains another challenge that is to identify whether two instances are in the same machine. They accomplish these by running many instances sequentially and use some heuristics (e.g., dom0 addresses, round-trip times) to understand whether they arrived to the correct machine. After having a VM running in the same machine with the target, the final challenge is to steal the confidential information by exploiting cross-VM / side-channel information leakage.
I particularly like the part where they discuss how to determine co-residence. Once we know where a target instance is and place our instance in the same machine, we can apply many attacks since we share same physical resources with the target. Therefore, I believe it is the key point. It is also surprising that accomplishing this is not that hard. On the other hand, I highly suspect there are many hacking tricks, and trial-error history behind this approach.
One thing that I can criticize is the claim that the approaches presented in this paper is applicable to other providers such as Azure. Throughout my reading, I had the impression that the approaches are too specific for Amazon EC2. They heavily depend on the results of the analysis of the system such as the regularity of EC2 addressing algorithm. If we apply similar analysis to another systems, it is very probable that we end up having different results, which means we should find unique solutions to those problems. I strongly believe that authors would be able to reveal security vulnerabilities in other services as well, however I do not agree the approaches in this work are general enough to do that.
Another critic is that the discussion on “what should be done to prevent” is too brief. But I guess security people generally do not care about this part.
The paper is a recent case study that is made on most popular cloud service, and obviously it is highly applicable to practice. It is inevitable that multiple instance share physical resources in the cloud; therefore, the confidentially problem is getting more and more important. In addition, there is a tendency to use the cloud to not only store data but also run applications through VMs. At this point, the paper clearly shows that there are security vulnerabilities in running applications through VMs in the cloud, since one can steal confidential information by locating targets along with placement and extraction.
Posted by: Halit Erdogan | March 7, 2012 11:25 PM
David Capel, Victor Bittorf, Igor Canadi, Seth Pollen
Hey, You, Get off of my cloud (2009, not the 1965 version) explores the changing threat models introduced by cloud computing, which is the latest big thing in our field, and provides methods to use these threat models in ways that were not originally considered as feasible. This paper focuses on Amazon’s EC2 service, which falls towards the infrastructure as a service on the line of varieties of clouds. After introducing these threat models, it demonstrates two attacks on EC2’s setup: Cloud Cartography and VM isolation attacks like side channels. Specifically, they exploit cache to reveal keystroke timing and server load.
The primary contribution of this paper is the changing of the threat model for cloud servers. Previously, people seemed to assume that shared resources, when virtualized, could be treated as if they were perfectly shared (eg, perfect virtualization). However, shared resources can still fall victim to side-channel attacks and DoS attacks. Furthermore, it was known that the cloud provider must be trusted since it ultimately has hardware or VMM access, but it was not recognized that third parties can also damage your business or snoop upon your data. Another contribution is the so-called cloud cartography. This is using properties of the cloud’s infrastructure (in this case limited to Amazon’s EC2) to discover the internal layout and even the location of a specific target. Once the attacker is co-resident on the machine with the victim, further attacks can be mounted. This cartography is aimed at EC2, and it is unclear how general it is: on one hand, internal infrastructure details vary by provider and in some cases (eg, the dom0 trick), the paper relies on behavior that is very specific to Xen. On the other hand, some design decisions are practically universal (eg ip address allocation such that routers can maintain reasonably small routing tables). The work done in the paper has interesting parallels to the Grey Box work done by the Arpaci-Dusseaus.
The attacks made in the paper do not seem particularly dangerous (or are simply very difficult to pull off), however, they hit an interesting sweet spot in a danger-to-cost-to-fix ratio. As mentioned in the paper, most vulnerabilities have well-known solutions (such as ip address randomization or dedicated machines) but the cost of deploying them outweighs the benefit of preventing the attacks. Thus, such attacks can continue unabated until the balance changes.
In our views, the direct applicability of this paper is low, as it depends on EC2-specific details and many attacks are impractical at best, but the field of cloud security is a new one and this is a solid foundation for further work which may be more dangerous in a real-world situation. Cloud cartography, if generalized or at least applied to other clouds, can be a particularly dangerous stepping-stone, as it sets up more dangerous escape-from-VM or data theft attacks. It would be interesting to see where the cloud security field has gone since this paper was published.
Other than mild applicability concerns, the only flaw we noticed in this paper was that certain experiments were run on simulations instead of real machines. Although this likely saved a good deal of time, it makes it harder to claim real-world applicability.
Posted by: David Victor Igor Seth | March 7, 2012 11:12 PM
This paper claims to show that the approach commonly taken by cloud computing providers -- that of providing compute-on-demand services by placing virtual machines for customers -- opens up new security vulnerabilities. The authors focus on Amazon's EC2 service. They show that the EC2 cloud's infrastructure can be mapped, that VM's in the EC2 cloud can often be located, and that there are ways of detecting colocation with a target VM and ways of greatly increasing the chance of such colocation such that, if an attacker is willing to spend enough, she can more or less guarantee colocation with a target VM. The paper also explores attacks that might be made possible with such colocation.
The authors start off by detailing how they mapped the internal structure of the EC2 service. The were able to discover that all the internal IP addresses associated with the service were divided among the three zones. They also discovered that there seemed to be a tight correlation between instance type and internal IP address. This means that, by examining the internal IP of a target VM, an attacker can determine the best zone and instance type to use to maximize her chances of coplacement. Using this technique alone, the authors show that one can achieve a success rate of about 8-9%. However, the authors also observed sequential and parallel locality, such that, by launching a group of VM's soon after the target VM was launched, the probability of coplacement rose to nearly 40%.
Next, the authors explored several potential methods of using coplacement to exploit the new attack surface of the target VM. In this paper, the authors only seemed to explore information leakage, primarily using previously established techniques to measure system load on the other VMs. The authors suggest that such techniques could be used to estimate traffic rates, monitor keystrokes, and measuring cache usage.
One flaw with the paper is that the authors do not clearly establish the seriousness of the ability to get coplacement with target VMs. Of the side channel techniques covered, none of them seem especially notable. Estimating traffic rates for web servers, for example, could surely be better done using Alexa or some other analytics site. Similarly, it is not clear to me exactly how measuring cache usage could be particularly useful information, especially since there could be VMs other than the probe and the target on the same machine that are causing noise. Keystroke monitoring, of course, would be a very serious concern, but the authors fail to show that it would be a viable technique in the EC2 service.
Another flaw is the authors' continued suggestion of offering customers the option of getting a machine on which they are the only allowed VM for an extra fee. I believe that a service like this defeats the entire purpose of cloud computing. It would surely require implementing VM migration, adding complexity to application development and cost to customers. In addition, there are already services that lease dedicated servers, and it seems to me that if a customer wanted such a thing, a dedicated server would be the more logical solution.
Posted by: James Paton | March 7, 2012 10:51 PM
The paper talks about the new vulnerabilities that are introduced when cloud computing resources from a third party cloud service provider are utilized. The reason behind this is that the cloud providers as a means to reduce cost and resources multiplex the same physical machine across VMs that may have different owners. Experiments are conducted on the Amazon EC2 and the paper shows that it is possible to place the attacker’s instance on the same physical machine as that of the target instance with reasonable accuracy that can then be used to extract information from the target using cross VM side channel attacks.
The attack proceeds in two phases. In the first phase, he launches instances that co-reside on the same machine as the target. To do this the attacker has to know the location of the target instance and should also be able to verify that the malicious instance and the target instance actually co-reside. To do this experiments are conducted to get a prior knowledge of how the mapping of the internal IP addresses are assigned among instances and it is observed that there is a direct mapping between the availability zone, instance type and the internal IP. This reduces the number of instances that need to be created to place the malicious instance on the same machine as the target.
It is shown that even a brute force method of creating a number of instances such that an insignificant number of them co-reside with targets gives a coverage of 8.4%. However, it can be argued that the target size is large and m1.small instance type was chosen which gives the advantage that there is a greater probability of co-residence when compared to larger instances. Nevertheless, it seems like an insignificant amount of coverage for a brute force technique. When this is combined with locality knowledge that is learnt via the internal mapping, the coverage increases to 40% that is indeed huge! To verify co-residence of two instances the authors check if they have the same Dom0 IP address, have small round trip times and have numerically close internal IP address which gives almost zero false positives. Once co-residence is achieved it can be used to measure the cache usage using Prime + Trigger + Probe measurement that helps in knowing when the other instances experience load. This can also be used to infer timing between keystrokes that can then be used to perform password recovery.
As a solution, the cloud provider may introduce randomness in internal IP mapping which would increase attacker’s time but not prevent it completely. Another solution is to use blinding techniques to minimize information leakage. The author however suggests that these do not completely remove the possibility of an attack and that a choice to the end user to have the full physical resource to run only his instances would be the best solution.
In conclusion the paper nicely brings out the specific security threats that using a third party cloud service may introduce. In a practical setting, it may be more difficult than described to attack *ANY* specific target instance without some assumptions regarding launch time and instance size. Nevertheless, the paper clearly brings out that the possibility cannot be ruled out.
Posted by: Madhu Ramanathan | March 7, 2012 10:38 PM