In this blog series, I will describe what secure multiparty computation is, what it can be used for, how it works (to some extent), and more. My aim is for the majority of the content to be accessible to a general reader, with some of the material requiring some but hopefully not much technical background. I hope that I succeed in this task.
The setting of MPC is one where a number of distinct, yet connected, computing devices (or parties) wish to carry out a joint computation of some function, while preserving certain security properties in the face of adversarial behaviour. In order to be concrete, consider a group of friends – all working in the same industry – who wish to compute basic statistics on their salaries. They may wish to know the average salary for those with 0-2 years’ experience, 3-5 years’ experience, and greater than 5 years’ experience. This would enable them to understand if they are being underpaid, or in general where they are on the scale (don’t worry, no one thinks that they are overpaid). Until now, in order to carry out such a computation, they would have to provide this information to a trusted third party who would receive their information, compute the statistics, and send back the results. The basic idea of MPC is to enable a set of parties to carry out such a computation without any trusted party (because usually no such trusted party exists). We call such a process an “MPC protocol”.
The concept of MPC considers an adversarial setting, where some of the parties (or an external entity) try to “attack” the protocol. The aim of such an attack may be to learn private information (e.g., the salary of some of the participants) or to cause the result of the computation to be incorrect (e.g., that the average salary is some specific value, so that they can offer someone in the group a job and have them think that their offer is good). As such, two of the most basic security requirements for an MPC protocol are privacy and correctness. The privacy requirement states that nothing should be learned beyond what is absolutely necessary; more exactly, parties should learn the designated output and nothing else. The correctness requirement states that each party should receive its correct output. Therefore, the adversary must not be able to cause the result of the computation to be different from the function that the parties had set out to compute.
Secure multiparty computation can be used to solve a wide variety of problems, enabling the utilisation of data without compromising privacy. To take another example, consider the problem of comparing a person’s DNA against a database of cancer patients’ DNA, with the goal of finding if the person is in a high-risk group for a certain type of cancer. Such a task clearly has important health and societal benefits. However, a person’s DNA is highly sensitive, and should not be revealed to private organisations (see here for good reasons why). This dilemma can be solved by running a secure multiparty computation protocol that reveals only the category of cancer that the person’s DNA is close to (or none). In this example, the privacy requirement ensures that only the category of cancer is revealed, and nothing else about anyone’s DNA (neither the DNA of the person being checked nor the DNA of the patients in the database). Furthermore, the correctness requirement guarantees that a malicious party cannot change the result (e.g., make the person think that they are at risk of a type of cancer, and therefore need screening).
Although relatively straightforward as a high-level notion, the devil is very much in the details. When considering the security guarantees of MPC, how can we actually formally state these requirements of privacy and correctness? What type of adversary are we protecting against, with what adversarial power? For example, do we assume that the adversary can run malicious code? Do we assume that the adversary controls any size subset of the parties or only up to a certain percentage? These questions and many more need to be addressed before we can really understand what MPC is. We will address all of these and much more in later posts.
I would like to conclude this first blog post in the series with an example of how it is possible to compute on information without revealing it. Consider a set of people who wish to compute the average of their salaries without revealing to anyone at all their personal salary. If you want a challenge, spend a few minutes thinking about how to do this before continuing. Now to the solution: we can number the parties from the first to last, in any arbitrary order. Then, the first party chooses a very large random number, adds their salary to the number, and forwards the result to the second party. The second party adds their salary to the value they received and forwards the result to the third, and so on to the end. The last party adds their salary and sends the final result to the first party. Finally, the first party subtracts the original large random number, and divides the result by the number of parties. This is the average (since it is just the sum of all salaries divided by the number of parties). In order to see why this preserves privacy, note that the large random number chosen by the first party hides its own salary and all intermediate sums of the salary (as long as it is large enough). Thus, no individual party sees anything but a large random number. Finally, the first party receives the sum of the salaries, but this is actually equivalent to the average (given the average, multiplying by the number of parties gives the sum of all salaries; thus, this reveals nothing more than the average). The result is that no party learned anything about anyone’s salary, but they still are able to learn the average. Magic! (Actually, I hate calling it magic; it’s just science.)
To complicate things a little, this protocol for computing the average is actually a great example of understanding the adversary. If there is at most one party who is corrupted, then indeed they can learn nothing. However, consider the case that the 3rd and 5th parties are corrupted. These two corrupted parties can collude to learn the 4th party’s input by subtracting the value sent by the 3rd party to the 4th from the value sent by the 4th party to the 5th. In addition, nothing prevents the first party from lying about the final result. As we will see in later blog posts, technically this means that the protocol is secure against a semi-honest adversary corrupting at most one party. For now, just enjoy the fact that no one learns anyone’s salary, but they can still compute the average.