Alice runs a large corporation. Various employees send in reports containing secret information. These reports need to be sorted and routed to various departments, depending on the subject of the message. Of course, each report could have a set of keywords attached, and the keywords could determine which departments receive the reports. But maybe the keywords are sensitive information, too. Therefore, the keywords need to be encrypted. But if the person who sorts the reports decrypts the keywords, then this person sees the sensitive keywords. It would be good to have the keywords encrypted in a way that an authorized person can search for a certain keyword in a message and determine either that the keyword is present or not, and receive no other information about other keywords. A solution to this problem was found by Boneh, Di Crescenzo, Ostrovsky, and Persiano.
Start with a supersingular elliptic curve and point as in Section 22.1. We need two public hash functions:
maps arbitrary length binary strings to multiples of A little care is needed in defining since no one should be able, given a binary string to find with See Exercise 7.
maps the th roots of unity (in the finite field with elements) to binary strings of fixed length; for example, if the roots of unity are expressed in binary, can be a standard hash function.
Alice sets up the system as follows. Incoming reports (encrypted) will have attachments consisting of encrypted keywords. Each department will have a set of keyword searches that it is allowed to do. If it finds one of its allotted keywords, then it saves the report.
Alice chooses a secret integer and computes
The point is sent via secure channels to each department that is authorized to search for
When Daphne writes a report, she attaches the relevant encrypted keywords to the documents. These encrypted keywords are produced as follows:
Let be a keyword. Daphne chooses a random integer and computes
The encryption of the keyword is the pair
A searcher looks at the encrypted keywords attached to a document and checks
If yes, then the searcher concludes that is a keyword for that document. If no, then is not a keyword for it.
Why does this work? If is the encrypted form of the keyword then
for some Therefore,
Suppose conversely, that is another keyword. Is it possible that corresponds to both and ? Since the same value of could be used in the encryptions of and it is possible that occurs in encryptions of both and However, we’ll show that in this case the number comes from at most one of and Since is collision resistant and and we expect that and (See Exercise 1.) Since is collision resistant, we expect that
Therefore, passes the verification equation for at most one keyword
Each time that Daphne encrypts the keyword, a different should be used. Otherwise, the encryption of the keyword will be the same and this can lead to information being leaked. For example, someone could notice that certain keywords occur frequently and make some guesses as to their meanings.
There are many potential applications of this keyword search scheme. For example, suppose a medical researcher wants to find out how many patients at a certain hospital were treated for disease in the previous year. For privacy reasons, the administration does not want the researcher to obtain any other information about the patients, for example, gender, race, age, and other diseases. The administration could give the researcher Then the researcher could search the encrypted medical records for keyword without obtaining any information other than the presence or absence of