Slashdot contributor Bennett Haselton has written an essay on a subtle privacy issue affecting many websites (including Slashdot!) He says "Suppose your girlfriend called up Match.com and said, "I think my boyfriend might be cheating on me. His e-mail address is joeblow - at - aol - dot - com. Can you tell me if he's a member?" And Match.com phone support told her, "Why, yes, he is a member. You'd better have a talk with him." After you had gotten over the guilt of getting caught -- I mean, the guilt of cheating -- would you not feel like Match.com had violated your privacy by telling a third party that you were a member?" Keep reading to see what he's getting at and to decide if and when it's a problem.
Something like this is actually possible with quite a few well-known sites -- given a person's e-mail address, it is possible to find out if they have an account with Match.com, PayPal, Netflix, eBay, Amazon, and Google (and, by the way, Slashdot [CT: We'd fix it if I thought it mattered]). For some of those sites, it may even be possible to take a long list of e-mail addresses and use an automated process to find out which of those addresses have accounts with those sites (something I didn't want to risk trying myself, but as a general rule, if you can do it once, you can do it many times, at least if you do it slowly enough). It does not enable the attacker to extract addresses from a site's membership rolls, which is a much more serious type of breach -- in this case, the attacker would have to already know a list of e-mail addresses, and would only be able to find out which of those addresses have accounts with a given service. And it definitely wouldn't enable an attacker to extract more sensitive information like passwords or personal data. But the ability to get a yes/no answer for whether an e-mail address belongs to a member of a given site, should be something that the site designer should take into account. I'm not even saying that it should necessarily be considered a security hole in most cases, just that it should be something that the site designers decide whether or not they want to permit it -- not something that was left in the open accidentally. Representatives from PayPal and Netflix assured me that they knew about the possibility of this attack and had countermeasures to detect it. In the case of Match.com, on the other hand, I would argue it looks like an oversight. For other sites, whether it's a security hole or not depends on your point of view.
There are three main causes for concern with this issue. The first is simple privacy -- for a site like Match.com, a person may not want other people to be able to find out that they're a member. The second is the possibility of making phishing attacks easier. If a phisher sends spam to a huge number of recipients, hoping to trick them into entering their login details on a counterfeit site, then generally their success rate would be proportional to the number of recipients who are members of that site (of which a certain percentage will be duped into entering their login info), but the speed at which the phishing site is shut down would be proportional to the total number of recipients (since any recipient would carry the same likelihood of reporting the phishing site to an ISP and helping to get it shut down). So if the phisher could find out which addresses on their list belong to actual members of a given site, and send mail to just those people, they could get more successful attacks in proportion to the number of e-mails sent. This is especially true of "puddle phishing" attacks, where only a small percentage of recipients are likely to be members of the site being phished. The third possibility is that the data could be valuable to spammers wanting to advertise a competing site -- a spammer advertising a dating site, for example, could get more band for their buck by advertising only to Match.com members. (Maybe even try a hybrid spam-with-just-a-hint-of-phish -- spam that says "Rejected a lot on Match.com?" to make the user think at first that the e-mail really is from Match.com, but then steer them towards a competitor.)
With a build-up like this, the attack is disappointingly simple. (In fact, I listed the possible consequences of the attack first, because otherwise the attack itself is too easy to dismiss.) If you haven't already guessed at least one of these methods, the three easy ways to find out if an e-mail address is associated with an account at a given site, are:
- Try to create a new account with that e-mail address. See if you get an error message saying the address is already associated with an account.
- Log in under an existing account, and try to switch to another e-mail address. See if you get an error message saying the address is already associated with an account.
- Use the forgot-your-password feature to request a password be sent to a given e-mail address. See if you get an error message saying that address is not associated with an account.
With most popular sites that I tested, at least one of the above methods fail, but at least one other method succeeds. On Netflix, for example, the forgot-your-password form requires you to enter a last name and a credit card number, so that form can't be used to find out who is a member. On the new member signup page, though, you can enter an e-mail address and be told whether that e-mail address already belongs to a member. With Match.com, on the other hand, I already mentioned the weakness in the password-reset form, but if I tried to sign up for a new account but I didn't correctly pass the Turing test (reading numbers off a graphic and entering them in a text field), Match.com wouldn't tell me if the e-mail address was associated with an existing account. So that form could not be used to sift through 100,000 addresses and find which ones were Match.com members, but it could be used to find out if an individual person was a subscriber.
There are at least two simple countermeasures to this type of attack. The first is to require a Turing test when a user creates a new account, requests a password reset, or changes their e-mail address on file, and make sure that if the Turing test isn't completed correctly, then no error message is displayed about whether a given e-mail address does or does not exist in the system. This makes it hard for attackers to sift through a mountain of e-mail addresses finding out which ones already belong to accounts, but it still enables someone to check if someone is a member, one person at a time. For sites where that would be a privacy concern (again I'm thinking of Match.com), the other solution is better: send an error message to the e-mail address entered, not displayed to the user in their browser. If you try to sign up as firstname.lastname@example.org, and that address is already associated with an account, then display the normal message telling the user to check their inbox for confirmation -- but then send them a message saying their address is already in the system. eBay, for example, gets this right on their "forgot your userid" page -- if you enter an e-mail address not associated with an eBay account, it simply says, "eBay just sent your User ID to email@example.com. Check your email to get your User ID." (On the other hand, eBay's new user signup page lets you check if an e-mail address is assigned to an existing member, without needing to pass a Turing test.)
Netflix, eBay and PayPal also responded to say that they had monitors in place to detect "suspicious" activity, saying that even in cases where the forms did not require a Turing test, they could dynamically detect if someone were using a script to submit the form over and over to harvest data, but they declined to go into more detail. It seems to me this could work for forms that require you to be logged-in, but not for forms that don't. For example, on the Netflix new user page, how would they detect if it's the same person submitting e-mail addresses over and over again? Not by IP address -- you can use Tor and farms of open proxies scattered across the Internet to make it appear as if you're coming from lots of different IP addresses. However, consider the PayPal add-a-new-email-address form. This form does not require a Turing test, and does give you an error message if you try to add an address associated with another account. At first I thought this might be a loophole that an attacker could use to find all the PayPal users in a long list of addresses, but PayPal told me that if you do this enough times under the same account, eventually you will hit a limit where the form starts requiring a Turing test. I never got high enough to hit that limit. However, in this case the "dynamic detection" could actually work -- because you can only perform this action while logged in, and after you hit the limit, to continue testing more addresses would require another PayPal account -- and creating additional throwaway PayPal accounts does require a Turing test for each one. So I'll take their word for it that that attack is blocked, although, it seems to me it would be easier just to require a Turing test on the add-a-new-address page.
On the other hand, perhaps in the case of a site like Netflix, it's not something that users really need to worry about, if the company has no problem with it. Big deal, an attacker can find out whether you're a Netflix user -- but that's not a huge privacy violation, it's not like I shamefully hide those red envelopes under my shirt while I'm scurrying back from the mailbox. Now, a spammer can take a list of addresses and run them through the form to find out who is a Netflix customer, and then spam those users trying to lure them to a competing service -- but that's Netflix's problem, not ours, isn't it? (Well, it's our problem that we get the spam. But without using this attack, the alternative was that the spammer was just going to spam everybody on their list anyway, so by that argument, this attack actually results in less spam all around!)
Except... perhaps an attacker could try the third type of attack, a phishing attack to get people's Netflix usernames and passwords, but not in order to compromise their Netflix account, rather to see if the person has an account with the same password at eBay or PayPal. Perhaps a user would be wary of a PayPal phish since they see so many of them, but they might fall for a Netflix one -- although then the attacker's success would be limited to people who had Netflix and PayPal accounts, and were using the same password for them both...
So it seems to me it's not obvious when this should be considered a problem. (All of the sites mentioned in this article were e-mailed about this issue months ago, and so far none of them considered it a serious enough threat to block all three of the avenues of attack listed above.) If abuse of this type becomes common, perhaps eventually these "queryable membership lists" will come to be considered in the same way as open mail relays -- which were never considered a glaring security hole, but were abused in ways that triggered a shift in people's thinking that got them to be gradually phased out, going from open relays being the default standard up to the early 90's, to the point where many ISPs today prohibit customers from running them. Maybe "queryable membership lists" will start to be abused more, if anti-spam technologies get smart enough that spammers can't send 1 million messages at a time any more and have to limit themselves to, say, 100,000 messages at a time to get through people's filters, so they have to pick which 100,000 of their addresses they could get the most value out of. Or maybe things will go in a completely different direction and this will never become a problem. I just think that, for now, we should be aware that some form of this trick works on the majority of sites that require an account, and the types of abuses described are at least possible.