Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think it's a bit conspiracy-theorist to say that companies do this because they want to use everything they can get. The relatively easy privacy maintaining alternative (hash address book contents and store the hash, and check against hashes when people join) is simply not as obvious as simply uploading what you get from the API.

Most app developers are just trying to get a job done as quickly as they can, and are in that hustle are choosing the path of least resistance, rather than thinking, "I really want to exploit this data as much as possible and invade as much privacy as possible."



Totally agree. I'm actually surprised at how many people assume this was done with malicious intent.

There are still plenty of sites storing plaintext passwords. I doubt there's a data mining conspiracy there (although I bet you could make some interesting guesses about people based on their password choice). It's just a poor design that accomplishes its task in the simplest way possible.


> I'm actually surprised at how many people assume this was done with malicious intent.

I don't care whether or not it was done with malicious intent. What bothers me is that copies of my address book are floating around out there without my permission.


Would hashing contact information placate those who are outraged at this practice? It would still enable the app to associate you with other users of the app without your explicit information.

It seems to me that the biggest complaints are that Apple doesn't popup a permission dialog before allowing an app to access your address book, and that Path's privacy policy seemed to omit that they were using your address book.


Keep in mind too, that the data involved is all small enough that rainbow tables could easily be used to reverse them. My cell phone number's unsalted MD5 hash is trivially reversable via a google search - and if you salted it, you couldn't then compare it to the hash out of someone else's address book.

I've seen rainbow tables claiming 100% coverage of all <14 lowercase characters. I'd bet reasonable money that there's a rainbow table specifically generated for email-address-like strings and another for name-like-strings. I'm pretty sure both names and email addresses have a lot less entropy than random lowercase letter for the same lengths.

Using hashes to obfuscate while still maintaining comparison ability of low entropy data really doesn't help security much…


Particularly because 85% of "average" people in north america keep their email in one of 5 mail domains (hotmail.com, aol.com, yahoo.com, gmail.com, Facebook.com) - that, plus the low entropy of names - means the rainbow tables would probably have a 95% hit rate at relatively small sizes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: