Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Ask Slashdot: Best Practices For Collecting and Storing User Information?

Unknown Lamer posted more than 2 years ago | from the design-by-committee dept.

Privacy 120

New submitter isaaccs writes "I'm a mobile developer at a startup. My experience is in building user-facing applications, but in this case, a component of an app I'm building involves observing and collecting certain pieces of user information and then storing them in a web service. This is for purposes of analysis and ultimately functionality, not persistence. This would include some obvious items like names and e-mail addresses, and some less obvious items involving user behavior. We aim to be completely transparent and honest about what it is we're collecting by way of our privacy disclosure. I'm an experienced developer, and I'm aware of a handful of considerations (e.g., the need to hash personal identifiers stored remotely), but I've seen quite a few startups caught with their pants down on security/privacy of what they've collected — and I'd like to avoid it to the degree reasonably possible given we can't afford to hire an expert on the topic. I'm seeking input from the community on best-practices for data collection and the remote storage of personal (not social security numbers, but names and birthdays) information. How would you like information collected about you to be stored? If you could write your own privacy policy, what would it contain? To be clear, I'm not requesting stack or infrastructural recommendations."

Sorry! There are no comments related to the filter you selected.

Just don't do it (5, Insightful)

sublayer (2465650) | more than 2 years ago | (#41296233)

Best practice from my perspective: do not collect the data at all.

Re:Just don't do it (5, Insightful)

puterguy (642044) | more than 2 years ago | (#41296277)

If you really feel the need to collect personal data and you *truly* care about the privacy concerns and needs of your customers, then don't go burying such disclosures in a privacy statement that the average user is unlikely to ever see let alone read.

If you truly care about privacy, then either require the user to *opt-in* to such sharing or prominently display the lack of such privacy on the initial splash screen.

Burying the collection of personal data in the middle of some lawyerly gobblygook privacy statement is like mortgage lenders burying key terms in the middle of 100's of pages of documentation. Yeah, it's legally there but no one is actually going to read or understand it.

Re:Just don't do it (0)

Anonymous Coward | more than 2 years ago | (#41296439)

That's good stuff. I think there's a lot to be said for being honest and normal with people. You don't have to be obnoxious about it, just tell me what you're doing, up front, in plain english. It'll earn you a little trust and appreciation from your users.

Re:Just don't do it (2, Insightful)

philip.paradis (2580427) | more than 2 years ago | (#41296783)

Alternately, people could simply take responsibility for themselves and choose to avoid services which require agreement to miles of terms. Given your attitude on the topic, you probably haven't even bothered to read the terms of service for anything you're using right now. It seems you're trying to divert responsibility for yourself onto the backs of the service organizations you choose to deal with. Again, note the word "choose."

You've also managed to miss the opportunity to discuss where data goes and how it's protected after it's submitted in the first place. Oddly enough, this is the essential question posed by the submitter in the first place, and regardless of what any given set of terms says, is actually the most important piece that very few people think about at all. In other words, you can trust an organization to high heaven based on what they say they will or won't do with your data, but if their infrastructure is a gaping mess of channels by which your information could get compromised, all of a sudden those terms don't mean much. I applaud the submitter for asking the right questions, and remind you to think more about your responses in terms of real wold data acquisition and retention mechanisms before posting again.

Re:Just don't do it (5, Funny)

davester666 (731373) | more than 2 years ago | (#41296845)

Yes, just store the data in plaintext, in a mysql database connected directly to the internet.

Bonus points if you create mysql users for each unique user and use their username/password to authenticate connections to the database.

Re:Just don't do it (1)

rwise2112 (648849) | more than 2 years ago | (#41298205)

I was going to say "email it to Anonymous - they'll back it up for you too", but your method will be just as effective!

Re:Just don't do it (3, Interesting)

CodeBuster (516420) | more than 2 years ago | (#41296949)

Whenever I'm signing up for a new site or using a service for the first time, I always do a recon of their sign up procedures using a fake name / email address so I can see what sort of information they "require" before I even get started and even then I only give up what I absolutely have to. If I can get away with using the fake information permanently, then I do that. I keep track of all my fake identities in an encrypted file container by site name so that I can be consistent with my aliases. This strategy works well for me and I'm sure that I can't be the only person out there who does this. As Robert De Niro's character, Jack Byrnes, said in Meet the Fokkers (paraphrased), "If you're outside the circle of trust, you're on a need to know basis and right now you don't need to know."

Re:Just don't do it (1, Insightful)

hutsell (1228828) | more than 2 years ago | (#41296507)

Best practice from my perspective: do not collect the data at all.

Exactly: "Put the Database down now, and step away from the Internet."

Sorry, but my interest in giving beneficial doubt to the question's possible sincerity was lost when reading the part about the unoriginal solution for insuring honesty and transparency -- the solution being hidden in (the lawyer make-work terms of) "our privacy disclosure".

Re:Just don't do it (2)

isaaccs (1854142) | more than 2 years ago | (#41298737)

There were little to no details given as to how the privacy disclosure would be phrased or provided to users. As it were, your assumption is wrong. There is no desire to squirrel away anything in legalese. Indeed, the question asks: "If you could write your own privacy policy, what would it contain?". You describe the "hidden" (which you've assumed) solution as unoriginal, but provide no alternative suggestions (which was the point of submitting the question to the community in the first place).

Re:Just don't do it (-1)

Dupple (1016592) | more than 2 years ago | (#41296575)

Ask Slashdot? Ask Google!

Re:Just don't do it (2)

fm6 (162816) | more than 2 years ago | (#41296613)

So, Slashdot made a mistake in allowing you to create an account?

Re:Just don't do it (2)

c0lo (1497653) | more than 2 years ago | (#41296729)

Best practice from my perspective: do not collect the data at all.

More detailed:
Rule 1. don't do it
Rule 2. if for some reasons, rule 1 cannot be followed, collect them but discard them immediately
Rule 3. if for some reasons, the prev 2 rules cannot be obeyed, after collection put them on a WORN storage (that is: "Write Only, Read Never" media)

Re:Just don't do it (0)

Anonymous Coward | more than 2 years ago | (#41298027)

Use the NULL storage engine.

Meaning: (0)

Anonymous Coward | more than 2 years ago | (#41296953)

Be very careful indeed what you really need, and collect only that. The less data you collect the less you have to worry about.

Note that the easy cop-out is to stick someone else with the trouble, like "supporting" facebook logins or something, but that's actually worse. The why is left as an exercise, but rest assured that if you do that I'm certainly never going to sign up with you, just like I won't be signing up with facebook, or google accounts, or any of the others you might be "supporting".

Re:Just don't do it (1)

Anonymous Coward | more than 2 years ago | (#41298089)

I hate to IANAL here but here goes:
In your country of origin you have legislation that you have to prove compliance to should your respective government body find out if you are collecting user information. Personally identifiable stuff (Name, address, Phone Number, E-mail) is considered sensitive, Personally Identifiable sensitive stuff (Social Insurance, Health Records, Employment History, Criminal Records etc. ad. nosium,) comes with hefty legislation like HIPPA for each type of stuff. Again the parent here is correct by "Not collecting, saving or distributing anything"; you save yourself the headache of having to tell your respective government what you are collecting and how long you have to store it; never mind the fact that you need the storage and audit trail on whatever you do collect (e.g.; SOX). Europe and commonwealth countries take this VERY seriously the governments have armies of lawyers who's sole joy in life is to SUE some poor un-complaint (potentially unscrupulous) company into oblivion. Canada's Privacy Commission sued Google over it's data collection practices more than once; the EU has filed numerous complaints and these things are not cheap. AGAIN, IANAL but YMMV and you should hire one to CYA.

Re:Just don't do it (1)

Eadwacer (722852) | more than 2 years ago | (#41299049)

I think it was Robert X. Cringely who compared personal user data to toxic waste. You don't ever want to produce it. If you do produce it, it's your responsibility forever because you don't know where an undiscovered drum of it is hiding. If it touches something, that something becomes toxic also. Finally, the legal implications of it getting out into public are capable of destroying your company.

risk vs. investment tradeoffs (4, Informative)

noh8rz10 (2716597) | more than 2 years ago | (#41296235)

I think your mind is on the right track in identifying your resource limits (i.e. no tip-of-the-spear experts) and the sensitivity of the data (i.e., it's not all nuclear bomb codes). That is the first step. Next, think on the exact types of data that you're collecting, and try to group like data together, for example, all text data, screen caps, keylogging, audio or webcam video if you have it, and find a way to store them in an efficient structure while everything stays linked together. Finally, if possible, associate all data collection events with time (timestamp) and location (gps). this will allow a more complete analysis on the back end.

Re:risk vs. investment tradeoffs (3, Insightful)

SomePgmr (2021234) | more than 2 years ago | (#41296453)

Finally, if possible, associate all data collection events with time (timestamp) and location (gps).

It started getting a little creepy there at the end, bud. ;)

Re:risk vs. investment tradeoffs (0)

Anonymous Coward | more than 2 years ago | (#41297511)

What, you missed the screen caps and keylogging part?

Re:risk vs. investment tradeoffs (1)

asylumx (881307) | more than 2 years ago | (#41298757)

If you think that's bad, keep reading:

this will allow a more complete analysis on the back end.

He wants to analyze users "back ends"!!!

Re:risk vs. investment tradeoffs (0)

Anonymous Coward | more than 2 years ago | (#41296623)

associate all data collection events with time (timestamp) and location (gps)

Why location? Why is everybody obsessed with location data these days?

Re:risk vs. investment tradeoffs (-1)

Anonymous Coward | more than 2 years ago | (#41296667)

Who controls the past controls the future. Who controls the present controls the past...

Re:risk vs. investment tradeoffs (0)

Anonymous Coward | more than 2 years ago | (#41298195)

"Microsoft: Where are you going today? (never mind, don't tell us...)"

Re:risk vs. investment tradeoffs (0)

Anonymous Coward | more than 2 years ago | (#41298313)

If they know where you are:
Lead them to a restaurant down the street.
Steer them passed a store front and entice them.
Figure out their normal routine and put stuff up in front of them.

Re:risk vs. investment tradeoffs (0)

Anonymous Coward | more than 2 years ago | (#41296787)

While you're at it, add some remote desktop feature. You never know when you may need to assist your users in using your app properly. You'll have to get them to install some sort of app exploiting an OS vulnerability for best effect though.

Re:risk vs. investment tradeoffs (1)

noh8rz10 (2716597) | more than 2 years ago | (#41298407)

Good point. White hat root kits!

user info (-1)

Anonymous Coward | more than 2 years ago | (#41296239)

You are just another effin asshole collecting user data. you should be assfucked!

Give all the infokmation to me (1)

For a Free Internet (1594621) | more than 2 years ago | (#41296249)

I promiste that I will only use it to fantasize about my honeymoon in Italy with Laura. Woooowoowowowoooooooo. Did you know that Italy is shaped like a boot? More proof that Italy is awesome, whereas the U.S., a crappy little country full of morons and goatfuckers is shaped, appropriately, like a big splatted turd.

It's a Trap! (-1)

Anonymous Coward | more than 2 years ago | (#41296259)

Seriously.

Don't store the data. (0)

micheas (231635) | more than 2 years ago | (#41296287)

Just don't.

When you get the expertise to store the data securely then consider it.

Once you get into the habit of justifying everything that you store you will be less prone to the woops! plain text password/username/real-name/creditcard table being found by intruders.

Re:Don't store the data. (1)

isaaccs (1854142) | more than 2 years ago | (#41298647)

To start, I do appreciate the spirit of the comment - as a professional in a field, it's an argument I make often. But I don't totally agree in this context. It would proove extremely difficult, for example, to build a search engine such as Google without collecting or correlating user information. To build Instagram without collecting pictures (which I'd very much consider private user data/personal identifiers) might also prove vexing. The question wasn't "Should I collect user information?" but "How can I do something that I must do - popular opinion of the widespread practice not-withstanding - responsibly". You suggest that it should only be done when done by an expert: I am admittedly not an expert in securing data. I am an expert in software development, and this is now an area I need to begin to explore. To simply suggest that an ambitious tech startup "shouldn't" innovate in a space because they don't have the material resources to hire an established specialist on one of the myriad topics that goes into building a software product, is, to me, quite close-minded and defies the spirit of do-it-yourselfedness and indeed innovation that makes the startup space and tech sector so exciting to begin with.

Let me have a login? (1)

aliquis (678370) | more than 2 years ago | (#41296289)

Let me have a login for the benefit of having my data saved?

If I don't log in then don't store my details.

As for the rest whatever. Hash + salt or whatever?

If no-one can reach / use the data for anything then maybe say just e-mail address or something such as identifier.

I'm an experienced developer (0, Flamebait)

Osgeld (1900440) | more than 2 years ago | (#41296291)

I am not an experienced developer, but if I were I sure as shit would not be asking about it on slashdot, to be honest as I stand here now, if I needed some serious advice about any situation, I would not be asking on slashdot, I would be asking on a forum where people live and breathe the topic at hand like their lives depend on it, cause they are professionals, and not the peanut gallery of random trolls, tards, fanbois and neckbeards.

Re:I'm an experienced developer (-1)

Anonymous Coward | more than 2 years ago | (#41296341)

ck fu tam bien

Re:I'm an experienced developer (1)

ThatsMyNick (2004126) | more than 2 years ago | (#41296373)

Well, if you are looking for developer/legal opinions there are better forums, but if you want legal, developer and user opinion (and a discussion based on them), slashdot is not bad. Besides you dont really know that OP has not also posted in a better developer/legal oriented forum (and I find it strange that you mention that you wouldnt post on slashdot, buy fail to mention the forum that is appropriate for this question (unless you yourselves were just trolling)).

Re:I'm an experienced developer (0)

Osgeld (1900440) | more than 2 years ago | (#41296425)

I dont know the appropriate forum, as I am not an experienced web developer, nor would I expect any serious answer from slashdot when I do need it, I develop electronics, I dont post which FET has the best ESD damage resistance on slashdot, nor would I expect anything but random opinion from it.

when your serious, you get the data from people who have been down that road, and test it yourself, not post to some news recycler and hope for the best.

Re:I'm an experienced developer (3, Insightful)

SomePgmr (2021234) | more than 2 years ago | (#41296491)

I'd give him the benefit of the doubt, and assume this isn't the only place he's looking for best practices.

Meanwhile, "I'm an experienced developer, I'm familiar with all the general rules for securing customer data, but I'd like to hear of any 'gotchas' that you know about"? That seems like a reasonable thing to ask.

Again, assuming this isn't the one-and-only source. So instead of grabbing our pitchforks, maybe someone has some examples of what he asked about?

Re:I'm an experienced developer (1)

Anonymous Coward | more than 2 years ago | (#41296705)

There's the blatantly obvious stuff: keep the data heavily encrypted on a back-end d/b or file store, on a server nowhere near a public-facing interface (or DMZ); obfuscate and/or consolidate the individual, personal data as soon as you gather it, assuming you don't need specific per user info to be retained. Needless to say, keep all your OS/software/services/apps/etc patched with latest security on a weekly, if not daily basis, FFS!

Also, invite some wannabe hack-meisters you can kind-of-trust to try & break into your environment, just to see if they can... ;)

Re:I'm an experienced developer (1)

Electricity Likes Me (1098643) | more than 2 years ago | (#41297181)

Isn't your first bit of advice right there a classic gotcha?

Encryption doesn't mean anything unless the access routes to that encrypted data are well defined and understood - since at some point it has to be unencrypted to be used. So who's doing the unencrypting, who holds the keys etc.

Re:I'm an experienced developer (1)

isaaccs (1854142) | more than 2 years ago | (#41298519)

In this forum, I submitted to seek the opinions of a community of technically minded individuals on a question that hinges on broader social concern. I did/do not expect a uniform or comprehensive answer. I expected to hear the voices of different people who have thought about, dealt with, or otherwise concern themselves with data collection. I am much aware that this is not a legal or technical venue - and I appreciate your acknowledgement that this may not be the only avenue I've pursued to inform myself.

Re:I'm an experienced developer (2, Interesting)

Anonymous Coward | more than 2 years ago | (#41296375)

Agreed. People mistake this for a technical forum.

Re:I'm an experienced developer (1)

isaaccs (1854142) | more than 2 years ago | (#41298183)

The question specifically says "I'm not seeking stack or infrastructural recommendations." This is not a technical question. The question is posed to the community as it bears on *social* issues.

Re:I'm an experienced developer (1)

noh8rz10 (2716597) | more than 2 years ago | (#41296449)

thank you. I'm updating my sig with your quote.

Re:I'm an experienced developer (0)

Anonymous Coward | more than 2 years ago | (#41296675)

Welcome to the club!

Re:I'm an experienced developer (0)

Anonymous Coward | more than 2 years ago | (#41296767)

Here's the rub - professionals want to get paid.

Re:I'm an experienced developer (0)

Anonymous Coward | more than 2 years ago | (#41296821)

.... not the peanut gallery of random trolls, tards, fanbois and neckbeards.

So which category are you in? /. has good comments but you have to wade through the dross.

Re:I'm an experienced developer (1)

Anonymous Coward | more than 2 years ago | (#41297529)

the problem is not that /. has both good and bad, it is if you don't know the answer then how the hell do you think they will know enough to sort the good from the bad. reading through he has already gotten a mix of both for this topic.

Re:I'm an experienced developer (1)

bloodhawk (813939) | more than 2 years ago | (#41297513)

Not sure why you got marked flamebait, Even as a developer I find your comments spot on. If you are not experienced enough to know the answer to this topic then /. is not the place to be asking as you won't have the knowledge to sort the garbage from the good advise. Incidentally I would love to know the name of this new site as I think it is one I would avoid for my own safety.

Don't (4, Informative)

SmartyPants (27576) | more than 2 years ago | (#41296317)

honestly... try not to store it.

You need to examine why you actually need the data, and if you can't think of a good reason (except it might be valuable in the future), then don't store it.
If you do need it for analysis, machine learning apps, etc, try to anonymize it as early as possible, and not to keep raw data longer than you need it. (say raw data for 3 months, then just store aggregate info).

also.. for behavior.. you don't need years of information, studies have shown people change, so make sure the things people do recently are more important, and the old stuff gradually decays.

Re:Don't (1)

mcrbids (148650) | more than 2 years ago | (#41296789)

As a counterpoint, Don't process all that data.

At my company, we store everything. Every click, every bit of data, nightly snapshots of all data, etc. Forever. This results in stupid amounts of data about our users and we pretty much don't bother to try to correlate the data, we just provide it upon request of the customer.

Why try to correlate it, when our customers are eager to pay us to do other things with it? Just because you have the data, doesn't mean you have to be devious with it. Save everything relevant, and then look after the interests of your customers. You'd be surprised at just how far a policy like this works for you!

Our customers are almost unusually loyal; they almost never leave after 1 year of using our services, and the trust is universally present even when the inevitable problems appear as we upgrade/enhance/update our softwares.

The truth is that operating in the best interests of your clients is actually a rather effective business strategy!

Re:Don't (1)

stephanruby (542433) | more than 2 years ago | (#41297357)

If your purpose is really just for "analysis and ultimately functionality, not persistence" then there is really no reason to keep an email or a name. Just assign a unique identifier, and then you're done.

So if for some reason, the user wants to get in touch with you to file a bug report, or what not, then assign a unique identifier for the device to the bug report (in case you get other bug reports coming from the same source), but don't ask for his/her contact information unless the user ticks a box asking specifically for a reply from you. Basically, if you tie your requests for information directly to your user actions, then it will become obvious to your users why you need the information you need.

And if you need their emails for marketing reasons, basically the marketing department wants it -- then be upfront about that too. If you're upfront and honest with the way you're going to handle or mishandle information, and not try to bury it under vague language and pages and pages of terms and conditions, then I think most users will still be willing to share their information with you. That being said, don't forget to research and comply with local laws too, in the regions where your application(s) will be made available.

Start reading about PII (3, Informative)

Anonymous Coward | more than 2 years ago | (#41296331)

Wikipedia (http://en.wikipedia.org/wiki/Personally_identifiable_information) is a good start.

Break the association (4, Insightful)

cheros (223479) | more than 2 years ago | (#41296377)

If at all possible, stay away from personally identifiable data. If your aim is to use identity as an index, work out a way in which you can translate an identity into an an index or hash value (i.e. one way). This is not going to be perfect (there will be about a million "John Smith"s out there), but if you have a consistent pair such as name and phone number, turn that into a hash and use it as data index.

That means you can still do correlations, but a leak will not result in exposure of personal data.

However, first of all, look at what you're holding on personal data and simply assume you got hacked and it's "out there" - plan for that crisis first because there is one question you need to answer:

If you cannot afford to pay for security advice, can you afford to pay for the inevitable consequences?

Re:Break the association (1)

noh8rz10 (2716597) | more than 2 years ago | (#41296467)

If you cannot afford to pay for security advice, can you afford to pay for the inevitable consequences?

put another way: if you can't afford to do it right, how will you afford to do it again?

Re:Break the association (1)

isaaccs (1854142) | more than 2 years ago | (#41298227)

There is validity to this point, but followed to it's conclusion, many of the great boot-strapped startups of our time wouldn't exist. As your exposure and user base grows, so does your ability to consult with specialists and experts - but everyone must start somewhere.

Re:Break the association (1)

dgatwood (11270) | more than 2 years ago | (#41296663)

Or keep personally identifiable information separate from everything else. Ensure that you cannot get to one data set from the other and vice versa. Use login information as a hash into the identity database and the behaviors database. If you must store any time stamps on database records, make sure you do so in a way that prevents using them to easily correlate the two data sets (e.g. update the time stamp on the personal info record only when the user changes his/her password, address, or whatever, rather than at every login).

To the extent possible, store the information locally on the client side, or if you must store it on the server (for synchronizing between multiple computers), encrypt it client-side and send the encrypted blob to the server. Sure, that wouldn't prevent you from getting the information (because you control the site's code), but it does make it unlikely that somebody who compromises your database server can get the information.

Re:Break the association (1)

Lorens (597774) | more than 2 years ago | (#41296869)

If your aim is to use identity as an index, work out a way in which you can translate an identity into an an index or hash value (i.e. one way). This is not going to be perfect (there will be about a million "John Smith"s out there), but if you have a consistent pair such as name and phone number, turn that into a hash and use it as data index.

Bad idea when you get a hash collision. Account numbers do not have to be seen by the user, but there aren't (m)any useful ways of avoiding their use internally.

If OP is storing data for analysis and not for immediate reuse, there are some often overlooked but stupidly easy things to do like making sure that the user-facing machines collecting the data only have append/insert access to the data (no read, no modify). Analysing the data would be done from another machine/subnet/database account whatever.

Re:Break the association (2)

cheros (223479) | more than 2 years ago | (#41296919)

He said he had little money available, so I figured I gave him something that was easy vs. perfect. The key question is if the delta introduced by the odd hash collision is actually significant in the volume of data he is planning to process. If it isn't, I would not try to develop perfection - he can use his little funding better elsewhere..

In other words, in theory you're absolutely right, in practice I suspect there is little difference. But my favourite way of avoiding issues with personal data is simply not collecting them in the first place. Unless you are Google and get away with a pathetic fine, of course..

Re:Break the association (1)

fa2k (881632) | more than 2 years ago | (#41297757)

Great idea for some cases. If you need "telemetry" data to understand how people are using your application, assign each session a unique ID and don't store which user did it. It also works for some other statistical data. The argument against is that you may need the correlation between sessions later.

Depending on the application, you could have a hierarchical system of databases where the lowest level contains session information, the next contains persistent user information but not personally identifiable info, and the highest level contains username, password, name, etc. You could have just a few components that have access to the top level, including the login component. The latter could load the information into the session state, so you could display the username on every page, for example. It's just something I thought of, I'm not an expert (I wrote a quiz and a page for a small event 8 years ago;).

Re:Break the association (1)

fa2k (881632) | more than 2 years ago | (#41297769)

Regarding my second paragraph, an important part was not obvious: Each session in the session database has a unique ID, and each anonymised user in the middle database has a list of sessions, and each user in the top database points to an anonymised user.

Collect as little as possible, throw it away... (4, Interesting)

IBitOBear (410965) | more than 2 years ago | (#41296407)

I have been toying with a site idea. Your account name is your public key fingerprint. You public nicname is whatever you use in the message. Your login is validated because everything you send is signed wiht the key that matches the fingerprint (and encrypted with my public key for transmision). Input to user form is constrained and validated within those constraints (to prevent padding attacks).

I would then have a database "key x","paid through date y".

Sure, I couldn't sell any farmed data a-la facebook, but suppoena requests woudl be a breze... "here's your hex dump..."

P.S. (1)

IBitOBear (410965) | more than 2 years ago | (#41296477)

Return email will be sent, if necessary, to whatever address(es) are registered in the public key database for that fingeprint, encrypted with that key.

Obviously I have no control over your passphrase and can do nothing to help you "recover your password" or whatever. Please see your GPG or PGP documentation for a better explanation.

Your account will not be "renewed" past the key expiration date.

Re:Collect as little as possible, throw it away... (1)

Rob Kaper (5960) | more than 2 years ago | (#41297191)

I have been toying with a site idea. Your account name is your public key fingerprint. You public nicname is whatever you use in the message. Your login is validated because everything you send is signed wiht the key that matches the fingerprint (and encrypted with my public key for transmision). Input to user form is constrained and validated within those constraints (to prevent padding attacks).

I would then have a database "key x","paid through date y".

Sure, I couldn't sell any farmed data a-la facebook, but suppoena requests woudl be a breze... "here's your hex dump..."

If you accept payments, wouldn't those keys still be linked to contact information and/or payment transactions?

Payment Recepits (1)

IBitOBear (410965) | more than 2 years ago | (#41297335)

Not for any longer than necessary. Likely I would make that opt-in.

I would have a payment history (bob paid x dollars for y time) as an atomic event. Bob could check a box to say "remember this for me", or not at the time of payment.

At the time of payment I would also send Bob a receipt. That recept would say "Bob paid for a service". The receipt would also contain a dot-splash (e.g. Qr Code a linear 2D barcode, depending on how much info space I turn out to need) that was the "proper join record for the database" (e.g. the key tuple that proved that payment X was for service Y on date Z). That tuple would be encrypted with _my_ secret key. Bob could use this receipt by sending it back to me, but I would only have that record until the payment cleared and was essentially irreversible, or when Bob sent it back via email or phone scan etc.

The actual membership information that Key X was paid-up-and-valid until Date Y would be a separate entry.

Think double entry accounting but where the account holder and not the institution had the journal that colated things.

With no start date, and if a person could buy any amount of time, which would be necessary because the key the customer made is only valid till it expires and that expiration date is chosen to the second by the key creator.

There is some ability to back-figure the expiration dates to the purchases and so the purchasers while both sets of data are present, so the user would have the option to "randomize the duration", e.g. for gambling a little of the funds paid they would gain or be shorted a random amount of time within a reasonable percentage of the purchase duration.

The idea is that, at every chance, you give the user the magic cookie, to join the information, but you keep the results. As long as the cookie is cryptographically secured it doesn't mater that they are holding it.

It wouldn't be that hard to figure out who and paid what when, when the user base is first started out, but as the base and transactions mounted the anonymity of payments would increase.

So imagine you want to buy a year, and your public key is good for at least a year, you could buy a year as one transaction, or cut it up into several transactions (like 2 and 3 and 7 months each) to get the year, or you could buy eleven months and bet a month hoping to go long not short. Without the record that you get into your exclusive custody, there is no good way to ask the site how 12 months ended up on that key from which purchaser.

If you invalidate your key, you get no money back. If you lose your receipt you have nobody to blame but yourself. That's the risk you take for your privacy. It's basically using an information system hole to make things same-as-cash.

I haven't figured out how to deal with credit card "charge-backs" or fraudulent disputes. I'd rather take the gift-card route for payment if it came to that kind of problem.

You could, I suppose, put people who paid via revocable means (like credit cards) in "risk pools" and if someone games, you penalize the pool but let people out of the pool using their receipts as proof that they are not the scammer. As each person used their receipts to change pools, the pool would get smaller but each member would lose more, until only the scammed account and people who didn't care or lost their receipt would lose anything.

The idea started out as more of a social media/blog/rant site idea more than a profit oriented thing, but I could make it work pretty easily, The "business rules" for an anonymized service seem totally workable, but the anonymous people would have to accept some of the risk for the privacy.

People who opt in to having _me_ keep the payment records are, of course, buying the surety of service for the loss of anonymity, at least in part.

And the un-paid people are much less work (e.g. none) to track.

And a spammer would lose all their content for spamming as the and all its content would be forefet for spamming as a single "hide/delete where" eqivelent action. So rather than make fake accounts on my system they would be "paying" CPU to make keys and encrypt and sign their transmissions to me. Not impossible to script but pretty hard to javascript.

The little nicities (1)

IBitOBear (410965) | more than 2 years ago | (#41297423)

There would be other little niceties.

Agressive use of POST instead of GET messages on all forms so that pin-trap requirements, if levied, would be largely moot. as in user XXXXXX did POST to "/" at this site on these dates and times. [POST data is not legal to collect in PIN traps in the USA as I understand the law.]

Services a site could sell? POST the URL you want as part of the encrypted blob you sen to this site, we will retrieve it, scrub it and send its content back to you encrypted to with your key.

Pay for encrypted, advertisement free page delivery with/without the unpaid peoples noise at your leasure. 8-)

Encrypted mail box where the records in the mailbox are encrypted to your public key the instant we get it if the "From" matches particular criteria you specify. (this burns time off your subscription key expiration date etc, so you might not want to encrypt "form *".

Note that this is not a bar to law enforcement if they show up with a court order to "tap" a particular key going forward. It is a barrier to having law enforcement fish into your past. I am not a lawyer so I don't know if this last bit is legal, it's just the noise floating in my head.

Of course such a site would have no way of knowing whether the "identity" information in the key, if any, was real so just as I could make a key that said I was both Mittens and B.H.O. today, anybody would be foolish to assume that an unsigned and unverrified key was anybody it clamed to represent.

In short the site design is not to confound the law, but to make the entire issue of identity "Somebody Else's Problem" since I want to be in the business of passing messages for fun and profit, not being the arbiter of who is whom.

(you should see my thoughts on replacing DNS... 8-)

Give me control and earn my trust (3, Insightful)

johnnick (188363) | more than 2 years ago | (#41296411)

The short requirements:

1) Explain what you're collecting in real-time at the moment when you give me the option whether or not to permit you to collect it. Tell me what you will use it for, when you will delete it and the consequences if I don't give it to you. People don't read privacy disclosures. Give notice and ask permission at the moment of proposed collection. Make it opt-in, not opt-out.

2) Only request the information required to perform the service I've requested. Use the information I provide only to provide the service I've requested. Only share the information I provide with third parties to the limited extent necessary to provide the services I've requested. Obtain contractual commitments from those third parties that cause them to protect my information and delete it as soon as they've done what's required to provide the service I've requested. Keep information only as long as necessary to provide the service I've requested and delete it after you've done what's required to provide the service I've requested.

3) Protect my information. Encrypt in transit and at rest. Delete thoroughly and don't give in to the urge to collect and keep information just because it might be useful some time in the future. You can't lose what you don't have.

You say the collection "... is for purposes of analysis and ultimately functionality, not persistence." That seems inconsistent with the collection of name and email address. I can't think of too many use cases where you're collecting my name and email address and don't plan to keep it (and use it for marketing or otherwise share it in some way). If you need to contact me or I need to create a user-id that is my email address, you don't need my name.

Your privacy policy is your contract with your user. It is an operational document that must be consistent with your practices. The privacy policy should be consistent with your policies and procedures. If the information you collect, or the way you handle it changes, you must change your privacy policy.

Re:Give me control and earn my trust (1)

TheDarkMaster (1292526) | more than 2 years ago | (#41297925)

I think your answer is the best I've seen for the issue.

Nice admission of guilt. (-1, Offtopic)

Anonymous Coward | more than 2 years ago | (#41296525)

"but I've seen quite a few startups caught with their pants down on security/privacy of what they've collected — and I'd like to avoid it to the degree reasonably possible given we can't afford to hire an expert on the topic."

Your Honor, with his own words, the responsible developer admitted that they were too cheap to hire an expert and that they knew the consequences that they would have to bear.

I rest my case.
Saul Goodman, Attorney at law.
http://www.bettercallsaul.com/ [bettercallsaul.com]

Support OpenID (1)

interval1066 (668936) | more than 2 years ago | (#41296601)

...and let your users, investors, and you sleep easier at night. Don't store anything at all except a few prefs.

Store nothing. (0)

Anonymous Coward | more than 2 years ago | (#41296673)

"This is for purposes of analysis and ultimately functionality, not persistence."

Store nothing.
Ask people what they want from your service.
Listen to them.

You can't afford it, by your own admission. (3, Insightful)

VendettaMF (629699) | more than 2 years ago | (#41296699)

If you can't afford the expert then you can't afford to collect such data. Move away from this project to something you have the ability to do.

Re:You can't afford it, by your own admission. (2)

Mike610544 (578872) | more than 2 years ago | (#41297231)

If you can't afford the expert then you can't afford to collect such data. Move away from this project to something you have the ability to do.

I'm surprised it took this long for someone to say that. The people who will exploit your system and extract something valuable from it can afford those experts.

Re:You can't afford it, by your own admission. (0)

Anonymous Coward | more than 2 years ago | (#41297545)

bingo, finally we have a winner. If you don't already have the knowledge to handle this and you can't afford the experts to do it for you or teach you to do it then you should stay the hell away from this area. Every dev thinks they can write good secure code that protects privacy, the reality is most devs have no concept of the methods and skills that experts in this field can utilise to compromise your system or more importantly your users privacy. move on to something you are more comfortable with.

IN MY BUNGHOLE! (-1)

Anonymous Coward | more than 2 years ago | (#41296701)

I can store a lot of information there.

FU DATA MINERS!!!

OWASP (5, Informative)

FormOfActionBanana (966779) | more than 2 years ago | (#41296709)

OWASP has guidance; for instance, here: https://www.owasp.org/index.php/IOS_Developer_Cheat_Sheet#Insecure_Data_Storage_.28M1.29 [owasp.org]

From https://www.owasp.org/images/5/5e/Mobile_Security_-_Android_and_iOS_-_OWASP_NY_-_Final.pdf [owasp.org]
2. Insecure data storage
Solution
  Avoid local storage inside the device for sensitive information
  If local storage is “required” encrypt data securely and then store Use the Crypto APIs provided by Apple and Google
  Avoid writing custom crypto code – prone to vulnerability

Re:OWASP (1)

fa2k (881632) | more than 2 years ago | (#41297815)

  Avoid local storage inside the device for sensitive information

That does make sense, but it still feels like I've fallen into opposite land.

  Avoid writing custom crypto code – prone to vulnerability

Yes! I'll repeat it a couple of times

  Avoid writing custom crypto code – prone to vulnerability

  Avoid writing custom crypto code – prone to vulnerability

Book of best practices (5, Insightful)

Okian Warrior (537106) | more than 2 years ago | (#41296751)

In the US, we have the National Electrical Code [wikipedia.org] which explains in clear detail how house wiring is constructed.

Following the code a legal requirement in many (most?) states, but from the point of an electrician it's a "book of best practices". Use this gauge wire for this current, staple the wire within 6" of the box, and so on. The code gets revised and added to over time as questions crop up and new technologies get added and people get more experience.

There's a reason for everything. For example, the light in a bathroom should be on a separate breaker from the outlet next to the sink. It makes sense in retrospect, but this is not something that is obvious beforehand.

It's very detailed, but also very clear. Homeowners routinely understand the instructions and are able to make simple repairs and modifications to their home wiring which conform to the code.

We throw a lot of "best practices" around here as if they were simple and obvious at the outset, but maybe they're not. Hash your passwords, salt the hash, sanitize the form inputs, don't keep CC info... lots of best practices which in hindsight make sense but which aren't necessarily obvious beforehand.

Most web apps have common requirements for login, identity management, privacy, various forms of functionality, and so on.

Should we have a "book of best practices"?

Re:Book of best practices (1)

fuzzyfuzzyfungus (1223518) | more than 2 years ago | (#41296863)

I suspect that the big problem with that analogy is that data collection(unlike electrical wiring) is a substantially adversarial field.

There is a certain amount of tension, (fast, cheap, good, pick any two, and the usual buyer/seller desire to not leave money on the table); but the buyer and the seller both share roughly the same ideal, though they may deviate from it out of laziness, cheapness, or incompetence.

With data collection, the purely security/architectural aspects are somewhat similar; but there is the more fundamental problem that data collection is frequently not for the good of the collected. There is only the merest pretence of aligned interests, and it mostly is a matter of what the collector can get away with.

Re:Book of best practices (1)

khallow (566160) | more than 2 years ago | (#41298117)

The same tension exists in electrical wiring. But one can physically inspect the entire work. With data collection, it's pretty easy to hide what you are doing from the target of your collecting.

Aggregate Data (1)

Archangel Michael (180766) | more than 2 years ago | (#41296761)

Aggregate the data as quickly as possible to anonymize it.

Collect "Mary did X, Y but not Z", but aggregate it to Three people did X, Two Y and TWELVE Z and drop Mary from the data. You don't need to know Mary did anything.

What is is for? (1)

Silvanis (152728) | more than 2 years ago | (#41296785)

You say you aren't interested in persistence, so I don't see any reason why the data needs to be personally identifiable. Whether your index is John Smith in Albany,NY or User #71829382 doesn't matter for usage analytics. Even demographic information can at least be stripped of things like name and phone number.

If you REALLY need to tie this information to a particular instance, then use a hardware key from the mobile device and not a user's information. A hacked phone is easier to deal with than identity theft.

As someone else mentioned, work from the assumption that anything you save will end up being hacked and used for nefarious purposes. Make the data as useless as possible to a hacker and THEN design the systems and storage to be a hackproof as you can.

Also consider TLDR-TOS (2)

Krishnoid (984597) | more than 2 years ago | (#41296875)

This site [tos-dr.info] provides summaries of the terms-of-service policies for various companies covering privacy, retention, and use of user information. You can use it to compare your plans with those of major companies and identify privacy or TOS concerns you may have overlooked.

"We aim to be completely transparent and honest" (2)

stiebing.ja (836551) | more than 2 years ago | (#41296891)

+5 Funny

License under AGPLv3 (0)

Anonymous Coward | more than 2 years ago | (#41296911)

Doing this means that you will really respect the privacy of anyone using your software since they would have the source code to do as they wish.

https://en.wikipedia.org/wiki/AGPLv3

Give your users Freedom and they'll respect you.

FBI Laptop (0)

Anonymous Coward | more than 2 years ago | (#41297011)

Cut out the middle man. Just put it straight onto an FBI laptop.

On a need to know basis only (1)

istartedi (132515) | more than 2 years ago | (#41297059)

My car insurance company needs to be able to pull my DMV records, perhaps even periodicly. They could retain *none* of that information and ask me to visit a web site periodicly where the info gets enterred so they can do the query (and then forget the information required to perform the query). Most customers wouldn't mind them holding that information; but if I'm *that* security minded and they make it clear to me that I'll have to hit their site once a month to maintain my insurance... well... There are always trade-offs, arent' there?

Just ask yourself, "what do I need in order to serve my customers?". Yes. REAL customer service. People doing good things for other people, and getting paid for it as opposed to just herding people like cattle and exploiting them. Yeah, I know. Strange concept.

That can be a very successful business model though. Zappos is said to be very customer focused, and AFAIK they are very successful. I can't say I care much for their work environment; but I believe that's a separate issue from customer service. I mean, do you really need to have conga lines and party with your co-workers after hours to be spot-on with a customer? I don't think so; but I just can't think of a good counter-example off the top of my head.

Best Practices For Collecting and Storing User Inf (0)

Anonymous Coward | more than 2 years ago | (#41297069)

Consider if your service is intended for only one country or several (a global service).
Regulations on user information/user data is VERY DIFFERENT in different parts of the world.
There are quite a few countries where the parts of the population has first hand experience of severe implications of wrongfully used user information like
-Friends
-Location
-Behaviour
-Etc
In some of these countries data collection is outright forbidden or the user data can never leave the country.

North Face Denali (-1)

Anonymous Coward | more than 2 years ago | (#41297411)

A North Face Denali [northface-denali.co.uk] jacket isn't cheap. Given the quality of construction and the notable materials, it also comes with a solid price tag. North Face Women Jackets [northface-denali.co.uk] understands that not everyone is ready to forgo a week of groceries to own a jacket. There is a better solution than waiting patiently for an end-of-the-season sale! A North Face Outlet store will help you to experience a North Face Men Down Jackets [northface-denali.co.uk] with less of a dent in your bank account. A North Face Outlet is stocked with popular styles that need a good home. Swing by one of the multiple North Face Down Coats [northface-denali.co.uk] Outlet locations and experience things for yourself. If you aren't located by any physical locations, hop online to your favorite outdoor retailer and have a North Face Denali shipped to your door step. It's worth it.

Don't (0)

Anonymous Coward | more than 2 years ago | (#41297481)

Only collect what is absolutely necessary at that particular junction; afterwards move that data to a different datastore where only legally required information is stored (e.g. for tax purposes) and everything else is omitted. Encrypt it. Don't store IPs longer than 30 days. Don't set permanent cookies.

Use (1)

MrKaos (858439) | more than 2 years ago | (#41297489)

/dev/null

Functionality Only, Eh? (0)

Anonymous Coward | more than 2 years ago | (#41297559)

From the article,

I'm a mobile developer at a startup. My experience is in building user-facing applications, but in this case, a component of an app I'm building involves observing and collecting certain pieces of user information and then storing them in a web service. This is for purposes of analysis and ultimately functionality, not persistence. This would include some obvious items like names and e-mail addresses, and some less obvious items involving user behavior.

If the intended reason as is stated, then why store the names and email addresses at all? Analysis of user behaviour in the aggregate does not require individually-identifiable information be collected much less stored,

Read "Translucent Databases" by Peter Wayner (1)

cornicefire (610241) | more than 2 years ago | (#41297567)

It explains how to store personal information so it can be used correctly. http://wayner.org/node/46 [wayner.org]

So it sounds like... (0)

Anonymous Coward | more than 2 years ago | (#41297977)

Either your truly concerned about what parts of the interface/product/service they use the most and how, or you're collecting sales and marketing data so you can trim the fat and rape people. If the first is the case, just put a counter and/or timer on everything so every time it is accessed, clicked, etc, it is counted and you can see the amounts of time they are spending doing what without ever collecting any personal data at all. That would be completely anonymous and give you all the data you need to build a better interface. If the latter is the case go jump in a river from a very high bridge.

Policies, Procedures, Standards, Trust all Useless (1)

anorlunda (311253) | more than 2 years ago | (#41298043)

If your company goes bankrupt, or is sold to another, all it's assets become the property of someone else. That someone cannot be constrained to respect anything you have promised. You may not even have the opportunity to wipe disks or change passwords.

For example, a hospital failed to pay the rent on a warehouse storing patient records. The landlord seized and sold those records as scrap. None of the hospital's patient privacy obligations transfer to the landlord, or to the scrap dealer.

  Heed the advice of others who told you don't do it.

Ugh (0)

Anonymous Coward | more than 2 years ago | (#41298091)

"How would you like information collected about you to be stored?" ::Must resist unhelpful comment::

Keep it on the user's computer, not in the cloud (1)

jbrohan (1102957) | more than 2 years ago | (#41298093)

Obviously not the solution for everybody. We write apps for Android Tablets (for old people actually). All the data like Name, email, pictures, and messages are stored in the Android tablet and kept on the Cloud only until they are downloaded. They are encrypted, even the pictures, while waiting on the Cloud database. In the registration part of the app the user does type in his email, but we do not keep it. How to contact the user? We put a record in a table which is checked periodically by an active user's android and they can get the message. Payment is tricky since the PayPal record contains the email of the payer and teh AndroidId of the user's tablet, It's just a matter of throwing away data that does not belong to me!

Google Mobile Analytics (1)

monkeyhybrid (1677192) | more than 2 years ago | (#41298611)

Although you state you're not looking for stack or infrastructure recommendations, I'd still recommend having a look at Google Mobile Analytics [google.com] . They have an SDK for Android and iOS that makes it very easy to integrate in your apps.

Please (0)

Anonymous Coward | more than 2 years ago | (#41299089)

Let us know the name of this startup, so that we may avoid it like the plague.

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?