An anonymous reader writes "The Foursquare blog has an interesting post about some of the math they use to evaluate and verify the massive amount of user-generated data that enters their database. They need to figure out the likelihood that any given datapoint accurately represents reality, so they've worked out a complicated formula that will minimize abuse. Quoting: 'By choosing the points based on a user’s accuracy, we can intelligently accrue certainty about a proposed update and stop the voting process as soon as the math guarantees the required certainty.
So far, we’ve taken a very user-centric view of p-sub-k (this is the accuracy of user k). But we can go well beyond that. For example, p-sub-k could be “the accuracy of user k’s vote given that they have been to the venue three times before and work nearby.” These clauses can be arbitrarily complicated and estimated from a (logistic) regression of the honeypot performance. The point is that these changes will be based on data and not subjective judgments of how many “points” a user or situation should get."
Link to Original Source