NCSA Compares Google and Yahoo Index Numbers
the assumptoins seem to be that sarch results are randomlydistributed. But by teh very nature of search - a targeted and subjective request for information - that is clearly the wrong model. I don't se why the assumption that a 2x bigger index should return 2x more results for any query 1000.
A better test would be to see how much overlap there was between queries. Do the top 50 returns on queries (ofany size, not just imited to those with N 1000 returns) match? to wuithin what percentage?