Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Wikia Search Engine to be Launched on January 7th

timothy posted more than 6 years ago | from the wisdom-of-crowds dept.

The Internet 189

cagnol writes "The Washington Post reports that Jimmy Wales, the founder of online encyclopedia Wikipedia, has announced the launch of a new open-source search engine, Wikia Search, on January 7th, 2008. The project will allow the community to help rank search results, in a model close to Wikipedia. However the company is a for-profit organization. This new search is supposed to challenge Google and Yahoo."

Sorry! There are no comments related to the filter you selected.

Challenging Google? (2, Interesting)

sykopomp (1133507) | more than 6 years ago | (#21878576)

I guess that's their response to Google's Knol (http://en.wikipedia.org/wiki/Knol) Pity to see things heat up between the 'good guys'.

Re:Challenging Google? (2, Insightful)

Anonymous Coward | more than 6 years ago | (#21878664)

What? Why? It's called "competition" and it's healthy.

Re:Challenging Google? (0, Flamebait)

jt2377 (933506) | more than 6 years ago | (#21878928)

Good guys? Are you in speical ed. or something?

Re:Challenging Google? (5, Informative)

jwales (97533) | more than 6 years ago | (#21878996)

No, it is no response to Knol. I have been working on this for a year. The press has talked about it endlessly. :-)

It'd be sort of cool if we could create a search engine in a week or two to respond to Knol, but actually it takes a bit longer. :)

I see Larry and Sergei socially from time to time. I spoke about the search project at Google Zeigeist a few months ago. Going to a google party next month. The media loves a "fight" but really, that's just a nice story arc the press makes up. (Notice: google is not in the search business, google is in the advertising matching business. This search engine doesn't hurt that business at all, indeed it probably makes it marginally less likely we will see the emergence of a proprietary competitor to topple them.)

It is actually possible for people to just enjoy doing cool stuff without being bastards about it. People forget this sometimes, maybe due to the reputation of a certain dominant software provider. :)

Re:Challenging Google? (3, Interesting)

ThreeGigs (239452) | more than 6 years ago | (#21879086)

It looks like you've entered some sort of partnership with Grub http://www.grub.org/ [grub.org] .
If so, kudos... Grub's been languishing in not-ready-for-primetime land for far too long, and the ability to crawl your own site to keep results current is a bonus, too.

Re:Challenging Google? (0, Redundant)

jo42 (227475) | more than 6 years ago | (#21879256)

google is in the advertising matching business
And this is why Google is Evil. They use their massive advertising revenue to give away services for free. Much like when a Walmart comes into town, the local shops go out of business.

Re:Challenging Google's Revenue Model (2, Interesting)

TaoPhoenix (980487) | more than 6 years ago | (#21879284)

There have only been two fundamental revenue models of content for 25 years now - EndUser and Advertiser. The ISP's went through the throes of the switch from PerHour to FlatRate in the 1990's, and the RIAA is struggling with it now.

I don't know anyone who would "pay to search" casual queries. There are some professional databases which do operate on this principle for high powered content.

From the RIAA threads we learn people don't want to pay as endusers for their content. The post above asks about the advertiser model.

The absolutely tough part about Free Open Source models is that it takes a MUCH longer cycle for the benefits to wind around the social benefit cycle. The monthly rent/mortgage whips around much sooner. The first person to absolutely nail this problem will be the mogul of the 2010 decade.

Re:Challenging Google's Revenue Model (5, Insightful)

sethawoolley (1005201) | more than 6 years ago | (#21879570)

From the RIAA threads we learn people don't want to pay as endusers for their content.
Great post, except this part doesn't make any sense. I pay as an end user for content all the time, and not just for high-end data: Magazine subscriptions, membership in various societies (and their publications), newspapers, my ISP, government funding (I pay through taxes), direct donations to non-profits, contributions to wikipedia and other open content systems directly. While some of them are for high-end data, a lot of it is not.

Is content going to ever be totally free? It will be if people understand the inherent rewards of an open society. Information's negligible cost of duplication is the revolutionary model is the thing that is shattering the old models (c.f. http://homes.eff.org/~barlow/EconomyOfIdeas.html [eff.org] ). Wikipedia is already doing that. As much as I'm a critic of Jimmy Wales, citizendium, etc. (with their NPOV lunacy), the system he's helped build is saving people's lives and improving quality of life in ways the old world just doesn't understand yet.

Personally, I'm hopeful that as long as we still have the Right to Read (c.f. http://www.gnu.org/philosophy/right-to-read.html [gnu.org] ), we're on the path to freedom and salvation. A corporation who makes up a new "model" to take advantage of content producers isn't going to take hold anymore. There's just not a point anymore. The price of content is already quite low for common knowledge. Even if the arbiters of knowledge try to keep it from common knowledge, we can paraphrase it. The greatest risk to real productive use of our knowledge still remains Patents. Information may finally be free, but the freedom to tinker is not.

Re:Challenging Google? (-1, Troll)

Niggy the Fist (1210970) | more than 6 years ago | (#21879270)

Yeah, that's actually a great point - but tell me, how can you speak with all that horse jizz in your mouth?

Re:Challenging Google? (2, Insightful)

R2.0 (532027) | more than 6 years ago | (#21879324)

Hey Jimmy: quit goofing around on Slashdot and get to straightening out the Wikipedia "administration" system. Check out your current fundraising campaign - that little green guy is moving REALLY slowly, and things like faked credentials for editors and the "notability purge" aren't helping.

Sincerely,
The Rest of the Internet

Re:Challenging Google? (-1, Troll)

Anonymous Coward | more than 6 years ago | (#21879364)

Go fuck yourself pornographer.

Re:Challenging Google? (-1, Flamebait)

Anonymous Coward | more than 6 years ago | (#21879450)

So Jimmy...

Enjoying all the fruits of other people's labors? Don't you find it the least ethically challenging to be making your riches by using all the free labor of your WikiDrones like so many whores?

Re:Challenging Google? (5, Funny)

STrinity (723872) | more than 6 years ago | (#21879664)

No, it is no response to Knol. I have been working on this for a year.
I'm sorry, but your post cites primary sources and thus does not meet Wikipedia's standards.

Re:Challenging Google? (4, Funny)

mblase (200735) | more than 6 years ago | (#21879686)

It is actually possible for people to just enjoy doing cool stuff without being bastards about it. People forget this sometimes, maybe due to the reputation of a certain dominant software provider. :)

Oh, come on. The people who matter already know that most Linux users aren't elitist snobs.

Re:Challenging Google? (0)

Anonymous Coward | more than 6 years ago | (#21879424)

I guess that's their response to Google's Knol (http://en.wikipedia.org/wiki/Knol) Pity to see things heat up between the 'good guys'.

Is it pitty because a clear-cut "evil vs good" model requires less thinking when reading the news? Such a pitty indeed.

Re:Challenging Google? (1)

bigpat (158134) | more than 6 years ago | (#21879560)

Pity to see things heat up between the 'good guys'.
There is a difference between healthy competition and trying to drive your competition into the ground in order to squeeze every last penny from the market.

Competition should not be confused with the anticompetitive mafia like behavior that we all too often see from some other big business. For example, if Google acted like Microsoft (or the old AT&T and IBM) they would use their market position and simply require websites to exclude other search engines from indexing their web pages or else be excluded from Google's results.

Competing on quality of product or service is a good thing and helps consumers by giving them better choices. Ethical companies can compete without destroying the market or hurting their customers.

Just look at Google itself.. it was a late comer to the search engine market, but it was able to supplant Yahoo by providing a better search service, but Yahoo is still around and was able to adapt and improve the quality of its service. And ask.com was able to remake itself into a worthwhile alternative by improving its own service with some good features... healthy competition benefits people.

Easily Abused? (5, Insightful)

Shade of Pyrrhus (992978) | more than 6 years ago | (#21878582)

So basically...they're asking for people to abuse the ranking system. To patrol something like this would require a company with resources like Google, and most likely the reason Google doesn't have such functionality. Just my two cents.

Re:Easily Abused? (5, Funny)

Walzmyn (913748) | more than 6 years ago | (#21878618)

What this means is that no matter what you search for, the top hundred results will be to porn sites.

Well that's lame. (3, Funny)

raehl (609729) | more than 6 years ago | (#21879084)

the top hundred results will be to porn sites.

What a crappy search engine. When I search for something, I want the top 100 results to be 100 different porn sites! I can find two porn sites without help.

Re:Easily Abused? (1)

commodoresloat (172735) | more than 6 years ago | (#21879496)

What this means is that no matter what you search for, the top hundred results will be to porn sites.
So what's the problem here? This is exactly the functionality that was once promised in the vaporware project Net Nymph [fadetoblack.com] .

Re:Easily Abused? (1)

wizardforce (1005805) | more than 6 years ago | (#21878622)

an open search algorithm can more easily be checked for any flaws and by extension, can be fixed- closed search can only be reverse engineered for good or bad. with closed source you may be able to find a problem with the software/algorithm but there is nothing you can really do about it, it's completely at the whims of whoever created it and that's the problem.

Re:Easily Abused? (5, Insightful)

Shade of Pyrrhus (992978) | more than 6 years ago | (#21878670)

Having an open algorithm is good, as non-disclosure isn't security, but the issue is allowing people to rank searches and such. Having that public is asking for people to abuse the system, and as noted before, a lof of malicious parties could seemingly legitimately rank their sites (porn sites, etc) higher, leading to ranking battles by bots. Of course, the issue of vandalism occurs with Wikipedia, however when people are looking to make money off of it they'll likely be more persistent.

Re:Easily Abused? (2, Interesting)

Kjella (173770) | more than 6 years ago | (#21878884)

it's completely at the whims of whoever created it and that's the problem.
Funny, I prefer it to be under control of someone that's in the business of making good search results rathar than a bunch of wankers/trolls/bots trying to lure me to their site even though there's a hundred others that would be more relevant to my search.

Re:Easily Abused? (5, Interesting)

jrothwell97 (968062) | more than 6 years ago | (#21878624)

Point well made - while spam attacks may be pretty obvious, they could be spread out over time to make them less obvious.

Additionally, I can see this search engine being very much affected by public mood. For example, say there was a royal death and a certain right-wing 'upmarket' tabloid newspaper [dailyexpress.co.uk] decided to claim that it was a conspiracy by the Government to kill the royal off. This is linked to from said newspaper's web site, and this people improve its ranking. Therefore it floats to the top of the results pile, thus giving it more exposure and setting off a vicious cycle.

Just a hypothetical situation, but certainly possible. Such a model would also make it possible to carry out smear attacks and to ruin the rankings of competing companies, parties, organisations, whatever - a practice that IMHO should be left to search engine admins.

Re:Easily Abused? (1)

foreverdisillusioned (763799) | more than 6 years ago | (#21878942)

To patrol something like this would require a company with resources like Google, and most likely the reason Google doesn't have such functionality.

Errr, what? If they had the resources to do it, why wouldn't they? Especially considering their overall support of open source and Wikis in particular.

Re:Easily Abused? (5, Informative)

jwales (97533) | more than 6 years ago | (#21878958)

The question of abuse is obviously one that we are taking very seriously in thinking about design issues. My belief is that the key to solving this thorny question is hinted at by the success of wikis and the wiki model: the key is to put tools in the hands of the community that allow for broad oversight and control by the community in a process of open dialogue and discussion. This is very different from approaches that allow only for atomistic participation by a "community" which is never allowed to really become a community due to excessive reliance on algorithmic voting systems and similar.

One of the first lines of defense in the early days will be use of a community (wiki) generated whitelist [wikia.com] of sites to crawl. We will want to work outward from there, but basically the first thing is for us to assess "look, what are the most important must-have sites on the net" and crawl them. One thing that the mainstream media never seems to report very well, mostly because I think they don't get why it is important, is that we are doing everything here under free licenses. The software GPL, the data we generate under free licenses, etc. The aim here is not just to create a good search engine, but to create it and *give it all away* in a way that I think has a chance to restructure the entire search industry. Well, maybe not, maybe so, but what the hell, it'll be fun to see. :-)

Your track record says otherwise (4, Insightful)

onyxruby (118189) | more than 6 years ago | (#21879168)

Unfortunately for you your track record disagrees with your promises. You and your website have a history of abuse and bias that rivals that of any on the Internet. Your management incompetence of Wikipedia is so bad that you have dedicated websites documenting it. From secret mailing lists to the junior high style politics that rule your sham open organization, you are incompetent.

The thought that Jimmy Wales, cofounder of Wikipedia could have an open site without abuse is laughable. You operate under the sham of an open community, yet exclude those outside a very narrow political agenda. Your a fraud, using open source principals as a smokescreen that presents your personal world-view set as fact to the world. I don't buy what your selling, and I'm calling your bluff. The sad thing is that this will probably make you a fair amount of money if more people don't start to see through you.

But then the wonderful thing about leading revisionist history is you can substitute your own revisions for reality....

Mod Parent (3, Interesting)

Anonymous Coward | more than 6 years ago | (#21879480)

As trollish as parent is perhaps, he is unfortunately speaking a trollish truth.

Speaking explicitly as a reader of slashdot, with all the group-think biases a site like this introduces, wikipedia is floundering in a mire of their own arrogance, and the dissatisfaction with this needs to be heard.

Re:Mod Parent (0)

Anonymous Coward | more than 6 years ago | (#21879564)

Modded Flamebait. The next guy can mod Underrated, perhaps, so we can try and end up with a +5 Flamebait? I've never seen a post where it would be more appropriate.


Re:Your track record says otherwise (3, Interesting)

jwales (97533) | more than 6 years ago | (#21879696)

"You operate under the sham of an open community, yet exclude those outside a very narrow political agenda. Your a fraud, using open source principals as a smokescreen that presents your personal world-view set as fact to the world."

Actually, no. Wikipedia can be criticized on a lot of grounds, some of them even valid :-); but that it presents my personal-world view or that we exclude people outside a narrow political agenda is just... not grounded in fact.

Perhaps you'd like to come to my talk page at Wikipedia and tell me what you're upset about.

Re:Easily Abused? (5, Insightful)

ivan256 (17499) | more than 6 years ago | (#21879488)

Is there an intersection between the people who decide what goes on the whitelist, and what is "notable" for inclusion in Wikipedia?

I thought so. Your solution is already broken.

Re:Easily Abused? (2, Interesting)

timothy (36799) | more than 6 years ago | (#21879512)

Hey, what would you say to another Slashdot interview so you could answer more questions at greater leisure? :)

timothy

Re:Easily Abused? (2, Insightful)

jacquesm (154384) | more than 6 years ago | (#21879518)

Abuse potential is the first entry on they whiteboard when it comes to designing a new internet site these days. It's a pity, but that's the way it is.

I've been running a (small, nothing compared to what you're doing) community powered search engine for a while now (little less than one year), it's been a neat little project and I've learned a lot.

I think the combined power of having your name and wikipedia as a launchpad and quite probably the capital to see this through may give you a chance worth taking. That said I wished that you'd go back to fixing what's still broken in wikipedia and that google would fix their search, I think you'd both be in better shape then. Wikipedia gives me a strong feeling the inmates have taken over the asylum and google has some serious issues (that your effort will probably not be able to address).

best regards, & best of luck,

  Jacques Mattheij

Re:Easily Abused? (2, Insightful)

nacturation (646836) | more than 6 years ago | (#21878966)

So basically...they're asking for people to abuse the ranking system. To patrol something like this would require a company with resources like Google, and most likely the reason Google doesn't have such functionality. Just my two cents.
And when you think about it, Google's pagerank algorithm already returns search results based on what the community thinks. This new venture is simply a means to take other peoples' sweat equity and turn it into profit for good old Jimmy while giving the people who did all the work little more than warm fuzzies inside, if that.
 

First Post!!! (-1, Offtopic)

Anonymous Coward | more than 6 years ago | (#21878584)

WooHoo!!!

yeah (4, Funny)

User 956 (568564) | more than 6 years ago | (#21878588)

The Washington Post reports that Jimmy Wales, the founder of online encyclopedia Wikipedia, has announced the launch of a new open-source search engine, Wikia Search, on January 7th, 2008.

Not only that, Wikipedia is reporting that its marketshare has tripled in the last six months.

Re:yeah (1)

aldheorte (162967) | more than 6 years ago | (#21878706)

Market share of what?

Re:yeah (0)

Anonymous Coward | more than 6 years ago | (#21878976)

Elephants in Africa.

Market share of... (4, Funny)

gringer (252588) | more than 6 years ago | (#21878982)

Market share of what?
Wikiality

Re:Market share of... (3, Funny)

CCFreak2K (930973) | more than 6 years ago | (#21879144)

[[citation needed]]

I don't care how they arrive at a rank! (5, Insightful)

garcia (6573) | more than 6 years ago | (#21878596)

The idea is to challenge the established players by offering a search service that is more transparent to end users, meaning they can see how search results are arrived at. Wales has described Yahoo and Google as opaque services that don't explain how results are arrived at.

Personally, I don't care how search engines rank the websites they return as long as what is returned is proper, relevant and useful.

notability: crawled results flagged for deletion (0)

Anonymous Coward | more than 6 years ago | (#21878632)

Please note that this web-crawled item has been flagged for deletion due to violation of the following:

notability: this item is not notable
references/authority: this item is missing authoritative references

Sincerely,

I-wasn't-cool-enough-for-hall-monitor-so-I-delete-Wiki-articles-instead

Re:I don't care how they arrive at a rank! (1)

John Hasler (414242) | more than 6 years ago | (#21878818)

When they return 700,000 results it is kind of nice when the proper, relevant and useful ones are near the top.

Re:I don't care how they arrive at a rank! (1)

larry bagina (561269) | more than 6 years ago | (#21878970)

If it's anything like Wikipedia, I'll have to use google to find what I'm looking for in their search engine.

Vandalism (1)

David_Shultz (750615) | more than 6 years ago | (#21878598)

I predict significantly more vandalism and self promotion with this project than with Wikipedia. That said, I still think it's a good idea. But the article had a very low content:words ratio, so I don't really have a good idea as to how it will be implemented.

My prediction: killed by nonprofit competition (2, Interesting)

Bombula (670389) | more than 6 years ago | (#21878600)

Since this project would seem to depend on the participation and good-will of users in order to work, my guess is that a nonprofit version will follow shortly afterwards, paralleling the open-source model. I also predict that without the benefit of a massive Microsoft-esque head start, the for-profit version will be put of business in short order.

Re:My prediction: killed by nonprofit competition (2, Insightful)

pigiron (104729) | more than 6 years ago | (#21878688)

The non-profit is still going to have to make money. Crawling the web and returning results to queries is quite hardware and energy intensive.

Re:My prediction: killed by nonprofit competition (1, Interesting)

Anonymous Coward | more than 6 years ago | (#21879010)

200-400 boxes can handle the crawl/processing/indexing for the current 'important' parts of the web similar to current google crawling. Its handling the query load/replicated availability anywhere near what google does that would scale that to a few 1000 of boxes around the world. Its hard to tell how much storage is required for adwords or all of googles non-web searches and projects. I figure that is where most of their tons of capacity really goes to now.

Re:My prediction: killed by nonprofit competition (1)

pigiron (104729) | more than 6 years ago | (#21879044)

Running 400 boxes 24/7 is hardly a trivial expense.

What I always wanted (1)

Oligonicella (659917) | more than 6 years ago | (#21878612)

Someone loading the dice on what I get back from a search. At least with the current crop, I can more or less figure what they're doing. With a dynamic, anything goes approach, I seriously doubt I'll be using it much.

I certainly hope... (0, Flamebait)

QuietLagoon (813062) | more than 6 years ago | (#21878630)

... they do a better job than what they did with WikiPedia.

Next challenger: Niggia (-1, Troll)

Anonymous Coward | more than 6 years ago | (#21878642)

Information about the world in a niggy format.

Biased Rankings? (1)

ocirs (1180669) | more than 6 years ago | (#21878674)

I have a feeling the general user who would edit a wikipedia article or something like Wikisearch will be much more tech savvy and promote more tech oriented results for keywords than those that google will provide. This can be a good thing or bad thing depending on what you are looking for. Google personalized search is probably the superior method since it uses the users own searches(without external input) to determine the rankings of the search results.

BUMRAPE! (-1, Troll)

Anonymous Coward | more than 6 years ago | (#21878708)

ARGH!

I'm glad you told me (1, Funny)

Anonymous Coward | more than 6 years ago | (#21878716)

This new search is supposed to challenge Google and Yahoo.
Really? Is a search engine startup going to be competing in the same industry as other search engines?

first things first (4, Insightful)

Paktu (1103861) | more than 6 years ago | (#21878764)

It would have been nice to see them fix Wikipedia's own search engine, which IMO is absolute garbage. I have a better chance of being linked to what I'm looking for by using a general search engine.

Re:first things first (5, Insightful)

phantomlord (38815) | more than 6 years ago | (#21878870)

Search for Kobar Towers and you get 0 relevant articles. Search for Khobar Towers and you get 62 articles. Yeah, the first is a misspelling, but it's 1 letter off, nothing difficult for a spell checker to check against a dictionary of existing articles. What use is a search engine if it is so strict that I have to enter the terms exactly to get an article when I could just do that in the URL?

As long as I need to use google to search Wikipedia, I don't see Wikipedia creating a google killer.

Re:first things first (5, Interesting)

Odiumjunkie (926074) | more than 6 years ago | (#21879038)

I completely agree. I am continually amazed at how good google's input-correction is - if I do a search for 'pale gire', it knows to correct it to 'pale fire [wikipedia.org] ', yet if I do a search for 'canadian gire', it's clever enough to work out that I mean 'canadian tire [wikipedia.org] '. I'm also continually amazed that people running other search services haven't yet realised just how vital this feature is - it's probably one of my favourite things about Google. Less so for monosyllables, but it's useful for words like "monosyllables". I'm particulary surprised that prominent online dictionaries don't have similar funcionality, seeing as I would imagine a large portion of their usage is to find the correct spelling of words.

Bingo (1, Informative)

Anonymous Coward | more than 6 years ago | (#21879072)

Exactly. I have *always* used google to find wikipedia articles. You can't beat google with a 'site:' prefix. As a matter of fact, I have the following firefox bookmark stashed under the "wik" keyword:

http://www.google.ca/search?complete=1&q=site:en.wikipedia.org+%25s [google.ca]

Which means of course, that simply typing "wik integer" into my address bar provides me with a list of wikipedia articles relating to integers. No need (or desire) for wikipedia's own search.

Re:Bingo (1)

c6gunner (950153) | more than 6 years ago | (#21879470)

Exactly. I have *always* used google to find wikipedia articles. You can't beat google with a 'site:' prefix. As a matter of fact, I have the following firefox bookmark stashed under the "wik" keyword"


Slashdot broke your URL. For anyone attempting to create a similar keyword, simply replace "%25s" with %s".

What a joke... (3, Insightful)

Evil Kerek (1196573) | more than 6 years ago | (#21878770)

This is simply his response to Google starting what amounts to competition for wikpedia. I'm sure google is having quite the laugh from it - one wonders how much of the donations for wikpedia are being used towards this thing.

If you think wikpedia gets vandalized, wait until there's money involved. Wikpedia for all it's trappings, doesn't directly influence spam. But a search engine... IF, and this is a big IF, this thing becomes mainstream, having the code public will make it very easy for the bot herders to control it. The idea is simply flawed. Google is currently dealing with bot herders attempting to manipulate it's page ranks - while the idea of it being open source sounds great (well, ok it doesn't to me - I don't have the love affair with open source that most slashdotters do - I've never bought into the security myth that there's GOOD coders out there with so much free time on their hands that they are walking OTHER peoples code. I don't like doing that when I'm PAID to do it. Not too mention there just aren't that many good coders out there....but I digress) it's simply going to work right into the hands of the malware crowd - especially now that it's more organized crime than it is vandalism.

EK

Re:What a joke... (1)

totally bogus dude (1040246) | more than 6 years ago | (#21878908)

having the code public will make it very easy for the bot herders to control it

This assumes that it's impossible to devise a web search algorithm that can automatically and reliably avoid poisoning. I don't think there's any particular reason to believe that this is true. Humans are quite capable of identifying with a fairly high amount of accuracy what sites are just linkfarms, or useful sites which are the target of a de-ranking campaign, so it should be possible to have a computer automatically do it.

I'm not saying they will succeed, just that it isn't necessarily flawed at a fundamental level, as you seem to imply. It is a difficult problem to solve, but that doesn't mean it can't be solved.

Regardless of that disagreement, it seems the big thing they're relying on is users ranking search results, which is easily gamed if there's money to be made from doing so. So ultimately I do agree with you, but not because the code will be "open".

Re:What a joke... (1)

Evil Kerek (1196573) | more than 6 years ago | (#21879670)

Valid points - I just think it being open will make it that much easier.

In the end, however, it won't matter - as you have said, money will eventually be involved and therefore people will figure out how to manipulate it.

Re:What a joke... (5, Informative)

jwales (97533) | more than 6 years ago | (#21879028)

Again, it would be hard for this to be a response to Knol, since I announced it and have been working on it for a year. :-)

And, if you read the linked article, you would know that *zero* donations from Wikipedia have anything at all to do with this: Wikia is a completely separate organization.

Also don't make the classic mistake of thinking that "open source" automatically means "volunteer coders". It generally does not, and the classic FUD from the proprietary world fails to describe reality for precisely this reason.

And finally, one of the most important concepts here is that of a broad deep whitelist, which is something that I think can be done realiably and well with appropriate tools in the hands of the end users. The entire problem of bot-driven spam comes from a lack of reliable quantities of human oversight in the process. All you have to do to massively spam google is fool a computer. (Well, even then, google does a pretty damned good job of preventing massive spam though of course there are always some problems.) Pretty hard to get that nonsense by a properly organized community effort.

(But of course, the design of a community which can move things forward quickly without a lot of useless work is nontrivial.)

Re:What a joke... (1)

RobBebop (947356) | more than 6 years ago | (#21879428)

What I think is a joke is the advertising model. Wikipedia wouldn't be Wikipedia is there were sponsored ads. One of the first things I saw on the Wikia Search link was an ad for Netflix and a promise of the site's four Open Principles (which includes Transparency).

Can I ask a question I think is important... whose pockets does the advertising revenue go into? Because my desire to support the site is incredibly contingent on the answer to that question. And my inquisitiveness about the answer is what a guy like me can do to get on the payroll (if anything) or if the revenue from this site can be a source for global charities in the starving parts of the world.

If the money filters into the pockets of a small number of top executives at Wikia, no thank you.

dmoz (0)

Anonymous Coward | more than 6 years ago | (#21879438)

Have you been influenced any by dmoz?

Re:What a joke... (1)

Evil Kerek (1196573) | more than 6 years ago | (#21879648)

And, if you read the linked article, you would know that *zero* donations from Wikipedia have anything at all to do with this: Wikia is a completely separate organization
Which is why I put one wonders - honestly I'm not interested enough in wiki-anything to follow links to it. Good enough though, point taken. It was an uninformed jibe.

Also don't make the classic mistake of thinking that "open source" automatically means "volunteer coders". It generally does not, and the classic FUD from the proprietary world fails to describe reality for precisely this reason.
I don't see that I said that anywhere, but in any case it's not what I was talking about - let me clarify - I was referencing the idea that open source is more secure because there are all these coders out there looking over the code for security/errors. I find that idea laughable. Don't get me wrong - I have no issue with open source - it's not what I do but that's not relevant to the idea of it. I just have an issue with this particular claim about it. There are strengths to open source, but to put a blanket 'more secure' on it is just silly. All I know is Google attracts some of the top talent around the world and they have problems keeping the spammers out of thier page ranking system - and thats with the spammers having to guess how it works. So I'm just a bit skeptical that having the ranking system methodology public isn't going to help them, no matter what the system. I understand the idea of a white list..but that opens up other issues. Now there's a select bunch of people determing page rank - there are other posts concerning how this will sway the results, so I'll leave that to those posters. Who gets to determine who's on the white list? It brings to mind the recent article where every one got fired up over the edits made by the army guy. So many things are a matter of point of view - what happens when someone puts in a search for George W. Bush? Did you let the liberal minded people control the search results? See what I mean? It will be a fascinating thing to watch - we'll see what articles come up on slashdot about it. Thanks for the reply, EK All this said, more choices are always welcome.

I can see... (3, Funny)

Idiot with a gun (1081749) | more than 6 years ago | (#21878810)

by our tags, that we have a few Wikipedian Protestors [xkcd.com] in our midsts.

Just in Time for the Election (1)

STrinity (723872) | more than 6 years ago | (#21878820)

All right! Googlebombing is time consuming and an organizational nightmare. This will simplify everything. Karl Rove

There's more to life than Goolge, Yahoo, MSN (0)

Anonymous Coward | more than 6 years ago | (#21878842)

Search today is nothing but "top lists" for your keywords. If we look at the major search engines (Google, yahoo, MSN, Ask.com) what is search today anyways? It is the top results of web's averages searches for a given query. It has little to do with you as the individual, it isn't in a natural language (something we forgot exist), and it is old. We have recently released Assista.com which is a different kind of search (I define it as "inquisitive search"). While we don't think it is the only solution out there, it is different, fresh, unique, and for the advance learner, it gives an unfair advantage on learning, education.
Wikia, while may not be perfect, is a noble idea. When Jimmy Wales asked why he is getting into search he replied "Because it sounds fun" (or something like that). The bottom line is Search is just beginning and we have a long way to go from perfection. When you look back at Google/Yahoo/MSN 20 years from now you will not believe e have settled for that.
Cheers
Sahar Sarid
http://www.assista.com/ [assista.com]
http://www.conceptualist.com/ [conceptualist.com] (blog)

Re:There's more to life than Goolge, Yahoo, MSN (0)

Anonymous Coward | more than 6 years ago | (#21879460)

I just finished watching a Science channel piece on Galaxies, so I was curious what your search engine might offer. Using "Milky Way" as my keywords, these are the first results:

WHERE IS THE MILKY WAY? (15 results)
Who is find the milky tits? (3 results)
What is a milky performance? (3 results) ...

While we don't think it is the only solution out there, it is different, fresh, unique, and for the advance learner, it gives an unfair advantage on learning, education.
I don't think these words mean what you think they mean. Or maybe, "advanced learning" is a lot like what my Health Education teacher, Miss Juniper, offered me back in 10th grade as "extra credit" after class?

Search Engine based in Wiki? (0)

Anonymous Coward | more than 6 years ago | (#21878862)

I guess it will return all of the left wing Bush haters sites first.

Re:Search Engine based in Wiki? (1)

roguetrick (1147853) | more than 6 years ago | (#21878932)

Yes, the internet has a clear left wing bias. I'm no fan.

Re:Search Engine based in Wiki? (1)

ArikTheRed (865776) | more than 6 years ago | (#21879358)

Interesting that the internet is also populated by the more educated and affluent of our society. Makes you think...

Re:Search Engine based in Wiki? (2, Funny)

Donniedarkness (895066) | more than 6 years ago | (#21879524)

"Interesting that the internet is also populated by the more educated and affluent of our society."

And Furries.

Fix the wikipedia search! (1)

Bootle (816136) | more than 6 years ago | (#21878920)

Honestly, it's crap. It's like using infoseek back in 1995!

Google search for Wikia (1)

Broken Toys (1198853) | more than 6 years ago | (#21878962)

"Results 1 - 10 of about 1,190,000 for wikia. (0.04 seconds)"

Someone had to do it ;-)

What's gonna happen to Wikia? (1)

tlhIngan (30335) | more than 6 years ago | (#21879260)

There's a site that hosts Wikis called Wikia [wikia.com] as well. (Yes, it's owned by Jimmy Wales, as well).

So what's going to happen to those Wikis now that Wikia is turning from a MediaWiki host site to a search site?

Re:What's gonna happen to Wikia? (1)

Aluvus (691449) | more than 6 years ago | (#21879634)

Wikia is supposed to become both a wiki host and a search engine.

What a great idea! (1)

telso (924323) | more than 6 years ago | (#21879062)

But building a search engine is a little ambitious, even for the co-founder of Wikipedia. Maybe he should start off small, like searching one website. I even have the perfect one to start off with: its search feature is so bad that if your search is off by one letter, you have a good chance of not finding what you're looking for. Maybe you've heard of it [wikipedia.org] .

Hope it works better than wikipedia's search (4, Insightful)

SoCalChris (573049) | more than 6 years ago | (#21879102)

Shouldn't they work on getting wikipedia's search to work half way decently before they try to compete with Google?

Don't get me wrong, I like wikipedia, but their search on the site is next to worthless.

Re:Hope it works better than wikipedia's search (1)

STrinity (723872) | more than 6 years ago | (#21879704)

Yes, anytime I want to find an article on Wikipedia, I Google for it.

Well, good news (1)

helpfulcorn (668048) | more than 6 years ago | (#21879118)

We know how the search engine will work, if anyone has ever used Wikipedia's search function, you're almost destined not to find what you're looking for, especially if you're one letter off. ;]

maybe this is just me (0)

Anonymous Coward | more than 6 years ago | (#21879148)

But this sounds a lot like, "Y'all write the code, I'll take the credit, do the magazine interviews and get all of the money. But you guys can have some fun with it too, debugging and doing localization and that kind of stuff."

Call it the Marc Fleury path to fame and fortune.

the worst idea ever (0)

ILuvRamen (1026668) | more than 6 years ago | (#21879182)

If it's an "open source" search engine and anyone can go and read the source of how it operates, everyone will know the secret to rigging their pages so that they show up in the top results. Google is extremely secretive of their algorithms and that's why there's relatively few rigged crap links.

Re:the worst idea ever (1)

AySz88 (1151141) | more than 6 years ago | (#21879412)

If it's an "open source" search engine and anyone can go and read the source of how it operates, everyone will know the secret to rigging their pages...

...but there's a big difference between "knowing the secret" and actually being able to break it. A "the secret" to breaking RSA is factoring really big numbers, but you can't actually do that.

It sounds like the "secret" to breaking this new system, like Wikipedia, would be to overwhelm the community that is guarding the data. We know that Wikipedia is working fine (for the most part), but things get a bit more complicated with search. Wikipedia, at least, knows when every single edit occurs. But with a whitelist or "reputation" list of URLs, there's no notification when domains (or subdomains or such) change hands (I think?), and re-vetting too often is probably untenable. And you don't really want results based upon the URL's reliability of staying on the whitelist, people might want relevance based on the most-recent data right now, sometimes even if it might disappear under a nasty registration/subscription barrier in a week.

But we'll see whether Wales is onto something good here, I guess. :)

citation needed (1)

notnAP (846325) | more than 6 years ago | (#21879306)

The story lacks a link...

...hold on while I search for it on Google.

Re:citation needed (0)

Anonymous Coward | more than 6 years ago | (#21879540)

  • Your comment does not cite any references or sources.
  • Your comment may contain original research or unverified claims.
  • Your comment may require cleanup to meet slashdot's quality standards.
  • The tone or style of your comment may not be appropriate for slashdot.
  • The creator of your comment, or someone who has substantially contributed to it, may have a conflict of interest regarding its subject matter.

Sooo.... (1)

ArikTheRed (865776) | more than 6 years ago | (#21879318)

How is this different from Mahalo [mahalo.com] ? The media wiki powered search engine that has been up for almost a year?

Re:Sooo.... (4, Informative)

jwales (97533) | more than 6 years ago | (#21879736)

Completely different. :) For one thing, we are doing everything completely freely licensed. Mahalo is proprietary.

For another thing, Mahalo is "human edited" search results for the top queries, which is not a bad idea of course, but it is not intended to be a full search engine. Mahalo have indicated an interest in replacing their google search backup with our open source alternative, if we get to be good enough, which is obviously a far from foregone conclusion.

Wikia, the place to go for furry fan fiction (3, Insightful)

Animats (122034) | more than 6 years ago | (#21879326)

Wikia has been something of a dud. What Wikia really does is monetize fancruft. Their big wikis are for Star [Trek|Wars|Gate|Craft], Everquest, Marvel comics, Yu-Gi-Oh, and similar subjects. They're the resting place for fan articles thrown out of Wikipedia. [wikia.com]

Wikia's search engine, based on the user demographic they have now, is going to have great coverage of furry fan fiction. [wikia.com]

There's already a good manually-updated search engine. It's called Open Directory [dmoz.org] . It's quite useful as a data source for answering the question "what is this web site about"? It tends to run months behind changes to the web, since it's manually updated. While not many people query DMOZ manually, it's used by Yahoo, Google, etc. to get some basic information about a web site.

As an example of how great Wikia search is going to be, Wales suggested searching for "Tampa hotels". [techcrunch.com] The major search engines return too many bottom-feeder reseller and directory sites for searches like that. As I point out occasionally, we've already solved that problem over at SiteTruth [sitetruth.com] , which looks for business legitimacy. Type in "Tampa hotels" there and watch it push the marginal sites to the bottom of the search results. We have that one handled.

Wikipedia works because people are willing to do substantial work for free for a non-profit organization. That doesn't work for a commercial business. You can get people to write about themselves (Myspace, Facebook, etc.) but beyond that, "crowdsourcing" doesn't go very far.

Re:SiteTruth? Well... (1)

gondwannabe (1028488) | more than 6 years ago | (#21879486)

...the documented "tests" of site legitimacy at your site are so lame and obvious they hardly bear discussion.

Example - you don't recognize our site as having a valid business address because it's embedded in a table - not best practice HTML, granted, but hardly obscure AND you only recognise certification from the BBB - which is hardly a sterling endoresement of business legitimacy.

Sorry to go on the attack, but I don't think you can fairly claim to have solved this problem at any level that this crowd would accept as meaningful. The web is not headquartered in Podunk and it doesn't meet for lunch at the Rotary Club.

Funny thing is (1)

WindBourne (631190) | more than 6 years ago | (#21879336)

that if this succeeds, and can race past MSN in terms of popularity, it will show to the world, that MS's gripe about Google truly was worthless. Of course, MS will use that to tell our congressmen that the glass is half empty, not half full.

Why does everything have to be communist (-1, Flamebait)

Anonymous Coward | more than 6 years ago | (#21879454)

Open Sores? Everyone who loves communist open sores, including Fucktard Taco, BrokebackNeil, TwitterTheTroll, Communist Zonk, and every other fucktarded shitdot sheeple should go and slit their fucking wrists.

GO AHEAD FUCKING FLAME AWAY OR WASTE YOUR GODDAMNED MODPOINTS FUCKTARDED SHITDOT SHEEPLE!

Re:Why does everything have to be communist (1)

timmarhy (659436) | more than 6 years ago | (#21879484)

sheeple? your lisp is cute.

The problem here is... (2, Interesting)

Stan Vassilev (939229) | more than 6 years ago | (#21879494)

Wikipedia receives most of its traffic from its articles appearing in Google's search results, Wikipedia being relevant content, and Google being the top search engine.

How is Wikipedia to draw traffic to their search engine? Obviously not via Google, as search engines are content free on their own. Integrating it with Wikipedia? But again, Wikipedia is the end target, not a start point, so how could this work.

I don't think Wikipedia has the strategy or money for this to reach critical mass and show its potential, but it'll be interesting as an experiment.

Google vs. Wikia (1)

listen_to_blogs (1210278) | more than 6 years ago | (#21879516)

Finally a new search engine to compete against Google. Looking forward to it. This kind of explain why google came up with Knols. listen_to_slashdot [blogbard.com]

It's called relevance feedback (1)

Plutonite (999141) | more than 6 years ago | (#21879544)

and it's pretty useful. I don't know why everyone is complaining in the comments so far. This user-participates-in-ranking magic is not exactly news, and anyone who has studied or worked in information retrieval knows this. With a large enough number of benign participants, it should work.

And since people are bringing up Google as competition: Google Search has an estimated retrieval accuracy* of around 10%. Not very hard to beat, except that the Internet is a rather large document set. Have you ever browsed to the 50th page of results on Google? Good. Don't.

The problem is that to give decent results an engine needs time, and people are just not prepared to wait. That's why general purpose search engines on the web try to give you the best answer on the top hit. Results deteriorate a little (next 10) then improve again (next 20) then go completely nuts as you proceed. This fits the business plan, and almost everyone is happy. Google may have superb query processing and a decent Index system, but retrieval can be made to improve a lot if, say, there is an option to wait a little and get something better. Maybe Wikia can do this. If the users who get the most "insightful" (ergo time consuming) results get their feedback weighed more heavily than the point-and-click folks, this project can be very interesting.

*accuracy is a complicated metric that involves efficiency (fraction of retrieved that is relevant) and recall (fraction of relevant that is retrieved).

Scary Implications (3, Funny)

Doomstalk (629173) | more than 6 years ago | (#21879558)

I foresee someone hacking this system to return goatse as the #1 result for every search made.
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?