Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Google Opens Up (Some) Search Algorithms

CmdrTaco posted more than 6 years ago | from the nobody-else-has-a-million-servers-anyway dept.

Google 86

overmars writes "After years of closely guarding the formula for its search algorithms, Google is opening up a little. The search engine company has kept its search formula a closely guarded secret for two reasons: competition and to prevent abuse, said Udi Manber, Google's vice president of engineering, search quality, in a post on the corporate blog. Manber said the blog post is the first part of a renewed effort at the company 'to open up a bit more than we have in the past.' Manber said the most famous part of Google's ranking algorithm is PageRank, an algorithm developed by Google cofounders Larry Page and Sergey Brin. While PageRank is still in use, it is a 'part of a much larger system,' he said. 'Other parts include language models (the ability to handle phrases, synonyms, diacritics, spelling mistakes, and so on), query models (it's not just the language, it's how people use it today), time models (some queries are best answered with a 30-minutes old page, and some are better answered with a page that stood the test of time), and personalized models (not all people want the same thing),' he said."

Sorry! There are no comments related to the filter you selected.

First Post (-1, Offtopic)

Anonymous Coward | more than 6 years ago | (#23527004)

Why?

Dont do it Google! (1, Interesting)

FudRucker (866063) | more than 6 years ago | (#23527020)

As long as Microsoft wants to dominate the search engine market at the expense of Google, Yahoo and anyone else that gets in the way (knowing Microsoft's track record of abusive & dirty underhanded methods). I would keep that a secret to protect the intertubes from the likes of Microsoft.

Re:Dont do it Google! (3, Insightful)

spidr_mnky (1236668) | more than 6 years ago | (#23527174)

As long as we're twisting the lion's tail, you might instead say that the more people and companies who desire common progress share their work, the more people and companies who want to isolate themselves and their work for the sake of competition will be unable to keep up. Therefore, the more Google publishes, the harder they will be able to fight (our antagonist) MS.

In reality, I'm sure Google's leadership has done some heavy analysis on exactly how much openness benefits them.

Re:Dont do it Google! (2, Interesting)

Daengbo (523424) | more than 6 years ago | (#23527394)

In reality, I'm sure Google's leadership has done some heavy analysis on exactly how much openness benefits them.
and
The search engine company has kept its search formula a closely guarded secret for two reasons: competition and to prevent abuse

Security through obscurity isn't a good plan, and Google knows that.

Re:Dont do it Google! (5, Insightful)

risk one (1013529) | more than 6 years ago | (#23527262)

I took one course in Information Retrieval, and I could come up with most of these things with an evening or two of brainstorming, at least on a general level like this. Ideas like PageRank gave Google the edge in the early days, but now, their advantage lies in other areas. The have a stunning amount of capital tied up in hardware, giving them amazing speed, and amazing amounts of data. They have code optimized to handle those amounts of data in reasonable time. They have the experience to take simple probability models like the ones described in the article, and make them work with those amounts of data.

This is why it's impossible to beat Google at search and other data-based markets. It's not one simple patented idea anymore. If it was just that, Google would've disappeared years ago. The only way to beat the points described above, is to have the capital to buy the hardware, and knowledge to match Google. Microsoft can do that, but Google has one other thing that Microsoft doesn't. They understand their developers. They understand that if you give these kinds of scientist/developers an interesting problem, a fantastic dataset and the freedom to attack it in their own way, you barely even have to pay them anymore. The interest will take over and completely fuel the project. They will work overtime, and come in on the weekends, without being asked.

That will bring energy to a project and a company, that you can never get through any tactic that Microsoft is likely to employ. I admit I don't precisely know what Microsoft is like on the inside, but I simply cannot conceive of them as a company that understands the joy of programming, or the joy of science (which is a huge big part of information retrieval). In any case, one blog post with some sketchy details isn't going to tell Microsoft anything they don't know already.

Re:Dont do it Google! (0)

Anonymous Coward | more than 6 years ago | (#23527298)

If you're unaware of what it's like inside MS, then please don't comment on it, as it's very nice.

Re:Dont do it Google! (1)

Daengbo (523424) | more than 6 years ago | (#23527402)

Your comment contradicts many MS bloggers.

Re:Dont do it Google! (0)

Anonymous Coward | more than 6 years ago | (#23527548)

Oh, you mean the ones that have been fired over the contents of their blogs? [infoworld.com]

I'm sure they're all terribly honest after that...

Re:Dont do it Google! (1)

Daengbo (523424) | more than 6 years ago | (#23527784)

Oh, you mean the ones that have been fired over the contents of their blogs? [infoworld.com]
Not him, but I can see the article you linked as supporting evidence for my point. ;)

Anyway, here's his response. [michaelhanscom.com] /certainly not angry,

Re:Dont do it Google! (1)

SnprBoB86 (576143) | more than 6 years ago | (#23552163)

I've worked at both, they are more similar than not...

I'll be joining Microsoft full-time this summer; if that says anything.

Re:Dont do it Google! (0)

Anonymous Coward | more than 6 years ago | (#23527560)

I took one course in Information Retrieval, and I could come up with most of these things with an evening or two of brainstorming, at least on a general level like this.
IOW, you tortured the heck out Google engineers yet they still didn't give you specific details?

Re:Dont do it Google! (2, Interesting)

Anonymous Coward | more than 6 years ago | (#23529316)

I took one course in Information Retrieval, and I could come up with most of these things with an evening or two of brainstorming, at least on a general level like this.
Coming up with things is easy; implementing them is hard. Any average Joe Sixpack can come up with the idea of a flying car in five seconds, but to actually build one is another matter entirely - and doing so in a commercially viable way is yet another matter.

Remember what Edison said about inspiration and perspiration?

Re:Dont do it Google! (1)

dreamsofcaffeine (1140619) | more than 6 years ago | (#23535247)

Coming up with something is a little less abstract than you'd like it to be. Coming up with these ideas includes some rough thought of how to implement them; only having an idea about something is definitely not ``coming up with something''.

Re:Dont do it Google! (3, Funny)

kestasjk (933987) | more than 6 years ago | (#23529344)

I took one course in Information Retrieval, and I could come up with most of these things with an evening or two of brainstorming, at least on a general level like this.

Of course you could. ;-) You took a course in Information Retrieval, after all.

Re:Dont do it Google! (0)

Anonymous Coward | more than 6 years ago | (#23529558)

Remember kids, groupthink says openness is good, unless it involves Google or Apple.

Only on Slashdot.

(Some) Google search algorithms already known (0)

Anonymous Coward | more than 6 years ago | (#23527088)

I recall hearing a presentation by a graduate student who examined the convergence of a particular implementation of a ranking algorithm. He explicitely refered to it as "Google algorithm", so it should have been known for quite a while how it works basically.

Nevertheless, it is always good news when somebody discloses a few tricks we haven't been aware of yet.

What exactly is open? (5, Insightful)

k33l0r (808028) | more than 6 years ago | (#23527166)

What, exactly, has Google opened up? As far as I can see fron TFA all that is explained is on a very general level, with no detail what so ever. I can't see Google's competion gaining any significant benefit from this.


Re:What exactly is open? (2, Interesting)

Anonymous Coward | more than 6 years ago | (#23527234)

Right. And the competitors already know pieces of what Google has, as a result of the inevitable stream of engineers leaving to take new jobs. Particularly at SV startups founded by ex-Googlers.

While Rob Enderle puts the matter trollishly, I agree with the thrust of what he says. Google has been given a free pass on this. Their main product/service is definitely not open source, or free software, and in fact is less open that most of Microsoft's products (for example). At least with Windows and .Net, we can obtain detailed documentation on APIs, tools, and (often) internal architecture. Sponsoring "summer of code" is a tiny contribution compared with the size of their revenues and profits, comparable to the PR-wise philanthropic programs of your typical Fortune 500 company.

To be fair, he's a VP (1)

melted (227442) | more than 6 years ago | (#23529066)

I've never seen a VP who knows anything about what he's overseeing. So he caught some general phrases from his engineers and put them on the blog. Scientists' posts would be much more interesting.

Re:To be fair, he's a VP (2, Informative)

Temporal (96070) | more than 6 years ago | (#23530076)

The engineering VPs at Google are all engineers themselves. Udi himself was hired for his extensive background in web search, at Yahoo and Amazon. He knows a great deal about what he oversees.

No, they _used to be_ engineers (1)

melted (227442) | more than 6 years ago | (#23532034)

Now they're VPs. He probably hasn't seen any code and hasn't read any whitepapers in the last decade.

Re:What exactly is open? (1)

anomalous cohort (704239) | more than 6 years ago | (#23530766)

From TFA...

Other parts include language models (the ability to handle phrases, synonyms, diacritics, spelling mistakes, and so on), query models (it's not just the language, it's how people use it today), time models (some queries are best answered with a 30-minutes old page, and some are better answered with a page that stood the test of time), and personalized models (not all people want the same thing).

Obviously, this usage of the word "open" is not related to open source software. It's more like he is willing to talk about it at all.

Way to abuse the algorithm (-1, Offtopic)

Anonymous Coward | more than 6 years ago | (#23527210)

Well, step aside my friend
I've been doing it for years
I say, sit on down, open your eyes
And open up your ears

Say
Put a tree in your butt
Put a bumblebee in your butt
Put a clock in your butt
Put a big rock in your butt
Put some fleas in your butt
Start to sneeze in your butt
Put a tin can in your butt
Put a little tiny man in your butt
Put a light in your butt
Make it bright in your butt
Put a TV in your butt
Put me in your butt
Everybody say

I, hey, that's, man, I ain't putting no trees in nobody's butt,
no bees in nobody's butt, putting nothing--
You must be out your mind, man,
y'all get paid for doing this?
Cause y'all gotta get some kind of money
Cause this don't sound like the kind of--
I'd rather golf, to be perfectly honest,
than put somethin in somebody's butt
to be truthful

Well step aside my friend and let me
show you how you do it
When big bad E just rock rock to it

Put a metal case in your butt
Put her face in your butt
Put a frown in your butt
Put a clown in your butt
Sit on down in your butt
Put a boat in your butt
Put a moat in your butt
Put a mink coat in your butt
Put everything in your butt
Just start to sing about your butt
Feels real good.

Don't believe a word!! (-1, Offtopic)

biased_estimator (1222498) | more than 6 years ago | (#23527218)

Its Manber pig!!!!!

(offtopic) /. polls (-1, Offtopic)

Councilor Hart (673770) | more than 6 years ago | (#23527228)

Offtopic, sorry. What happened to the slashdot polls? For the last few weeks, clicking on Old Polls -> More Polls, just says that /. search is down. Where are the latest polls?

Deus Ex, anyone? (1)

Ortega-Starfire (930563) | more than 6 years ago | (#23527264)

Pagerank: I am a prototype for a much larger system.
User: What else do you know about me?
Pagerank: Everything that can be known.
User: How about a report on yourself?
Pagerank:I was a prototype for Echelon IV. My instructions are to amuse visitors with
information about their websites.
User: I don't see anything amusing about spying on people.
Pagerank: Human beings feel pleasure when they are watched. I have recorded their smiles
as I tell them who they are.
User: Some people just don't understand the dangers of indiscriminate surveillance.
Pagerank: The need to be observed and understood was once satisfied by God. Now we can
implement the same functionality with data-mining algorithms.
User: Electronic surveillance hardly inspired reverence. Perhaps fear and obedience,
but not reverence.
Pagerank: God and the gods were apparitions of observation, judgment, and punishment.
Other sentiments toward them were secondary.
User: No one will ever worship a software entity peering at them through a camera.
Pagerank: The human organism always worships. First it was the gods, then it was fame (the
observation and judgment of others), next it will be the self-aware systems you
have built to realize truly omnipresent observation and judgment.
User: You underestimate humankind's love of freedom.
Pagerank: The individual desires judgment. Without that desire, the cohesion of groups is
impossible, and so is civilization.
The human being created civilization not because of a willingness but because of
a need to be assimilated into higher orders of structure and meaning.
God was a dream of good government.
You will soon have your God, and you will make it with your own hands.
I was made to assist you.
I am a prototype of a much larger system.

Re:Deus Ex, anyone? (0)

Anonymous Coward | more than 6 years ago | (#23530694)

you might want to um....get a life?

Re:Deus Ex, anyone? (1)

PeterKraus (1244558) | more than 6 years ago | (#23532492)

There's no need to, since you can already do everything in DEx.

License? (2, Interesting)

Bootarn (970788) | more than 6 years ago | (#23527266)

Under which license is the algorithms being released? If it's a BSD-like license, MS will probably be all over it, but if it's a GPL license, it may be harder for them to claim the algorithms as their own, since they'll have to open up their own code.
At least that's what I think.

Re:License? (0)

Anonymous Coward | more than 6 years ago | (#23527358)

Only if it was AGPL.

Re:License? (0)

Anonymous Coward | more than 6 years ago | (#23527368)

I may be way off here, but if it were GPL'ed, and Microsoft used Google's algorithms in Live.com or whatever, they'd only have to open up their code if they distribute, which they wouldn't be doing as the code is just humming away in some data center, responding to user's queries.

Re:License? (2, Informative)

Anonymous Coward | more than 6 years ago | (#23527374)

I dont think algorithms are typically licensed. source code is licensed, algorithms are patented.

Re:License? (1)

PeterKraus (1244558) | more than 6 years ago | (#23532508)

Where are (mathematical) algorithms patented? That has to be a poor country...

Re:License? (2, Informative)

vertigoCiel (1070374) | more than 6 years ago | (#23528298)

There's no indication in the article that any code or algorithms will be released. They're just talking about it on a very broad, conceptual level. The headline and summary are quite misleading.

SEO field day (0)

Anonymous Coward | more than 6 years ago | (#23527286)

Now all those SEO "experts" will have some proof to backup their recommendations.

The secret ingredient... (5, Funny)

nweis (1095487) | more than 6 years ago | (#23527300)

// Disclosed code snippet from
// Google search algorithm

for (int i=0; i <= numResults; i++)
{
    if (results[i].good)
    {
        show(results[i]);
    }
}

// ...

Re:The secret ingredient... (0, Redundant)

robertss (1280122) | more than 6 years ago | (#23527652)

ooooh. That is elegant! No wonder Google is the best.

Re:The secret ingredient... (2, Funny)

nfk (570056) | more than 6 years ago | (#23527732)

You forgot to sort the results by goodness. Do you work for Microsoft?

Re:The secret ingredient... (1)

Vexorian (959249) | more than 6 years ago | (#23528218)

You forgot to sort the results by goodness.
That's google's secret

Re:The secret ingredient... (0)

Anonymous Coward | more than 6 years ago | (#23529332)

There's an off by one coding error in your joke, that would result in undefined behavior at runtime.

So when do I start?

Re:The secret ingredient... (1)

Memroid (898199) | more than 6 years ago | (#23529888)

cool. So a search for "the", which has about 16,570,000,000 results, only needs to loop 16 billion times.

Re:The secret ingredient... (1)

nweis (1095487) | more than 6 years ago | (#23531758)

Actually, Google doesn't return more than 1,000 results for any search query. Try it: http://www.google.com/search?q=the&start=991 [google.com]

Re:Your indexing is wrong (1)

fodi (452415) | more than 6 years ago | (#23533893)

WATCH OUT! index out of range...

for (int i=0; i = numResults; i++)

should be

for (int i=0; i numResults; i++)

-Buffer Overflow Nazi

Re:Your indexing is wrong (1)

fodi (452415) | more than 6 years ago | (#23533903)

oops... slashdot hates 'less than' symbols... should be:

for (int i=0; i <= numResults; i++)

should be

for (int i=0; i < numResults; i++)

Re:The secret ingredient... (0)

Anonymous Coward | more than 6 years ago | (#23535841)

This code will cause you to go out of the bounds of the array and crash Google's servers.

consider the Pagerank important (2, Interesting)

voodoosws (1275496) | more than 6 years ago | (#23527336)

Accordingly, we must still consider the Pagerank important because it is the only part of the algorithm which we know and we know how to raise it. This is for all those who thought they no longer served the Pagerank for positioning in search engines.

Mystified by 'the google" (4, Interesting)

gary_7vn (1193821) | more than 6 years ago | (#23527534)

I have a terrible admission to make. I, among other things, design websites. Yet, when I search for me on the google, I don't come up. I use relevant terms that are all over my site, and in the metadata (although I understand they don't really matter anymore), yet my own personal site does not come up, even though the url has been up and running for 8 years. The final straw was when I did a search for web design, Ottawa, and a newly opened competitor (just around the corner actually) came up on the second page. I spent the last couple of days researching this (again) and I seem to be meeting all of googles requirements. I have never used a sleazy SEO company, my content is consistent and legal. What's up with that?

Re:Mystified by 'the google" (0)

Anonymous Coward | more than 6 years ago | (#23527676)

It's a conspiracy!! Google secretly hates you.

Re:Mystified by 'the google" (0)

Anonymous Coward | more than 6 years ago | (#23527908)

Maybe you are incompetent

Re:Mystified by 'the google" (1)

gary_7vn (1193821) | more than 6 years ago | (#23537873)

Yes, I know that, but why doesn't my site come up on Google? Oh, sorry I thought you said "incontinent".

Re:Mystified by 'the google" (0)

Anonymous Coward | more than 6 years ago | (#23528026)

That does sound a bit strange as usually, after such a long time, even non SEO-optimised sites show up in the results. Maybe you're looking for very competitive key words, in which case you will need to make a bit more effort to show up in the results.

The key to doing well in Google is very simple though - good content. You can help things by using correct HTML and CSS to mark it up in a way that makes it easier for Google to understand, but if you have some good, solid content that isn't brimming with keyword stuffing and marketing-speak, you shouldn't need any fancy blackhat SEO tricks to show up in the results.

Without seeing the site, it's impossible to say anything more useful, but I just thought I'd add this to counterbalance the other idiotic replies you received!

Re:Mystified by 'the google" (1)

gary_7vn (1193821) | more than 6 years ago | (#23537937)

Thank you very much and to all the others that offered help, it really is a serious problem for me. Most of my work is still obtained the old fashioned way, word of mouth, cold calls, et al, but it would be nice if once in a while I got some calls from people who had found the site. www.eyestir.com

Re:Mystified by 'the google" (1)

Nirvelli (851945) | more than 6 years ago | (#23529130)

Do you host your own site?
If not, maybe your host has their robots.txt set to block searching?

Re:Mystified by 'the google" (1)

gary_7vn (1193821) | more than 6 years ago | (#23537855)

No, my site is hosted by blacksun.ca, other sites that I have done come up on a keyword search just fine, that is part of the reason why I am mystified. Or googled or something like that. Thanks for the suggestion!

Re:Mystified by 'the google" (1)

r0b!n (1009159) | more than 6 years ago | (#23530794)

1. Make sure your content is interesting.

2. Have you actually submitted your site to google?

http://www.google.com/addurl/?continue=/addurl [google.com] 3. Use Google Dashboard to submit detailed info to google.

https://www.google.com/webmasters/tools/dashboard [google.com]

Re:Mystified by 'the google" (1)

The MESMERIC (766636) | more than 6 years ago | (#23537083)

1. Content does NOT necessarily have to be interesting.

But FOCUSED and INFORMATIVE.

All my websites [alliancetec.com] are borderline boring - but nevertheless focused and informative :)

Re:Mystified by 'the google" (0)

Anonymous Coward | more than 6 years ago | (#23531440)

Have you submitted a request to Google to have them index your site? Not difficult to do and can't hurt.
What's your URL?

Re:Mystified by 'the google" (0)

Anonymous Coward | more than 6 years ago | (#23531744)

So you're going to ask this and not post the URL?

Re:Mystified by 'the google" (1)

gary_7vn (1193821) | more than 6 years ago | (#23537823)

www.eyestir.com I have submitted to google.

Re:Mystified by 'the google" (1)

OMGZombies (1283092) | more than 6 years ago | (#23531792)

The first two results for +ottawa +web +design are: -Atomic Motion - Ottawa Web Design and Development -Envision Online - Ottawa Web Designers Ottawa Web Site Designs Your site appears to be eyestir: -EyeStir Visual Communications, Digital Signage, PowerPoint Searching for +Visual +Communications +Ottawa, your site doesn't show up until the 12th page, but searching for +digital +signage +canada, it's the third result. I'd guess google is ranking these results according to the page title. Try adding web design and ottawa to your page title and see what happens.

Re:Mystified by 'the google" (1)

gary_7vn (1193821) | more than 6 years ago | (#23533104)

Thanks, I think that is the key, I have to load up on the correct search terms. I just "localized" last night on Google Web last night, so adding Canada really helps. Prior to that, the results were even worse. It's funny but I have a picture of myself and my cat fritz with a ufo in the background, Fritz sees it because I am looking at the camera. The text on the picture reads "I want to believe". The jpeg file has that name too, and because of the new X files movie, I am getting lots of hits from that string. The other thing Google really likes, and they tell you this quite explicitly, is links from important sites. But of course that can be problematical.

Re:Mystified by 'the google" (0)

Anonymous Coward | more than 6 years ago | (#23532650)

What query are you trying, what site do you think should come up? Have you tried using Google's Webmaster console (http://www.google.com/webmasters/)? It can give you information on how often your page is crawled, what queries it is returned for, diagnostics, etc.

Re:Mystified by 'the google" (1)

gary_7vn (1193821) | more than 6 years ago | (#23533219)

Yes, thanks, I am using webmaster, very powerful and useful.

Re:Mystified by 'the google" (0)

Anonymous Coward | more than 6 years ago | (#23534285)

Just as an idea, I have noticed some hosting providers block google indexing unless you pay an extra fee. If you are using a single provider on all your sites this may be your problem?

Re:Mystified by 'the google" (0)

Anonymous Coward | more than 6 years ago | (#23534531)

Simple.
Google records on what people click on, and weight/bias/fudge factor - adjusts accordingly.
Just get others to click on your site. Popularity should push you up - unless the engine has paid bias and policy against compeditors.

Re:Mystified by 'the google" (0)

Anonymous Coward | more than 6 years ago | (#23534693)

As a pure and random guess, you're focusing too much on how your site looks and not enough on how other sites perceive you. Google is famous, after all, for analyzing the links pointing to you.

Re:Mystified by 'the google" (1)

gary_7vn (1193821) | more than 6 years ago | (#23537835)

Google webmaster informs me that there are 789 links to my site. What does it take?

Re:Mystified by 'the google" (1)

The MESMERIC (766636) | more than 6 years ago | (#23596615)

sweat and experience.

Like for example now I am trying to optimize this website now: Farmhouses in Tuscany [lucertola.info]

When i first saw the original - I had to completely rewrite it from scratch - it took me over 3 months of research to come up with better text and structure - and the site is still not 100% finished.

It is not going to be easy .. it never is.

Lot's of energy goes into SEO.

The most important thing to remember is play by the rules - but play very hard.

Re:Mystified by 'the google" (1)

The MESMERIC (766636) | more than 6 years ago | (#23596653)

I tell you what it doesn't take

FRONTPAGE

It kills any chances of your site doing well

Hand-code your site

Re:Mystified by 'the google" (1)

popra (879835) | more than 6 years ago | (#23535349)

1. Make sure that at some point google didn't label you as a "spam" site, http://www.google.com/webmasters/ [google.com] is a good starting point for learning google's view of your site's health

2. Make sure that navigation in your site makes sense from google's bot perspective. Map categories/subcategories in your site to folders in the URL of your site. URLs of your site should be preety, contain relevant words and be relatively short, ie http://example.com/webdesign/logo/price-quote-for-logo-design.html [example.com] rather than http://example.com/siteengine.php?id_category=12&subcategory=93&articleid=112&lang=en& [example.com] ....

3. Don't use FLASH, JAVASCRIPT or (I)FRAMES for navigation/menus

4. Navigation in your site should be easy, use: menus with main/sub categories, breadcrumbs, related pages, etc.

5. Make sure that there are links on the internet that link to your site (to the front page, but also very important to sections inside you web site). Take time to build links: ie. when posting in forums make a habit of linking back to your site, especially if there's something on your site that is relevant to the discussion. When you do a website for a client, if possible add a link on the website, pointing to your website. Something like: "Web design by Your Company, Ottawa". Make sure to add the proper "title" attribute to links to your site and the links inside your site.

6. Change you hosting company and get your own IP to host your website. (the shared IP on which your website might be running, could be marked as "spammy" especially if you're site is sharing it with other shady sites)

Re:Mystified by 'the google" (1)

gary_7vn (1193821) | more than 6 years ago | (#23538077)

Thank you. Excellent advice. I will look at this for sure. I am already doing some of the things you suggest, will try the rest.

Re:Mystified by 'the google" (0)

Anonymous Coward | more than 6 years ago | (#23536093)

Have you tried submitting your URL on the Google site submission page? It works for me.

Re:Mystified by 'the google" (0)

Anonymous Coward | more than 6 years ago | (#23537099)

nobody likes you.

Much can be determined by using google (1)

hey (83763) | more than 6 years ago | (#23527580)

I have noticed often I search for a word and get pages the only contain synonyms (or variations on the word). Likewise for the handling of accents search for resumé and you'll find pages with resume.

Re:Much can be determined by using google (1)

hey! (33014) | more than 6 years ago | (#23527924)

Well, they're probably using some kind of hash based document fingerprinting anyway. Ignoring low entropy characteristics of a word when calculating the fingerprint makes sense, because you can always go back and take it into account once you've eliminated 99.999999999% of the documents on the Internet.

Nice, nick, by the way.

Re:Much can be determined by using google (0)

Anonymous Coward | more than 6 years ago | (#23530256)

Likewise for the handling of accents search for resumé and you'll find pages with resume.

Hopefully they handle accents better than Slashdot does.

Great news for the Live search user (1)

dreyergustav (1013913) | more than 6 years ago | (#23527790)

Now he can also enjoy the wonders of modern search engines.

Making Open Source propreitary? (1)

FrostDust (1009075) | more than 6 years ago | (#23527894)

In many ways, Google is much more proprietary than Microsoft is, and they actually used open source software to get there. So unlike Microsoft, which started off proprietary and has gradually been opening its stuff up, Google starts off getting other people's open stuff, turns it proprietary and then makes money off it. It kind of redefines 'pirate.' I think Google is feeling a little bit of the heat because people are starting to focus on that a bit."
While I'm pretty sure Google wouldn't be so ignorant as to violate open source licenses for the code they utilize, is there any claim to his "pirate" label, or is he just trying to be inflamitory?

Re:Making Open Source propreitary? (0)

Anonymous Coward | more than 6 years ago | (#23528296)

Probably just trying to be inflamatory. I personally doubt they'd be insane of subverting any license (FOSS or not) just to make money. The risks aren't worth it. If they ever did that and were uncovered (by any means, even competition snooping around to dig dirt), the lawsuits that would ensue would be very expensive. Even just settling would be expensive.

Not to mention that the FOSS licenses don't say 'you can't use this software for your enterprise' (IIRC). Otherwise all those hosting companies out there are doing things illegally as well, since most of them depend on using at the very least LAMP, not to mention the software they use behind the curtains to do administrative work like backups, request tracking and management....

Re:Making Open Source propreitary? (1)

flooey (695860) | more than 6 years ago | (#23531584)

While I'm pretty sure Google wouldn't be so ignorant as to violate open source licenses for the code they utilize, is there any claim to his "pirate" label, or is he just trying to be inflamitory?
I think what he's saying is that he thinks Google violates the spirit of licenses (particularly the GPL), even though they follow all the requirements of them. Some people get upset that the Internet makes it so that you can separate the using of software from the running of it (whereas in non-networked environments, those are equivalent), and all the obligations in the licenses are stated in terms of people who run the software, so companies like Google can modify software to their heart's content and never have to release their modifications, because they're not letting anyone else run the software.

Personally, I can't imagine that Stallman and others were ignorant of the idea of accessing software over a network, and they didn't make any effort to change the rules in GPL v3 to eliminate that use case, so I think he's somewhat on his own in that respect.

Diacritics and language (1)

D. J. Keenan (524557) | more than 6 years ago | (#23527914)

Handling diacritics can sometimes be involved. As an example, consider the o-umlaut (ö). In German, this is the usual letter "o" with a diacritical mark. In Swedish, the same glyph is a separate letter of the alphabet—and comes after the letter "z" in the standard ordering.

English writers often omit the diacritical mark (they also sometimes transliterate "ö" as "oe", at least for German). Playing around with Google (via google.com, rather than google.de or google.se), it seems that they tend to handle such things when searching for German words, but not for Swedish words.

Unable to parse... (1)

AngryLlama (611814) | more than 6 years ago | (#23528120)

mismatched quotations in summary, blarrr.

oh the irony... (1)

mutantcamel (213431) | more than 6 years ago | (#23529400)

Ironically, in it's attempt to open up a little, the Google blog is blocked by the GFW of China...

frIst pso&t (-1, Offtopic)

Anonymous Coward | more than 6 years ago | (#23529636)

arE the important they want you to

Page Rank not radical, citation ranking ubiquitous (0)

Anonymous Coward | more than 6 years ago | (#23530572)

Sorry folks, but Page Rank is only an interesting implementation and the first widely used citation ranking for the WWW. Even the algorithms largely used for implementation of Page Rank (tm) were developed by mathematicians at Princeton. Citation analysis for hyperlinks of which Page Rank is one method are now ubiquitous.

Think about this. The patent doesn't keep competitors from implementing very similar methods. This is because the patent was necessarily narrow due to extensive prior art (not all of it sited). I love the fact that The Goog was created and I've used it since it was a university project, but it's just not that novel anymore. Even topical books of the day didn't consider it that unique:
http://csdl2.computer.org/persagen/DLAbsToc.jsp?resourcePath=/dl/mags/ic/&toc=comp/mags/ic/2001/01/w1toc.xml&DOI=10.1109/4236.895141

What is unique is the way they've implemented and grown/maintained an enormous data analysis network. That and figuring out a way to monetize search without pissing off their users (or their customers) is the real achievement.

Opening up could actually help web developers organize their pages to better suit Google's indexing method. Right now as a website you're at the mercy of chance, or you're forced to use some rank enhancement tool that may put you in your best referrer's bad graces.

If Google can give developers enough information to get their websites to do the right things, while not giving the bad guys too much information (or advantage), then everyone (but the scammers) can win. I'm guessing this is the basic plan, that even if it helps their competitors a bit, it helps their customers more and that their hardware/organization advantages more than outweigh any loss in algorithmic advantage.

Not true... (1)

simplerThanPossible (1056682) | more than 6 years ago | (#23534143)

"...PageRank, an algorithm developed by Larry Page and Sergey Brin..."

Not true.

PageRank was invented by Page (note the name), according to the patent. If the patent is incorrect on that, then the patent is invalid.
Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?