×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Several Link-Spam Architectures Revealed

timothy posted about 4 years ago | from the labyrinthine-luring dept.

The Internet 38

workie writes "Using data derived from website infections, RescueTheWeb.org has found several interesting link-spam architectures. One architecture is where concentric layers of hijacked websites are used to increase the page rank and breadth of reach (within search engine search results) of scam sites. The outer layers link to the inner layers, eventually linking to a site that redirects the user to the scam site. Another architecture involves hijacked sites that redirect the user to fake copies of Google, having the appearance that the visitor is still within Google, but in reality they are on a Google lookalike that contains only nefarious links."

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

38 comments

For the paranoid... (5, Interesting)

Antony-Kyre (807195) | about 4 years ago | (#31972568)

Consider doing all your banking, and any other sensitive stuff, on a computer totally separate from your web-surfing computer. Kind of like having a dummy wallet containing only petty cash and your ID when you go out at night versus your credit cards, etc.

Re:For the paranoid... (0)

Anonymous Coward | about 4 years ago | (#31972594)

better is a Live CD

Re:For the paranoid... (3, Insightful)

SlothDead (1251206) | more than 3 years ago | (#31972764)

When a vulnerabily is found on your LiveCD you won't be able to patch it.

Re:For the paranoid... (2, Interesting)

oiron (697563) | more than 3 years ago | (#31972948)

This is 2010; install it onto a VM...

Re:For the paranoid... (1)

capo_dei_capi (1794030) | more than 3 years ago | (#31973640)

Seeing that them hijacking your host OS would give them access to your VMs, using a VM for general surfing purposes seems to be better choice, especially if it runs an obscure OS, like opensolaris.

Re:For the paranoid... (1)

Nerdfest (867930) | more than 3 years ago | (#31973792)

A VM isn't foolproof. There are exploits that will allow you to 'step out of the box'. A lot of the time the VM runs at a high privilege level as well. It's protection, just not as foolproof as a live CD.

Re:For the paranoid... (1)

Darkman, Walkin Dude (707389) | more than 3 years ago | (#31972878)

This doesn't represent an active threat though, its just for those who get fooled by the camouflage of scam sites. And if they get pwned on one computer, they can get pwned on another just as easily.

Re:For the paranoid... (2, Insightful)

Runaway1956 (1322357) | more than 3 years ago | (#31972962)

That isn't paranoia - it's good common sense. Statistics tell us that an ungodly number of computers are compromised. Why do your banking and other sensitive online transactions from a potentially compromised machine? Use those LiveCD's, or a virtual machine, or almost ANYTHING other than your Windows browsing and porn watching machine!!

Re:For the paranoid... (0)

Anonymous Coward | more than 3 years ago | (#31972978)

Statistics tell us that an ungodly number of computers are compromised.

I'm curious as to what would be a godly number of compromised computers?

Re:For the paranoid... (1)

Hurricane78 (562437) | more than 3 years ago | (#31975330)

Protip: It’s called FinTS. With chip card. Look it up. :)
I use it since it were still experimental and called HBCI 1.0.
No browser involved. You have a separate reader with keys (and optionally a display) that you interact with. Unless someone modifies the reader, there is no way anyone else can get your code. In short it’s two-factor authentication on a trusted client. The PC just shoves encrypted packets back and forth between the reader and the bank server.

I recommend having a reader with a display. That way it can even guarantee you that the amount displayed is the amount you actually agree to be transferred. Additionally it allows you to load stored-value cards, do legally valid digital signatures, and read a lot of other chip cards.

Re:For the paranoid... (1)

lsatenstein (949458) | more than 3 years ago | (#31994132)

I have a live CD and boot that when I want to do my banking. Since I also live near a branch of the bank, my wife goes there to do most of the non-electronic transactions, such as extracting grocery money, etc. Why extract money? Well, I don't want to be a victim of a business whose site gets compromised and find there site secutity was or is a copy of the security shortcomings experienced by TJMAXX. I want to own my personal information and not worry about it after it was stolen.

Link Spam? (2, Insightful)

AndGodSed (968378) | about 4 years ago | (#31972570)

I thought that google had ways of detecting these and down-ranking them?

Re:Link Spam? (0)

Anonymous Coward | about 4 years ago | (#31972632)

Google had ways of making you think they had down-ranked the spam links.

Re:Link Spam? (1, Interesting)

FuckingNickName (1362625) | more than 3 years ago | (#31972864)

Precisely. In fact, with Google for Domains etc., they know well how profitable this link spam is. Hell, 10 people employed 8 hours a day flagging sites would tackle the vast majority of repeated and obvious search engine spammers. But then Google would have to admit that they haven't refined interesting algorithms since the '90s, and might have to give actual work to the 2nd rate PhDs they hire to twiddle their thumbs.

Re:Link Spam? (3, Insightful)

asdf7890 (1518587) | more than 3 years ago | (#31972910)

Every time Google adjust the rankings to account for the current crop of deceptive SEO techniques, people think up new deceptive SEO techniques. It is a moving target and Google can't move too fast without thinking as they risk disrupting unaffected parts of the algorithm resulting in reducing its effectiveness when presented with genuine links.

Also Google may be the biggest name in town but they are not the only big name by a long shot. an SEO technique is not completely invalidated until such time as all popular engines have a away to discount it.

And the summary (didn't RTFA, sorry) doesn't state that the techniques were proven to be working, just that this is what people are trying.

That was actually an interesting read (1)

bguiz (1627491) | more than 3 years ago | (#31972726)

While its assertions are believable, I'd now like to see the methods and data

Re:That was actually an interesting read (1, Flamebait)

bguiz (1627491) | more than 3 years ago | (#31972770)

Also, I dislike their main tagline

"The web is under attack from hackers. RescueTheWeb.org is working to reduce their chances of success."

I take issue with their ignorance toward the difference between a hacker [8hz.com] and a cracker [8hz.com] . (links to Eric Raymond's "The Jargon File")

Re:That was actually an interesting read (0, Insightful)

Anonymous Coward | more than 3 years ago | (#31972826)

The rest of us moved on about 20 years ago -don't you think it's time YOU did too?

Re:That was actually an interesting read (0, Troll)

bguiz (1627491) | more than 3 years ago | (#31973002)

Not at all.

When you say "The rest of us", you should say just yourself.

Re:That was actually an interesting read (2)

For a Free Internet (1594621) | more than 3 years ago | (#31972868)

I found a great web site for all fun people who like vriot7liugiy7z! Get the best vriot7liugiy7z free at my web stite! vriot7liugiy7z vriot7liugiy7z vriot7liugiy7z!!!!!! EXCludsibve! The bewsrt!!!!! vriot7liugiy7z!

rtfa? (0)

Anonymous Coward | more than 3 years ago | (#31973072)

umm, i would read the fine article, but afraid to click the link..

Link pyramids (1)

kmike (31752) | more than 3 years ago | (#31973264)

Sounds familiar: http://seoblackhat.com/2009/07/10/link-pyramids/ [seoblackhat.com]
By the way, if blackhat SEO's describe this technique in the open, it's either already well known, or its effectiveness has been diminished to the point where hiding the details isn't worth it.

Re:Link pyramids (1)

workie (1754464) | more than 3 years ago | (#31974668)

The RescueTheWeb article is a high level discussion of link architectures that currently exist in the wild. The article wasn't trying to show samples since disclosure of which websites are breached is against the privacy policy of RescueTheWeb. These are private websites that have been breached by others and used to create these various structures. Thus, their web addresses would revel who's website were breached. I can tell you that an example 'constellation' Google look-alike search engine consists of some 26 domains of this pattern: http://googpill_.com/ [googpill.com] where the '_' is the letters from 'a' to 'z'. When you visit these sites directly they say 'Under Construction', but when you visit them from a hacked site you get the Google look-alike. (Not all of the lettered domains appear to be working.) Follow this link to see an example: http://googpillc.com/zgyllgiaahkeiryy_idknxqkbi.py [googpillc.com] This constellation example is different than the pyramid example from the seoblackhat. The goal of this constellation, as an example, is to confuse the user into thinking they are on Google, it is not to increase page rank.

Re:Link pyramids (1)

TaoPhoenix (980487) | more than 3 years ago | (#31975336)

I had basically known it, but it's still daunting to face as an actual search customer.

I like trying out freeware utilities. But sometimes it's tricky to know which are real links (could be some 15 real ones) and which are nastylinks (could be 85) for my 100-result first page of returns.

The problem: low standards in search engines. (1)

Animats (122034) | more than 3 years ago | (#31975900)

These guys are doing good work, but really, all they're doing is checking for some specific types of black-hat SEO. This is inherently a losing battle, because there's active opposition. It's a "negative file" approach - making a list of the bad guys. Credit cards once worked that way; merchants were sent daily lists of canceled or stolen credit cards. Back then, getting a credit card was tough; the customer had to be a good customer of the bank. Not until credit card transactions were validated remotely against a "positive file" that checked the actual account could everyone have one. Web search is still in the "negative file" era.

As I point out occasionally, the main search engines have very low standards for business legitimacy. It's an ongoing, and losing, battle to filter out the totally bogus sites. But if you insist on some minimal standard of business legitimacy for a commercial web site, you kick out most of the "bottom feeders" with no business address, and along with them, most of the total phonies. We do this at SiteTruth [sitetruth.com] , which exists to demonstrate that it's possible. SiteTruth tries to find some indication that a domain maps to a real-world business. If it can't, the site is moved down in search engine position. That's enough to move most "bottom feeder" downward, below the legit ones. It's not always successful in finding the business behind the site, but it looks harder than the average user would, looking through the site's "About", "Help", "Contact", etc. pages for a mailing address. If a search engine takes a hard line on this, the junk sites can be kicked out.

Once you have a business address for a web site, there are extensive resources for finding out more about the business. It's easy to get annual sales and number of employees if you know what database to buy. Corporate registration information and D/B/A name information is available. Business credit rating info is available in bulk for a fee. Crank that info into search engine positioning and you've got hard data driving search. Rating web sites by looking only at the web is a process easy to manipulate. Use info from the real world, and it's much harder.

Phony mailing addresses do show up, but that's usually associated with phishing sites. Not showing a business address is a misdemeanor in some jurisdictions, but common. Using the address of another business is felony fraud and identity theft. That gets law enforcement attention. So only outright criminals try that. To catch that, we fetch the entire PhishTank database every few hours and blacklist the entire domain for a single phishing entry. That's draconian, but if you're running a site that lets users upload entire pages, it's your job to kick the phishers off. Most of the innocent victims there [sitetruth.com] are free hosting services with weak abuse departments. If you're in the free hosting business or the URL redirection business, you need a strong abuse department, or you will be pwned. Right now, "t35.com" is getting hit hard. By now, most free hosting sites with a clue automatically check PhishTank and the APWG list to see if they're on it. "t35.com" is still doing it by hand, and they're losing the battle.

So why doesn't Google do this? Google's business model depends on those ad-heavy "bottom feeder" sites. About 36% [sitetruth.net] of Google's "content network" domains are "bottom feeders". When organic search takes you to the right place on the first try, Google doesn't make any money. But if you're led through an ad-heavy site, the Google cash register clicks. Google's business model thus takes them to the dark side. Google would take a big financial hit if they did even some basic legitimacy checking on their advertisers. Search Google for "craigslist auto posting tool" [google.com] , which brings up five Google ads for companies offering to spam Craigslist. There's even "Learn How to Auto Post to Cragslist. Robot Posts 1000's of Ads Per Day!", for which Google provides Google Checkout service and takes a cut of the profits. That's aiding and abetting computer crime.

Bing, however... Something might happen there.

Re:The problem: low standards in search engines. (1)

the_womble (580291) | more than 3 years ago | (#31980694)

This only works if someone is searching for a business or product. Most searches are for information. There are LOTS of valuable websites run by individuals. You rank them all low?

Why on earth do we want rankings to reflect credit ratings? You can trust sources with good credit ratings more? Lots of businesses with good credit ratings one year, have ended up with their CEO in the dock the next (e.g. Enron).

You need a lot more data coverage than you have: you can cannot verify Glaxosmithkline, Vodafone (main corporate site - country sites you do), Freshfields (a major law firm) or Oxfam International (but you can verify Oxfam UK).

Nice idea, but your current implementation sucks (yes, it is alpha, but its not very encouraging). It is better than Cuil.

Re:The problem: low standards in search engines. (1)

Animats (122034) | more than 3 years ago | (#31987794)

Re SiteTruth complaints: (We have a blog [sitetruth.net] for that.)

Non-commercial web sites aren't rated at all. However, the presence of an ad link marks a site as "commercial", as does being in ".com". Our "commercial intent" detection is rather simplistic. We really should have a classifier system doing that. Yahoo search R&D, back when they had search R&D, built one of those, but never did much with it. We've been reluctant to use machine learning techniques, though, because they reduce the transparency of the system. At present, SiteTruth doesn't rely on "security by obscurity". Adding a classifier system would change that.

Credit rating information is useful because, for businesses, you can get business size information. Annual sales and number of employees are worth knowing, and displaying to the user in search results. (We'll be doing something in that area soon.) There's a guy in Brooklyn, NY, who took pictures of camera stores that advertise on line or for mail order. [donwiss.com] There are companies with giant warehouses and loading docks, and there are, well, "marginal locations". It's very funny. Search engines need info like that.

As for specific sites:

  • Glaxosmithkline [sitetruth.com] : We give them a yellow "?", which means we think they're legit, but don't have third-party verification that the domain is tied to the company. In our hard-ass view, that's an OK rating. SSL certs and BBBonline links provide such third-party verification. They did match our database. We weren't able to parse "Registered office: 980 Great West Road, Brentford, Middlesex, TW8 9GS, United Kingdom.", unfortunately; we only recognize multi-line postal addresses, usable on an envelope, at present.
  • Vodaphone [sitetruth.com] All the country sites have SSL certs, but the main ".com" site does not. It does have the address "Vodafone Group Plc / Vodafone House / The Connection / Newbury / Berkshire / RG14 2FN / England" on multiple lines, which we pulled out of the source HTML as a possible address, but did not parse successfully. Still, they got a yellow "?", and were matched to the UK business database.
  • Oxfam [sitetruth.com] gets a green checkmark, and the system was able to pull four business addresses from their web site.
Check for New Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...