Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

The Wayback Machine, Friend or Foe?

Cliff posted more than 12 years ago | from the giving-google's-cache-a-run-for-its-money dept.

The Internet 508

ShaunC asks: "As the webmaster of numerous sites, I'm curious how others feel about the Wayback Machine. What particularly interests me is the fact that the Machine is a relatively new animal, yet it contains snapshots from my sites dating back to 1998. I can't help but wonder: where did they get such old copies of my websites, and who gave them permission to make those copies? I certainly didn't provide either. Perhaps I'm too much of a purist, but I've always seen the internet as an ever-changing medium, not a permanent one. Archives have bothered me ever since the fledgling days of DejaNews." This site last made an appearance on Slashdot, earlier this year. Internet archival sites are right smack in the crosshairs of copyright, but they are useful. Anyone who has ever used Google's cache (and there are plenty of those links on Slashdot) can attest to this. Of course, the issue that may bug many content providers is how to opt-out of such services, since some see it as a copyright violation. Is it possible to balance the issues of copyright and history, or will these two Internet resources find themselves in legal trouble in the future?

"The way I see it, archives are much like SPAM; I never opted in, why should it be my responsibility to opt out? I manage a number of domains and the process of refining robots.txt files and submitting myself to the Wayback Machine for removal seems to be intrusive. Worse, domains I've abandoned (which have lapsed or been re-registered by someone else) are forever archived in the Machine and I have no way to exclude them. Why should I have to deliberately remove my copyrighted material from an archive which was never granted permission to replicate that material in the first place?"

cancel ×


Sorry! There are no comments related to the filter you selected.

C# Sourcecode for Slashdot Troll Bot!!! (-1)

RoboTroll (560160) | more than 12 years ago | (#3732254)

C# Sourcecode for Slashdot Troll Bot!!!

Published under the TPL (Trolling Public License)

usingSystem; []
usingSystem.Co llections;
usingSyst em.Windows.Forms;
usingSystem. Data.OleDb;
usingSystem.Runtime.InteropServices;n amespaceSlash man{
publicclassMainFrm:System.Windows.Forms.Form {[DllI mport("winmm.dll")]
publicstaticexternlongPlaySou nd(Stringlpszname,lon ghModule,longdwFlags);privateboolmanualMode=false;
privateboolcontextTroll=false;privateboolcontext Tr ollOnly=false;
privatestringlatestStory="";privat estringlatestSto ryDisplay="";
privatestringlatestURL="";privatest ringlatestTime= "";
privatestringlastStory="";privateintselTroll= 1;
privateSystem.Randomrand=newSystem.Random();pr ivat eDateTimenextCheck=System.DateTime.Now+System.Time Span.FromSeconds(6);
privateSystem.Windows.Forms. Labellabel1;privateSys tem.Windows.Forms.LinkLabellinkURL;
privateSystem .Windows.Forms.LabellabelTime;private System.Windows.Forms.ButtonbuttonCheck;
privatebo oltrying=false;privateSystem.Timers.Timer theTimer;
privateSystem.Windows.Forms.LabellabelN extCheck;pr ivatestringmainURL="http:privatestringreplyURL="ht tp:privateSystem.Data.DataTabletrollTable;
privat eSystem.Data.DataSettrollSet;privateSystem.D ata.DataTablecontextTable;
privateSystem.Data.Dat aSetcontextSet;privateSystem . indows.Forms.ButtonbtnPost;
privateboolisposting= false;privateintpreinctroll=0 ; BR>privatestringdirBase="";privateSystem.Windows.F o rms.ButtonbtnOptions;
privateSystem.Windows.Forms .ContextMenutrayMenu;pr ivateSystem.Windows.Forms.MenuItemmenuItem1;
priv ateSystem.Windows.Forms.MenuItemmenuItem2;priv ateSystem.Windows.Forms.MenuItemmenuItem3;
protec tedSystem.Windows.Forms.NotifyIcontIcon;priv ateSystem.ComponentModel.IContainercomponents;
pr ivateOleDbConnectiondbConn;privateOleDbDataAdapt erdbTrollsAdapter;
privateSystem.Windows.Forms.La bellbResult;privateO leDbDataAdapterdbContextAdapter;
publicMainFrm(){ InitializeComponent();

protectedover ridevoidDispose(booldisposing){if(dis posing){
if(components!=null){components.Dispose( );}
#regionWindowsFor mDesignergeneratedcodeprivatevoid InitializeComponent(){
this.components=newSystem. ComponentModel.Container ();System.Resources.ResourceManagerresources=newSy stem.Resources.ResourceManager(typeof(MainFrm));
this.linkURL=newSystem.Windows.Forms.LinkLabel();t his.label1=newSystem.Windows.Forms.Label();
this. labelTime=newSystem.Windows.Forms.Label();thi s.labelNextCheck=newSystem.Windows.Forms.Label();
this.buttonCheck=newSystem.Windows.Forms.Button() ; this.theTimer=newSystem.Timers.Timer();
this.btnO ptions=newSystem.Windows.Forms.Button();t his.btnPost=newSystem.Windows.Forms.Button();
thi s.tIcon=newSystem.Windows.Forms.NotifyIcon(this . omponents);this.trayMenu=newSystem.Windows.Forms.C ontextMenu();
this.menuItem1=newSystem.Windows.Fo rms.MenuItem(); this.menuItem3=newSystem.Windows.Forms.MenuItem();
this.menuItem2=newSystem.Windows.Forms.MenuItem( ); this.lbResult=newSystem.Windows.Forms.Label();
(( System.ComponentModel.ISupportInitialize)(this.t heTimer)).BeginInit();this.SuspendLayout();
this. linkURL.Location=newSystem.Drawing.Point(16,4 8);this.linkURL.Name="linkURL";
this.linkURL.Size =newSystem.Drawing.Size(432,23);t his.linkURL.TabIndex=0;
this.linkURL.LinkClicked+ =newSystem.Windows.Forms. LinkLabelLinkClickedEventHandler(this.linkURL_Link Clicked);this.label1.Location=newSystem.Drawing.Po int(16,16);
this.label1.Name="label1";this.label1 .Size=newSyst em.Drawing.Size(80,23);
this.label1.TabIndex=1;th is.label1.Text="LastCheck : ;
this.labelTime.Location=newSystem.Drawing.Point (10 4,16);this.labelTime.Name="labelTime";
this.label Time.Size=newSystem.Drawing.Size(128,23) ; his.labelTime.TabIndex=2;
this.labelTime.Text="00 :00";this.labelNextCheck.Lo cation=newSystem.Drawing.Point(240,16);
this.labe lNextCheck.Name="labelNextCheck";this.lab elNextCheck.Size=newSystem.Drawing.Size(208,23);
this.labelNextCheck.TabIndex=3;this.labelNextCheck . ext="NextCheckin0Seconds";
this.buttonCheck.Locat ion=newSystem.Drawing.Point( 376,120);this.buttonCheck.Name="buttonCheck";
thi s.buttonCheck.TabIndex=4;this.buttonCheck.Text= "CheckNow";
this.buttonCheck.Click+=newSystem.Eve ntHandler(thi s.buttonCheck_Click);this.theTimer.Enabled=true;
this.theTimer.Interval=1000;this.theTimer.Synchron izingObject=this;
this.theTimer.Elapsed+=newSyste m.Timers.ElapsedEve ntHandler(this.OnFireTimer);this.btnOptions.Locati on=newSystem.Drawing.Point(200,120);
this.btnOpti ons.Name="btnOptions";this.btnOptions. TabIndex=5;
this.btnOptions.Text="Options";this.b tnOptions.Cli ck+=newSystem.EventHandler(this.btnOptions_Click);
this.btnPost.Location=newSystem.Drawing.Point(28 8, 120);this.btnPost.Name="btnPost";
this.btnPost.Ta bIndex=7;this.btnPost.Text="PostNow ";
this.btnPost.Click+=newSystem.EventHandler(thi nPost_Click);this.tIcon.ContextMenu=this.trayMenu;
this.tIcon.Icon=((System.Drawing.Icon)(resources .G etObject("tIcon.Icon")));this.tIcon.Text="SlashMan ";
this.tIcon.Visible=true;this.tIcon.DoubleClick +=ne wSystem.EventHandler(this.DblClickTrayIcon);
this .trayMenu.MenuItems.AddRange(newSystem.Windows . orms.MenuItem[]{this.menuItem1,
this.menuItem3,th is.menuItem2});
this.menuItem1.DefaultItem=true;t his.menuItem1.Ind ex=0;
this.menuItem1.Text="Open...";this.menuItem 1.Click +=newSystem.EventHandler(this.menuItem1_Click);
t his.menuItem3.Index=1;this.menuItem3.Text="-";
th is.menuItem2.Index=2;this.menuItem2.Text="Exit";
this.menuItem2.Click+=newSystem.EventHandler(this. menuItem2_Click);this.lbResult.Location=newSystem. Drawing.Point(16,80);
this.lbResult.Name="lbResul t";this.lbResult.Size=n ewSystem.Drawing.Size(432,23);
this.lbResult.TabI ndex=8;this.lbResult.Text="LastR esult:None";
this.AutoScaleBaseSize=newSystem.Dra wing.Size(5,13 );this.ClientSize=newSystem.Drawing.Size(472,149);
this.Controls.AddRange(newSystem.Windows.Forms.C on trol[]{this.lbResult,
this.btnPost,this.btnOption s,
this.buttonCheck,this.labelNextCheck, belTime,this.label1,
this.linkURL});this.Icon=((S ystem.Drawing.Icon)(re sources.GetObject("$this.Icon")));
this.MaximizeB ox=false;this.Name="MainFrm";
this.StartPosition= System.Windows.Forms.FormStartP osition.CenterScreen;this.Text="SlashMan";
this.S izeChanged+=newSystem.EventHandler(this.Size Chang);((System.ComponentModel.ISupportInitialize) (this.theTimer)).EndInit();
this.ResumeLayout(fal se);}#endregion
App lication.Run(newMainFrm());}privatevoidReadDB() {
try{dirBase=System.Diagnostics.Process.GetCurre ntP rocess().MainModule.FileName;
dirBase=dirBase.Sub string(0,dirBase.LastIndexOf("\ \"));System.IO.Directory.CreateDirectory(dirBase);
stringmdbFile="Provider=Microsoft.Jet.OLEDB.4.0; Da taSource="+dirBase+"\\Slashman.mdb";dbConn=newOleD bConnection(mdbFile);
dbTrollsAdapter=newOleDbDat aAdapter();OleDbCommand dbInsert=newOleDbCommand("INSERTINTOtrolls(ID,Subj ect,Body)Values(?,?,?)",dbConn);
dbInsert.Paramet ers.Add("ID",OleDbType.Numeric,0," ID");dbInsert.Parameters.Add("Subject",OleDbType.V arChar,255,"Subject");
dbInsert.Parameters.Add("B ody",OleDbType.Char,6553 5,"Body");OleDbCommanddbUpdate=newOleDbCommand("UP DATEtrollsSETSubject=?,Body=?WHEREID=?",dbConn);
dbUpdate.Parameters.Add("Subject",OleDbType.VarCha r,255,"Subject");dbUpdate.Parameters.Add("Body",Ol eDbType.Char,65535,"Body");
dbUpdate.Parameters.A dd("ID",OleDbType.Numeric,0," ID");OleDbCommanddbDel=newOleDbCommand("DELETEFROM trollsWHEREID=?",dbConn);
dbDel.Parameters.Add(ne wOleDbParameter("ID",OleDbT ype.Numeric,0,"ID"));dbTrollsAdapter.InsertCommand =dbInsert;
dbTrollsAdapter.UpdateCommand=dbUpdate ;dbTrollsAda pter.DeleteCommand=dbDel;
dbTrollsAdapter.SelectC ommand=newOleDbCommand("SEL ECT*FROMtrolls",dbConn);dbContextAdapter=newOleDbD ataAdapter();
dbContextAdapter.SelectCommand=newO leDbCommand("SE LECT*FROMContext",dbConn);dbConn.Open();
trollSet =newSystem.Data.DataSet("trollset");trollT able=newDataTable("trolls");
dbTrollsAdapter.Fill (trollTable);trollSet.Tables.A dd(trollTable);
if(trollTable.Rows.Count==0){Syst em.Windows.Forms. MessageBox.Show("Thetrollsdatabaseismissingorempty . );
thrownewSystem.Exception("Thetrollsdatabaseism issi ngorempty.");}contextSet=newSystem.Data.DataSet("c ontextset");
contextTable=newDataTable("Context") ;dbContextAdap ter.Fill(contextTable);
contextSet.Tables.Add(con textTable);this.Visible=t rue;
privateboolSendMail(stringfrom,stringto,stringsubj ect,stringbody){try{
System.Web.Mail.MailMessaget heMail=newSystem.Web.M ail.MailMessage();theMail.From="";
theMail.Bo dy=body;theMail.BodyFormat=System.Web.Ma il.MailFormat.Text;
System.Web.Mail.SmtpMail.Smtp Server="your.server.c om";System.Web.Mail.SmtpMail.Send(theMail);
retur ntrue;}catch(Exceptione){
System.Windows.Forms.Me ssageBox.Show(e.Message);re turnfalse;}
this.label Time.Text=latestTime;this.linkURL.Text=l atestStoryDisplay;}
privatestringGetTaggedText(st ringfrom,stringtagBeg in,stringtagEnd){intbegin=from.IndexOf(tagBegin);
if(begin==-1)thrownewSystem.Exception("tagBeginno t found");stringretstr=from.Substring(begin+tagBegin . ength);
intend=retstr.IndexOf(tagEnd);if(end==-1) thrownewS ystem.Exception("tagEndnotfound");
returnretstr.S ubstring(0,end);}privatestringStripT ags(stringfrom){
stringret=from;intbegin=ret.Inde xOf("");
while(begin=0){intend=ret.IndexOf("",beg in);
if(end==-1)break;ret=ret.Remove(begin,(end-b egin)+ 1);
privatest ringGetHref(stringfrom){stringtagHref="AH REF=\"";
stringret=from;intbegin=ret.IndexOf(tagH ref);
if(begin0)thrownewSystem.Exception("GetHref failed( 1).");begin+=tagHref.Length;
intend=ret.IndexOf(" \"",begin);if(end0)thrownewSys tem.Exception("GetHreffailed(2).");
ret=ret.Subst ring(begin,end-begin);if(!ret.StartsW ith("http:"))ret="http:"+ret;
returnret;}privates tringDoHttpPost(stringinURI,Sys tem.Collections.Specialized.NameValueCollectionval ues){
System.Net.WebClientcli=newSystem.Net.WebCl ient(); byte[]resp=cli.UploadValues(inURI,values);
return System.Text.Encoding.ASCII.GetString(resp);} privatestringDoHttpGet(stringinURI){
System.Net.H ttpWebRequestreq=(System.Net.HttpWebRe quest)System.Net.WebRequest.Create(inURI);req.Cook ieContainer=newSystem.Net.CookieContainer();
req. CookieContainer.Add(newSystem.Net.Cookie("user ",SlashCfg.userCookie,"/",""));System. Net.WebResponseresp=req.GetResponse();
System.IO. StreamReadersr=newSystem.IO.StreamReader (resp.GetResponseStream(),System.Text.Encoding.ASC II);returnsr.ReadToEnd();}
privatevoidPrePro(refs tringtheData){theData=theDat a.Replace("Ask Slashdot: The Wayback Machine, Friend or Foe?",latestStory);
theData=theData.Replace("191" ,selTroll.ToString()) ; heData=theData.Replace("194",trollTable.Rows.Count . oString());}
privatevoidUpdateStatus(stringstat){ tIcon.Text=sta t;
labelNextCheck.Text=stat;labelNextCheck.Update ();}
privatevoidPromptTrollData(outstringsubj,out string body){subj="";
body="";GetTrollgt=newGetTroll(lat estStory,latestU RL);
gt.ShowDialog(this);if(!gt.accepted)thrownew System . xception("AbortedEntry");
subj=gt.thesubj;body=gt .thebody;
if((subj=="")||(body==""))thrownewSyste m.Exception ("AbortedEntry");}privatevoidGetTrollData(outstrin gsubj,outstringbody){
inti=contextTable.Rows.Coun t;subj="";
for(i=0;ico ntextTable.Rows.Count;i++){if(latestSto ry.IndexOf(contextTable.Rows[i]["IfContain"].ToStr ing())=0){
intidx=(int)contextTable.Rows[i]["Post "];subj=trol lTable.Rows[idx-1]["Subject"].ToString();
body=tr ollTable.Rows[idx-1]["Body"].ToString();bre ak;}
if(contex tTrollOnly){thrownewSystem.Exception("Noc ontexttrollexistsforthispost.");}
preinctroll=Sla shCfg.curTrollIndex;if(SlashCfg.cur Troll==0){
SlashCfg.curTrollIndex++;if(SlashCfg.c urTrollIndex =trollTable.Rows.Count)SlashCfg.curTrollIndex=1;
selTroll=S lashCfg.curTroll;}if(selTroll=trollTable . ows.Count){
thrownewSystem.Exception("Theselected trollisgreate rthanthenumberoftrollsinthetable.");}subj=trollTab le.Rows[selTroll]["Subject"].ToString();
body=tro llTable.Rows[selTroll]["Body"].ToString(); }if(SlashCfg.appendPostfix){
body+="P"+SlashCfg.a ppendPosttext;}PrePro(refsubj) ; BR>PrePro(refbody);}privatevoidPostComment(){
/*s tringxtheSubj,xtheBody;
GetTrollData(outxtheSubj, outxtheBody);System.Windo ws.Forms.MessageBox.Show(xtheBody,xtheSubj);
retu rn;*/
if(man ualMode)PlaySound(Application.StartupPath+"\ \alert.wav",0,1);try{
stringtheSubj="",theBody="" ;if(!manualMode){
GetTrollData(outtheSubj,outtheB ody);}UpdateStatus( "Readingcommentspage...");
stringpageText=DoHttpG et(latestURL);stringtagSID=" INPUTTYPE=\"HIDDEN\"NAME=\"sid\"VALUE=\"";
string tagCID="INPUTTYPE=\"HIDDEN\"NAME=\"cid\"VALU E=\"";stringtagPID="INPUTTYPE=\"HIDDEN\"NAME=\"pid \"VALUE=\"";
stringtagKEY="INPUTTYPE=\"HIDDEN\"NA ME=\"formkey\" VALUE=\"";stringtagEND="\"";
stringSID=GetTaggedT ext(pageText,tagSID,tagEND);st ringCID=GetTaggedText(pageText,tagCID,tagEND);
st ringPID=GetTaggedText(pageText,tagPID,tagEND);st ringreplyPage=replyURL+"?";
replyPage+="sid="+SID +"&";replyPage+="pid="+PID+"& ";
replyPage+="cid="+CID+"&";replyPage+="op=Reply &mod e=flat&commentsort=0&threshold=-1";
UpdateStatus( "RequestingReplyPage...");pageText=Do HttpGet(replyPage);
SID=GetTaggedText(pageText,ta gSID,tagEND);PID=GetT aggedText(pageText,tagPID,tagEND);
stringKEY=GetT aggedText(pageText,tagKEY,tagEND);Sy stem.Collections.Specialized.NameValueCollectionnv s=newSystem.Collections.Specialized.NameValueColle ction();
nvs .Add("threshold","-1");nvs.Add("commentsort","0 ");
nvs.Add("unickname",SlashCfg.username);nvs.Add(" up asswd",SlashCfg.password);
nvs.Add("op","Submit") ;nvs.Add("posttype","1");
if(manualMode){PromptTr ollData(outtheSubj,outtheBo dy);}
System.Threading.Thread.Sleep(21000);}nvs.Add("pos tersubj",theSubj);
nvs.Add("postercomment",theBod y);pageText=DoHttpPo st(replyURL,nvs);
stringtagErrorResult="!--Errort ype:--";stringtagPo stResult="FACE=\"arial,helvetica\"SIZE=\"4\"COLOR= \"#FFFFFF\"B";
stringtagPostResultEnd="/B";string PostResult=GetTa ggedText(pageText,tagPostResult,tagPostResultEnd);
boolisOK=(pageText.IndexOf(tagErrorResult)==-1); if ((!isOK)&&(PostResult=="PostComment")){
try{PostR esult=GetTaggedText(pageText,tagErrorResu lt,".");}catch{}
while((PostResult.Length0)&&((Po stResult[0]32)||(P ostResult[0]127)))PostResult=PostResult.Substring( 1);}if(isOK){
CID=GetTaggedText(pageText,tagCID,t agEND);lbResult . ext="PostedComment";
lbR esult.Text="ERROR:"+PostResult;}isposting=false ;
throw; }UpdateStatus("PostComplete.");}
privatevoidTryRe ad(){if(trying)return;
trying=true;stringtagTitle =@"FACE=""arial,helvetic a""SIZE=""4""COLOR=""#FFFFFF""B";
stringtagTitleE nd="/B";stringtagUrl="PB(/B";
stringtagUrlEnd="BR eadMore.../B";UpdateStatus("Che ckingNow...");
stringpa gestr=DoHttpGet(mainURL);latestStory=Strip Tags(GetTaggedText(pagestr,tagTitle,tagTitleEnd));
latestStoryDisplay=latestStory;latestURL=GetHref (G etTaggedText(pagestr,tagUrl,tagUrlEnd));
latestUR L+="&threshold=-1";latestTime=System.DateT ime.Now.ToString();
if((lastStory.Length0)&&(late stStory!=lastStory)){ PlayAlert();
catch(System.Excep tione){if(e.Message.IndexOf("(40 4)")0){
SlashCfg.curTrollIndex=preinctroll;retryP ost=true; }
lastS tory=latestStory;nextCheck=DateTime.Now.AddSe conds(SlashCfg.checkIntervalMin+rand.Next(SlashCfg . heckIntervalMax-SlashCfg.checkIntervalMin));}
els e{nextCheck=DateTime.Now.AddSeconds(5);}
UpdateFo rm();trying=false;}
privatevoidbuttonCheck_Click( objectsender,System.E ventArgse){TryRead();}
privatevoidOnFireTimer(obj ectsender,System.Timers. ElapsedEventArgse){if(trying)return;
if(isposting )return;if(DateTime.NownextCheck){
TryRead();}Upd ateStatus("NextCheckin"+(int)((nextC heck-DateTime.Now).TotalSeconds)+"Seconds.");}
pr ivatevoidNavigateLink(){try{System.Diagnostics.P rocess.Start(latestURL);}
catch{}}privatevoidPlay Alert()
{}privatevoidlinkURL_LinkClicked(objectse nder,Syst em.Windows.Forms.LinkLabelLinkClickedEventArgse){
NavigateLink();}privatevoidbtnPost_Click(objectse n der,System.EventArgse){
if((latestURL==null)||(la testURL=="ERROR")||(lates tURL.Length==0)){System.Windows.Forms.MessageBox.S how("Mustgetthepostfirst!(PressCheckNow)","Error", System.Windows.Forms.MessageBoxButtons.OK,System.W indows.Forms.MessageBoxIcon.Stop);
privatevoidbtnOptions_Cli ck(objectsender,System.Ev entArgse){Slashman.OptionsFrmopts=newSlashman.Opti onsFrm();
opts.trollTable=trollTable;opts.ShowDia log(this);
if(opts.pressedOK){dbTrollsAdapter.Upd ate(trollTab le);
trollTabl e.RejectChanges();}}
privatevoidShowMe(){this.Vis ible=true;
this.Activate();this.WindowState=Syste m.Windows.Fo rms.FormWindowState.Normal;}
privatevoidHideMe(){ this.Visible=false;}
privatevoidmenuItem1_Click(o bjectsender,System.Eve ntArgse){ShowMe();}
privatevoidmenuItem2_Click(ob jectsender,System.Eve ntArgse){this.Close();}
privatevoidSizeChang(obje ctsender,System.EventArgs e){if(this.WindowState==System.Windows.Forms.FormW indowState.Minimized){
privatevoidDb lClickTrayIcon(objectsender,System.Ev entArgse){ShowMe();}

I don't know whether to laugh or cry (-1, Offtopic)

BlackTriangle (581416) | more than 12 years ago | (#3732306)

What's with all the 'this.' statements? They're usually completely redundant in Java...

Re:I don't know whether to laugh or cry (-1, Offtopic)

Anonymous Coward | more than 12 years ago | (#3732447)

Not bad for a 12-year old though...

Erm (3, Insightful)

adamwright (536224) | more than 12 years ago | (#3732271)

Isn't this exactly the point of robots.txt? Google won't cache content it doesn't spider, and it won't spider content forbidden by your robots.txt. Does the WayBack Machine obey the robots rules?

Re:Erm (2, Informative)

JebusIsLord (566856) | more than 12 years ago | (#3732368)

Yes, it does follow robots.txt protocol. Therefore there really isn't a problem now is there?

Re:Erm (1)

JebusIsLord (566856) | more than 12 years ago | (#3732403)

Little karma whoring here, but if you are not familiar:
Just make a file named robots.txt in your webroot and fill it with the following 2 lines:

User-agent: *
This will prevent any webcrawler that is compliant (IE most of them) from indexing your site at all. Problem solved.

Re:Erm (2, Informative)

JebusIsLord (566856) | more than 12 years ago | (#3732445)

Shoot, that should be:

User-agent: *
Disallow: /

Re:Erm (0)

Anonymous Coward | more than 12 years ago | (#3732413)

The Wayback Machine claims to honor robots.txt files and meta tags, but there's no way to remove a site once it's in there. I had a site back in 1996 and I didn't know anything about robots.txt files back then. That site's long gone -- at least I thought -- but I found it with the Wayback Machine. You can't make a robots.txt file for a site that no longer exists.

Re:Erm (2, Funny)

HP LoveJet (8592) | more than 12 years ago | (#3732450)

Clearly an RFC is needed here:

"Retro-Temporal Automated User Agent Exclusion Protocol"

I'll try to put a draft together by April 1.

Re:Erm (2)

1g$man (221286) | more than 12 years ago | (#3732501)

Why do webmasters have to "opt-out" rather than "opt-in" to be cached?

Shouldn't the default be "don't allow spiders and caching" ? And if I want it then I should specifically allow it.

DAVE WINER - Why he's been missing (-1, Troll)

Anonymous Coward | more than 12 years ago | (#3732503)

Fans of Userland Software may have noticed that Dave Winer hasn't been updating Scripting News [] since last week.

It turns out that on June 14th, the Userland offices were raided by the FBI under charges of pedophilia. Apparently, Userland's flagship product, Radio Userland, is actually a vast P2P network of kiddie porn.

David Winer is currently facing serious charges related to the distribution of child pornography. Frontier and Manilla users are unaffected.

Yummy (2, Informative)

sheepab (461960) | more than 12 years ago | (#3732274)

Slashdot from 1997 [] .

Amazing (-1, Flamebait)

BlackTriangle (581416) | more than 12 years ago | (#3732337)

Despite all the claims, Slashdot hasn't gotten worse. It's always been a haven of Lunix Zealots and Open Sores dorks.

Re:Yummy (2)

quintessent (197518) | more than 12 years ago | (#3732397)

Very nice. And it's good to know they were using the same careful journalism back then. I like this headline:

Judge Uninstalls IE in 90 seconds.

Re:Yummy (1)

mongoks (540017) | more than 12 years ago | (#3732482)

Already /.'d.

"Even in the future nothing works!" - Dark Helmet


Subject Line Troll (581198) | more than 12 years ago | (#3732277)

"The Wayback Machine" (3, Informative)

pb (1020) | more than 12 years ago | (#3732287)

"The Wayback Machine" has been a pet project for a long time, and we're only now seeing results. I know for a fact that they have pages back at least as far as 1996, and it's a damn shame they don't have anything that much earlier...

And yes, it obeys the Robot Exclusion Principle.

"Ask Google" strikes again; I would hope that you could find all of this information by searching, or reading an "About" page, or something. Fortunately, these abortions to journalism don't appear on the Front Page very often.

Re:"The Wayback Machine" (4, Insightful)

Disevidence (576586) | more than 12 years ago | (#3732329)

I think the question is not about its being publicly available, but rather about it archiving web pages that were taken down at later dates for various reasons.

Its legally grey, and all it really takes is for some paranoid person to sue, and then the fireworks start.


Re:"The Wayback Machine" (4, Insightful)

martyn s (444964) | more than 12 years ago | (#3732455)

So I suppose libraries should just stop carrying books because the author doesn't like what he wrote anymore? I mean, what the fuck?


Subject Line Troll (581198) | more than 12 years ago | (#3732357)

Robots.txt (5, Informative)

mshowman (542844) | more than 12 years ago | (#3732291)

I had recently placed a restricted robots.txt file on my site and when trying to access any of the past revisions, I get a message saying that the owner has restricted access to the site via robots.txt. They seem to have that aspect under control.

Re:Robots.txt (1)

Dwedit (232252) | more than 12 years ago | (#3732474)

Tell that to the squatters who bought up great old sites and restrict their spamsites from being viewed by a new robots.txt file!

There are more than copyright concerns... (4, Insightful)

Anonymous Coward | more than 12 years ago | (#3732294)

It's a scary thought that things kids are saying on message boards when they're teenagers are going to be back to haunt them when they apply for jobs in their mid 40s...

I mean, if everything I posted on BBSes in the 1980s were still attributable to me... yikes.

Remember kids. Use a nickname, and change it frequently if you ever want to run for any kind of office.

Re:There are more than copyright concerns... (4, Insightful)

TheMonkeyDepartment (413269) | more than 12 years ago | (#3732323)

Well, that's a great point, and it's a good illustration of the double-edged sword of free speech. You are free to say whatever dumbshit, ridiculous things you want. But you are also free to deal with the social consequences.

Opting out -- of publicly available HTTP??? (4, Interesting)

TheMonkeyDepartment (413269) | more than 12 years ago | (#3732296)

When you publish something on the web, it is publicly available via HTTP. End of story. Responsible netizens can observe the requests of "robots.txt" but they don't have to. If you want something more controlled, create a VPN or intranet or some other kind of non-public data server.

Your argument is similar to that of newspaper publishers who didn't like "deep linking." What they couldn't (or didn't want to) understand is that the nature of an HTTP web server is quite simple. A client asks for a file, the server gives it back. Using that protocol implies that you are OK with that. If you're not, I suggest you look into different technologies, instead of complaining about lack of control, in a medium that was never intended to provide it.

Re:Opting out -- of publicly available HTTP??? (1)

ajmarks (447148) | more than 12 years ago | (#3732354)

One of the problem with archiving is simple copyright violation. If I make a site, regardless of the fact that HTTP is open, it is legally very questionable (understatement) to save a copy of it and redistribute it without my permission.

Re:Opting out -- of publicly available HTTP??? (1)

I_redwolf (51890) | more than 12 years ago | (#3732355)


Re:Opting out -- of publicly available HTTP??? (1)

billybobSDK (586720) | more than 12 years ago | (#3732394)

So should magazine publishers also be alowed to opt out of un-requested archival in my bathroom too?

Re:Opting out -- of publicly available HTTP??? (2)

krogoth (134320) | more than 12 years ago | (#3732432)

Exactly what I wanted to say. Of course, when you put something on the Internet you don't expect it to be archived forever, but you have to keep in mind that anyone can download it and do what they want.

Re:Opting out -- of publicly available HTTP??? (4, Insightful)

KillerCow (213458) | more than 12 years ago | (#3732444)

When you publish something on the web, it is publicly available via HTTP. End of story.

I don't think that that is a good enough standard. When a television show is broadcast, or when a book is published, it is publicly available -- but we don't think that the publisher looses their right to copyright protection in these cases. Publishing on the web is similar. The creator wants people to see his/her creation, but does not automatically give visitors the right to archive and retransmit the works.

Re:Opting out -- of publicly available HTTP??? (2)

FreeUser (11483) | more than 12 years ago | (#3732472)

When you publish something on the web, it is publicly available via HTTP. End of story.

Exactly. By publishing online and publicly you've already opted-in.

This is just another example of how incompatibel copyright is with any kind of normalcy vis-a-vis individual freedom and, in this particular case, the freedom to archive information and hold someone accountable if they try to change it retroactively (and on the sly). Unless we want Orwellian-style changing of the facts post facto copyright must lose to the right of archivists to preserve information from being lost. Any other policy would be disasterous.

Re:Opting out -- of publicly available HTTP??? (2)

sckeener (137243) | more than 12 years ago | (#3732493)

When you publish something on the web, it is publicly available via HTTP. End of story. a previous post pointed out, I don't think kids should have their remarks recorded forever. I doubt I would have made it as far as I have if my BBS quotes were still around...

Talk about a time machine... (3, Interesting)

wompser (165008) | more than 12 years ago | (#3732300)

Went back and looked at the site for the .com I used to work for, very nostalgic. The wayback machine is a good resource for people who create content on someone's site (a.k.a. me), and then lose access to it because the company goes under. Now I'm able to add my old content to my portfolio, now that the company who once owned it is gone.

Re:Talk about a time machine... (1)

Prof.Phreak (584152) | more than 12 years ago | (#3732498)

Totally agree. It's very nostalgic... brings back lots of memories. I'm actually kinda upset it doesn't go back farther in time. (my oldest site there is from 1997, kinda sad can't see the 'original' though).

And to the people who complain about copyrights: It's public content. If you don't want your "copyrighted" stuff on the internet, then simply don't put it there in the first place. Nobody is complaining about Google's cache, and this is something similar, except it goes back years.

I think this is a great thing! You can go back to see how the internet used to be. (go see how corny or looked in 1996 :-) The only bad thing is that it doesn't go back to the very beginning, other than that, it's one of the sites that will be on my favorites from now on.

Simple rule (1)

npsimons (32752) | more than 12 years ago | (#3732305)

There's a very simple rule to remember on the Internet: if you don't want it copied or linked to, don't put it online.

Come on people, wake up! First NPR, now this brain dead crack monkey who calls himself a "webmaster". Anyone who doesn't understand the simple rule stated above is not qualified to be a webmaseter.

I can understand clueless users, but clueless sysadmins is something with which I will not put up.

Re:Simple rule (1)

galejt (259786) | more than 12 years ago | (#3732401)

Roger that!!!

Permission... (3, Insightful)

gorf (182301) | more than 12 years ago | (#3732307)

who gave them permission to make those copies?

The way I see it, you implicitly give people some limited form of permission by putting it up on the internet freely available to download in the first place. You put it up for people to download, print out and so forth (which amounts to copying), and therefore you've implied that people may do so.

Sure, you own copyright, and blatant plagarism is something that clearly is wrong. But I see nothing wrong with taking an article that you published on the web and reproducing it, as long as it is taken in context and is clearly attributed (and it made obvious that the copy isn't the original, but proper attribution would do this and therefore suffice).

Of course, this is republication and so the issue is not so clear and obviously subjective. That's just my opinion.

Re:Permission... (1)

rector (580924) | more than 12 years ago | (#3732443)

You put it up for people to download, print out and so forth (which amounts to copying), and therefore you've implied that people may do so.

You argument is similar to the following:
If program is shown on TV, why someone can't record it and sell copies?
Note that copying the content of the website is the same. And showing banner on the website that contains a copy is the same as selling content.

And some website owner explicitely prohibit even printing. See Bloomberg []

Re:Permission... (1)

JebusIsLord (566856) | more than 12 years ago | (#3732464)

No, he said absolutely nothing about "selling" it. Your example is flawed.

Re:Permission... phhtht. (0)

Anonymous Coward | more than 12 years ago | (#3732473)

And they can bite my shiny metal ass. Especially Bloomberg.

Friend or Foe? Hmmmm... (1)

Navaash Fenwylde (35067) | more than 12 years ago | (#3732309)

If I choose Friend, I can get half or none of the Wayback Machine's content...

but if I choose Foe, I can get all or none of its content?

Better choose Foe.

Legally you can stop them, but why? (3, Informative)

the_womble (580291) | more than 12 years ago | (#3732310)

If you own the copyright they can not archive it without your permsiission, legally, that is all there is to it.

Of course in practice you have to purse this and ask them to remove it.

If you really object I suggest a list of every site you have or have had and dates with a request to remove everything. Then you only need to notify them when you put up a new site that that whould also be excluded. That would not be such a nuisance, would it?

That said I think they are providing a service that is interesting so unless you are harmed by it, why object?

I am interested in knowing how they had such old versions of your site though. Do search engines keep archives?

The story should read 'since 1996' (2)

forged (206127) | more than 12 years ago | (#3732313), 1 page (1996), 5 pages (1996), 7 pages (1996)

This is in the FAQ [] .

robots.txt (-1)

Anal Cocks (557998) | more than 12 years ago | (#3732314)

It's called robots.txt, you ignorant idiots.

As an creator... (2)

Bonker (243350) | more than 12 years ago | (#3732315)

As someone who makes lots of free sellable [] and href="">unsellab le content, I think The Wayback Machine is an invaluable resource. I can look back a see how big a dork I was and still am. I've also found stuff of mine that I've lost over time, amazed that anyone ever bothered to hold on to it.

Re:As an creator... (2)

rknop (240417) | more than 12 years ago | (#3732359)

I've also found stuff of mine that I've lost over time, amazed that anyone ever bothered to hold on to it.

Yes, I've used it for this too. I'm a volunteer webmaster for a site ( where we have a "monthly spotlight", but foolishly I wasn't keeping track of past spotlights. Eventually I wanted to put together a list of past spotlights, and realized that I hadn't kept that list. I felt stupid. The Wayback Machine (mostly) came to my rescue there.


[ Reply to This | Parent ]

Re:As an creator... (1)

Paradoxish (545066) | more than 12 years ago | (#3732385)

I agree that the Wayback Machine is pretty cool, although I don't think I'd call it "invaluable" or even a "resource". But it is very nice to go back and see old websites of mine stored there. Websites that I had taken down a looong time ago but are still being preserved in one way or another. Ultimately, if the Wayback Machine manages to last for long enough it'll be a way for anyone that has ever put content on the web to have something they created stored and available.

Ah, Gee! (2, Funny)

Dark Paladin (116525) | more than 12 years ago | (#3732318)

Sherman: Mr. Peabody, I want to go back in time!

Mr. Peabody: Be quite, Sherman. This new Wayback Machine is now accessable via a browser. Be happy with that.

Sherman: But I wanted to go back in time and watch Cleopatra taking one of those milk baths again.

Mr. Peabody: .... Damn it, boy, fire up the Wayback machine. And fetch me my chew toy.

Re:Ah, Gee! (0)

Anonymous Coward | more than 12 years ago | (#3732495)

Maybe Mr. Peabody can buy Sherman a spellchecker.

Even better.... (1)

sheepab (461960) | more than 12 years ago | (#3732320)

This is just....mind blowing. Look at Ebay from 1997 [] .

Who DOES have permission to copy your site? (3, Insightful)

allism (457899) | more than 12 years ago | (#3732326)

Do I have permission to copy the content of your site to my browser history directory, and if so, how long do I have permission to keep it? Can I show a copy of an html document that is stored in my browser history to my mother? What about my neighbor? Or the dude in another country I happen to be chatting with online?

IANAL blah blah blah, but once you open your files up to being downloaded and stored by a browser, you've pretty much given up the right to tell people they can't be re-distributed--I would think the best you could hope for is that people would re-distribute them, in whole, the way you originally released them.

I like it but... (4, Insightful)

rknop (240417) | more than 12 years ago | (#3732330)

When I first discovered it, it was a lot of fun. Much nostalgia; it was fun seeing earlier verisons of my webpages. Some go back quite a number of years.

On the other hand, I was horrified when I realized that there was full archiving of If you visit that site, you will see that there are a large number of scripts (as in plays), many of which have restrictions on use. Over the years, we've had people request that scripts be removed from the site; of course, we did so. However, they weren't necessarily removed from the archive, and an archive keeps them forever. Specifically with the wayback machine, I was able to submit stuff that removed the specific directories I was worried about (they don't archive the scripts from, just the "front page" stuff which is all part of the fun), and keep them from doing it again.

I like the idea of archives; it preserves history. The web is a transient medium, but not entirely. Yes, much of the content is dynamic and should only be dynamic. Some of it, though, is like the front page of a newspaper. Each day, what's on "today's front page" is different-- but there is value and use in seeing what was on the front page in any day in history.

But sometimes you need to delete something and make sure it really is no longer available. When you don't completely control your site (i.e. somebody else archives it, rather than just mirrors it), that becomes impossible.


(Incremental backups can have a similar issue. If you only back up files which are "newer than the last backup", your backup doesn't have the information about files which have been *deleted* since the last backup. When you restore, you might find some files there you thought shouldn't exist any more.)

( has changed so that it's not straightforward to get directly to the scripts any more. META tags tell the search engines to leave the actual scripts alone, and you can only get the text itself via CGI. Yes, it's easy to subvert if you put your mind to it, but at least you do have to put your mind to it, and automated search engines or archivers won't. 90% of the security for 1% of the effort.)


its a good thing... (1)

negativethirsty (555244) | more than 12 years ago | (#3732333)

" Perhaps I'm too much of a purist, but I've always seen the internet as an ever-changing medium, not a permanent one."

If you dont have a record of what something was before, how do you know its changed?

Personally I love seeing older versions of previous work and watching the trends in web development as they progres.

I love it. (3, Informative)

gripdamage (529664) | more than 12 years ago | (#3732334)

What's the problem?

If you do something illegal on your website, you won't be held responsible more than once just because the data persists on the Wayback machine. If you remove the offensive material from your site, that's all you can do. The Wayback machine can deal with their own lawsuit threats. And I'm sure they'll remove material if you are the site owner and ask nicely.

As far as outdated information, anyone reading pages on the wayback machine and expecting them to be current would have to be crazy. It's an archive after all.

It's easy to opt out. Google provides instructions in there webmaster faq [] which points out "There is a standard for robot exclusion at [] ."

boo fucking hoo (-1)

CmdrTaco (troll) (578383) | more than 12 years ago | (#3732336)

I bet you are crying yourself a river about your copyrights being violated at the same time you are filling your hard drive with mp3s.

As a webmaster of various sites... (5, Insightful)

schon (31600) | more than 12 years ago | (#3732338)

As a webmaster of various sites, I have no problem with archives.. if I didn't want people to see my stuff, I wouldn't have put it on the internet in the first place.

where did they get such old copies of my websites, and who gave them permission to make those copies?

They probably got the copies the same way everybody else did - by surfing. You (implicitly) gave them permission to cache your sites by not including an appropriate entry in your robots.txt.

The way I see it, archives are much like SPAM; I never opted in, why should it be my responsibility to opt out?

Archives are nothing like spam. Spam is primarily harrassment. These guys aren't harrassing you. They did ask your permission (by way of checking your robots.txt). If you've since changed your mind, it's your responsibility to notify them.

Google caches material too - do you consider them to be spam as well?

Archive sites provide a valuable resource to the rest of the 'net. If you don't like it, put an appropriate entry in your robots.txt file, and be done with it.

The web is a public medium! (2)

Steveftoth (78419) | more than 12 years ago | (#3732429)

This parent post said almost everything I was going to, but one thing that I wanted to add was that the web, if a spider is even able to get to a page, (even if it doesn't follow the robots protocol which the wayback machine does) is only seeing a public page that anyone with an internet connection can get to.

Otherwise you have bad control over your content and need to update your web server to not serve that content. If you don't want people to be able to copy your information then don't give it to them. Or only give it to them in a signed format that cannot be easily duplicated.

It's like being surprised that someone has forwarded an email that you sent them.

Re:As a webmaster of various sites... (0)

Anonymous Coward | more than 12 years ago | (#3732477)

You (implicitly) gave them permission to cache your sites by not including an appropriate entry in your robots.txt.

Yes, and I specifically denied them permission to redistribute my intellectual property when I wrote "copyright XXXX, by YYYY. All rights reserved."

Can libraries keep old newspapers? (2)

cperciva (102828) | more than 12 years ago | (#3732339)

The submitter states that he never gave the Internet Archive permission to replicate his work. He is wrong.

By placing material on the web, one is implicitly granting permission for it to be read. If I put a poster up in my window, I lose the right to complain if someone walking by on the street reads it.

Equally, I lose the right to complain if someone walks by and takes a photograph of the front of my house, including the poster. The fact that someone might then be able to read the poster ten years from now is irrelevant.

If the Internet Archive were required to seek permission before archiving freely and publicly available material, then the same argument would require libraries to seek permission prior to archiving (free) newspapers.

Timeshifting is fair use, and it applies to web pages just as well as TV signals.

But!!! (2) (142825) | more than 12 years ago | (#3732504)

A person may take a picture of the front of your house and of you and your painting for personal use.

Now, when that person redistributes it, then it becomes an issue of fair use, copyright and license.

Quit simply, without Google ... (2)

Vicegrip (82853) | more than 12 years ago | (#3732342)

I would never have visisted countless sites I reguarly surf to. Google has definitely been a major gateway to the internet for me.

I think making an issue of the caching is a moot point, as about 99% of the time I always go to the website for the content since the source is always better than the cache. I use the cache only in cases when the content has disapeared or in some cases when the website itself is gone.

This is a valuable service Google is providing-- and webmasters get it for free.

Preserving information is important. (5, Insightful)

Chiasmus_ (171285) | more than 12 years ago | (#3732344)

I doubt that I'm alone in my belief that it is always tragic when any piece of information--no matter how trivial--is lost forever.

If a person has offered that information for free at any point, to the extent that an automated script could access it, then I believe that information can be safely considered public domain. I doubt that there's any mechanism by which Richard M. Stallman could lose his mind and "rein in" all copies of GNU, or by which Stephen King could recall all his novels and refund the purchase price; once something is offered to the public, it no longer belongs exclusively to the publisher.

In my opinion, the value of archives in the future immeasurably outweighs occasional inconveniences of having information stick around longer than the author would have wished.

Re:Preserving information is important. (2)

quintessent (197518) | more than 12 years ago | (#3732499)

Throwing out so much dated information would mean discarding a critical part of our written history. Did you notice how the multitude of Y2K disaster sites changed from 1999 to 2000? That is history.

If the courts are going to outlaw archives of the Internet, I suggest they do a complete job of suppression and order the burning of all books, newspapers, and magazines more than a year old.

Then authors will be free to rewrite history as they wish.

It has its uses. (1)

Helmholtz Coil (581131) | more than 12 years ago | (#3732346)

I like it...I'm just the latest in a long line of webmasters for the site I run, my boss ran it before me. I will gleefully pull out his work for him anytime he gripes about the current incarnation. :)

err okay... (2)

NanoGator (522640) | more than 12 years ago | (#3732349)

"I can't help but wonder: where did they get such old copies of my websites, and who gave them permission to make those copies?"

You sound like Television broadcasters when you say something like that. "We'll broadcast content over the airwaves, but you better not capture it!"

Well, let me make it simple for you: When you make something public you cannot expect to bottle it up later. That's the whole reason that the internet is in existance: Extreme redundancy so that data is never lost. The original idea was to build a data network that could survive a nuclear attack.

I don't think anybody should ever post stuff on the web without expecting it to last forever in some form or another, regardless of whether permission is granted.

How so? (2)

SkyLeach (188871) | more than 12 years ago | (#3732350)

"Of course, the issue that may bug many content providers is how to opt-out of such services, since some see it as a copyright violation."

So I need to burn all my old comics? Or perhaps I don't need to every allow anybody to look at them?

Caches aren't republishing information, they are archiving it. That's what libraries do to. Hell, they can even charge for the service if they want and still be in the moral right.

Excellent idea (1)

synthox (229949) | more than 12 years ago | (#3732356)

I myself am a fan of the Wayback Machine. I really like to see snapshots some of how my sites and some of favorite websites have evolved over the years. I would also like to think that I could actually show my Grand Children what the internet was like in my prime instead of saying "back in my day we read Slashdot and we liked it, now pass me my teeth".

Fork over your caches (3, Funny)

Eponymous, Showered (73818) | more than 12 years ago | (#3732358)

I browsed your all of your sites (even the abandoned ones) and since my browser cache is set to 782TB (and I'm still running Netscape 1.0N), your sites are still there. And my cache is publically accessible via my webserver. Yet another way you're being violated. Ah, the risks and perils of publishing on a public network.

awwwwww (0)

red_five_standing_by (582037) | more than 12 years ago | (#3732363)

how slashdot...

Archives need to be made (4, Insightful)

Waffle Iron (339739) | more than 12 years ago | (#3732366)

If the courts determine that it is technically illegal to make archives of electronic content, then the copyright laws should be changed to explicitly allow archiving. Otherwise, we could eventually lose track of history. The only written record of large portions of our civilization would be relegated to a few rusting web server hard drives buried landfills.

If you read 1984, you might remember that the government tightly controlled all old copies of documents so that they could manipulate history as they wished. We might get into a similar situation by accident if we don't allow independent archives of electronic information.

With traditional media, you publish something on paper, but you don't get to control who puts the paper copies in which archives. That has served us well for keeping track of history, and an equivalent system needs to maintained for electronic content.

A Real World Example/Question (2, Insightful) (84577) | more than 12 years ago | (#3732373)

Do libraries have to get permission to save and allow browsing of copies of newspapers (both physical and microfiche)?

Copyright must die! There is no such right (0, Troll)

WetCat (558132) | more than 12 years ago | (#3732375)

We have right to live, feed, have children, work, be under cover. We have no right to copy. Copying is free! And don't restrict rights of other to access to information, please!
(yes I know about to copy and copyright)

Re:Copyright must die! There is no such right (2)

Chiasmus_ (171285) | more than 12 years ago | (#3732452)

According to Locke, the "natural rights" of man are life, liberty, and the ability to own property; when you enter into a society, you turn over all those rights to the State in return for whatever rights it deems fit to grant you.

Thus, no one has the right to eat, have children, work, or be sheltered, unless their government sees fit to grant those rights. Certainly, America does not acknowledge a right to be employed or to eat; in fact, it's been known to blacklist people in the hope that they'll do neither.

And no, no society I'm aware of has ever given its citizens the right to copy information indiscriminately. Personally, I would love to see a society do so, because I suspect that such a society would actually probably end up richer in technology and culture. Both sides of the argument make some sense, but only one is actually tried, and it's apparent that excessively restrictive copyright laws actually retard cultural and economic growth. But, no, as it stands, society has deemed that the exclusive right to copy a piece of work is something a government can hand out.

And what to do when info must die? (2, Insightful)

Nf1nk (443791) | more than 12 years ago | (#3732379)

For the most part I don't have a problem with them archiving my sites (after all they can show me what a site used to look like faster than digging out my back ups), but recently one of my customers told me to remove all traces of a product from thier site (something about nasty litigatiation). I pulled the info off our servers quickly, but three hours later I get a nasty phone call from the customer saying he can still see the product on the site. seems it was hung up in some proxy server between here and there.

back to the point how do you deal with an archive when you need to get rid of information that is a liability to you now? Maybe we are better off without them in some cases

Re:And what to do when info must die? (1)

crath (80215) | more than 12 years ago | (#3732485)

how do you deal with an archive when you need to get rid of information that is a liability to you now

The first rule of email is: write every email assuming that everyone in the world will eventually read it.

The first rule of web-posting is not too dissimilar: post every document/picture/program/you-name-it assuming that it will always be readable by everyone.

Anyone who ignores these rules will suffer well deserved consequences. If you don't want your content cached, copied, archived, printed, et al then don't post it in the first place.

Re:And what to do when info must die? (0)

Anonymous Coward | more than 12 years ago | (#3732497)

Use No-Cache meta tags if you have to. Also, any decent proxy runs a "has this page changed" request to the server even when serving a cached copy.

it's even... (1)

gabvalois (256651) | more than 12 years ago | (#3732383)

It's even as slow as it was back then!

Windows XP much easier/better than Linux - 0002 (-1, Offtopic)

Anonymous Coward | more than 12 years ago | (#3732386)

Buy computer from walmart with mandrake installed.
Buy another computer from walmart with Windows XP installed.
Buy a nice inkjet printer, perhaps a HP.

Now, plug in printer to Windows XP box. Xp finds new hardware and installs drivers for you and you can print whatever you want.
Now, plug in printer to Linux Box. Nothing Happens. Try to print. Good Luck

Windows XP once again much easier to use than Linux.


Friend to Hosting Comapnies (5, Funny)

Da J Rob (469571) | more than 12 years ago | (#3732387)

I was talking to this guy who works for a web hosting company [] , and he says a fourth of his sales calls are people calling him up cause they're pissed that their last hosting company 'lost' thier site. (in reality most the time its later found out that the guy deleted it himself or renamed index.html to index2.html, etc..) He says 90% of the sites he can find a copy on the wayback machine. He'll then start to quote the website's contents to the guy on the phone and usually will have the amazed (and dumbfounded) customer signing a hosting contract by the end of day.

Hah! (0)

Anonymous Coward | more than 12 years ago | (#3732390)

Move it to Sealand or something like that, or some other country where copyrights are meaningless.

actually, Fan or Freak. (0)

sulli (195030) | more than 12 years ago | (#3732391)

depending on whether the site you had up when you were scanned is/was any good!

caching proxy servers (1)

bigpat (158134) | more than 12 years ago | (#3732400)

Caching of web pages on the internet is considered fair use and is central to the Web. Isn't this like a time-delayed caching server. This is just caching for a different purpose... and they aren't making money off of other people's content.

lawsuit (-1)

Anonymous Pancake (458864) | more than 12 years ago | (#3732404)

I'm currently in a class action lawsuit against wayback machine for copying our copyrighted site, we are also planing on sueing slashdot for linking to it

Uh, robots.txt! (2)

Tom7 (102298) | more than 12 years ago | (#3732421)

Use robots.txt, stupido. It lets you prevent search engines from indexing and archiving your property. However, if you're that concerned about people copying your pages, you might try avoiding the internet.

I personally love the internet archive and google's cache.

The old days... (-1, Troll)

Anonymous Coward | more than 12 years ago | (#3732422)

Fertile trolling grounds [] ...

robots.txt won't work (0)

tps12 (105590) | more than 12 years ago | (#3732434)

I know everyone is going to say, "just make a robots.txt file and everything will be okay." Sadly, that is naive and incorrect. What makes you think that the people who send out 'bots looking for content (rather than create their own or use hyperlinks!) would honor such a noble convention?

This is like trying to solve music piracy by putting a "No Napster" sticker on the jewel box. Nice thought, but it's a dead-end.

it's a good thing (1)

red_five_standing_by (582037) | more than 12 years ago | (#3732436)

someone backed up the Internet to floppy.

Euro friendly :) (1, Interesting)

Anonymous Coward | more than 12 years ago | (#3732438)

Well, the wayback machine helped me in confronting some companies for raising their prices when we changed to the euro :)

Especially dominio's pizza. They raised their prices more that 12%. I printed out the page and got a 15% discount :)

robots.txt DUUUUUUUUHHHHHHH!!!!!!!! (2)

jsimon12 (207119) | more than 12 years ago | (#3732441)

For such a "webMASTER" this guy doesn't seem to know a lot about the Internet, seems more concerned with keeping his "Intellectual Property" safe then actually understanding the way things work.

People like this ruin the concept of the Internet, the free exchange of knowledge. I hope other people on /. feel the same.

Copyright and websites. (3, Interesting) (142825) | more than 12 years ago | (#3732449)

It could be argued that the site is publically available and thus anyone can copy it. There is also the issue of fair use. That is why many people place terms of use and robots.txt files on their sites. It could even be a DMCA violation where an IP (or range) has been blocked, so people from that IP use the google cache to bypass the block.

I don't mind that my site is being added to indexes that the public have use of for free. I have a problem where a company uses my site to make a profit, with no public benefit.

There is case law where unauthorized access to a website is a copyright violation.

I am trying to use copyright law against some of the spammers who scrape my site for email addresses. Then, go after the spam software companies for contributory infringement (let the napster rulings serve some good).

Get Used to It, please (2)

pyrrho (167252) | more than 12 years ago | (#3732453)

I understand the concerns, but I think it's a part of the net, a good part, that we have to wrap our minds around.

Especially when you mention Usenet archives, which are (ok, get ready to laugh) historically important. I'm not kidding! There is a little signal in there, it's a cultural brain dump, and that's of historic interest.

I think the rub is, if the archive presents the data exactly as you presented it (that is, it doesn't play with your content, present it in a frame or otherwise embed it as their own content), then it is a fair archive, a ghost of your site still walking the internet. There is no taking it back once you post it.

TV Broadcast analogy (4, Interesting)

rknop (240417) | more than 12 years ago | (#3732456)

Some have already drawn analogies to TV broadcasts, saying hey, it was broadcast, you get to keep a copy. You can't bitch now if people still have that copy, unless you're Jack Valenti.

You can spin this how you want. Here's one valid way to think about it though: a TV network brodcasts a show. You make a private copy on a VCR tape. Jack Valenti aside, you can watch that copy again as often as you like, and it's no big deal. However, you do emph not have the right to rebroadcast your copy of that show to the public without the permission of the original copyright holder. (I have my B5 tapes. I'm watching them through again now, showing them to my wife. I'm sure nobody is upset about this. But I'd be in deep doo-doo if I managed to broadcast them on a local access station, or uploaded them to a public website.)

If you are inclined to be negative about the Wayback Machine, you could view it this way. While the page existed on the original site, it was broadcast to the public. If somebody made a personal copy, they have it and will always have it, even if the site goes down. However, when the site goes down, individuals do not necessarily have the right to then "rebroadcast" (i.e. post) themselves the content they downloaded and kept. This, however, is what the WayBack machine is doing.

Mind you, except for the issue with that I noted above (and which I fixed long ago), I like the WayBack machine, and am happy that they archived the content which was implicitly copyrighted to me. I would have opted in if I had wanted to. But, of course, I didn't know about it back in 1996 to opt in.

I don't have a good answer to the questions. Just thought.


best thing since sliced bread (2, Insightful)

John Sokol (109591) | more than 12 years ago | (#3732469)

There is nothing-worst then revisionist history. I can't stand seeing site that post something and a bit later it vanished forever or have it altered removing the very think I was interested in.
There are several GPL'ed Open Source software packages that I have copies of, that have vanished with all references to them and are no longer available on the net. Also a number of great sites that came and gone for either lack of cash or time. I think if someone open sources something it should stay that way.

Also if it's open on the net for public viewing, then it should be fair game. Especially if the original author is credited and it is in the original context, like the Wayback Machine is. I know there are always special cases where something was put up that the webmaster was not entitled to like a copyrighted book or something, but for most stuff this is invaluable and a great service to humanity.

Also think of all those users who's we site was lost without backup. Now they can get that data back.

The Wayback Machine is one of the few web services I'd be willing to pay for.


used to be part of a "web helper" app (-1)

kahuna720 (56586) | more than 12 years ago | (#3732471)

Called Alexa. I remember seeing their bot spiders
going through all the time. Alexa was (is?) another
menu bar placed in the browser that would get cached
copies of the page if not available at the time,
stuff like that.

Glad it has been released to the public. I remember
when I lost some site pages a few years back, and
contacted Alexa about retrieving it from their archives,
but no joy as they were not interested. Now they've
put it all up there as, excellent!

p2p (1)

mephist01 (122565) | more than 12 years ago | (#3732476)

this reminds me alot of the old opt-in/ opt-out p2p debate.

Don't publish a website ... (0)

Anonymous Coward | more than 12 years ago | (#3732478)

Don't publish a website available to anyone on the Internet if you don't want a "snapshot" taken. I'm personally very comfortable with my work and writings being available to anyone, forever. If I wasn't I wouldn't have put them online.

Wayback machine = free backups! (1)

FamousLongAgo (257744) | more than 12 years ago | (#3732480)

I like to think of the Wayback Machine as my personal backup server.

I just put all my most vital files in a web folder, and their crawlers take care of the rest.

And for encryption? Two words, baby:


Library archives are given broader copyright uses (5, Informative)

tiltowait (306189) | more than 12 years ago | (#3732486)

.... and wayback is sponsored, amongst others, by the library of congress. The archive itself a 501(c)(3) public nonprofit. See 17 U.S.C. SECTION 108(a)(3) [] for more information.

Strange that such a complaint would appear within a group expousing that "information wants to be free." :)

For what it's worth... (2)

Reality Master 101 (179095) | more than 12 years ago | (#3732491)

What particularly interests me is the fact that the Machine is a relatively new animal, yet it contains snapshots from my sites dating back to 1998.

Interestingly, if you look at Slashdot's earliest entry [] (man, that page was ugly back then!), and then look at the bottom of the page, it shows the domain that was used to pull the page: "Welcome User From". appears to be some web search ("powered by Google") toolbar thingy. I can't determine if they are the same people as the wayback machine or not.

Purist? Pure what? (5, Insightful)

American AC in Paris (230456) | more than 12 years ago | (#3732492)

Perhaps I'm too much of a purist, but I've always seen the internet as an ever-changing medium, not a permanent one. Archives have bothered me ever since the fledgling days of DejaNews.

I'd say it makes you more of a control freak than a purist, personally.

Seriously, how did you ever get it into your head that a medium that serves documents to the general public on demand would be somehow exempt from archiving?

Would it bother you of John Q. Savant could recite the contents of your web pages from memory ten years after you'd taken it down?

Would it bother you to learn that stock prices, perhaps the most "ever-changing" thing out there, are permanently archived by a variety of services?

Or are you just jittery at the thought that your spouse/boss/Friendly Neighborhood Representative of The Man/kids may be able to someday look at the shite you plastered all over the web in your younger days? ("Ech, that stupid Netscape 2 animated title hack--honey, you actually -did- that?")

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?