Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Integrating Wikipedia With a Local Intranet Wiki

samzenpus posted more than 5 years ago | from the mix-and-match dept.

The Internet 121

An anonymous reader writes "I work for a large company taking a preliminary look at developing an honest-to-goodness wiki. We have tried to launch a company-wide wiki before, but with little success. The technical domains of each part of the company are different, thus each article needs a good deal of background to be useful. Of course, due the proprietary nature of our work we cannot share our articles outside of the intranet. What we would like to do is leverage existing wikis by augmenting our internal wiki with an external wiki. When a user accesses Wikipedia from inside our intranet, they receive the wikipedia content, plus the local domain specific information. For example, links to company-specific wiki pages would be available in Wikipedia pages. Has anyone else tried to do something like this? I know it sounds like a logistical nightmare; are there any thoughts on how to make this successful?"

Sorry! There are no comments related to the filter you selected.

URLs (2, Funny)

BadAnalogyGuy (945258) | more than 5 years ago | (#28713331)

URLs. Look into it.

Hook-nosed kikes run Wikipedia (-1, Flamebait)

Anonymous Coward | more than 5 years ago | (#28713597)

With Jews you lose so to win you need to lose the Jews.

Re:Hook-nosed kikes run Wikipedia (-1, Troll)

uepuejq (1095319) | more than 5 years ago | (#28713641)

if with jews you lose, and you need to lose the jews to win, do you need to use the jews to lose the jews? if so, how do you lose the final jew?

Re:Hook-nosed kikes run Wikipedia (0)

Anonymous Coward | more than 5 years ago | (#28714089)

A very hot and dry sauna.

Re:URLs (2, Insightful)

smallfries (601545) | more than 5 years ago | (#28713705)


It's a good place to bury the signal.

Re:URLs (0)

Anonymous Coward | more than 5 years ago | (#28713915)

chinga tu madre

Re:URLs (0, Offtopic)

CarpetShark (865376) | more than 5 years ago | (#28713991)

Hahhah, 100% accurate, and nicely put :D

Re:URLs (2, Interesting)

S77IM (1371931) | more than 5 years ago | (#28715755)

Said in a crude way; but to the OP: This guy is right. The most brain-dead simple way to make this work is to just set up your own wiki, and pepper it liberally with links to relevant Wikipedia pages. As someone below points out, there's even a feature in MediaWiki to make this linking easier (look up "InterWiki" in the MediaWiki help).

You may even be able to set up #REDIRECTS using InterWiki links so that people can still see the page names you want in your search and category listing, and then be taken straight to Wikipedia. If you want to get fancy, you can create a Template that opens the Wikipedia page in an IFRAME or does some DHTML to embed the content, so that the surrounding trim (edit link and search box) is for your wiki. The key here is to make sure people understand which content is your own wiki and which is Wikipedia -- whenever they edit or add a page, it goes into your wiki; treat Wikipedia as read-only. (If someone wants to make a genuine contribution to Wikipedia, they can go do that the normal way.)

I think that's the tightest integration you can get -- easy access to Wikipedia info, plus your own wiki for company-specific stuff. On the other hand this right here sounds like a recipe for disaster:

When a user accesses Wikipedia from inside our intranet, they receive the wikipedia content, plus the local domain specific information. For example, links to company-specific wiki pages would be available in Wikipedia pages.

That's not a logistical nightmare; that's a major development effort, just for a freakin' wiki. Requirements like this are why so many software projects fail. Abandon the pie-in-the-sky vision and go with something simple to start.

Like URLs.

Re:URLs (1)

TheRaven64 (641858) | more than 5 years ago | (#28717033)

If you want a simpler solution and have a few tens of GBs of space to spare, then you can just download a snapshot of Wikipedia and use that as the base for your wiki. You won't get any future articles, but you'll get the current ones.

On the other hand, I don't really see the point. Is it really hard to read both the wikipedia page and the local page?

cron + rsync + tar (1)

itomato (91092) | more than 5 years ago | (#28718091)

Every organization needs their own, up to date version of . []

But seriously, process the SQL dump when you retreive a monthly (quarterly?) update. Generate a set of strings that are relevant to your organization, and strip articles that don't match.

Someone can always visit the upstream site, or you can use the interwiki facilities, as mentioned elsewhere.

bad idea (5, Interesting)

uepuejq (1095319) | more than 5 years ago | (#28713343)

create a firefox addon that downloads a master list of wikipedia urls to add a link to the intranet site to. you can use regular expressions to parse the wikipedia source so that your link is consistently placed. the master list can be updated at will, and could probably be filled the first time with a simple database request. or something.

Re:bad idea (1)

ArbiterShadow (1222388) | more than 5 years ago | (#28713491)

I also think this is actually a decent idea. The only problem is that it takes a bit of user configuring, and will also make changes to the page on Wikipedia as well as the "internal" version. You could also just keep a server running that mirrors Wikipedia for the pages that you want, and uses said regexes to insert the internal links then. Transforming at compile time rather than runtime, if you will.

Re:bad idea (1)

Max Romantschuk (132276) | more than 5 years ago | (#28714145)

I wholeheartedly agree with the parent. Your best bet at doing this well is doing this as dynamically as possible. Scraping web pages is a huge pain. Building an extension to detect when you're visising wikipedia and inject something into the page is a hell of a lot simpler.

Another poster suggested greasemonkey. I haven't used it myself, but I suspect it would make sense to develop a prototype with greasemonkey first. It might well be that a custom extension is not needed at all.

Also, Firebug is your friend.

Re:bad idea (1)

Canazza (1428553) | more than 5 years ago | (#28714381)

A well written Javascript Bookmarklet will do the job too. You likely don't even need Greasemonkey, and it can be made cross-browser

Re:bad idea (1)

Jane Q. Public (1010737) | more than 5 years ago | (#28714439)

Scraping web pages is not so bad... I have been doing it for years. But in this case it is entirely unnecessary.

I know of at least two ways this could be done, neither of which is nearly as much work as this would seem at first. First, did you know that the entirety of the content of Wikipedia is downloadable, in different formats? You can get everything, or just the current articles without the history (much smaller), and there are other options as well. While there is a lot of data, it is really not that difficult to take a snapshot of Wikipedia and load it into a local database. I have done it before myself. It is a really huge database... but it is doable, and I managed to make it work in MySQL on my local development machine, not even a fancy server. Though I admit it was not fast that way.

Probably a better solution, however, would be to use Wikipedia's API. Get a back-end programmer to write a program that compares users' search requests against the content of your local wiki. If there is no match, forward the request to Wikipedia via the Wikipedia API. This is not a terribly complicated task. The hardest part would be determining whether a search matched something local. But that is the kind of task that has to be done internally.

Re:bad idea (1)

Canazza (1428553) | more than 5 years ago | (#28714579)

If we're talking about redirects, it would be quite easy to generate a 404 page that would redirect you to the Wikipedia page, either through a link or as a straight redirect. Or, if you can, use .htaccess and set up redirect rules there (it's the way wikipedia works anyway AFAIK, it just means adding more rules to your existing one)

Re:bad idea (1)

uepuejq (1095319) | more than 5 years ago | (#28715273)

i think you really get into too much development and design effort when such a simple task can be accomplished by using resources that are already available. wikipedia already exists, and they already have servers to handle the search requests. i still say developing a small application (whether it's some javascript, a new add-on, or something that already exists) to add links to wikipedia entries that have alternate internal wiki entries, since according to the parent question the employees are going to be using wikipedia a lot, anyway. you can make the link apparent so that employees can simply take a quick glance to see whether they need to do any more internal research. you don't have to waste time downloading, restoring, and implementing the wikipedia database internally, and you don't have to waste time updating it, writing scripts to do so automatically, or finding scripts that will.

as to the api approach, i think that would be fun, but it seems like a waste of server power to have a machine dedicated to doing something that could be done on client computers dynamically with a relatively minute bit of javascript, and would be just as functional and ultimately less effort to upkeep. for future changes to the api you may have to alter your entire application, but if wikipedia just changes the html layout all you have to do is modify a regular expression and the rest of the code is still usable.

Re:bad idea (1)

uepuejq (1095319) | more than 5 years ago | (#28715331)

jesus i hate when i start sentences, find something else to think about, then never oh look a flashing light

Re:bad idea (3, Funny)

dyingtolive (1393037) | more than 5 years ago | (#28715777)

i'd mod you funny
you didnt read like
e.e. cummings

Re:bad idea (0)

Anonymous Coward | more than 5 years ago | (#28717435)

my son actually does this, at 2 I expect it but am still amazed every single time to see it happen sincerely.

Re:bad idea (1)

fedxone-v86 (1080801) | more than 5 years ago | (#28716191)

Also, Firebug is your friend.

Have you just woken up from cryostasis*, too?

And you know the craziest thing I've heard about Firebug? Allegedly, people also use it to debug web applications written in JavaScript! Applications... On the web... In JavaScript...

What's next? Apple running on Intel? Bill Gates becoming a humanitarian?


*) I was frozen just before GWB started WWIII and thawed after the Blacks won :D

**) Thank you, thank you. I'll be here all night.

redirect wikipedia traffic to an internal web app (0)

Anonymous Coward | more than 5 years ago | (#28713381)

Build a web application to merge wikipedia content with internal content (iframes maybe).
Then setup a DNS alias to redirect the wikipedia traffic to this web app.

Download it (2, Informative)

Anonymous Coward | more than 5 years ago | (#28713397)

Download their database, put it into your system, and you're set.

Re:Download it (0)

Anonymous Coward | more than 5 years ago | (#28714373)

2.8 TBs eh?

By the time i download it, i'll be telling people to get off my artificial lawn... in a nursing home... drooling all over myself... IN SPACE!

Re:Download it (1)

paulatz (744216) | more than 5 years ago | (#28714413)

It does not include images, and all the integration with Wikimedia Commons, the Wiktionary and other projects. Last but not least, it does not update as wikiepdia is edited.

Solution (4, Informative)

Z34107 (925136) | more than 5 years ago | (#28713399)

Perhaps the easiest thing to do would be start with a complete dump of Wikipedia and add your own stuff to it. Their database dump page is here [] .

It is 2.8TB, however. They allude to a "Wikipedia API" for working on a "random subset" of Wikipedia; maybe that would be helpful too.

Re:Solution (0)

Anonymous Coward | more than 5 years ago | (#28713419)

It's not 2TB, it's only 3.2gb. You need enwiki-20080103-pages-articles.xml.bz2, from

Re:Solution (2, Informative)

negge (1392513) | more than 5 years ago | (#28713447)

Why use a dump from early last year when you can have yesterdays (

Re:Solution (2, Funny)

BadAnalogyGuy (945258) | more than 5 years ago | (#28713557)

Have you *seen* the latest?

I'd much rather have something that's been vetted a couple


Re:Solution (1, Insightful)

MaskedSlacker (911878) | more than 5 years ago | (#28713769)

Your Karma must be shit BadAnalogyGuy.

Why would anyone one commit be less vetted than any other commit? The old commits don't get new edits merged into them. A commit from year ago is no less likely to have vandalism present than the commit from yesterday. It will just be different vandalism.


CrashandDie (1114135) | more than 5 years ago | (#28717555)


Re:Solution (1)

MaskedSlacker (911878) | more than 5 years ago | (#28713789)

That's the compressed version. The meta-history file (compressed:17GB) decompresses to 2.8TB on its own. Assuming the same compression ratio (likely not a valid assumption) the articles file would decompress to 500GB, give or take.

Re:Solution (4, Interesting)

mcrbids (148650) | more than 5 years ago | (#28713563)

Dumps go stale, Wikipedia is updated all the time. I'd suggest something a bit more dynamic.

I did something similar (conceptually) as a dynamic help system for our web-based application, and had content in a wiki based on the URL of the page where the help message was to apply. In my case, clicking the "help" button on a page would make a proxy call to a private wiki to get the help menu content. If none was found, an email was sent to support desk and the end-user was given a web-chat prompt to tech support (with the URL prepended so that tech support could jump in, answer the questions, and write the help menu in one fell swoop)

In your case, start with your local wiki. Presumably you have some stuff in there already. Rename the articles as necessary to match URLs from Wikipedia.

Then, build a simple proxy server that rewrites wikipedia content to include a header of your local content. Probably 100 lines (or so) of glue code, and anywhere from a few man-hours to a few man-days coding.

The rest is all training.

intrawiki (-1, Flamebait)

Anonymous Coward | more than 5 years ago | (#28713405)

you fucking dolt

Do it the other way around? (0)

Anonymous Coward | more than 5 years ago | (#28713423)

Have you considered taking a recent wikipedia snapshot [] and using that as a foundation to seed your internal wiki.

Your internal users can then add their own revisions to this as required to customise it where necessary.

Of course, you'll lose the ability to pick up new changes/revisions to the original WP pages, but it might be a simpler way to go.

IFRAME? Intelligent proxy/page modification? (2, Insightful)

seifried (12921) | more than 5 years ago | (#28713437)

I assume you want up to date content and to have it clearly seperated from what is yours. Why not enclose the content within an IFRAME? Seriously, it's stupid and simple but might be all you need. Alternatively you coudl use some form of an intelligent proxy/page modifier, either as a mediawiki plugin or whatever floats your boat (i.e. every time a page is loaded also try to get the wikipedia stuff).

Re:IFRAME? Intelligent proxy/page modification? (1)

foniksonik (573572) | more than 5 years ago | (#28716753)

If you want to get fancy, use AJAX to grab the Wikipedia content, stuff it into a hidden div, then DOM select the contents of the article and set a visble div's html to the wiki content:

var wikiSource = JQuery.get("", function (wikiHtml){ setContent(wikiHtml); })

function setContent(wikiHtml){

var wikiContent = JQuery("#hiddenDiv #content").html();



Re:IFRAME? Intelligent proxy/page modification? (1)

truthsearch (249536) | more than 5 years ago | (#28717215)

Except you can't currently make off-domain AJAX calls. It's blocked for security reasons. There's a proposed standard for whitelisting domains, but it doesn't appear to be implemented in any browsers yet.

Business Talk is Stupid Talk (4, Insightful)

rm999 (775449) | more than 5 years ago | (#28713475)

"What we would like to do is leverage existing wikis by augmenting our internal wiki with an external wiki"

What does that even mean? If you want to design something, you'll have to use more precise language. And for god's sake, stop using the word leverage without thinking about it. You used it backwards - if you are augmenting your internal wiki with external wikis, you are leveraging your internal wiki with the external wikis. You leverage a boulder with a lever, but you don't leverage a lever with a boulder.

Re:Business Talk is Stupid Talk (0)

Anonymous Coward | more than 5 years ago | (#28713959)

C'mon. Let the man leverage the whole breadth of language and augment words with new meanings taking advantage of social media phenomenon.

Re:Business Talk is Stupid Talk (0)

Anonymous Coward | more than 5 years ago | (#28714035)

if you are augmenting your internal wiki with external wikis, you are leveraging your internal wiki with the external wikis. You leverage a boulder with a lever, but you don't leverage a lever with a boulder.

Wow, you totally shifted my paradigm.

Re:Business Talk is Stupid Talk (0)

Anonymous Coward | more than 5 years ago | (#28714319)

whilst i agree with your overall sentiment...

the word leverage has meanings beyond the mechanical sense, and despite the sentence in question being in my opinion quite clumsy; its meaning is not lost nor backwards.

one may leverage a resource in order gain some kind of advantage.

In anycase, I agree with your general point, 'ffs use some clarity, this is /. not', the use of the word here was hardly pivotal.

Re:Business Talk is Stupid Talk (5, Funny)

MrMr (219533) | more than 5 years ago | (#28714379)

As a non native speaker I find a dictionary quite convenient in these cases. so I'll do some back and forth translation for you here:

leverage (v.) -> opkrikken -> fuck up
augment -> duurder maken -> make more expensive
internal wiki -> krabbel zonder net -> off-line blurb
external wiki -> krabbel met net -> on-line blurb
existing -> nog bestaand -> not yet deleted

So the English to English translation is: "What we would like to do is fuck up non yet deleted blurbs by making our off-line blurbs more expensive with on-line blurbs".
Now that I can understand.

Re:Business Talk is Stupid Talk (0)

Anonymous Coward | more than 5 years ago | (#28714619)

What does that even mean?

That he's a dickhead? I think that he communicated that quite effectively.

Re:Business Talk is Stupid Talk (1)

PMBjornerud (947233) | more than 5 years ago | (#28714665)

"What we would like to do is leverage existing wikis by augmenting our internal wiki with an external wiki"

What does that even mean? If you want to design something, you'll have to use more precise language.

His example is much clearer:
For example, links to company-specific wiki pages would be available in Wikipedia pages.

One solution could be a Firefox greasemonkey script, as someone above already suggested.

Re:Business Talk is Stupid Talk (1)

Hognoxious (631665) | more than 5 years ago | (#28714679)

I have no idea what he means, but when he ran it up the flagpole I saluted.

Re:Business Talk is Stupid Talk (1)

Lumpy (12016) | more than 5 years ago | (#28715581)

give him some slack, he's been in meetings all week with PHB's and Executives that throw the terms around like candy and they don't even know what it means.

"It will bring us a whole new dynamic by leveraging our skill-set when applied to the future latitude and positions."

Everyone knows the suits in the corner offices talk only to hear themselves talk. It's either that or Business Administration degrees have a "ramble on like you are educated" class requirement.

Management speak ? (-1, Redundant)

Anonymous Coward | more than 5 years ago | (#28713479)

leverage existing wikis by augmenting our internal wiki with an external wiki

Congratulations, you have taken your first step into PHB'dom....
By starting out on a pointless project and using buzzwords to make it sound like you are doing something, you are doomed to failure (or at least mediocrity).

Browser overlay (1)

Phroggy (441) | more than 5 years ago | (#28713531)

It seems to me I've seen a browser extension somewhere that lets users add their own comments to any arbitrary web page, and those comments can be made public so anyone else running the same browser extension will see them when they load the same page. I bet you could use something like that, with all your users having a browser plugin that pulls URL-based content from an internal server.

Friendly MITM attack (1)

RiotingPacifist (1228016) | more than 5 years ago | (#28713539)

Sounds like a weird setup, so you'll probably need to do most of it yourself. Perhaps the easiest way is
1) setup a normal local wiki, with care to name pages the same as the relevant wikipedia page [I'm guessing you know how to do this]
2) use DNS redirects or similar tricks to get all wikipedia requests to go to a proxy
3a) do html injection on the page and stick your stuff at the bottom [MITM attack using ettercap or something like that]. This is probably a pretty bad solution, but is going to be the easiest to research as its textbook hacking.
3b) host dynamic pages that mash-up the 2 wikis (python,php,something like that). This is probably the closest to the right way to do it, no hmtl injection just a DNS redirect, but will require serverside processing for every.
3c) use injection, but only inject a bit of javascript/an iframe that tacks on your wiki stuff at the end (when avalible), This doesn't require much to be done serverside, just inject the same html on all pages.

Whatever you do you will probably spend more time reading hacking tutorials than wikihowtos

Re:Friendly MITM attack (1)

readthemall (1531267) | more than 5 years ago | (#28713669)

Consider also that Wiki pages are mainly intended to be updated. You should decide if your users should be able to modify the content they are viewing or not. If yes, make sure they can modify only the local content and not the content borrowed from Wikipedia.

Re:Friendly MITM attack (1)

JWSmythe (446288) | more than 5 years ago | (#28714183)

    That wouldn't work so well, if it were time to update from Wikipedia. I would assume they'd update frequently from Wikipedia (say once every month or so), but is it really necessary to suck down their whole database, when in reality if it's a small network (say less than 10,000 users), there will only be a handful of pages read.

    Ah, what happened to the good ol' days, when the whole Internet fit on that one AOL disk. :)

free legal advice (-1, Offtopic)

Anonymous Coward | more than 5 years ago | (#28713603)

In this article you can find the aspects relevant to the licence used by Wikipedia and your wish to integrate Wiipedia in your intranet:
Can Tpms Create a Commons? Looking at Whether and How Tpms and Creative Commons Licenses Can Work Together.
Good luck!

Doinitwrong (5, Insightful)

Anonymous Coward | more than 5 years ago | (#28713613)

Agreed. Appending to wikipedia is the ass backwards way to do it. Everyone suggesting greasemonkey and other addons are just enabling your backassery.

What you do is create an internal wiki, and wherever relevent you link to the wikipedia article. Or an external doc. Or nothing at all and expect your employees to look it up on their own.

Re:Doinitwrong (1)

karstux (681641) | more than 5 years ago | (#28713905)

That's what I'd have suggested as well. Least amount of work, efficient, usable, no questionable hacks. It's common sense.

Re:Doinitwrong (1)

tomhudson (43916) | more than 5 years ago | (#28713969)

Of course it's common sense - which is why it won't be done that way. It's the "OMG you expect users to figure this out?" shit.

Sounds like someone never heard of 'target="new"' to force the external link to open in a new tab so that the user doesn't go "where did my f*ing internal wiki page go to?"

... which explains why they never succeeded before - dumb users, and dump "implementors", and not even the basic understanding of how things work.

Re:Doinitwrong (1)

Lumpy (12016) | more than 5 years ago | (#28715609)

which explains why they never succeeded before - dumb users, and dump "implementors", and not even the basic understanding of how things work.

Welcome to corporate America. Like what we did to the economy?

Re:Doinitwrong (1)

geminidomino (614729) | more than 5 years ago | (#28715843)

Welcome to corporate America. Like what we did to the economy?

Should have gone with Art Deco. The whole "Early Mongolian Clusterfuck" theme clashes.

interwiki (4, Interesting)

MadFarmAnimalz (460972) | more than 5 years ago | (#28713619)

You probably want interwiki [] .

Re:interwiki (0)

Anonymous Coward | more than 5 years ago | (#28713723)

Wikipedia should be available via the wikipedia:-identifier by default. You can then add an extension to check if a wikipedia page for such topic exists and extract the content.

Re:interwiki (-1)

Anonymous Coward | more than 5 years ago | (#28713847)

You probably want interwiki [] .

Yes, but I have a wiki fetish. Due to the open nature of most of them, I will be copulating with wikis until 2029, at least.

google wave (0, Offtopic)

linhares (1241614) | more than 5 years ago | (#28713657)

sorry folks, it's all over and google has won. Google wave, coupled with an internal dump of wikipedia, seems to me perfect for your needs.

watch 1.20hs here and see for yourself [] . This monster will change email, chat, wikis and forums. I'd be worried if I was a slashdot overlord. In fact, an idea for an extension to google wave would be to implement slashdot's moderation system into it.

Maybe I drank too much of the kool-aid, but I think wikis and forums will all have to rapidly adapt, or adopt the coming plague from Mountain View.

Don't (5, Interesting)

pfafrich (647460) | more than 5 years ago | (#28713675)

Merging wikipedia with you company wiki is a bad idea:
  • The wikipedia content will always be out of date
  • Changes made to wikipedia content don't get fed back into wikipedia
  • Creates confusion as to what is and is not company information
  • Trying to load the wikipeida DB locally is a headache due to its shear size

Re:Don't (1, Informative)

Anonymous Coward | more than 5 years ago | (#28713821)


Re:Don't (3, Interesting)

korpique (807933) | more than 5 years ago | (#28713857)

I agree (would mod up but gave up modding way back). However this is an interesting and probably reoccurring problem: extending the wealth of public net wisdom with precision data from local context (organisational or task-centric rather than geolocational).


A proxy adding local content into pages loaded from outside as suggested in Re:Solution by mcrbids [] would solve some of the problems you mention:


        * The wikipedia content will always be out of date
                * it's fetched from real sources in real time
        * Changes made to wikipedia content don't get fed back into wikipedia
                * this changes to risk of unintentionally publishing private information - how hilarious!
        * Trying to load the wikipeida DB locally is a headache due to its shear size
                * not done; could instead cover the whole of outside web with one solution.


This problem remains:


        * Creates confusion as to what is and is not company information


I guess you'd be best off injecting a (user-hidable) "widget" layer that would contain all the local information needed, thus providing clear separation of local and global content. Least breakage of existing layout that way, I hope.


I assume here that we restrict our proxy to embed HTML (possibly including Javascript) into well parsing HTML pages only, so as to avoid breaking things as much as possible - inevitable to happen sometimes anyway.


Updating the contents of another window based on browsed content would require either


        * a single sign-on solution to target references to correct user's desktop (seen in updaters of multiple applications views in medical solutions for instance) or
        * a browser-specific local hack to study each page url and content and to fetch related information from local database based on those.


OT, adding such meshing into Google Wave would probably prove an interesting challenge :) Think of doing it Right (tm), with private additions to documents and discussions getting saved and tracked on local servers while public parts would be passed on to public servers.


Re:Don't (1)

FlyingBishop (1293238) | more than 5 years ago | (#28717021)

Also, strictly speaking, what the poster wants to do is illegal according to the CC-BY-SA and the GFDL.

See []

Copyleft/Share Alike
        If you make modifications or additions to the page you re-use, you must license them under the Creative Commons Attribution-Share-Alike License 3.0 or later.

I'm not sure he's planning on modifying, but it still sounds like a pretty clear-cut copyright violation.

Hyperlinks? (1)

Malibee (1215790) | more than 5 years ago | (#28713703)

Maybe I'm missing something, but why not just have an external links section on your internal wiki, or a "Required Reading" section? Seems like the solution you're proposing is a little bit heavyweight for the described problem.

Legitimate use for this hack (2, Interesting)

biduxe (541904) | more than 5 years ago | (#28713731)

Am I the only one which cannot see any legitimate uses for this hack.

Why lure your users into thinking the content is on wikipedia if it is on your network?
Can't your users use wikipedia _and_ your wiki.

Sincerely I think that the goal for this hack is luring users to think they're reading/editing wikipedia for someone's profit.

Re:Legitimate use for this hack (1)

tomhudson (43916) | more than 5 years ago | (#28714147)

Why lure your users into thinking the content is on wikipedia if it is on your network?
Can't your users use wikipedia _and_ your wiki.

Obvious answer: If they're as retarded as the person posting the question ...

Seriously, if the user can't figure out how to open 2 sites in 2 tabs, a "merged wiki" should be low on you list of priorities.

your content, how proprietory is it? (1)

tumbleweedsi (904869) | more than 5 years ago | (#28713747)

You need to make sure that there is a clear demarcation between your content and the wikipedia content and this will limit your integration. The last thing you want is for one of your users to upload confidential information onto wikipedia in the mistaken belief they are putting it on the in house wiki.

Maybe... (1)

denmarkw00t (892627) | more than 5 years ago | (#28713785)

Open page in intranet for...say, capcitor.

Script grabs wikipedia article, strips out header, sidebar, etc and fill in remaining links/images with proper URLs to wikipedia (so they work)

Stores in a database for diff'ing and updating later, dumps remaining content from Wikipedia at the bottom with a good 'ol <hr> and you're off!

What? (2, Interesting)

madcow_ucsb (222054) | more than 5 years ago | (#28713787)

Why? Can't you just link to wikipedia pages where appropriate? OK, my company has an internal server we link through to sanitize referrer info so our internal wiki titles don't get all over teh interwebs. But if the wiki users can't figure out "hey, this article is too specific - maybe wikipedia has more general information that would help me," you've got bigger problems than your wiki management.

CMS Federated Search (0)

Anonymous Coward | more than 5 years ago | (#28713817)

Nowadays, any content management system worth anything has a built-in wiki and most allow direct linking and searching between the local wiki and wikipedia.

For example Documentum [] and Sharepoint [] both have federated search [] providers for Wikipedia.

Plus, because the OP works for a "large company" they probably already have DCTM or MOSS installed somewhere.
Why reinvent the wheel when you've already bought a better one? (job security?)

part of an Intelligent Book (3, Interesting)

williamhb (758070) | more than 5 years ago | (#28713837)

A very small part of My PhD [] looked at this (but with "collaborative textbooks" rather than wikis) -- see Chapter 4. Adding a very simple metadata-based navigation layer over the top of the wiki is pretty easy, clean (doesn't confuse users), and seems to do the trick. The wiki itself shows in an embedded frame. Of course, I had to go further and let students do difficult number theory proofs backed by machine reasoning systems within the book, but you won't have to solve that problem!

I'm (gradually) putting this fairly simple but useful part of the software into an online resource at [] , though it's in my spare time and the system is down at the moment. I'll put my contact details back up there shortly in case the question-asker wants to discuss it technically.

don't mix security contexts. (0)

Anonymous Coward | more than 5 years ago | (#28713849)

What happens when a user doesn't understand that this isn't a local copy, and edits a wikipedia page with private information?

This is a bad idea, period.

Two approaches - browser extension or proxy (0)

Anonymous Coward | more than 5 years ago | (#28713861)

With a browser extension (probably relatively easy with Firefox or Opera), you can modify HTML DOM and include an iFrame with company specific information. This should probably be unobtrusive on the Wikipedia page, but it should be clearly marked as internal to your company.. users aren't always the brightest, and there's always the possibility of them editing the Wikipedia page itself to correct local content which should never be published on the Internet. It might also be possible to force the Wikipedia page into an Frame, and have company content clearly identifiable in another frame.

With a proxy, you would add some Javascript near the end of the HTML page, which does pretty much the same thing.. you will be limited to the security settings of the browser, though..

Also interesting are the extensions allowing you to comment on any public webpage, and share those comments with other people. Most of these use a public server, but you could probably modify an existing firefox extension to talk to a local server (which you then need to script). I think there's even an open protocol for this.

Of course, if you're going the browser extension path with Firefox, why modify the HTML at all? Modify the user interface, so that the company's wiki becomes part of the browser? Somebody has a site they want to bookmark for the wiki? Have a button for it. They want to create a new topic, based around this page's content? Have a button for it? They want to see all related internal pages, have a sidebar which updates with info from the local server. Standardising on Firefox in an organisation isn't a bad idea at all, especially if you can bring such benefits to the company. :-)

Simple. Two Tabs (1)

Phoe6 (705194) | more than 5 years ago | (#28713897)

One Tab for your Internal Wiki. Another one for wikipedia.
You can also highlight a particular word in your internal wiki, do a right click and search wikipedia (if your search is set so). The search term automatically open the wikipedia content in a new tab. How amazing. Isn't it?

Is it only me wondering how did this article ever made it to /. ?

Frames (0)

Anonymous Coward | more than 5 years ago | (#28713921)

Frames! Enough said.

Learn from mistakes. (1)

JamesR404 (1546869) | more than 5 years ago | (#28713989)

I think this is a very interesting story. Aside from the technical question raised, I am wondering why the first corporate Wiki wasn't successful. If it failed the first time because the culture isn't right or there wasn't any management support, a second wiki tool - no matter how seamlessly integrated - won't succeed either. Even if you have a company with many different technical domains it's even more reasonable to be able to share information. And an article shouldn't try to be totally comprehensible. You could write a parent page describing the concept, and subpages that are specialised for the different domains. I'd love to discuss this further.

Not without merit. (1)

asdfndsagse (1528701) | more than 5 years ago | (#28713995)

This is something the Google Wave protocol and platform [] completely anticipates.

Its based on a tree structure and source code management. People who edit from the synergized wiki could add to either the private or public versions, and patches to public versions or additional documents could be changed and maintained internally.

Re:Not without merit. (1)

asdfndsagse (1528701) | more than 5 years ago | (#28714011)

That would essentially be the way it would happen. You would hot pull down the mediawiki source, apply local changes, and locally render to pages with active diffs. You would add have pages that only exist locally. Due to limitations in the platform you would have to custom design any way to have changes that people make go either to public or private system, this would be difficult under the current system constraints, where the documents structure is not kept track of.

Dokuwiki (0)

Anonymous Coward | more than 5 years ago | (#28714217)

links to wikipedia: [[wp>subject]]
internal links: [[subject]]

you can give the links different colours with CSS, e.g. wp link = blue, internal links = green

The simplest solution I can think of is.. (1)

kikito (971480) | more than 5 years ago | (#28714237)

1. On your personal wiki server, have a copy of each page of the wikipedia you want to apply modifications to, and add whatever you want on those.

2. Have a modified http proxy on the intranet that detects queries to the wikipedia about items that you have on the server and re-route them.

For example, let's say you want custom information on [] . You copy it to http//yourintranetserver/wiki/Socks, and make your changes.

Then, if someone from inside your network tries to get [] , they get yours instead.

At the same time, the proxy needs to be intelligent enough to redirect back to the wikipedia page if your server doesn't have a page. Ideally the http redirect rules should be put automatically in place when a new page is added to yourintranetserver. (1)

TwistedPants (847858) | more than 5 years ago | (#28714287)

Go and look at Freebase: []
They provide an API to obtain articles and structured data from them. They handle all of the wikipedia import.
Additionally, you can do much more with the structured data there
For instance - Olympic Cyclists and the Way They Died. [] Try doing that with Wikipedia.

Done this before (1)

MarkH (8415) | more than 5 years ago | (#28714343)

1) Install Wikipedia software locally and use this for any locally created articles

2) The web server running this simply proxies out to for that request if not available in the local version. The easiest way to do this is with Apache + rewrite rules

This means that users can get to articles locally and on wikipedia from the same command

You then need to consider the following

1) The search request needs to go to the local version of wikipedia then the external one and concatinate the results together - a small proxy script should be more than capable of doing this

2) You may want to create a reference table which maps external wiki articles to related internal ones. Again a small script could insert these into the external wikipedia articles during rendering

Why not (1)

Vahokif (1292866) | more than 5 years ago | (#28714363)

Why not run MediaWiki on your intranet and use InterWiki links to Wikipedia in your own articles?

howabout (1)

bball99 (232214) | more than 5 years ago | (#28714419)

they were pretty good at page-hijacking, IIRC :-)

seriously though, perhaps i mis-read the question? are you looking for automated tools to do the hyper-links?

There's lots of value in a compound wiki (3, Interesting)

davide marney (231845) | more than 5 years ago | (#28714479)

Ignore the nay sayers. Of course there is a lot of value in aggregating content and creating a compound page that blends your internal content with other sources.

From a usuability and authority-of-source perspective, however, I think it would be best to list each source in a separate section on the page, starting with your internal content at the top. You can get to the other content either by embedding links into your internal content, or by collecting the links in a separate section.

Wikipedia itself uses the embedded technique. When composing or editing an article, the author can embed markup for external references. On display, this markup is turned into a footnote link at the point of embedding, and a footnote at the bottom of the page. I don't see why you couldn't do something similar. In this case, however, you would be embedding references to Wikipedia articles.

I don't see why you couldn't do something similar. In your internal wiki templates, have a custom markup for embedding wikipedia queries related to the article. On display, turn this markup queries either into embedded links to footnotes, resolve the queries and deposit them at the bottom of the page, or toss them into iframes and let the user sort it out.

The other technique is to have a custom form in your internal wiki template where you collect the cross-references. On display, turn these queries into links or resolve them into content.

In any event, why limit yourself to Wikipedia? Include cross-references to patent search engines and other domain-specific sources.

A big word of caution, of course, is owed to the legal angle. Make sure you follow the law whenever reusing anyone else's content, even if it's just a link. Have your legal department sign off on your reuse policy. Don't distract them with technical aspects of what you want to do. They're lawyers; they only care about the law. Ask them a specific legal question, such as, "what is our legal exposure if we republish (links to or actual content from) Wikipedia on our internal wiki?".

WikiSlurp... (1)

bagsta (1562275) | more than 5 years ago | (#28714727)

Accidentally I saw this site [] . I haven't tested and I don't know the results. I think it's in the early stages of development.

Greasemonkey (1)

Steneub (1070216) | more than 5 years ago | (#28715577)

My first thought was to use a Greasemonkey (or Greasemonkey analogue) to add whatever you wanted to pages that show up on Wikipedia. The way it could be integrated is you have the internal wiki with its markup and everything named as the same page title as what's on Wikipedia. When a page on Wikipedia is loaded, the script appends the internal wiki onto the Wikipedia page.

Others' concerns about Wikipedia being out of date or contradictory are valid though. You would probably do better to either.

  1. Don't!
  2. At least make it very clear to end-users what is external and what is internal.

Extra Points earned if, when someone clicks to edit an entry that has internal information, the process is seamless and feels like editing the Wikipedia page itself.

leach (0)

Anonymous Coward | more than 5 years ago | (#28715903)

So basically you want to benefit from the community effort that has built Wikipedia but not give back? Do I have that correct?

Think about what Wikipedia would (not) be if everyone had your attitude -- to keep their contributions private.

See Also: (0)

Anonymous Coward | more than 5 years ago | (#28716165)

Scan each internal page and see what wikipedia pages it links to. Store that info in a database. Make a firefox plugin that works as follows:
When a user is at a wikipedia page query the database and see which internal pages link to it. Add those links to the "See Also:" part of the wikipedia page.

No you wouldn't get inline links but it sounds much easier.

interwiki (0)

Anonymous Coward | more than 5 years ago | (#28716185)

Use mediawiki as your wiki and add the interwiki plugin. See

Is this really worthwhile? (1)

gertam (1019200) | more than 5 years ago | (#28716201)

I don't get it. Are people in your company using Wikipedia so much in their daily work that this would really be useful. Just set up your internal wiki. It is your focal point. Why try and integrate the two beyond just making a link to Wikipedia? Using Mediawiki, you can even use Interwiki links to easily link outside of your internal wiki.

Try the other way round (1)

RogL (608926) | more than 5 years ago | (#28716329)

Why not try the other way round:

Create your wiki, add pages, add links from your wiki pages (which you have full control over) to relevant wikipedia pages?

Much simpler, and should still produce the desired effect.

The Real issue is Social. (1)

Electrawn (321224) | more than 5 years ago | (#28716755)

You are trying to force a technical solution on a social problem. It's probably not going to work. Your best bet for success is to try and install a WYSIWYG editor for mediawiki. There are several out there. wiki, underneath, is just a programming language. It requires training people - no matter how much it is designed to be "easy." Make it easier.

Consider Sharepoint. As much as /. is Anti-Microsoft, if your users are used to Exchange and Windows then Sharepoint is worth paying for.

I've worked for Larry Sanger's Citizendium.

Extensions (1)

Jjeff1 (636051) | more than 5 years ago | (#28717207)

I wrote a very simple extension for my own mediawiki site that pulled in external pages as an iframe within a wiki page. I'd imagine you can do the same, Build your own wiki, with the wikipedia pages included below your own content.

Tearline Wiki (1)

ijones (83977) | more than 5 years ago | (#28717265)

The experimental Tearline Wiki [] system we've developed at Galois [] might suit your needs. Inside the firewall, you use MediaWiki with the Tearline system, and get a combined view of your internal wiki(s), possibly different wikis on different sub-nets, and you can integrate it with Wikipedia or other internet-based wikis to get the global context of the article.

As others have said, integrating your content with other people's content can be a legal issue.

Contact me if you want more information on Tearline :)



Wikipedia is X rated (1)

cellurl (906920) | more than 5 years ago | (#28717487)

Just keep them separate.
I work for a huge corporation and we have our own thing called etipedia.
Also, don't forget, wikipedia is X rated. []
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?