Mining Neologisms from Wikipedia 93
holy_calamity writes "Natual Language Programming researchers have developed a tool called Zeitgeist that can discover the meaning of new words for itself using Wikipedia. It looks for entries for words not in the WordNet database and works out their meaning by looking for known words linked to them. Development of the tool is focusing on using it to understand what bloggers (using slang and neologisms) are saying about companies' products."
Garbage collection (Score:1, Funny)
Re: (Score:1)
> works out their meaning by looking for known words linked to them
I suspect some bugs need to be worked out. For example, it came up with this definition:
slashdot: v To surf for pictures of pretty girls (e.g. Natalie Portman) for the purpose of satisfying unrelieved sexual frustration owing to social retardation, using powerful network-enabled computers (e.g. Beowulf clusters).
Id love to see what it came up with... (Score:5, Funny)
"ass-hat" and "tard" could take on a whole new meaning
Re:Id love to see what it came up with... (Score:5, Funny)
Re: (Score:2)
Re: (Score:3, Funny)
Damn! Now so will truthsearch! Son of a...
Re: (Score:2)
Re: (Score:2)
slashdotting (n., neolog.) (Score:3, Informative)
Re: (Score:2)
Re: (Score:1)
We'll use your example of "Feminazi". To some people, any feminist is a "Feminazi". To others (take some feminists), it's a feminist who is irrational in his/her ways and seeks power over equal or fair treatment and expectations.
Another example would be the word "Jew". It can be said or used in such a way to give insu
Re: (Score:3, Funny)
By the way, I have an odd problem with the word neology. Why? Because in my 7th grade Latin Class, one of our assignments was to be a neologist, using latin roots to make up a new word. So the word neology makes me think of 7th g
Re: (Score:2)
Re: (Score:2)
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
Temporal flash crowding (Score:1)
To extend, the lack of huge crowds of time-travelling tourists at events such as the WTC collapse is the best evidence that time travel from the future into our present cannot happen. Either that, or time travellers are required to cloak themselves. Otherwise, the streets and skies of NYC would have been packed with tricked-out Deloreans on 9-11-01.
Stephen Hawking thinks this may be because the furthest you can travel back in time is to the invention of the time machine (and
Re: (Score:2)
You make two ends of a wormhole and carry them to wherever you want. It
obviously takes a very long time at ~cee to do this. Your wormhole is now
a fixed-length (space-)time machine on the order of how much time you spent
transporting the ends.
Just imagine... (Score:3, Funny)
Re: (Score:2, Funny)
That's just part of his strategery to get people to misunderestimate him.
Re: (Score:2)
All too truthy.
Marketing research on the net (Score:5, Insightful)
Re: (Score:1)
blog/online feedback research is different in that it focuses on what people consider is worth saying/writing about a certain product. The risk of bias is less probable, because of transparency.
Re: (Score:1)
I don't know, but I know those snakflabbing IBM products really zorf me right in the snurls. . .
Re: (Score:1)
Re: (Score:2)
say hello to dictionary bombing (Score:4, Funny)
n.
1. 43rd president of the United States.
2. miserable failure.
But Wikipedia seeks to avoid Neologisms! (Score:5, Informative)
http://en.wikipedia.org/wiki/WP:Neologism [wikipedia.org]
Many articles about neologisms *do* get created in violation of this policy - but they are generally put up for deletion via the Wikipedia process for deleting inappropriate material - so they only exist briefly.
So, for example, the article entitled "Windows Rot" is being debated today, Although it looks like this one will be merged into an existing article, it won't survive as the name of an article - so Zeitgeist presumably won't be able to find it.
It may be that enough of these kinds of articles slip through the system to be useful to Zeitgeist but that is not by design - so coverage will be patchy at best.
A further consequence of this is that the articles that Zeitgeist does find will most likely be so new that only one person will have worked on them - which will make for poor quality.
Also, it is very common for people such as bloggers who come up with what they consider to be clever new words to try to wedge them into common usage by writing about the word in Wikipedia. This 'vanity word' problem is one of the main reasons that Wikipedia seeks to avoid articles on neologisms.
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
(BTW, am I the only one who has added 'cromulent' to his spellchecker's list of good words?)
this is not at all true... (Score:2)
For some reason, someone decided to redefine acronym and make up a new word to cover what acronym covered before. And Wikipedia uses it constantly, despite the pointlessness of it and the fact that the word hasn't caught on widely, thus making it a protologism. Although protologism isn't a word that has caught on widely either, thus making it a protologism itself at best, more likely a vanity word.
Re:this is not at all true... (off topic) (Score:1)
If a homological adjective is one that is true of itself, e.g., "polysyllabic", and a heterological adjective is one which is not true of itself, e.g., "bisyllabic", then what about "heterological?" Is it heterological or not?
- Grelling's Paradox
Re: (Score:2)
Ironically 'protologism' does seem to be a neologism - there is a definition for it in The Urban Dictionary from 2003 - so it's at least 3 years old.
that doesn't mean it's caught on... (Score:2)
Re: (Score:2)
Logism = Word
Once it's not new, it's not a neologism anymore.
if only urbandictionary were so restrained (Score:1)
For slang, it is useles without a context (Score:2, Informative)
I wish them good luck...
Re: (Score:2, Funny)
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
Well, for example, when you refer to a friend you envy for a precise in-context reason, calling him a bastard would somehow be what the GP is talking about. But that would also work for other insults, such as enculé, and undoubtfully even in other languages.
Example :
"-Dude, I just had sex the Olsen twins!
-You bastard!"
omg it reads L33t? (Score:2, Funny)
Re: (Score:2)
Urban dictionary (Score:2)
Re: (Score:1)
What if it went in to a loop (Score:5, Funny)
Re: (Score:3, Funny)
Gazomplat. Wow! I remember that word from the mid 1970's. Bear with me a moment...
When I was learning to program in FORTRAN in my high school math class. Our teacher (who didn't know how to program either) was trying to teach us by the age-old process of reading the book one chapter ahead of the class she was teaching. As a consequence, she was no better at it than the rest of us and we ended up debugging her code about as often as she helped with debugging
Slashdot Font Confusion (Score:2)
If you needed any more proof that the slashdot font sucks, here you go.
It's a sad day when
is mistaken for
Next thing you know, pom enthusiasts stray into the wrong conversation, and you can never go back from that.
Re: (Score:2)
Re: (Score:1)
Re:What if it went in to a loop loop loop loop (Score:1)
Re: (Score:1)
Re: (Score:2)
Shmorkle!
Re: (Score:2)
Re: (Score:1)
Acme-sucks.com locator (Score:1, Offtopic)
Corporate censorship. Now Automated with "Zeitgeist".
Think I'm a nut.
Call me back in 5 years...
chance (Score:3, Funny)
Re: (Score:2, Funny)
Step One is Complete (Score:5, Funny)
Re: (Score:2)
That should be "giving a bzzzzt!" or "live-wiring their butts" I think.
Re: (Score:2)
A better name (Score:1)
OK why not the term DefMiner? Then get an old guy to be the site mascot? On second thought, never mind. Just dont be supprised when people get you confused with another product.
Re: (Score:1)
Re: (Score:1)
Re: (Score:2)
Santorum! (Score:5, Funny)
thats how I learn new words in languages (Score:2)
This usually only works in languages I know fairly well. If there are two or three unknown terms in a paragraph I'll have less success in understanding them.
Hello? (Score:5, Interesting)
You do not need a fancy program to do this. I can do it for you, without even reading the blogs in question.
Watch.
They are saying your products suck, and that your customer support is worthless.
See how easy that was? Now, you might be wondering how I know this. Simple. They don't use made up words to say good things about you. I'm not sure why (maybe they aren't worried about being sued for saying good things?), but the pattern is very consistent. If somebody goes to the trouble of writing about you in their blog using made up words, they don't like you or the horse you rode in on.
Likewise, if you are a journalist, they call you funny names (Steno Sue, Laura Dildo, Kneepads Miller, "Dollar a Word" Armstrong, etc.) because they've noticed that you consistently write to favour a certain party, position, politician, company, or lifestyle, even when this requires ignoring a pile of facts the size of Paraguay, any one of which would shred your position.
And if you're a politician, it means that someone noticed that what you say in speeches is so unconnected to what you do with the office you hold that the only link between them is the way in which they combine to mollify your nominal constituents while maximizing the benefit to your corporate sponsors.
If you are an industry association, they are saying they hate you, period, and that you are evil incarnate.
See how easy this is? If you still don't get it, I am willing to come out of retirement as a consultant to explain it to you, provided the price is right.
--MarkusQ
Re: (Score:3, Funny)
Re: (Score:1)
Re: (Score:1)
Comparison of Wordnet to Current Hutter Prize (Score:2)
paq8hp3 [binet.com.ua] is the current Hutter Prize lead contender [hutter1.net] and has compressed the first 100M of Wikipedia to just over 17M. Wordnet's .exe file is just over 17M. One wonders what would happen if the "cream" of Wordnet's vocabulary were compressed using paq8hp3 and then incorporated into paq8hp3 to be a better
"Natual"? (Score:1)
Urban Dictionary? (Score:1)
Shouldn't they also crawl through something like the urban dictionary which will have ten times more slang definitions?
When turned on mentions of itself, (Score:1)
Searching for the meaning of words? (Score:1)
I am leaving now, but I shall return interfrastically.
(5 points to whoever places the origin of this bastardized quote)