Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and learn more about it. Thanks for reading, and for making the site better!

It may disappoint you, but you lost to a post titled "Frothy piss". If getting a "first post" is really that important to you, I suggest you take this as a lesson and re-evaluate your priorities in life. Being nearly as good as "Frothy piss" is a sad, sad life achievement.

Re:first, finally (-1, Offtopic)

Anonymous Coward | more than 6 years ago | (#18145730)

not according to some tastes. but thanks fella:-)

How is this not a radix sort? (5, Informative)

Anonymous Coward | more than 6 years ago | (#18145112)

4. Logic of BitFast
BitFast works with numbers in the bit level. It applies masks to cut the numbers so it uses only the desired ( based on the
programmer/problem specifications ) number of bits every run. It always start with the LSB ( Less Significant Bits ) ignoring
the any other bits and every run it moves to the higher block of bits until the whole list is sorted. So is we want to sort 8 bit
every run , the first timer it will sort 0-7 bits ( Least significant ), then the 8-15 bits , then the 16-23 bits and for last the
24-32 bits ( Most significant ) , having now sorted the entire list of 32bit numbers

Congratulations, you have reinvented radix sort [wikipedia.org] .

Making it in place is just an implementation detail, just as most other sorts can be done either in place or not (a typical example is quicksort, where most naive implementations tends to copy, while the typical C implementations tends to be in place).

It's still in the N log(N) time class. Mergesort has minimal time overhead to prepare and only N memory overhead. Do we have that for this algorithm?

Re:Nice, but... (0)

Anonymous Coward | more than 6 years ago | (#18145244)

Radix sort exploits the representation of the keys to be compared. It is not general purpose and its running time cannot be measured purely as a function of 'N' like for comparison-based sorts.

Re:Nice, but... (0)

Anonymous Coward | more than 6 years ago | (#18145278)

Mergesort on a linked list requires constant space.

I guess it technically *is* O(N * log(N)) to the number of items, but this is misleading. It's actually O(N).

As pretty much everyone has pointed out, it's just radix sort. The time taken by radix sort scales linearly to the number of keys, and with the log of the maximum key that can be held by the container.

If we're dealing entirely with unique keys, this is of course >= log(N), by the pigeonhole principle and all that. If we may have duplicate keys, however, there may be more keys than container space, and radix *does* scale linearly to the keys.

Time to prepare and overhead of this alrogrithm are negligable in any context large enough for us to care about the O() of the algorithms. For some things, this *is* better than mergesort. I'm just not sure why it was posted, as it's also not new - it was invented over 50 years ago.

This isn't sorting linked lists, it's sorting linked list of NUMBERS (or, more specifically, N-bit strings.) Doing that fast is hardly rocket science, and this bitfast is nothing new... That's basically radixsort [wikipedia.org] with radix = 2^16.

10x? (-1, Troll)

Anonymous Coward | more than 6 years ago | (#18145142)

NO WAI. RLY?

Re:10x? (-1, Troll)

Anonymous Coward | more than 6 years ago | (#18145214)

Not only integers, more than likely only POSITIVE integers. If you are using 2's complement that would mean that the first bit in all negative numbers would be a 1. Unless I missed something, this algorithm does not take that into account and would sort them as being larger than all positive numbers.

It's a bit of an apples vs. oranges comparison to put this up against mergesort - mergesort is a comparison-based sort, while Papadopoulos' bitfast is a radix sort and thus O(N*W) where N is the number of elements and W is the width of each element in bits. (hint - try sorting 1000-byte strings with bitfast, and see which is fastest) And no, it doesn't have anything to do with hashing.

However, it's definitely a clever way of implementing radix sort with linked lists, which may make it useful in some cases (e.g. OS internals) where you don't want to allocate space for a big directly-addressable array.

Re:radix sort vs. comparison sort (3, Interesting)

I ditto this. Mergesort is a true comparison sort, which means it can sort anything so long as a comparison between two values is defined, e.g. strings (in lexigraphical order), floating point numbers, 1000-digit numbers... you get the drift. BitFast is a radix sort and only works in cases where the list elements obey certain limitations: for instance, if each list element can be expressed in 32 bits (an int).

First, it's just a radix sort [well that's my take]. Second there are too many google ads on that page. Third, merge sort is O(n log n) time, hard to beat for random data. Fourth, most people use variations of the QUICK SORT not merge sort.

Tom

Re:wtf? seriously. (0)

Anonymous Coward | more than 6 years ago | (#18145256)

The point is not that this is some revolutionary way to sort, but that it is a fast way of sorting LINKED LISTS. Since linked lists do not have random access, you cannot perform a quicksort on them.

Quick sort does not require random access in order to be E[O(n*log(n))] on a shuffled linked list. Here [mtu.edu] is a good example. Even if you want to use the middle element as a pivot (good for near-sorted data), you can scan the linked list linearly for that element, and it will amortize to the same complexity.

Now, there is the question of whether you want to do it, but you most certainly can. For my unsigned integer linked-list sorting needs, I've found radix sort with a radix of 6 to work pretty well in practice.

Actually, you can perform quicksort on linked lists. But without random access, many of the optimizations you can use to make quicksort really "quick" can't be done, so there isn't much reason to prefer quicksort over a good mergesort implementation for linked lists.

Sixth, the headline "10X faster" is incorrect, as they differ asymptotically, not by a constant factor. (Run different data sets...vary by size of a single element, and by number of elements in the list. The ratio will change.)

Actually sorting linked lists is useful in many cases, and can be done fairly efficiently with a well written quicksort (without the step of converting to array that this code does), since quicksort is conceptually just recursive partition around a pivot element - partitioning a linked list is cheap and can sometimes be faster than using an array if the objects being sorted are complex and stored by value (i.e. in any case where moving the linked list pointers around would be faster than copying the objects). A variety of other sorts are also easily adaptable to linked lists.

Fifth, the author tries to "GPL" the algorithm, which is utter nonsense. GPL deals with copyright, so the most he can do is GPL his implementation of his bucket-/radix sort. Anyone is free to re-implement the algorithm, GPL or not.

It would require a software patent to restrict the use of the algorithm to GPL programs.

(And sixth, a quick look in a text book would have clued the author in)

Get NoScript [mozilla.org] for Firefox. It's annoying the first few days as you build a white-list, but now I don't deal with any crap. Google-analytics is forbidden and that page is now ad-free.

Third, merge sort is O(n log n) time, hard to beat for random data.

Provably impossible, in fact.

Fourth, most people use variations of the QUICK SORT not merge sort.

I'm not sure that's a fair generalisation. There are several good O(n log n) sorting algorithms, heap sort being another obvious one, and typically things like the data structures involved, need for stability, or ease of parallelisation are the deciding factors in practice. In fact some of the best performing algorithms in real terms are theoretically O(n^2) in the worst case, hence the creation of hybrids like introsort, a quicksort variant designed to avoid the worst case by switching to heap sort after a certain recursion depth is reached.

Anyway, I'm afraid this is another article by a 21-year-old student who is well-meaning, but not nearly experienced enough yet to appreciate that his suggestion is just rehashing (no pun intended) a lot of long-established ideas. I think the giveaway sign is that his CV on the site still lists just about every programming language and web development skill in existence.:-)

Third, merge sort is O(n log n) time, hard to beat for random data.

Provably impossible, in fact.

Not quite. O(n log(n)) is only the best case for comparison-based sorting. That is, if you have a less than (or greater than) predicate, you can build an algorithm that is O(n log(n)) using it. If your algorithm is not comparison-based, you can beat O(n log(n)). The most obvious example is bucket-sort, where you have k possible values. You pass over your list once, placing the value in the correct bucket, which gives you O(n) performance, at the cost of requiring k buckets worth of storage space (you can create the buckets on the fly, but then you add a O(log(n)) search into each step of the algorithm, dropping it back to O(n log(n))).

Re:wtf? seriously. (0)

Anonymous Coward | more than 6 years ago | (#18145638)

Seeing skills on his CV like "Visual Basic ***Excellent***" really engender my confidence in his abilities as a developer. You're just being unfair!

> the best performing algorithms in real terms are theoretically O(n^2)

Agreed. For example, in a microcode example at work, the cost of searching for a given string was totally dominated by memory access time, so it really didn't matter which algorithm we used. So we used brute force. Pretty easy to debug, as well.

First, it's just a radix sort [well that's my take]. Second there are too many google ads on that page. Third, merge sort is O(n log n) time, hard to beat for random data. Fourth, most people use variations of the QUICK SORT not merge sort.

Of course, radix sort is not a comparison sort, but has its own limitations as others have noted.

And Quicksort technically has a O(n^2) worst case (reversed list) but is more often picked over Merge Sort because on avg it is nLogn and doesn't require the extra memory merge sort requires.

It's useless to talk about the speed of a sort algorithm without discussing its complexity.

Linear improvements are completely uninteresting. Take algorithm X in Perl, then a version in optimized assembly, and there you go, a 10X improvement without having done anything groundbreaking.

A 10X faster bubble sort is just as useless for large enough values of n, and a 10x slower merge sort would still have a very good performance in comparison.

Since the article has nothing, I wanted to ask: has anyone implemented merge sort where you merge from both the front and back using two simultaneous threads ? I haven't seen this anywhere, but it's an obvious improvement for multicore machines. Does anyone know how to further parallelize merging (or for that matter, heaps) ?

Nevermind, I figured it out. You need to estimate the median (like quicksort) to parallelize merging to four cores. If anyone still has suggestions for heaps, I'd like to hear it though.

Re:multithreaded merge (0)

Anonymous Coward | more than 6 years ago | (#18145308)

The parallelization of merge sort is nothing new, and it is rather simple to do because of the recursive nature of algorithm. I've seen parallel versions of merge sort with load balancing to sort real large data quantities.

Re:multithreaded merge (0)

Anonymous Coward | more than 6 years ago | (#18145322)

I've never seen it done before, but I don't imagine it would be particularly diffucult to pull off. I'd say just add another parameter that tells the function how deep in the call stack it is. If it's less than x layers deep, fork with the child process going on to the other processor. The parent process sorts the first half of the list, and the child process sorts the second half. When the parent process finishes, it waits for the child process to finish (if it hasn't already), then collects the child's sorted list and merges as usual. The reason you'd want to track how deep you are is so that you don't wind up forking when there aren't any more processors available. For instance, if you were running on a two processor machine, you'd only want to fork once, since forking again would only add overhead without any gain. You could probably generate x on the fly based on the number of processors in the user's computer, too.

The only problem I forsee is the case where the added overhead exceeds the gain. However, I imagine this would only be the case when you are sorting very small lists. Perhaps one could check the size of the list first and then decide whether or not forking is worth it.

Since the article has nothing, I wanted to ask: has anyone implemented merge sort where you merge from both the front and back using two simultaneous threads ? I haven't seen this anywhere, but it's an obvious improvement for multicore machines. Does anyone know how to further parallelize merging (or for that matter, heaps) ?

There are entire classes of sorting algorithms for multiprocessor machines, including algorithms optimized for shared versus non-uniform memory architectures, message passing, sorting networks, and much much more. In general for m processors, sorting becomes an O(n log n / m) problem just as you'd expect (or hope).

To go even one step better you can use a sorting networks and get O(lg n) complexity. Of course you need a network of size O(n lg n). Given a mesh-based computer you can sort N^2 items in O(n) time. But for now most of us are stuck with serial computers with small attempts at parallelism.

First of all, algorithm running times and complexities are not compared as 3x or 5x faster. Please give me running time in Big O notation [wikipedia.org] . Saying 10 times faster than merge sort makes no sense.

MergeSort was the state of the art

How about QuickSort [wikipedia.org] ?
AFAIK, quicksort (as well as mergesort) is optimal algorithm for comparison based sort, which means you can not find anything asymptotically better than that.

And how about submitting it to academic journals if you come up with something better?

> First of all, algorithm running times and complexities are not compared as 3x or 5x faster. Please give me running time in Big O notation. Saying 10 times faster than merge sort makes no sense.

But I came up with a sorting algorithm yesterday that's 12 times faster than bubblesort! Click here to see my advertisements....

Seriously, mergesort (but NOT quicksort) is optimal, O(n log n) in one model. But there are sub-n log n sorting algorithms in other models. Some of them are based upon radix sort, surprise, surprise....

Re:oh really? (0)

Anonymous Coward | more than 6 years ago | (#18145534)

Dude, O(N log N) is not the best case for sorting. There are known algorithms for doing a little better than that. But not a lot better so using quicksort still makes sense.

Well, consider the source. It was written by a 21 y/o student who has listed his primary language as Visual Basic. Obviously he isn't trained in theory. That said, a nice discovery for him, but not Slashdot news, and definitely not academic journal material.

Quicksort may have a best expected sorting time, but still has an O(n^2), since O is worst case. I think the STL takes a pass at the list to determine how sorted the list appears to be and then picks a sorting algorithm based on that... but it's been awhile since I looked at it.

Please give me running time in Big O notation [wikipedia.org].

Oh, but he says he didn't want to go into a bunch of theory. In other words, he doesn't understand the theory. In addition to theory, add copyrights and patents to the list of things he doesn't understand:

The algorithm is being released under the GPL ( General Public License ). The algorithm belongs to PhoenixBit and VirusFree but you may use/modify it freely.

1. You release the code under the GPL, not the algorithm. 2. You don't 'own' the algorithm. You might patent it, but you haven't (and obviously won't since it appears to be radix). Additionally, to GPL the code, you'd have to provide an unlimited redistribution license to the patent if it existed.

Merge sort and heap sort are both O(n log n) for arbitrary input -- they'll always take about the same amount of time no matter what the input data looks like. And O(n log n) is the best you can possibly do on average.

Quicksort, in contrast, is highly suboptimal -- O(n^2) -- in the case where the data are already in order. It only approaches O(n log n) behavior when the data are completely randomized. In general, of course, it depends on what data you're looking at, but quicksort performance will be somewhere in between these two cases.

People USE quicksort because it's optimal for many situations that people typically work with. That's not because of its asymptotic behavior, but because of its small constant factor.

Remember, our computers are not Turing machines. We never actually work with the average case. And n is a lot smaller than infinity.

Against my better judgment, I skimmed the article, code.

Two observations:

Why no analysis of impact on memory usage? 10x speed-up, but at what cost? Omitting this detail makes it difficult to weigh the utility of the algorithm.

Not an Apples to Apples comparison. In the source, the "BitFast" sort converts the entire link-list into an array before performing the sort. Without even considering the merits of the actual algorithms, the Mergesort has no chance of being faster given the more complicated data structure used.

As others pointed out, this sounds like the Radix sort, but I was too disgusted to step through the source in detail and even verify this. Maybe someone else will have a stronger constitution.

Here is a good Mergesort implementation (1, Interesting)

Anonymous Coward | more than 6 years ago | (#18145344)

The poster has just implemented a radix sort. One can do them quite faster, but of course, they are basically restricted to sorting numbers, or at least bit strings with a reasonable diversity in the distribution of prefixes.

For a pretty efficient mergesort implementation in C/C++, try the following (has at one time been offered to libg++, but not taken up). It sorts, as an example, stdin line by line to stdout. The whole recursion stuff has been flattened into loops and binary arithmetic. <blockquote>/* Copyright (c) 2007 David Kastrup dak@gnu.org */ #include <cstddef> #include <iostream> #include <string> using namespace std;

template <class T,class V> class listsort : V { public:
listsort(const V &arg) : V(arg) { }
T * &sort(T* &headin, size_t n) const
{
size_t underpow,n1,n2,bitrev;
int lev, kmax;
T **head = &headin;
T **headstack[CHAR_BIT*sizeof(size_t)];
T **tail;
T **tail1;
T *p1;
T *p2;
int sp = 0;
switch (n)
{
case 0:
return *head;
case 1:
return getlink(*head);
}
for (underpow = 1, kmax=0; underpow <= (n-1)/4;) {
underpow *= 2;
++kmax;
}
for (lev = kmax, bitrev=underpow, n -= 2*underpow;;) {
p1 = *head;
tail = &getlink(p1);
p2 = *tail;
if (listleq(p1,p2)) {
tail = &getlink(p2);
} else {
*tail = getlink(p2);
getlink(p2) = p1;
*head = p1 = p2;
}
switch ((bitrev+n)>>lev) {
case 2:
p2 = *tail;
if (listleq(p1,p2)) {
tail1 = &getlink(p1);
p1 = *tail1;
if (listleq(p1,p2)) {
tail = &getlink(p2);
break;
}
*tail1 = p2;
} else
*head = p2;
*tail = getlink(p2);
getlink(p2) = p1;
break;
case 3:
headstack[sp++] = head;
head = tail;
++lev;
continue;
}
for (;;) {
if (lev == 0)
return *tail;
if (!((bitrev>>--lev)&1))
break;
n2 = ((bitrev+n) >> lev) + 1;
n1 = n2/2;
n2 -= n1;
tail1 = head;
head = headstack[--sp];
p2 = *tail1;
do {
p1 = *head;
if (!listleq(p1,p2)) {
*head = p2;
do {
head = &getlink(p2);
p2 = *head;
if (!--n2) {
*tail1 = p2;
*head = p1;
tail = tail1;
goto continuemerge;
}
} while (!listleq(p1,p2));
*head = p1;
}
head = &getlink(p1);
} while (--n1);
*head = p2;
continuemerge:
head=headstack[sp];
}
headstack[sp++] = head;
head = tail;
bitrev -= underpow - (3<<lev);
lev = kmax;
}
} };

Ideally, I'd like a tag for the clump of "Computer science by people who haven't done the background reading" posts we get. I haven't seen "my algorithm compresses everything, even compressed files!" or "I've proved that P = NP!" posts in a while. But we should be prepared.

I've done a bit of scientific programming, but I'm not a computer-science-oriented person by any stretch. Thus, I'm a little confused by this. Under what license are things like Quicksort and Stupidsort (for another, very bad example) "released"? Why would a person feel it necessary to release an algorithm under any kind of license at all, in fact? No textbook would want to include GPL-ed code--wouldn't that make the entire book GPL by nature of the license?

Please correct me if I'm mistaken, but this seems like a fairly bad precedent to set.

You can't copyright an algorithm, so there is no license to worry about. This is a well meaning but utterly clueless person who reimplemented a well known algorithm and then made a strange constant-factor instead of asymptotic analysis.

As far as I understand, source code can be placed under the GPL, but an algorithm cannot be. An algorithm is just a concept - it can be patented of course, but that's aside the matter here. Unless I'm mistaken, you could write your own implementation of BitFast and use it for commercial purposes, closed-source projects, whatever you want as long as you don't use their source code.

Algorithms aren't protected... a particular implementation can be. If you create an "original work of authorship," and fix it in a "tangible medium of expression", you've got a copyright. Therefore, the need for a license. Assuming that the source code was sufficient to meet the copyright standards, if the "author" hadn't said anything, it would have been protected and use would have been unlicensed.

IAAL, but I didn't RTFA, so I won't tell you whether it did actually meet those requirements.

Did KD set this guy up for ridicule on purpose? (5, Informative)

- Message : "Don't ever let school, stop you from learning!"

School could, learn you some grammar.

The algorithm is being released under the GPL ( General Public License ). The algorithm belongs to PhoenixBit and VirusFree
but you may use/modify it freely.
*** DO NOT COPY ANYTHING FROM THIS PAGE TO ANY OTHER PAGE. IF YOU
WANT SOMEONE TO READ THIS THEN LINK TO THIS PAGE ***

In addition to trying to apply copyright to an algorithm, doesn't a restriction on copying defeat the purpose of releasing something under the GPL? Or does text in all caps trum previous text not in all caps?

Feel free to add to this. If there are some clips of this guy with lightsabers, pleas post them.

kdawson is an idiot (0)

Anonymous Coward | more than 6 years ago | (#18145668)

That's why he put it up. This is just some huckster trying to (and succeeding in) making money off of idiot Slashdot moderators and viewers.

Re:Did KD set this guy up for ridicule on purpose? (-1)

I think it's almost mandatory to have grammar or spelling errors in any grammar nazi post. I noticed the "pleas post" right after posting. As a newly found subset of Murphy's Law, someone should name it and become famous.

The algorithm belongs to PhoenixBit and VirusFree (0)

Anonymous Coward | more than 6 years ago | (#18145536)

The algorithm belongs to PhoenixBit and VirusFree but you may use/modify it freely.

Huh? You can copyright the implementation you have (what allows you to put it under the GPL). That does nothing to control the algorithm... you would have to attempt to get a patent to do that.

This is a variant of RadixSort, which is well known to be faster than any comparison sort such as MergeSort. The problem with non-comparison sorts is that as such they are restricted to sorting items representable a unstructured bit fields, which means, essentially, integers. A large part of the time, the real problem in sorting is (a) extracting the fields that you want to use as keys (since it is not generally the case that you want to sort on the entire record) and (b) arranging for each pair of records to compare as you need them to, which involves both recognizing the internal structure of keys (consider the case of dates) and imposing suitable weights for the individual components. In other words, in many situations the bulk of the code and time are devoted to parsing and transformation of records. So long as you are not using a really bad algorithm, the time devoted to the sort itself is likely to be a small percentage of the total time.

For example, I have written a sort utility [billposer.org] that differs from most others in its ability to parse the input into records and records into fields and in the transformations it can apply so as to use unusual sort orders and sort on things like dates, month names, and numbers in non-Western number systems. It was originally written for alphabetizing dictionaries of "exotic" languages.
It is frequently the case that the time devoted to the actual sort is less than 1% of the run time.

In sum, non-comparison sorts have a niche but are of limited utility because they get their speed from making use of additional information that is only available for a limited set of datatypes. For the great majority of applications, only comparison sorts are flexible enough to be of use.

Re:solution in search of a problem (0)

Anonymous Coward | more than 6 years ago | (#18145826)

For example, I have written a sort utility that differs from most others in its ability to parse the input into records and records into fields and in the transformations it can apply so as to use unusual sort orders and sort on things like dates, month names, and numbers in non-Western number systems.

I do something very similar.
It's called a database.

No, it isn't a database. It doesn't support queries or allow for the insertion or modification of data. A true statement would be that the sort functions of database management and query programs are usually closer in flexibility to msort than are stand-alone sort utilities.

While radix sorts are limited, binned sorts, which are a superset of radix sorts, are more powerful.
Binned sorts do work on a broad set of data types, but you have to use statistical techniques to dynamically pick a good distribution among the bins. SyncSort [syncsort.com] developed this technique in the 1960s, and
had the first software patent, now long-expired. Mainframe batch systems have used such techniques for decades.

The general idea is that if you're sorting, say, text strings, you watch them go by for a while, accumulating statistics, then compute a set of bin division points which will divide them into bins of roughly equal size. The bins have to be watched, and if some get too big or too small, the bin division points have to be adjusted. It's a tree-rebalancing problem.

If your sort is too big for RAM, and you have to go out to disk (or, in early systems, tape) this is a huge win. It lets you beat O(N log N), both in theory and in practice. SyncSort times grow at worse than O(N), but better than O(N log N).

Few people bother for in-memory sorts, though.

I am sure (0)

Anonymous Coward | more than 6 years ago | (#18145608)

I am sure Microsoft has patented it by now.

int main (0)

Anonymous Coward | more than 6 years ago | (#18145656)

Retard who thinks you can GPL an algorithm manages to reinvent an algorithm first implemented as a computer program in the 1950s, and used way before then outside of digital computers.

Also manages to not know about memset and write C code that assumes sizeof(long)==4 and puts non-inline code in header files. I don't dare look at the C++ code since there's a large probability it will cause permanent brain damage.

> Also manages to not know about memset and write C code that assumes sizeof(long)==4 and > puts non-inline code in header files. I don't dare look at the C++ code since there's a > large probability it will cause permanent brain damage.

On 32-bit machines, sizeof(long) is 4, though I think on 64-bit machines this is 8 (but I'm not sure).

Anonymous Coward | more than 6 years ago | (#18145836)

Bit level comparison of floating point numbers won't work in general, and more specifically, it doesn't work in the code you posted. Also, after translating your time functions for unix, your algorithm does not terminate unless compiler optimizations are turned on (gcc -o moo main.c -O3).

Barring the fact that floating point is broken, comparing your radix sort to another array-based sorting algorithm might be more productive. O(N log N) is the lower bound on non-parallel, and much of the battle is coming up with lower multiplicative constants, as you are trying to do. Again, your comparison

You state that you need a constant amount of memory, in your case this is sizeof(long) * 2 * 2^16 = 2^19 bytes, which is half a meg.

Testing the long version doesn't seem to get equivalently sorted lists either. For one thing you seem to be sorting in the opposite direction as mergesort.

Ignoring correctness, you only have a speedup of ~1.3 over qsort.

Perhaps coming from a math background, my priorities differ from those with a CS background, but...

Where's the proof?

Is it peer reviewed? Has it been published? Does it exist?
Although the lack of a proof does not make your algorithm is incorrect, I'd be suspicious. In fact, I would be unwilling to use any algorithm in my code without being sure that a good proof of its correctness exists somewhere.
Proofs are more than formalities, they're essential: without a good proof, you're just a crank selling snake oil.

## FROTHY PISS (-1)

## Anonymous Coward | more than 6 years ago | (#18145098)

## Re:FROTHY PISS (5, Funny)

## Anonymous Coward | more than 6 years ago | (#18145226)

You are an "HTML programmer".

An oxymoron and a moron, simultaneously.

Defenestrate yourself.

## I CODE HTML IN C++ (-1, Troll)

## Anonymous Coward | more than 6 years ago | (#18145818)

Reason: Don't use so many caps. It's like YELLING.

Reason: Don't use so many caps. It's like YELLING.

Reason: Don't use so many caps. It's like YELLING.

## first, finally (-1)

## Anonymous Coward | more than 6 years ago | (#18145106)

hello world

## Re:first, finally (0, Offtopic)

## grimJester (890090) | more than 6 years ago | (#18145364)

:-)hello world

It may disappoint you, but you lost to a post titled "Frothy piss". If getting a "first post" is really that important to you, I suggest you take this as a lesson and re-evaluate your priorities in life. Being nearly as good as "Frothy piss" is a sad, sad life achievement.

## Re:first, finally (-1, Offtopic)

## Anonymous Coward | more than 6 years ago | (#18145730)

## How is this not a radix sort? (5, Informative)

## Anonymous Coward | more than 6 years ago | (#18145112)

## That doesn't matter! (5, Funny)

## Anonymous Coward | more than 6 years ago | (#18145254)

## Re:How is this not a radix sort? (1)

## truthsearch (249536) | more than 6 years ago | (#18145288)

## Re:How is this not a radix sort? (1)

## Anthony (4077) | more than 6 years ago | (#18145316)

## Re:How is this not a radix sort? (0)

## Anonymous Coward | more than 6 years ago | (#18145374)

Could someone volunteer to improve the writing in that article? At least for that atrocious first paragraph?

## Gentle readers.... (1)

## Anonymous Coward | more than 6 years ago | (#18145720)

## Re:How is this not a radix sort? (4, Funny)

## Sillygates (967271) | more than 6 years ago | (#18145778)

Sorts in O(1) and uses no memory!

## It's radix sort. (5, Informative)

## Nick Mathewson (11078) | more than 6 years ago | (#18145114)

## Re:It's radix sort. (4, Insightful)

## abradsn (542213) | more than 6 years ago | (#18145208)

radix is not

I agree though that the concepts for both algorithms are very similar.

## Re:It's radix sort. (5, Informative)

## vidarh (309115) | more than 6 years ago | (#18145296)

## Re:It's radix sort. (1)

## lordsid (629982) | more than 6 years ago | (#18145642)

## Re:It's radix sort. (1, Funny)

## Anonymous Coward | more than 6 years ago | (#18145746)

## Nice, but... (1, Informative)

## i kan reed (749298) | more than 6 years ago | (#18145116)

## Re:Nice, but... (0)

## Anonymous Coward | more than 6 years ago | (#18145244)

## Re:Nice, but... (0)

## Anonymous Coward | more than 6 years ago | (#18145278)

on a linked listrequires constant space.## Re:Nice, but... (1)

## i kan reed (749298) | more than 6 years ago | (#18145422)

## Re:Nice, but... (5, Informative)

## dlthomas (762960) | more than 6 years ago | (#18145468)

As pretty much everyone has pointed out, it's just radix sort. The time taken by radix sort scales linearly to the number of keys, and with the log of the maximum key that can be held by the container.

If we're dealing entirely with unique keys, this is of course >= log(N), by the pigeonhole principle and all that. If we may have duplicate keys, however, there may be more keys than container space, and radix *does* scale linearly to the keys.

Time to prepare and overhead of this alrogrithm are negligable in any context large enough for us to care about the O() of the algorithms. For some things, this *is* better than mergesort. I'm just not sure why it was posted, as it's also not new - it was invented over 50 years ago.

## Re:Nice, but... (1)

## quanticle (843097) | more than 6 years ago | (#18145636)

## Radix Sort (0, Redundant)

## doopokko (677503) | more than 6 years ago | (#18145122)

## Re:Radix Sort (5, Funny)

## Anonymous Coward | more than 6 years ago | (#18145144)

Marketing.

## Almost accurate. (0, Redundant)

## imbaczek (690596) | more than 6 years ago | (#18145138)

## 10x? (-1, Troll)

## Anonymous Coward | more than 6 years ago | (#18145142)

## Re:10x? (-1, Troll)

## Anonymous Coward | more than 6 years ago | (#18145214)

## Only lists of integers (2, Informative)

## iamacat (583406) | more than 6 years ago | (#18145148)

## Re:Only lists of integers (3, Interesting)

## antifoidulus (807088) | more than 6 years ago | (#18145412)

## radix sort vs. comparison sort (5, Informative)

## Peter Desnoyers (11115) | more than 6 years ago | (#18145158)

However, it's definitely a clever way of implementing radix sort with linked lists, which may make it useful in some cases (e.g. OS internals) where you don't want to allocate space for a big directly-addressable array.

## Re:radix sort vs. comparison sort (3, Interesting)

## Ignorant Aardvark (632408) | more than 6 years ago | (#18145388)

## wtf? seriously. (4, Informative)

## tomstdenis (446163) | more than 6 years ago | (#18145164)

Tom

## Re:wtf? seriously. (0)

## Anonymous Coward | more than 6 years ago | (#18145256)

dom

## Re:wtf? seriously. (1)

## SnowZero (92219) | more than 6 years ago | (#18145694)

Now, there is the question of whether you

wantto do it, but you most certainlycan. For my unsigned integer linked-list sorting needs, I've found radix sort with a radix of 6 to work pretty well in practice.## Re:wtf? seriously. (1)

## GileadGreene (539584) | more than 6 years ago | (#18145760)

canperform quicksort on linked lists. But without random access, many of the optimizations you can use to make quicksort really "quick" can't be done, so there isn't much reason to prefer quicksort over a good mergesort implementation for linked lists.## Re:wtf? seriously. (4, Insightful)

## slamb (119285) | more than 6 years ago | (#18145266)

Sixth, the headline "10X faster" is incorrect, as they differ asymptotically, not by a constant factor. (Run different data sets...vary by size of a single element, and by number of elements in the list. The ratio will change.)

## Re:wtf? seriously. (2, Insightful)

## vidarh (309115) | more than 6 years ago | (#18145516)

## Re:wtf? seriously. (5, Insightful)

## phazer (9089) | more than 6 years ago | (#18145300)

It would require a software patent to restrict the use of the algorithm to GPL programs.

(And sixth, a quick look in a text book would have clued the author in)

## Re:wtf? seriously. (1)

## Alaria Phrozen (975601) | more than 6 years ago | (#18145336)

Get NoScript [mozilla.org] for Firefox. It's annoying the first few days as you build a white-list, but now I don't deal with any crap. Google-analytics is forbidden and that page is now ad-free.

## Re:wtf? seriously. (3, Informative)

## Anonymous Brave Guy (457657) | more than 6 years ago | (#18145380)

Provably impossible, in fact.

I'm not sure that's a fair generalisation. There are several good O(

nlogn) sorting algorithms, heap sort being another obvious one, and typically things like the data structures involved, need for stability, or ease of parallelisation are the deciding factors in practice. In fact some of the best performing algorithms in real terms are theoretically O(n^2) in the worst case, hence the creation of hybrids like introsort, a quicksort variant designed to avoid the worst case by switching to heap sort after a certain recursion depth is reached.Anyway, I'm afraid this is another article by a 21-year-old student who is well-meaning, but not nearly experienced enough yet to appreciate that his suggestion is just rehashing (no pun intended) a lot of long-established ideas. I think the giveaway sign is that his CV on the site still lists just about every programming language and web development skill in existence. :-)

## Re:wtf? seriously. (2, Insightful)

## TheRaven64 (641858) | more than 6 years ago | (#18145564)

## Re:wtf? seriously. (0)

## Anonymous Coward | more than 6 years ago | (#18145638)

***Excellent***" really engender myconfidencein his abilities as a developer. You're just being unfair!## Re:wtf? seriously. (1)

## repruhsent (672799) | more than 6 years ago | (#18145802)

Loop

AddHobbie ("Computers");

Do Until 1 = 2

That part made me take him very seriously as a developer. He should come give a lecture in my algorithms class.

## Re:wtf? seriously. (2, Interesting)

## Ivan the Terrible (115742) | more than 6 years ago | (#18145882)

Agreed. For example, in a microcode example at work, the cost of searching for a given string was totally dominated by memory access time, so it really didn't matter which algorithm we used. So we used brute force. Pretty easy to debug, as well.

## Re:wtf? seriously. (1)

## Brian_Ellenberger (308720) | more than 6 years ago | (#18145852)

First, it's just a radix sort [well that's my take]. Second there are too many google ads on that page. Third, merge sort is O(n log n) time, hard to beat for random data. Fourth, most people use variations of the QUICK SORT not merge sort.For comparison searches, O(n log n) is not just hard to beat it is

IMPOSSIBLEto beat.See http://www.ics.uci.edu/~eppstein/161/960116.html [uci.edu] .

Of course, radix sort is not a comparison sort, but has its own limitations as others have noted.

And Quicksort technically has a O(n^2) worst case (reversed list) but is more often picked over Merge Sort because on avg it is nLogn and doesn't require the extra memory merge sort requires.

## 10X speedup? Compared to what? (2, Insightful)

## DaleGlass (1068434) | more than 6 years ago | (#18145180)

Linear improvements are completely uninteresting. Take algorithm X in Perl, then a version in optimized assembly, and there you go, a 10X improvement without having done anything groundbreaking.

A 10X faster bubble sort is just as useless for large enough values of n, and a 10x slower merge sort would still have a very good performance in comparison.

## Nice try, b ut... (4, Informative)

## Ancient_Hacker (751168) | more than 6 years ago | (#18145186)

It was used by 80-col card sorters, since circa 1925.

See Knuth, Volume 3

## Re:Nice try, b ut... (1)

## RevMike (632002) | more than 6 years ago | (#18145398)

## multithreaded merge (1)

## Zork the Almighty (599344) | more than 6 years ago | (#18145204)

## Re:multithreaded merge (1)

## Zork the Almighty (599344) | more than 6 years ago | (#18145270)

## Re:multithreaded merge (0)

## Anonymous Coward | more than 6 years ago | (#18145308)

## Re:multithreaded merge (0)

## Anonymous Coward | more than 6 years ago | (#18145322)

http://www-unix.mcs.anl.gov/dbpp/text/node123.htm

http://www-unix.mcs.anl.gov/dbpp/text/node127.htm

## Re:multithreaded merge (2, Interesting)

## SuperMog2002 (702837) | more than 6 years ago | (#18145352)

The only problem I forsee is the case where the added overhead exceeds the gain. However, I imagine this would only be the case when you are sorting very small lists. Perhaps one could check the size of the list first and then decide whether or not forking is worth it.

## Re:multithreaded merge (4, Informative)

## TERdON (862570) | more than 6 years ago | (#18145440)

## Re:multithreaded merge (1)

## DamnStupidElf (649844) | more than 6 years ago | (#18145556)

Since the article has nothing, I wanted to ask: has anyone implemented merge sort where you merge from both the front and back using two simultaneous threads ? I haven't seen this anywhere, but it's an obvious improvement for multicore machines. Does anyone know how to further parallelize merging (or for that matter, heaps) ?There are entire classes of sorting algorithms for multiprocessor machines, including algorithms optimized for shared versus non-uniform memory architectures, message passing, sorting networks, and much much more. In general for m processors, sorting becomes an O(n log n / m) problem just as you'd expect (or hope).

## Re:multithreaded merge (1)

## Helios1182 (629010) | more than 6 years ago | (#18145736)

## animation (2, Funny)

## hansamurai (907719) | more than 6 years ago | (#18145206)

## Re:animation (1)

## Ossifer (703813) | more than 6 years ago | (#18145436)

And just as unscientific...## Apropos (1)

## heli_flyer (614850) | more than 6 years ago | (#18145238)

## oh really? (1, Insightful)

## atamyrat (980611) | more than 6 years ago | (#18145290)

AFAIK, quicksort (as well as mergesort) is optimal algorithm for comparison based sort, which means you can not find anything asymptotically better than that.

And how about submitting it to academic journals if you come up with something better?

## Re:oh really? (2, Funny)

## Ossifer (703813) | more than 6 years ago | (#18145482)

But I came up with a sorting algorithm yesterday that's 12 times faster than bubblesort! Click here to see my advertisements....

Seriously, mergesort (but NOT quicksort) is optimal, O(n log n) in one model. But there are sub-n log n sorting algorithms in other models. Some of them are based upon radix sort, surprise, surprise....

## Re:oh really? (0)

## Anonymous Coward | more than 6 years ago | (#18145534)

## Re:oh really? (2, Insightful)

## Stalus (646102) | more than 6 years ago | (#18145644)

Quicksort may have a best expected sorting time, but still has an O(n^2), since O is worst case. I think the STL takes a pass at the list to determine how sorted the list appears to be and then picks a sorting algorithm based on that... but it's been awhile since I looked at it.

## Re:oh really? (1)

## Mr. Underbridge (666784) | more than 6 years ago | (#18145776)

Please give me running time in Big O notation [wikipedia.org].Oh, but he says he didn't want to go into a bunch of theory. In other words, he doesn't understand the theory. In addition to theory, add copyrights and patents to the list of things he doesn't understand:

1. You release the code under the GPL, not the algorithm. 2. You don't 'own' the algorithm. You might patent it, but you haven't (and obviously won't since it appears to be radix). Additionally, to GPL the code, you'd have to provide an unlimited redistribution license to the patent if it existed.

## Re:oh really? (1)

## Purple Screws (1017084) | more than 6 years ago | (#18145850)

Merge sort and heap sort are both O(n log n) for arbitrary input -- they'll always take about the same amount of time no matter what the input data looks like. And O(n log n) is the best you can possibly do on average.

Quicksort, in contrast, is highly suboptimal -- O(n^2) -- in the case where the data are already in order. It only approaches O(n log n) behavior when the data are completely randomized. In general, of course, it depends on what data you're looking at, but quicksort performance will be somewhere in between these two cases.

People USE quicksort because it's optimal for many situations that people typically work with. That's not because of its asymptotic behavior, but because of its small constant factor.

Remember, our computers are not Turing machines. We never actually work with the average case. And n is a lot smaller than infinity.

## Can't believe I wasted my time (1)

## defile (1059) | more than 6 years ago | (#18145318)

Against my better judgment, I skimmed the article, code.

Two observations:

Why no analysis of impact on memory usage? 10x speed-up, but at what cost? Omitting this detail makes it difficult to weigh the utility of the algorithm.

Not an Apples to Apples comparison. In the source, the "BitFast" sort converts the entire link-list into an array before performing the sort. Without even considering the merits of the actual algorithms, the Mergesort has no chance of being faster given the more complicated data structure used.

As others pointed out, this sounds like the Radix sort, but I was too disgusted to step through the source in detail and even verify this. Maybe someone else will have a stronger constitution.

## Here is a good Mergesort implementation (1, Interesting)

## Anonymous Coward | more than 6 years ago | (#18145344)

For a pretty efficient mergesort implementation in C/C++, try the following (has at one time been offered to libg++, but not taken up). It sorts, as an example, stdin line by line to stdout. The whole recursion stuff has been flattened into loops and binary arithmetic.

<blockquote>

#include <cstddef>

#include <iostream>

#include <string>

using namespace std;

template <class T,class V> class listsort : V {

public:

listsort(const V &arg) : V(arg) { }

T * &sort(T* &headin, size_t n) const

{

size_t underpow,n1,n2,bitrev;

int lev, kmax;

T **head = &headin;

T **headstack[CHAR_BIT*sizeof(size_t)];

T **tail;

T **tail1;

T *p1;

T *p2;

int sp = 0;

switch (n)

{

case 0:

return *head;

case 1:

return getlink(*head);

}

for (underpow = 1, kmax=0; underpow <= (n-1)/4;) {

underpow *= 2;

++kmax;

}

for (lev = kmax, bitrev=underpow, n -= 2*underpow;;) {

p1 = *head;

tail = &getlink(p1);

p2 = *tail;

if (listleq(p1,p2)) {

tail = &getlink(p2);

} else {

*tail = getlink(p2);

getlink(p2) = p1;

*head = p1 = p2;

}

switch ((bitrev+n)>>lev) {

case 2:

p2 = *tail;

if (listleq(p1,p2)) {

tail1 = &getlink(p1);

p1 = *tail1;

if (listleq(p1,p2)) {

tail = &getlink(p2);

break;

}

*tail1 = p2;

} else

*head = p2;

*tail = getlink(p2);

getlink(p2) = p1;

break;

case 3:

headstack[sp++] = head;

head = tail;

++lev;

continue;

}

for (;;) {

if (lev == 0)

return *tail;

if (!((bitrev>>--lev)&1))

break;

n2 = ((bitrev+n) >> lev) + 1;

n1 = n2/2;

n2 -= n1;

tail1 = head;

head = headstack[--sp];

p2 = *tail1;

do {

p1 = *head;

if (!listleq(p1,p2)) {

*head = p2;

do {

head = &getlink(p2);

p2 = *head;

if (!--n2) {

*tail1 = p2;

*head = p1;

tail = tail1;

goto continuemerge;

}

} while (!listleq(p1,p2));

*head = p1;

}

head = &getlink(p1);

} while (--n1);

*head = p2;

continuemerge:

head=headstack[sp];

}

headstack[sp++] = head;

head = tail;

bitrev -= underpow - (3<<lev);

lev = kmax;

}

}

};

template <class T> T* &sort(T* &headin, size_t n, T* T::* Link,

bool compare(const T&, const T&));

template <class T> class sorter {

T* T::* const Link;

bool (*const compare)(const T &,const T &);

sorter(T* T::* Link, bool compare(const T &, const T &))

: Link(Link), compare(compare) { }

friend T* &sort<T>(T* &headin, size_t n, T* T::* Link,

bool compare(const T&, const T&));

protected:

bool listleq(const T *a, const T *b) const {

return compare(*a,*b);

}

T* & getlink(T* p) const {

return p->*Link;

}};

template <class T>

T* &sort(T* &headin, size_t n, T* T::* Link, bool compare(const T&, const T&))

{

return listsort<T,sorter<T> >(sorter<T>(Link,compare)).sort(headin,n);

}

template <class T, T* T::* Link>

T* &sort(T* &headin, size_t n);

template <class T, T* T::* Link> class sorter2 {

sorter2(){}

friend T* &sort<T,Link>(T* &headin, size_t n);

protected:

static bool listleq(const T *a, const T *b) {

return *a <= *b;

}

static T* & getlink(T* p) {

return p->*Link;

}};

template <class T, T* T::* Link>

T* &sort(T* &headin, size_t n)

{

return listsort<T,sorter2<T,Link> >(sorter2<T,Link>()).sort(headin, n);

}

struct xxx {

string str;

xxx *next;

} *head, **tail = &head;

inline bool operator<=(const xxx&a, const xxx &b)

{

return (a.str <= b.str);

}

main(int ac, char **av)

{

string buf;

size_t n = 0;

while (getline(cin,buf)) {

++n;

*tail = new xxx;

(*tail)->str = buf;

tail = &(*tail)->next;

}

sort<xxx, &xxx::next>(head,n) = 0;

while (head) {

cout << head->str << endl;

head = head->next;

}

return 0;

}

</blockquote>

## Re:Here is a good Mergesort implementation (1)

## Shivetya (243324) | more than 6 years ago | (#18145458)

## You're making the baby Knuth cry (5, Funny)

## aprosumer.slashdot (545227) | more than 6 years ago | (#18145372)

## Re:You're making the baby Knuth cry (1)

## vadim_t (324782) | more than 6 years ago | (#18145448)

reallyscary ones. My condolences.## Time for the "reinventthewheel" tag? (1)

## jfengel (409917) | more than 6 years ago | (#18145574)

How about "badcomputerscience"?

## Re:Time for the "reinventthewheel" tag? (1)

## Asztal_ (914605) | more than 6 years ago | (#18145846)

## GPL-ed algorithm? (2, Interesting)

## Wilson_6500 (896824) | more than 6 years ago | (#18145400)

Please correct me if I'm mistaken, but this seems like a fairly bad precedent to set.

## Re:GPL-ed algorithm? (4, Informative)

## Helios1182 (629010) | more than 6 years ago | (#18145494)

## Re:GPL-ed algorithm? (1)

## Hootenanny (966459) | more than 6 years ago | (#18145506)

## Re:GPL-ed algorithm? (1)

## reebmmm (939463) | more than 6 years ago | (#18145594)

IAAL, but I didn't RTFA, so I won't tell you whether it did actually meet those requirements.

## Did KD set this guy up for ridicule on purpose? (5, Informative)

## grimJester (890090) | more than 6 years ago | (#18145522)

- Programming Skills : Visual Basic ***Excellent***Yes, that certainly is... excellent.

- Message : "Don't ever let school, stop you from learning!"School could, learn you some grammar.

The algorithm is being released under the GPL ( General Public License ). The algorithm belongs to PhoenixBit and VirusFree but you may use/modify it freely.*** DO NOT COPY ANYTHING FROM THIS PAGE TO ANY OTHER PAGE. IF YOU WANT SOMEONE TO READ THIS THEN LINK TO THIS PAGE ***

In addition to trying to apply copyright to an algorithm, doesn't a restriction on copying defeat the purpose of releasing something under the GPL? Or does text in all caps trum previous text not in all caps?

Feel free to add to this. If there are some clips of this guy with lightsabers, pleas post them.

## kdawson is an idiot (0)

## Anonymous Coward | more than 6 years ago | (#18145668)

## Re:Did KD set this guy up for ridicule on purpose? (-1)

## quanticle (843097) | more than 6 years ago | (#18145672)

## Re:Did KD set this guy up for ridicule on purpose? (3, Informative)

## Brandybuck (704397) | more than 6 years ago | (#18145692)

## Re:Did KD set this guy up for ridicule on purpose? (2, Funny)

## remahl (698283) | more than 6 years ago | (#18145706)

## Re:Did KD set this guy up for ridicule on purpose? (2, Interesting)

## grimJester (890090) | more than 6 years ago | (#18145748)

## The algorithm belongs to PhoenixBit and VirusFree (0)

## Anonymous Coward | more than 6 years ago | (#18145536)

Huh? You can copyright the implementation you have (what allows you to put it under the GPL). That does nothing to control the algorithm... you would have to attempt to get a patent to do that.

## solution in search of a problem (4, Insightful)

## belmolis (702863) | more than 6 years ago | (#18145560)

This is a variant of RadixSort, which is well known to be faster than any comparison sort such as MergeSort. The problem with non-comparison sorts is that as such they are restricted to sorting items representable a unstructured bit fields, which means, essentially, integers. A large part of the time, the real problem in sorting is (a) extracting the fields that you want to use as keys (since it is not generally the case that you want to sort on the entire record) and (b) arranging for each pair of records to compare as you need them to, which involves both recognizing the internal structure of keys (consider the case of dates) and imposing suitable weights for the individual components. In other words, in many situations the bulk of the code and time are devoted to parsing and transformation of records. So long as you are not using a really bad algorithm, the time devoted to the sort itself is likely to be a small percentage of the total time.

For example, I have written a sort utility [billposer.org] that differs from most others in its ability to parse the input into records and records into fields and in the transformations it can apply so as to use unusual sort orders and sort on things like dates, month names, and numbers in non-Western number systems. It was originally written for alphabetizing dictionaries of "exotic" languages. It is frequently the case that the time devoted to the actual sort is less than 1% of the run time.

In sum, non-comparison sorts have a niche but are of limited utility because they get their speed from making use of additional information that is only available for a limited set of datatypes. For the great majority of applications, only comparison sorts are flexible enough to be of use.

## Re:solution in search of a problem (0)

## Anonymous Coward | more than 6 years ago | (#18145826)

I do something very similar.

It's called a database.

## Re:solution in search of a problem (1)

## belmolis (702863) | more than 6 years ago | (#18145874)

No, it isn't a database. It doesn't support queries or allow for the insertion or modification of data. A true statement would be that the sort functions of database management and query programs are usually closer in flexibility to msort than are stand-alone sort utilities.

## Re:solution in search of a problem (1)

## Animats (122034) | more than 6 years ago | (#18145828)

While radix sorts are limited, binned sorts, which are a superset of radix sorts, are more powerful. Binned sorts do work on a broad set of data types, but you have to use statistical techniques to dynamically pick a good distribution among the bins. SyncSort [syncsort.com] developed this technique in the 1960s, and had the first software patent, now long-expired. Mainframe batch systems have used such techniques for decades.

The general idea is that if you're sorting, say, text strings, you watch them go by for a while, accumulating statistics, then compute a set of bin division points which will divide them into bins of roughly equal size. The bins have to be watched, and if some get too big or too small, the bin division points have to be adjusted. It's a tree-rebalancing problem.

If your sort is too big for RAM, and you have to go out to disk (or, in early systems, tape) this is a huge win. It lets you beat O(N log N), both in theory and in practice. SyncSort times grow at worse than O(N), but better than O(N log N).

Few people bother for in-memory sorts, though.

## I am sure (0)

## Anonymous Coward | more than 6 years ago | (#18145608)

## int main (0)

## Anonymous Coward | more than 6 years ago | (#18145656)

## Re:int main (2, Interesting)

## Srdjant (650988) | more than 6 years ago | (#18145822)

Messy and inconsistent indenting (check c/ and c++/ files for differences)

Difference between c/* and c++/* is that c++/ has cout instead of printf()... well done...

This stuff should go to worsethanfailure.com (was TheDailyWTF.com)

## This guy (1)

## ronnybrendel (1027526) | more than 6 years ago | (#18145664)

## His algorithm :::is::: optimal (1)

## shoemakc (448730) | more than 6 years ago | (#18145688)

## Shock horror, stop the presses (2, Informative)

## sholden (12227) | more than 6 years ago | (#18145764)

Also manages to not know about memset and write C code that assumes sizeof(long)==4 and puts non-inline code in header files. I don't dare look at the C++ code since there's a large probability it will cause permanent brain damage.

## Re:Shock horror, stop the presses (1)

## Srdjant (650988) | more than 6 years ago | (#18145856)

> puts non-inline code in header files. I don't dare look at the C++ code since there's a

> large probability it will cause permanent brain damage.

On 32-bit machines, sizeof(long) is 4, though I think on 64-bit machines this is 8 (but I'm

not sure).

Please don't look at the C++ code..

## wtf (1)

## mpoloks (1062844) | more than 6 years ago | (#18145770)

sigh

## sh1t (-1, Redundant)

## Anonymous Coward | more than 6 years ago | (#18145774)

## Correctness (0)

## Anonymous Coward | more than 6 years ago | (#18145836)

Also, after translating your time functions for unix, your algorithm does not terminate unless compiler optimizations are turned on (gcc -o moo main.c -O3).

Barring the fact that floating point is broken, comparing your radix sort to another array-based sorting algorithm might be more productive. O(N log N) is the lower bound on non-parallel, and much of the battle is coming up with lower multiplicative constants, as you are trying to do. Again, your comparison

You state that you need a constant amount of memory, in your case this is sizeof(long) * 2 * 2^16 = 2^19 bytes, which is half a meg.

Testing the long version doesn't seem to get equivalently sorted lists either. For one thing you seem to be sorting in the opposite direction as mergesort.

Ignoring correctness, you only have a speedup of ~1.3 over qsort.

## Where's the proof? (2, Insightful)

## frostilicus2 (889524) | more than 6 years ago | (#18145890)

Where's the proof?Is it peer reviewed? Has it been published? Does it exist?

Although the lack of a proof does not make your algorithm is incorrect, I'd be suspicious. In fact, I would be unwilling to use

anyalgorithm in my code without being sure that a good proof of its correctness exists somewhere.Proofs are more than formalities, they're essential: without a good proof, you're just a crank selling snake oil.