Journal severoon's Journal: Big-O Notation
In Indirect Thinking , I used an order-of-magnitude check as an example of an indirect way to verify a calculation. Is 24*29 equal to 14,984,334? Most people won't have to do the multiplication to know that the suggested result is way too large...they have performed an order-of-magnitude check to make this determination.
In computer science, we are taught to think in terms of order-of-magnitude when judging the performance of algorithms. The result of that analysis is often recorded using what is called Big-O notation, in which the total amount of time the algorithm takes to run is related to some independent variable of the problem. For instance, you have a list of names (a telephone book, for example) and you want to know whether the name Gummercinda Lipschitz is in that list. How long does it take your algorithm?
Well, there are a number of algorithms that can be applied here. The simplest and most straightforward is to simply go through the list from beginning to end, comparing each name as you go. The variable in this problem is the number of elements that happen to be in the list--note that this value is independent of the algorithm itself...the algorithm will execute over any sized list. Of course, the longer the list, the longer it takes.
When doing an order-of-magnitude anaylsis, we say the time T for this algorithm to run will be directly proportional to the number of elements n in the list: T=a*n, where a is some scaling factor. The actual value of this scaling factor depends on all sorts of things: what kind of computer is running the algorithm, how many other processes are running simultaneously, how much memory is available to the algorithm, etc, etc. This is a very convenient way of mathematically representing our analysis because it binds together all of these unknowns into this one little variable a so we can focus simply on the interaction between the number of elements and the performance of the algorithm. Certainly, all of that other stuff is important in terms of actually running the algorithm and getting an actual time...still, there is value in understanding this particular fundamental relationship contained in the other two variables.
So, using Big-O notation, instead of the above formula we obscure the proportionality constant altogether...this focuses the mind on the relationship that's under study: O(n)=n, we would write. For the sake of comparison, let's consider an algorithm that tells us whether the first character of an input string is a capital letter or not (it simply returns true or false). If n, in this case, is the number of characters in the input string, we can clearly see that the algorithm need only concern itself with considering the first character alone. Hence, no matter how long the input string, the algorithm will run in the same amount of time, as it will simply ignore whatever comes after the first character. Represented in Big-O notation, this would be: O(n)=1. Expanded into standard mathematical notation, we would write T=a*1, or simply T=a...all of these represent the same relationship. This algorithm is said to run in "constant time" with respect to the length of the input string because, for a particular machine with some fixed amount of memory, other processes running, etc, it will always return a result in the same constant amount of time regardless of the length of that input string.
In actuality, there is one more notational contraction applied in Big-O notation...the formula on the right hand side is usually represented directly in the O() function instead of explicitly listing the independent variable there--it is understood that n is the independent variable by convention. So O(n)=n and O(n)=1 become simply O(n) and O(1), respectively. Also, it is also sometimes unnecessary to roll all scaling factors into the unseen proportionality constant. For instance, if my name search algorithm was designed only to tell me if the name was present in the first third of the list, it would not be wrong to represent this algorithm's "order" by writing O(n/3). This 1/3 factor represents a fundamental part of the problem under consideration because it's related directly to the interaction between the performance of the algorithm and the independent variable, so it's not wrong to include it. On the other hand, for most purposes that Big-O analyses are done, this factor would be of little consequence so it would be excluded anyway.
In the previous example of the name list, we might be able to provide a different algorithm based on additional knowledge we have. For example, let's say the list of names accessible to our algorithm has certain features; it is always sorted alphabetically and we can always find out before we begin the search how many total elements it contains. In this case, we can use the divide-and-conquer technique. We skip to the middle of the list and see if Gummercinda's name is in the first or last half, then we recursively repeat this process on the half that might contain the name until we get to the position it would alphabetically occupy. This way of doing things does not vary directly with the number of elements in the list; rather, it varies directly with the logarithm of that number. (This is because each time I double the size of the list, only one more iteration is required for my algorithm.) This equates to O(log n).
So, what is the point of all this? Big-O is actually a very good way of distilling order-of-magnitude calculations into a reductionist form...everything but that which is absolutely necessary is rolled into an unseen constant. So, the basic character of two algorithms can be compared simply by comparing the graphs of their Big-O expressions. Even more complex information can be teased from these...for instance, if I have two algorithms that solve a particular problem, one associated with O(n^10) and the other O(2^n), I can tell right away that the former will, for large n, be more performant. But I might be working with a situation where n will never exceed a hundred, and I might wish to know for what n the second algorithm will begin to run slower than the first. With a bit of calculating, it's easy to see that the second overtakes the first in terms of running time around n=60.
This points up that any one of the quick calculating methods introduced in the previous essay Indirect Thinking can be developed further. It's true that the more developed version may not be as quick to check off-the-cuff, but it can yield a perspective and information about the problem that might otherwise lay undiscovered altogether.
Big-O Notation More Login
Big-O Notation
Slashdot Top Deals