×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Null References, the Billion Dollar Mistake

timothy posted more than 5 years ago | from the these-are-just-rough-numbers dept.

Programming 612

jonr writes "'I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years. In recent years, a number of program analysers like PREfix and PREfast in Microsoft have been used to check references, and give warnings if there is a risk they may be non-null. More recent programming languages like Spec# have introduced declarations for non-null references. This is the solution, which I rejected in 1965.' This is an abstract from Tony Hoare Presentation on QCon. I'm raised on C-style programming languages, and have always used null pointers/references, but I am having trouble of grokking null-reference free language. Is there a good reading out there that explains this?"

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

612 comments

null or not null, that is the question (5, Interesting)

alain94040 (785132) | more than 5 years ago | (#27051389)

It's hard to imagine life without the null pointer! That being said, the author is not really responsible for billions of dollars of mistakes, the programmers are.

If there is one thing I'll complain about, it's the choice of the value 0. It's almost impossible to trace it. When we do hardware debug of chips, we prefer to use a much more visible value such as 0xdeadbeef for instance. Otherwise a bad pointer will bland too much with all the uninitialized values out there.

In assembly, null has no particular meaning. If you dereference an address, you can do it in any range you like. It's just that 0 on most machines was not a good place to store anything, since it would typically be used to boot the OS or some other critical IO function that you don't want to mess up with. Thus null was born.

Re:null or not null, that is the question (4, Insightful)

CTalkobt (81900) | more than 5 years ago | (#27051471)

When debugging at the hardware level it's fairly common to fill uninitialized memory (or newly allocated in a debug version of the malloc libraries) with a value that will either cause the computer to execute a system level break ( eg: TRAP / BRK etc) or something fairly obvious such as ($BA).

If you don't like the 0's, then replace your memory allocation library.

Re:null or not null, that is the question (2, Informative)

LiquidCoooled (634315) | more than 5 years ago | (#27051705)

its not the memory allocation library that is at fault.
its the expectation of the app developer to instincively do

if(!ptr){ ... }

you have to change the fundimental way the compiler works and alter boolean logic to account for existing code which works like this to then accept 0xdeadbeef under some conditions and not others.

Re:null or not null, that is the question (1, Interesting)

Chrisq (894406) | more than 5 years ago | (#27051925)

Though if you were designing a language you could dissallow the use of ptr as a boolean or numeric value, and force operations like

if (! valid ptr) {
}

Re:null or not null, that is the question (1)

RightSaidFred99 (874576) | more than 5 years ago | (#27051981)

Most decent modern languages have exactly this (well, "if (ptr != null)" is required and you can't cast a pointer to a boolean).

Re:null or not null, that is the question (2, Interesting)

fishbowl (7759) | more than 5 years ago | (#27051997)

The C specification already requires the compiler to deal with that, and it's been the case since K&R. No matter what the implementation defines as NULL, comparing or assigning 0 in a pointer context always works.

http://c-faq.com/null/ptrtest.html [c-faq.com]

Re:null or not null, that is the question (3, Interesting)

cant_get_a_good_nick (172131) | more than 5 years ago | (#27052067)

RE: malloc pattern initializer

what's a good one for x86 and AMD64 chips? While spelunking flags for valgrind, i remembered the thought process for 68k chips. Use an A-Line trap, unimplemented so execution would stop. Also, make it odd, so a dereference would trigger a bus error.

What's the best values for x86 debugging?

Re:null or not null, that is the question (1)

gr8_phk (621180) | more than 5 years ago | (#27051701)

I'd like it if there was a "prefetch" instruction to fill cache, but that ignored references to address zero. This way you could prefetch all pointers unconditionally to increase performance. Compilers could then insert these prefetches automatically.

Re:null or not null, that is the question (0)

Anonymous Coward | more than 5 years ago | (#27052367)

Often, "prefetch" instructions will ignore invalid addresses, null included. Check your architecture manuals.

Re:null or not null, that is the question (3, Insightful)

jeremyp (130771) | more than 5 years ago | (#27051755)

That's all very well, but in a production environment when dereferencing a NULL pointer you'd probably rather have the program crash than carry on merrily with bad data. With a zero null value, you can easily arrange for this to happen by protecting the bottom page of memory from reads and writes. That way, even an assembly language program can't dereference a null pointer.

Re:null or not null, that is the question (0)

Anonymous Coward | more than 5 years ago | (#27051781)

It's hard to imagine life without the null pointer! That being said, the author is not really responsible for billions of dollars of mistakes, the programmers are.

While it's correct, things should be designed around stupidity.

If the software running my car were to fail at a critical moment, it would matter little to me whether the mistake was due to programmer error or not.

Re:null or not null, that is the question (1, Insightful)

tritonman (998572) | more than 5 years ago | (#27051827)

Please understand there is a difference between a null pointer and a null reference. The difference is very important in C++.

Re:null or not null, that is the question (1, Informative)

Chrisq (894406) | more than 5 years ago | (#27052005)

Please understand there is a difference between a null pointer and a null reference. The difference is very important in C++.

In C++ the difference is purely syntactical, and nothing that a * or a & won't fix.

Re:null or not null, that is the question (2, Funny)

Anonymous Coward | more than 5 years ago | (#27052047)

When we do hardware debug of chips, we prefer to use a much more visible value such as 0xdeadbeef for instance.

I've recently seen that one of our developers is using 0xfeedface 0xb00bf00d, which is nice and inventive.

Re:null or not null, that is the question (1)

rwrife (712064) | more than 5 years ago | (#27052115)

No, he is at fault for the mistake....at least that's what I'm going to put in our bug tracking system at work.

Re:null or not null, that is the question (0)

Anonymous Coward | more than 5 years ago | (#27052229)

It's hard to imagine life without the null pointer!

No, it is not. In fact you don't even have to imagine it, there are plenty of languages out there
which do not allow null pointers. You are right that Hoare is not responsible for the financial loss, in a strictly legal sense, which is why he is not getting sued for it. Morally, however he is responsible.

Re:null or not null, that is the question (1)

locofungus (179280) | more than 5 years ago | (#27052279)

If there is one thing I'll complain about, it's the choice of the value 0

The choice of the value 0 in the computer is by the compiler, not the C language.

Infact, on AS400 the null pointer is not all bits zero - which makes a mess of code that attempts to memset structures to zero and then tries to test for null pointers. A pointer with all bits zero is NOT null on that platform.

Note that if(!ptr) works fine as a test for null, just that:

memset(&ptr, 0, sizeof ptr); if(!ptr) printf("Ptr is null\n"); else { dereference ptr } will crash.

ptr=0; if(!ptr) ... is fine.

Tim.

20 second explanation (4, Interesting)

AKAImBatman (238306) | more than 5 years ago | (#27051411)

I am having trouble of grokking null-reference free language.

If you're familiar with SQL, then a simple "MyColumn NOT NULL" definition should explain it. Basically, the value can never be set to a null value. Attempting to do so is an error condition itself.

In fact, DB design is a pretty good analogy for the concept as databases often are forced to wrestle with this issue.

Consider for a moment how you would design a database that has absolutely NO null references. Not a one. Zip, zero, nada. Obviously the best way of accomplishing such a database is to denormalize any value that might be null. So if Address2 is optional, you would want to split Address into its own table with a parent key pointing back to the user entry. If the user has an Address2 value, there will be a row. If the user does NOT have an Address2, the row will be missing. In that way, empty result sets take the place of null values.

In terms of programming languages, there are a varity of ways to map such a concept. Collections are a 1:1 mapping to result sets that can work. If you don't have any values in your collection, then you know that you don't have a value. Very easy. Similarly, you can be sure that none of the values passed to a function or method will ever contain a null value. Cases where you might want to pass some of the values but not all can be handled either by method overloading (e.g. Java) or by allowing a variable number of parameters. (e.g. C)

Some pieces of programming would become slightly more difficult. For example, 'if(hashmap.get("myvalue") != null)' would not be a valid construct. You'd need to perform a check like this: 'if(hashmap.exists("myvalue")'

Of course, the latter is the "correct" check anyway, so the theory goes that the software will be more robust and reliable.

Re:20 second explanation (3, Insightful)

Anonymous Coward | more than 5 years ago | (#27051445)

doesn't NULL in SQL represent "unknown", which is something entirely different that a NULL reference, which in the context of programming languages is a discrete value?

Re:20 second explanation (4, Informative)

AKAImBatman (238306) | more than 5 years ago | (#27051767)

doesn't NULL in SQL represent "unknown", which is something entirely different that a NULL reference

No. NULL in SQL represents an absence of data. Which is occasionally used to cover for unknown values. However, NULL is a piece of data that says there is an absence of data. Which is incorrect. Absence of data means that it doesn't exist. Therefore, nothing should exist in its place.

Normalizing the database can create a situation where the NULL is unnecessary. Therefore, the concept is not needed by computer science. The problem is that real-world considerations often override the ivory tower of comp-sci. And one of those considerations was the fact that RDBMSes have traditionally been organized according to a fixed column model. The inflexibility of the model is driven by the on-disk data structures which are optimized for fast access. OODBMSes (which are really fancy RDBMSes with many "pure" relational features that work around the traditional weaknesses of RDBMSes) attempt to solve this issue by introducing concepts like table-less storage, columns that may or may not exist on a per-row basis, and a dynamic typing system that potentially allow for any data type to show up in particular column. (Note that columns are often handled more as key-value pairs than what we normally think of as columns. This does not undo the theoretical foundation of the Relational model, only results in a different view on it.)

Re:20 second explanation (2, Interesting)

BigHungryJoe (737554) | more than 5 years ago | (#27052171)

Ok, I'm far from an expert on SQL, but if NULL doesn't represent "unknown" in SQL, then why does

select 1 from dual where 1 not in (2,3,NULL);

return an empty set?

Re:20 second explanation (1)

Vellmont (569020) | more than 5 years ago | (#27051799)


doesn't NULL in SQL represent "unknown",

Sorta. From an operational perspective it represents an un-initialized state. If you don't write anything to a particular column, it's null. From a set-theory perspective it represents "nothing".

which is something entirely different that a NULL reference, which in the context of programming languages is a discrete value?
No. I'd say that NULL in a programming language is largely the same concept. Doesn't exist, nothing, etc. It's perhaps slightly more broad, since programming languages aren't just sets.

Re:20 second explanation (5, Informative)

MattRog (527508) | more than 5 years ago | (#27051511)

"Obviously the best way of accomplishing such a database is to denormalize any value that might be null"

That's normalizing -- the table in this example is de-normalized

Re:20 second explanation (1)

Sockatume (732728) | more than 5 years ago | (#27051591)

You lost me at "simple". Sorry. I'm afraid I don't grok what a null reference is to begin with, which may be an issue.

Re:20 second explanation (1)

pi_rules (123171) | more than 5 years ago | (#27051871)

Never thought I'd have to explain this on Slashdot of all places.

Let's see if this makes more sense:
String tmp = null;
if (tmp.length() > 0) /* <-- we blow up right here. */
{ //Do something.
}

Re:20 second explanation (4, Insightful)

AKAImBatman (238306) | more than 5 years ago | (#27051877)

Consider the situation of apples. If you have an apple, then something is in your possession. If you don't have an apple, what do you have? Do you have some sort of object that depicts your lack of an apple? Obviously not. Yet in the world of computers, we have this special piece of data that shows our lack of data. It's a bit like getting a certificate that you have no apples. The certificate accomplishes nothing except to fill a space that does not need to be filled.

Re:20 second explanation (0)

Anonymous Coward | more than 5 years ago | (#27052369)

Exceptional explanation! Mod parent up!

Re:20 second explanation (1)

morgan_greywolf (835522) | more than 5 years ago | (#27051949)

Sorry. I'm afraid I don't grok what a null reference is to begin with, which may be an issue.

A pointer in C/C++ contains a memory address where some data or code start. For instance, there is really no string type in C. In C, a string is a pointer to the character where the string begins in memory. A value of 0 signals the end of the string.

A null pointer in C/C++ (or just about any other language with pointers) is a pointer which points to nothing, hence, null.

A null reference is what you get when you dereference a null pointer.

Re:20 second explanation (1)

Omnifarious (11933) | more than 5 years ago | (#27051771)

My problem is that null references are typically used to signal the ends of lists or the place where the tree ends.

I could see using a variant type for this. Instead of pointing to null, the next to the last list element would point to a value that had the type 'last list element' and no pointer inside it. And there would be four varieties of tree node, leaf, left filled, right filled and both filled.

Can you think of any better ways than that to handle the lack of a null reference when building data structures? That solution seems sort of ridiculously complex on non-OO languages, and a pain even for OO languages.

Re:20 second explanation (0)

AKAImBatman (238306) | more than 5 years ago | (#27052083)

My problem is that null references are typically used to signal the ends of lists or the place where the tree ends.

From the perspective of comp-sci, is that a correct solution? The answer is "no". An absence of data should simply be an absence of data, not a piece of data that represents the absence of data.

In other words, you need to imagine a language where the second reference is potentially non-existent. This is quite easy to represent in languages where maps are not closely related to 'null' values. Javascript comes to mind as an exmaple:

var end = {};
var start = {next: end};
 
end.prev = start;
 
if(!start.prev) alert("The reference 'prev' does not exist at the start of the list.");
if(!end.next) alert("The reference 'next' does not exist at the end of the list.");
 
alert("There are "+count(start)+" items in the list.");
 
function count(list)
{
  var count = 0;
 
  while(list)
  {
      count++;
 
      if(!list.next) return count;
      else list = list.next;
  }
}

That solution seems sort of ridiculously complex on non-OO languages, and a pain even for OO languages.

This issue is caused by the structure of modern languages, nearly all of which assume that "null" is a valid value.

Re:20 second explanation (1)

Omnifarious (11933) | more than 5 years ago | (#27052105)

Oh, you have a special 'null instance' of any data type. That's just dumb. As someone else pointed out, it's just as easy to forget to check for it as it is to forget to check for null. And then your program ends up in some strange unpredictable behavior instead of generating a nice obvious segmentation fault when the reference is de-referenced.

Re:20 second explanation (1)

sunking2 (521698) | more than 5 years ago | (#27051791)

Please don't try to explain the behavior of an actual language with SQL. Its demeaning.

Re:20 second explanation (1)

Mr Z (6791) | more than 5 years ago | (#27052263)

Ok, so you have a solution databases. Now describe how you would implement common data structures such as linked lists and binary trees. Keep in mind that in those contexts, NULL is just a sentinel value, so converting NULL to "a magic copy of the structure" isn't really eliminating NULL, since the contents of the structure are still meaningless.

I imagine your solution ends up looking like Pascal's variant record, where you have a boolean tag that says "has next element" or "has child element", and a conditionally present field that holds that pointer. Hurray for wasting space.

Decrying NULL pointers is very much like railing against sentinel values. NULL is just a sentinel value of "reference" or "pointer" type.

There was a bigger mistake: (2, Insightful)

teknopurge (199509) | more than 5 years ago | (#27051429)

Null-terminated strings. The bane of modern computing.

Re:There was a bigger mistake: (3, Informative)

RetroGeek (206522) | more than 5 years ago | (#27051525)

A null terminated String is a misnomer. It is actually an array of chars which uses a special character to signify its upper boundary. So that a second variable is not needed to hold the upper boundary. Zero was chosen by K&R.

In some languages, a String is an object, and the object holds the upper boundary, so a terminator flag is not required.

Re:There was a bigger mistake: (0)

Anonymous Coward | more than 5 years ago | (#27051631)

In some languages (C++'s std::string), a string is an object, and the end of the string is marked by a 0 character. How is it a misnomer to call this a null-terminated string? Even C's char*, though not an object is still conceptually a string, and its end is marked by a null. A null-terminated string.

Re:There was a bigger mistake: (2, Informative)

Anonymous Coward | more than 5 years ago | (#27051785)

false*. in fact, you have to call c_str() to obtain a null terminated string.
what happen inside is opaque, and most probably std::string constructed with a grain of salt are the pascal kind (a memory allocation and a separate character counter)

*depending on your std implementor.

Re:There was a bigger mistake: (1)

hobbit (5915) | more than 5 years ago | (#27051695)

A null terminated String is a misnomer.

True. It should be "NUL-terminated string".

But the use of the word "string" is correct:

5. A series [answers.com] of similar or related acts, events, or items arranged or falling in or as if in a line. See synonyms at series.

Re:There was a bigger mistake: (1)

morgan_greywolf (835522) | more than 5 years ago | (#27051573)

Auto-generated code documentation. Causes programmer laziness resulting in things that should be documented often don't get documented.

Re:There was a bigger mistake: (5, Insightful)

Rik Sweeney (471717) | more than 5 years ago | (#27051579)

Null-terminated strings. The bane of modern computing.

Yeah! Let's abolish them, life would be much simplerasdjkaRGfl$!jaekrbFt6634i2u23Q0CCA;DMF ASDJFERR

Re:There was a bigger mistake: (4, Funny)

Anonymous Coward | more than 5 years ago | (#27051681)

I agree.ï½ï½ï½ï½ï½ï½ï½cï½ï½A
5ï½)ï½"ï½ï½ï½lï½3åï½ï½ï½SLï½4ï½54Vï½iï½ï½ï½D.O%N|ï½ï½ï½Tï½2nï½ì'iï½ï½ï½;ï½
                                                  ï½,ï½ï½(85ï½Iï½{ï½ï½ï½ï½)ï½Oï½Æ¼ï½%Cï½iwï½ï½ï½ï½ï½ï½I!,.ï½Õ'ï½ï½ï½ï½!ï½òfsQï½ï½zï½ï½Gï½ï½ï½aï½zï½-@ï½ yï½Ë+ï½ï½ï½Xï½ï½ï½ï½"ï½cï½âï½ï½ï½ï½ï½ï½ï½ï½ï½ï½dï½nbÕoeï½ï½ï½ï½lï½ï½ï½ï½ï½;hmï½ï½

Re:There was a bigger mistake: (1, Troll)

Anthony_Cargile (1336739) | more than 5 years ago | (#27051585)

Null-terminated strings. The bane of modern computing.

Maybe I'm feeding a troll, but what else would you terminate it with without using something the string may contain? Keep in mind that null-terminated strings were, err, "invented" around the time ASCII was really the only fully widespread character standard, and something was needed to mark the end of a string for detection by software.

The mistakes you speak of are made by programmers that don't know how to securely utilize this in certain environments. Mainly in buffers, but recall the lkml thread [kerneltrap.org] about the license macro in kernel modules being abused with '\0'.

Re:There was a bigger mistake: (1)

aspoon (794081) | more than 5 years ago | (#27051651)

I'm all for null-terminated strings, but just for the sake of giving a counter-example, look at Microsoft's implementation of BSTR. It basically starts the string with a string length, then the rest of the string. So technically speaking, you don't need the null since you already know where the string would end. (Correct me if I'm wrong... it's been quite a while since I last played with BSTR)

Re:There was a bigger mistake: (4, Informative)

Panaflex (13191) | more than 5 years ago | (#27051735)

Which comes from Pascal - which has always had the length at the beginning. Hence why pascal strings always had limits.

Re:There was a bigger mistake: (1)

Chrisq (894406) | more than 5 years ago | (#27052043)

Which comes from Pascal - which has always had the length at the beginning. Hence why pascal strings always had limits.

And originally from Cobol, where strings were fixed length (says he with 90% certainty)

Re:There was a bigger mistake: (1)

Bill, Shooter of Bul (629286) | more than 5 years ago | (#27052151)

Oh they did have limits, but you could still break them. With turbo pascal 7 you could read a line of input from a file into a fix length string that went over the limit you had set for the string. Allowing you to write into areas of memory you weren't supposed to have access to.

Re:There was a bigger mistake: (1)

Exitar (809068) | more than 5 years ago | (#27052185)

But since you use C to write more optimized code, using one byte for the terminator uses less space than using N bytes to memorize the actual string length, unless you're fine with strings with max length of 255.

Re:There was a bigger mistake: (1)

kLaNk (82409) | more than 5 years ago | (#27052091)

Minor nit for those not familiar with BSTRs: In a BSTR the length technically comes before the start of the string (pBSTR[0] actually points to the first character in the array).

Technically having a BSTR be null terminated wouldn't be required if it wasn't for the fact that the whole purpose of a BSTR was to allow these strings to be passed around to existing functions which only expected simple null terminated WCHAR* strings).

Re:There was a bigger mistake: (1)

morgan_greywolf (835522) | more than 5 years ago | (#27052239)

Why presume that one would care? Not all languages are like C. In Python and Java, I believe that the way strings are represented is left up to the language implementation. The main rule in Python is that strings are immutable -- so the storage requirements are fixed and known at runtime.

The mistake was actually not having a standard (4, Insightful)

Nicolas MONNET (4727) | more than 5 years ago | (#27051769)

for Pascal type strings in C. The fact that null-terminated strings existed wasn't the problem, they make some sense in some respects, such as when you want to pass text of arbitrary length. But the real problem, the real bug was not having a standard way of doing real strings in C. Everybody had to do it himself, poorly. Had there been a standard, no matter how poor, it would have been a starting point to do something better if needed, and would have been better anyway for many uses than C strings. It would have avoided MANY vulnerabilities from common software.

Re:The mistake was actually not having a standard (3, Interesting)

Vanders (110092) | more than 5 years ago | (#27052141)

The problem with Pascal strings is that it's easy for a short-sighted implementer to paint themselves into a corner. It's all very well and good to say "The first two bytes in a string are used to indicate the length of the string" but then what do you do a decade from now when a 16bit string is laughably small? The benefit of NUL terminated strings is that there length is only limited by the memory available to you and yet are forward and backward compatible by decades.

How else would you terminate them? (1)

wiredog (43288) | more than 5 years ago | (#27051821)

In a low-level language like C or assembly, anyway? The only workable alternative I ever saw was to store the length in (or with) the string, which can be very wasteful of memory.

Re:How else would you terminate them? (1)

Talchas (954795) | more than 5 years ago | (#27051939)

On a 32-bit platform it adds three additional bytes to the string over a null terminator. If you have thousands of very short strings, it could be wasteful of memory. Most times you have much longer strings, and in even the case of say a 20 character string, 21 bytes vs 24 bytes is generally pretty insignificant.

Re:How else would you terminate them? (1)

camperdave (969942) | more than 5 years ago | (#27052359)

If you prefix the string with a byte containing the length of the string, you are no worse off than postfixing the string with a zero byte. Besides, memory is cheap. It has been for decades. Face* was right. C needed a string type, and since it wasn't there, people implemented it badly.

* Face is what I call people when I can't remember (or can't be bothered to look up) someone's name. It's a cross between "What's his face" and the A-Team character.

Re:How else would you terminate them? (1)

kLaNk (82409) | more than 5 years ago | (#27052147)

In a low-level language like C or assembly, anyway? The only workable alternative I ever saw was to store the length in (or with) the string, which can be very wasteful of memory.

How is storing the length wasteful of memory? 99% of the time I'd guess as much space would be used in storing a null character as would be consumed storing the length of the string itself.

Re:There was a bigger mistake: (1)

Hal_Porter (817932) | more than 5 years ago | (#27051975)

Have you seen that picture "\0 RLY"? It's an O RLY owl with no eyes or beak, just feathers.

Having truncated strings with zero bytes for various hacks, that really makes me laugh. Unfortunately Google image search doesn't let you search for "\0 RLY".

Re:There was a bigger mistake: (1)

cant_get_a_good_nick (172131) | more than 5 years ago | (#27052183)

PEDANT ALERT.

NULL is a special pointer value, which is 0 in source code, but may or may not be 0 in object code. The compiler sets it to whatever the ABI defines the special flag pointer to be. The size would be whatever a pointer size is on your platform

NUL byte, a single byte of 0x00 in both source and object code. In C-style strings, it's a marker that terminates the string.

Not the same thing.

Null is just a value (1)

bytesex (112972) | more than 5 years ago | (#27051473)

Yeah, but wouldn't the first thing you'd do in the system API design of any non-null language be, the creation of a singleton object instance of the superclass of all objects, named 'null' ?

Also, apart from 'null' there are loads of parameters than can have illegal ranges and must be checked to be proper.

Thirdly, a similar rant can be had against non-range checking of enums in C (but then warning against it in switches (WTF?)).

Re:Null is NOT just a value (1, Interesting)

AKAImBatman (238306) | more than 5 years ago | (#27051583)

wouldn't the first thing you'd do in the system API design of any non-null language be, the creation of a singleton object instance of the superclass of all objects, named 'null' ?

Umm... no? The first thing done is usually a superclass called "Object". If you don't extend anything else, you extend Object. Depending on the language, the superclass of Object would either be self-referential or the option to obtain a superclass wouldn't exist. (The latter being the "correct" solution. See my next statement for why.)

Null is just a value

That's actually a problem. Null is a piece of data that represents the absence of data. The paradox here should be obvious. If the data doesn't exist, why do we create data about it not existing? If I have no apples, do I have an object that represents my lack of apples? No, I simply have no apples. At best, I might have a special container for apples. If it's empty, then I can infer that I have no apples. Just as a program can infer the absence of data through an empty collection.

Thirdly, a similar rant can be had against non-range checking of enums in C (but then warning against it in switches (WTF?)).

There's a lot of things wrong with C as a language. Don't try to use those as arguments. (Remember, C is more or less high-level assembly. On the scale of comp-sci it barely even rates. Its popularity stems from the excruciating slowness of computers in days gone by.)

Re:Null is just a value (2, Insightful)

Sneftel (15416) | more than 5 years ago | (#27051623)

Actually, if you were defining a "null" value, you'd make it a Top-type, meaning it would be a subclass of all other types. Otherwise you couldn't set an arbitrary reference to point to null, because null would be insufficiently derived.

Re:Null is just a value (1)

Garse Janacek (554329) | more than 5 years ago | (#27051641)

Yeah, but wouldn't the first thing you'd do in the system API design of any non-null language be, the creation of a singleton object instance of the superclass of all objects, named 'null' ?

No. That doesn't really make sense even in a lot of OO languages, anyway -- if my class Foo extends Object, and my function expects a Foo, then in a strongly-typed language you can't pass me an Object.

In languages where this would be possible, it would nonetheless be very evil to start with a language that is designed to guarantee the presence of a valid reference wherever one is expected, and then impose conventions that require runtime type checking substituted for null-checking every time we access any value.

Also, apart from 'null' there are loads of parameters than can have illegal ranges and must be checked to be proper.

Of course the claim isn't that removing null would avoid the need for all range checking, or eliminate all resulting errors. But I think a pretty good case can be made that null pointer/reference errors have historically been the majority of such errors -- and if not, certainly the plurality. Same answer for your C enum example -- they may be terrible and may cause a lot of errors, but I think null caused even more...

Re:Null is just a value (0)

Anonymous Coward | more than 5 years ago | (#27051691)

If that was the first thing you did, the second thing you'd do is wonder why your code doesn't compile. eg in Java

Object nul = new Object(); // doesnt work, object is abstract, and even if it wasn't...
Integer foo = nul; // ... won't even compile due to the type mismatch

It would work in weakly-typed languages, but that would miss his point entirely (he was talking about static type checking)

Re:Null is just a value (0)

Anonymous Coward | more than 5 years ago | (#27052197)

If that was the first thing you did, the second thing you'd do is wonder why your code doesn't compile. eg in Java

Object nul = new Object(); // doesnt work, object is abstract, and even if it wasn't...
Integer foo = nul; // ... won't even compile due to the type mismatch

It would work in weakly-typed languages, but that would miss his point entirely (he was talking about static type checking)

Just a fix... in Java...
Object nul = new Object(); // does work, Object is not abstract

Integer foo = nul; // ... won't even compile due to the type mismatch, but it easy to fix by casting:

Integer foo = (Integer) nul;

Wouldn't help (5, Insightful)

corporate zombie (218482) | more than 5 years ago | (#27051503)

Fine. No null references. So I create the same thing by having a reference to some unique structure (probably named Null) and I still *fail to check for it*.

Null references don't kill programs. Programmers do.

    -CZ

Re:Wouldn't help (1)

Tridus (79566) | more than 5 years ago | (#27051787)

When the same mistake is repeated over, and over, and over, and over, and over again for decades, it's only natural to wonder if maybe letting it happen was itself a mistake.

I mean, if I design a road and one car crashes, it's probably the driver. If there's crashes every day for 15 years? Either every driver is bad, or something is wrong with the road design.

Re:Wouldn't help (1)

Chirs (87576) | more than 5 years ago | (#27052211)

Given that in the US there are approximately 40000 fatalities/year from car accidents, what conclusion do you draw?

Re:Wouldn't help (0)

Anonymous Coward | more than 5 years ago | (#27051809)

The difference is that you keep going instead of crashing. For example, iterating over a linked-list and failing to check for the Null sentinel will give you Node1, Node2, Node3, Null, Null, Null, Null, ...

So it's turning a segmentation fault into an infinite loop, hereby saving BILLIONS of dollars! ...maybe.

Re:Wouldn't help (4, Interesting)

nuttycom (1016165) | more than 5 years ago | (#27052019)

If you use a sane class for references that could possibly be null (like Option [scala-lang.org] (aka Maybe in haskell) then your compiler will *force* you to handle the null case.

This is where null went wrong, at least in statically typed languages: it's a hole in the type system that errors fall through into your program. When coding in Java, I make an explicit point to never return null from a method; if I have a situation where no reasonable return value might exist, I use the Option class from functionaljava.org [functionaljava.org] and thus force the client to handle the possibility of the method not returning sensible data. Since Option obeys the monad laws [blogspot.com], it's easy to chain together multiple things that might fail (with the bind or flatMap operations.)

Re:Wouldn't help (1)

Seakip18 (1106315) | more than 5 years ago | (#27052267)

Amen!

Sometimes, you need to see the program killed to realize you're doing it wrong.

I recently discovered a horrible lapse of auditing in our program where user_id's were set = "". To the program, it's a valid ID. Had the original programmers made judicious use of null, this error would have popped up during testing and realized that "hey, we're not initializing this value!"

Re:Wouldn't help (1)

ACMENEWSLLC (940904) | more than 5 years ago | (#27052345)

It's when I see stories like this that I am glad I program mostly on an i5 with RPGILE. While NULL is possible, it is usually only seen when interfacing with a PC application or if you bind to C/C++ programs or APIs. When I declare @variable I will inz() it, usually with *blanks or *hival (xFF) If I don't declare it, it could have whatever what is memory before but it's not = null. isnull *true.

Algebraic data types (4, Informative)

Sneftel (15416) | more than 5 years ago | (#27051603)

The concept of "no null references" would be very limiting in a language without algebraic datatypes [wikipedia.org]. You can think of null references as a sort of teeny limited braindead algebraic data type, actually. I get the feeling that much of the incredulity here stems from the posters not being familiar with languages that support them. If this describes you, check out Haskell and OCaML! They're the sort of languages that make you a better programmer no matter what language you're using.

Re:Algebraic data types (0)

Anonymous Coward | more than 5 years ago | (#27052057)

Because forcing inherently procedural/algorithmic code into a functional paradigm makes for readable code, AMIRITE? Stop with the "functional languages are a panacea" bullshit already.

Re:Algebraic data types (1)

Chrisq (894406) | more than 5 years ago | (#27052101)

The concept of "no null references" would be very limiting in a language without algebraic datatypes [wikipedia.org].

Not necessarily. You could mandate default constructors that would be invoked every time that an unreferenced object occurred, so Strings unless explicitly initialised would refer to "", user types to whatever the default constructor produced, and so on.

Arrogance? (1, Informative)

Anonymous Coward | more than 5 years ago | (#27051647)

Is it not a tad arrogant to claim that he single-handedly is responsible for a billion dollars in mistakes? First, as an earlier poster remarked, the programmers themselves, or at least the current business context of programming, are perhaps more responsible. Second, while I'm aware that Tony Hoare is largely responsible for defining Algol -- though it was a committee effort -- it may just be on the edge of possibility that a high performance language such as C would have still included null references even if Algol did not. And Hoare's reference to Microsoft's work is nothing but PR; MSR is his current employer. Plenty of other earlier efforts have addressed null pointer dereferences; and of course certain classes of languages avoid the problem entirely.

Re:Arrogance? (1)

Chrisq (894406) | more than 5 years ago | (#27052133)

Not really, I will certainly print out his admission and have it ready for every project I work on in future ;-)

Re:Arrogance? (0)

Anonymous Coward | more than 5 years ago | (#27052281)

I am more inclined to feel lenient towards a claim that someone is responsible for a billion dollar loss, than if they claim to have produced a billion dollar benefit.

Pass by reference (4, Informative)

hobbit (5915) | more than 5 years ago | (#27051851)

I'm raised on C-style programming languages, and have always used null pointers/references, but I am having trouble of grokking null-reference free language.

Take a look at C++, in which you can declare methods to be "pass by reference" rather than "pass by pointer". Although the former is actually really just passing a pointer too, the semantics of the construct make it impossible to pass NULL.

Re:Pass by reference (2, Informative)

johannesg (664142) | more than 5 years ago | (#27052149)

... the semantics of the construct make it impossible to pass NULL.

void bar (int &intref)
{
    intref++;
}

void foo ()
{
    int *intptr = NULL;

    bar (*intptr);
} // learn something new every day!

Re:Pass by reference (0)

Anonymous Coward | more than 5 years ago | (#27052167)

the semantics of the construct make it impossible to pass NULL.

Emphasis mine. Wrong, actually.

/* g++ -W -Wall -Werror -pedantic wrong.cpp */
#include <iostream>
void possible (int& ref) { std::cout << &ref << std::endl; }
int main (void) {
  possible (*(int*)NULL);
  return 0;
}

Note: Bad things happen if you try to read or write the value of reference when you pass NULL, but the example above doesn't crash because it only outputs the address of the reference.

Should have been patented! (1)

fprintf (82740) | more than 5 years ago | (#27051875)

The concept of the null value should have been patented. If so, it would have validated that patents in software can be a good thing by stopping the destructive spread of bad ideas in the same way they stop the spread of good ones.

Either that, or whoever invented the concept would be far richer than Bill Gates, Larry Ellison, and Steve Ballmer combined!

K&R's null-terminated string in C (1)

peter303 (12292) | more than 5 years ago | (#27051945)

They should be shot for that one :-) This is lead to so many costly buffer-overflow virus attacks. Early languages like FORTRAN and COBOL had safer strings, but not as elegant as C. You had to pre-declare string storage size in early compilers.

Re:K&R's null-terminated string in C (0)

Anonymous Coward | more than 5 years ago | (#27052081)

Uh. In C you also pre-declare string storage size. You have a problem only if you use gets instead of fgets, sprintf instead of snprintf, or strcpy(dest, src) instead of strncpy(dest, src, n); dest[n-1] = '\0', etc.

It's not that NULL pointers are a problem (1, Informative)

wiredog (43288) | more than 5 years ago | (#27051965)

It's unitialized pointers (and, for that matter, other variables) that are the problem. At least in assembly and C/C++. I don't think I ever had cause to use pointers in Perl or Python. Or C#. Null pointers or zero values in other variables are easy to test for anyway. It's the uninitialized variables that bite you in the ass.

Re:It's not that NULL pointers are a problem (1)

Abcd1234 (188840) | more than 5 years ago | (#27052085)

I don't think I ever had cause to use pointers in Perl or Python. Or C#.

Umm... what? Every single one of those languages has the concept of a pointer/reference that is virtually inescapable, and every one has a concept of undef/nil/null. Or have you never used a class in Perl (which is just a blessed reference), or a non-value-type in C# (which is stored and passed as a reference to the actual object)?

Honestly, do you even know what a pointer is, conceptually??

Trouble is that even if you remove NULL-refs (1)

Kjella (173770) | more than 5 years ago | (#27051969)

You'll just have developers replace it like:

$foo = NULL;
getRef( $foo );
if ( $foo != NULL ) {
        doSomething( $foo );
}

with

$foo = "dummy";
getRef( $foo );
if ( $foo != "dummy" ) {
        doSomething( $foo );
}

Basicly, you can write any null code as non-null code just like you can hammer a square peg in a round hole. All you'd have is that instead of missed null checks you'd have missed dummy checks and it's be even less sane and understandable. Compared to every other way of enforcing error flagging the null references are the KISS solution. Though I much prefer the object oriented way where "isNull()" is the opposite of "isValid()" and that'll try to behave "nice" (usually by doing nothing and return error values) when something calls it instead of killing the application.

An was an even Bigger mistake: (5, Funny)

Wargames (91725) | more than 5 years ago | (#27052017)

Zero. The bane of all. It was the gateway math to all modern problems. It would be so much simpler with just countables. Surely the current crisis, measured in trillions would look so much better without all those zeros.
Whoever it was who invented zero should take responsibility for all the worlds problems, ex nehilo.

Re:An was an even Bigger mistake: (2, Informative)

AKAImBatman (238306) | more than 5 years ago | (#27052295)

Null predates zero in the western world. The Romans had no number for zero, but they did represent the concept of nothing with the word 'nulla'. Thus if I had IIII denarii and spent all IIII, I would have nulla remaining. i.e. "nothing".

As an aside, the numbering is correct. The subtractive form of IV for four is a more modern construct that was not in common use during the Roman empire.

If you're still hell-bent on finding who defined zero as a legitimate numerical value, you'd need to look to 9th century India. Their mathematics had evolved far enough to where they stopped asking the philosophical question of "is nothing a number?" and simply used it to get math completed.

Re:An was an even Bigger mistake: (1)

JustNiz (692889) | more than 5 years ago | (#27052311)

The ancient Babylonians, Mayans and Hindus all independently invented/discovered the concept of 0. Seriously.

Null as a concept (5, Interesting)

JustNiz (692889) | more than 5 years ago | (#27052193)

Stroustrup's "C++ Programming Language" book introduces a concept called "resource acquisition is initialisation" that was eye-opening enough to me that it forever changed the way I think about code, and also seems relevant to your point.

The basic idea is that an object is always meant to represent something tangible. As an example, consider the design of file object that abstracts file I/O operations. As a developer, I've come across this one several times, it is normal that such objects have open and close methods, however that makes the design of the object in contradiction with Stroustrup's concept because open/close provided as methods rather than only called in the constructor/destructor means the object may be in existence yet be in a state where it is not associated with an open file. You basically have to grok that having a file object around that doesn't directly map to an open file just adds overhead to the system and is basically bad OO design in that in some sense that object is meaningless.

Apply the same concept to a reference and you have your answer. If a reference is pointing at nothing, then what is its purpose? The only thing a NULL reference is good for is when the software design ascribes a special meaning to the value NULL. Instead of just meaning address location 0, it gets subverted to mean "variable unassigned" or the "tail node of list" or somesuch. Ascribing multiple meanings to a variable value (especially pointers/references that are only ever meant to hold memory addresses) is one example of bad programming practice known as programming by side-effect which most people agree should be avoided.

Another point is that in most OO lanugages, references have an extra benefit of being more strongly typed than pointers, menaing that reference is guaranteed to only ever be pointing at an instantiated object of its specific type. That guarantee also gets broken when a reference can be NULL.

"reference to nothing" is natural (1)

Todd Knarr (15451) | more than 5 years ago | (#27052195)

The reason it's hard to grok null-reference-free languages is because "a reference to nothing" is a natural concept. For instance, you want to find an object in a list. What's the result when the object you want isn't in the list? A language that can't express that concept leaves the programmer scratching their head.

The problem I run into's usually two-fold. First, programmers who don't really think about the failure case. They go looking for something, and skip the check for whether they found it. Sometimes it's just that they're lazy, sometimes it's that handling that case will be really hard, and sometimes it's because they've been told what they're looking for has to always exist so the operation can't fail. Second, compilers often treat null references/pointers as valid. Combined with the "initialize everything, always" coding style it yields nasty failures. The compiler doesn't gripe about using uninitialized variables because the variable was initialized, and neither the compiler nor the run-time gripe about using a null reference/pointer because it's considered valid. Solving those problems doesn't involve eliminating the null reference, though.

Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...