## Wednesday, October 18, 2006

### Math.abs Returns a Negative Number

Math.abs (Math.Abs in C# :-) is one of those methods that sould be pretty simple, right? Just take the value, if it's negative, return a positive version, duh! Well, it's not that simple.

There are 232 possible values of int, each of which is positive, negative or zero. There's only one integer that is zero, namely 0x00000000. That means that either there is one bit pattern of ints that doesn't represent an integer, or there is not a 1-1 mapping between positive and negative numbers. It turns out, the second is the case. The odd number is int.MinValue, which is equal to 0x80000000 in binary.

What happens when you negate this number? In two's complement -x=~x+1. ~x = 0x7fffffff. ~x+1=0x80000000. That means -int.MinValue == int.MinValue. Uh oh!

Enter Math.abs. What's the function to do when passed int.MinValue. Well, in C#, Math.Abs throws an OverflowException. Java, on the other hand, happily returns int.MinValue. Either way, the caller of the function probably wasn't expecting this.

I discovered this oddity when reading Effective Java (kindly provided to all Google Engineers). The book talks about the code: Math.abs (new Random ().nextInt ()) % n as a way to generate random numbers between 0 and n. This code usually works, except when Random happens to return int.MinValue. In this case, the abs returns a negative value, causing the modulus operator to return a negative number. Eeeek!

At this point, I realized that for the data structures course I TA, many students used a similar piece of code in a hashtable they wrote. I wrote a quick script to check how many people screwed up on int.MinValue. Most students did.

I wondered how much this occurred in production code. Luckily, I had one of the best test cases at my fingertips. Google has an internal "grep" tool (the inspiration for the new Code Search) utility. I made a quick regex to find where this occurred. There were many instances.

Now that Google has released an external version of this tool, you can see some of the places this anti-pattern is used in the real world:

Each of these snippits is a time bomb. One in 232 executions, the function will do something strange.

Having found this issue in Google's source code, I emailed somebody inside Google who had been running FindBugs internally. He in turn talked to Bill Pugh, who added a detector. I added a test case to the test suite for the homework assignment at CMU, so students will have to handle this case. All in all, a very obscure bug

So, what should you do rather than Math.abs? I've seen primarially two buggy patterns:

• Hashtables People want to find "which bucket should I put an object with hashcode x in". The best way to do this is (o.GetHashCode () & 0x7fffffff) % table.Length. This has the advantage of being faster than Math.abs (no branches).
• Random Numbers People want to say Math.Abs (new Random ().Next ()) % N. Not only is this buggy in the rare case, it can also cause a bad distribution of numbers for large N. Use the Next (int, int) method which allows you to specify bounds

Anonymous said...

Awesome. I wonder what other langs/libs do.

Anonymous said...

Why didn't people just remember that the range for n-bit signed integers is really -2^(n-1) to 2^(n-1)-1?

erik said...

To sun's credit, they have documented this anomaly in the documentation for Math.abs():

if the argument is equal to the value of Integer.MIN_VALUE, the most negative representable int value, the result is that same value, which is negative.

I still wouldn't expect a negative number though :-)

Viraptor said...

Libc does the same for INT_MIN. So most c/c++ projects do the same, but you shouldn't really expect any abs(rand()) code there.
It will happen less often, but still - I don't know of anyone who thought of this up to now.
Now... they're panicing ;)

Axel Liljencrantz said...

'abs(rand())%n' is broken, just as you say. However 'abs(rand()%n)' isn't, and the bit about the lower bits of output from random number generators being less random than the high bits are bogus on all modern systems. rand() implementations old enough to have such issues are so fundamentally broken that trying to 'fix' things by using high order bits won't really help either.

RichB said...

What happens on a 64bit CLR, or 64bit JVM? Do 32bit CPU instructions get used, or do 64bit instructions get used and the result truncated like with Java/strictfp?

Ben Maurer said...

'abs(rand()%n)' is still broken. Let's say n is MaxInt*2/3. Then 2/3rds of random numbers will be < 1/3, and 1/3rd will be > 1/3. Clearly broken

Anonymous said...

public static int abs(int a) {
return (a < 0) ? -a : a;
}

(0 - Integer.MIN_VALUE) equals Integer.MIN_VALUE

The anomaly here is that in stead of throwing an exception, the JVM lets positive numbers overflow into negative and vice-verca. But everybody knows that :-)

mauhiz said...

a solution could be to have Math.abs(int) return a long

Tim said...

Good Job! :)

Anonymous said...