I hate integers!

Did I catch your attention? I guess I did if you are reading this. Just to be clear, I don’t hate integers — I hate using integers (a.k.a. signed numbers) where they don’t make sense. For example let’s take the standard intrinsic routine called Length (in System unit). It returns the length of an array (or string) in format of a signed 32-bit integer. While I doubt anyone will use 2 gigs of memory for an array, this still breaks the prettyness of the code. Almost all routines that need an unsigned integer still require a signed one. And this is not limited only to Delphi — most of the .NET libraries use “int” when they should use “uint” (I’m not even going to mention Java here! Last time I checked there was no concept of “unsigned” in it).

Anyway, being as I am, I always try to use the Cardinal type when I do not require the negative values. For instance iterating in a FOR loop from 0 (zero) to a length of an array. This doesn’t benefit me in any way, aside of personal satisfaction. It even tends to “bite me in the ass” sometimes. Let’s take this example:

function MakeString(const AChar: Char; 
  const ACount: Cardinal): String;
var
  I: Cardinal;
begin
  Result := '';

  for I := 0 to ACount - 1 do
    Result := Result + AChar;
end;

begin
  WriteLn('Str1 = ', MakeString('A', 2));
  WriteLn('Str2 = ', MakeString('B', 0));

  ReadLn;
end.

What will this program write on the console? I bet you though it would be: “Str1 = AA” and “Str2 = “. Think again, but this time harder.

Now let me explain what is really going on:

  1. I pass 0 as the ACount parameter to the MakeString function.
  2. I use a Cardinal data type for my FOR loop.
  3. I start at 0 and go to ACount – 1.
  4. ACount being 0, the upper bound of the FOR loop becomes (0 – 1) =4294967295.
  5. The FOR loop continues for a lot of iterations.
  6. You will either run out of memory or get bored waiting for a result…
  7. If  I were an Integer the loop would have worked flawlessly.

“I hate integers …”

You May Also Like

About the Author: Alexandru Ciobanu

16 Comments

  1. I think you should hate cardinals, and compilers that don’t check the bounds of integer types.

    cardinal (and uint) were introduced when 16 bits integer were too short for some common uses. In a 32 bits (and 64 bits) world, there is no reason to add all this trouble for 1 more bit.

    I use exclusively use signed ints, and implement checks via asserts, if x<0 then, or x=max(0,x) where I need to make sure ints are positive.

  2. Unsigned integers only make sense in two cases (count’em):

    1) when you don’t have a choice, and I mean really, really don’t have a choice
    2) when you want to test if a signed integer is in a signed range with a single comparison

    Both cases boil down to optimization.

  3. Maybe you can use something like “type TPosInt = 0..maxint;” to have range checking without introducing unexpected problems.

  4. I tried the same quest a while back and failed for the same reasons.

    It just isn’t worth the grief. In fact, if you try to remove all the warnings from your code, it can be a real nightmare.

    If you need the extra range, it is frequently easier to just use int64.

    And btw, since Delphi is intimately tied to 32 bit windows, you never need to worry about using a data structure larger than 2gb -> It can’t happen.

    Once we all move to 64bit OSes, it may eventually be more of an issue, but probably not for a good many years (63 bits of address space is a **LOT** of ram).

    @Giel – Range checking? Who runs with range checking? I think you could break most apps (components and the VCL) just by turning it on.

  5. @ Xepol:

    > Who runs with range checking?

    A programmer who takes pre/postconditions seriously perhaps?
    If function parameters and results have suitable ranges you don’t have to add asserts for that.

  6. And, of course, many CPUs are optimized for signed integers, and your code will run slightly faster with integers than with cardinals and words.

  7. I couldn’t agree more with the original post. Another thing I dislike is that indices are counted from 0 – this leads to the exact problem described here…

    Indices should be from 1..Count and not from 0..Count-1

    What makes more logical sense:

    FOR I:=1 TO Count

    or

    FOR I:=0 TO Count-1

    ?

    I know, that in my head, if I need to iterate over 100 items, I count from 1 to 100, not from 0 to 99…

    This is also why I always code my loops as:

    FOR I:=1 TO SL.Count DO IF SL[PRED(I)]=” THEN …

    This way I don’t have to cater specifically for the case where there’s no items in the String List…

  8. @Eric:

    Unsigned integers makes sense EVERYWHERE where a negative value is simply impossible. F.ex. in LENGTH, SizeOf, FileSize, DiskFree, DiskSize

    Just think back to a time where 2Gb was a HUGE file. Think of how many lazy programmers were caught off guard when HDD sizes crossed that magic boundary and you were told that there wasn’t enough space available on your drive when you had 3.5 Gb free space.

    YOU HAD TO CREATE A FILE OF 2 GB IN SIZE TO GET THE AVAILABLE DISK SPACE DOWN BELOW 2 GB JUST BECAUSE A PROGRAMMER WAS LAZY AND USED “int” INSTEAD OF “unsiged int”.

    Since a disk NEVER can have a negative amount of available space, it makes no sense to have a DiskFree function return a SIGNED integer. NONE AT ALL…

  9. @Keld:

    FOR I:=1 TO SL.Count DO IF SL[PRED(I)]=” THEN …

    This way I don’t have to cater specifically for the case
    where there’s no items in the String List…

    I don’t understand. You don’t have to cater for empty lists using the more efficient, and more normal, approach either:

    for i := 0 to Pred(sl.Count) do
    if sl[i] = ” then …

    This works exactly the same way for sl.Count = 0 as your approach, but has the added benefit of restricting the need to Pred() to the upper limit of the loop. Otherwise you have to remember to Pred() every reference to the loop counter variable, which is less efficient and just additional opportunities to forget.

  10. There was a time when the concept of “zero” didn’t exist, I guess there were long philosophical discussions back then, that it didn’t make sense to care about what didn’t exist ^_^

    >Unsigned integers makes sense EVERYWHERE where a negative value is simply
    >impossible. F.ex. in LENGTH, SizeOf, FileSize, DiskFree, DiskSize

    You’re looking at it from the wrong side, the result of a function is not (and never was) intended to be limited to the possible ranges of a result value, it’s always a greater container (and thankfully so, I can’t even begin to think of mess we would be in if result values of all functions were constrained to the strictest datatype that could hold its result).

    No, the result of a function is intended to be… *used*. gulp.

    And it’s a bit like in the non programming world out there: negative values were invented for a good mathematical reason, and they make sense even when the real-world possible “results” are only strictly positive.

    >YOU HAD TO CREATE A FILE OF 2 GB IN SIZE TO GET THE AVAILABLE DISK
    >SPACE DOWN BELOW 2 GB JUST BECAUSE A PROGRAMMER WAS LAZY AND
    >USED “int” INSTEAD OF “unsiged int”.

    If he had used “unsigned int” you could have screamed “YOU HAD TO USE A FILE of 4 GB IN SIZE […] BECAUSE A PROGRAMMER WAS LAZY AND USED 32bit INTEGERS INSTEAD OF 64 bit ONES”

    >Since a disk NEVER can have a negative amount of available space, it makes
    > no sense to have a DiskFree function return a SIGNED integer. NONE AT ALL…

    It makes all the sense in the world because that number is one number you’re bound to use in arithmetics (subtract to or from it, compare, etc.), and it makes no sense to do arithmetic on unsigned numbers (unless you enjoy aiming loaded guns at feet).
    So better be clean and pass an integer value you can do arithmetic on, than one involving an unsafe conversion or an optional step to a higher precision representation (which if not done, would result in a gun at foot).

    Unsigned numbers are for binary masking, shifting and other special case situations. Heck, it would probably be safer if you could only access them in ASM 😉

  11. @Eric:

    > You’re looking at it from the wrong side, the result of a function is not (and
    > never was) intended to be limited to the possible ranges of a result value, it’s
    > always a greater container

    You are trying to impose the underlying limitations of the compiler/CPU upon the programmer’s logical thinking, which is not the way to do things. The compiler should be at the service of the programmer, not the other way around.

    If there’s no possibility for a function to ever return negative values, then it simply makes no logical sense for a function to limit its return value options just because the underlying compiler/CPU makes it “more efficient”. A good programming language is as far removed from the underlying CPU as possible, so that the programmer doesn’t have to think in those terms or operate within those restrictions…

    There has been numerous occasions in the past where a simple word – “unsigned” – would had meant that that problem had been non-existing.

    > If he had used “unsigned int” you could have screamed “YOU HAD TO USE A
    > FILE of 4 GB IN SIZE […] BECAUSE A PROGRAMMER WAS LAZY AND USED
    > 32bit INTEGERS INSTEAD OF 64 bit ONES”

    Yes – the problem returned again at a later date, but my point is that – at the time – there was no compiler (for the PC) who HAD a 64bit integer (signed OR unsigned), so that issue could only have been solved at that time by using the COMP type (80-bit “floating point integer” of the FPU, which didn’t necessarily exist in the CPUs at the time), whereas a simple “unsigned” already WAS available in the compilers at the time, and so there was no excuse for the laziness of the programmer to use signed integers as the return types for such functions.

    Restricting the possible return values of a function to allow values that can NEVER be returned is ILLOGICAL – plain and simple. And it turns the problems upside down with respect to the human/computer interface. The programmer should not be thinking about issues like these – that’s the compiler’s job.

    > I don’t understand. You don’t have to cater for empty lists using the more
    > efficient, and more normal, approach either:
    >
    > for i := 0 to Pred(sl.Count) do if sl[i] = ” then …
    >
    > This works exactly the same way for sl.Count = 0 as your approach,

    Nope. Because since I KNOW that there’s never a negative index needed, I then – of course – declare my variable to be an UNSIGNED integer, and then the compiler makes a fatal mistake and calculates that PRED(0) = 4Gb-1 (as described in the original post), which leads to a “String index out of range” exception in your IF statement…

    If the compiler was smart enough to recognize this flaw, then the problem would be smaller, but it isn’t, and this is a good example of the compiler/CPU enforcing its view on the world on the programmer, which it never should (in an ideal world), or at least should stay away from as much as possible.

    > It makes all the sense in the world because that number is one number
    > you’re bound to use in arithmetics (subtract to or from it, compare, etc.), and
    > it makes no sense to do arithmetic on unsigned numbers (unless you enjoy
    > aiming loaded guns at feet).

    It makes perfect sence. How about this:

    FUNCTION DiskSpaceUsed(Drive : CHAR) : Cardinal;
    BEGIN
    Result:=DiskSize(Drive)-DiskFree(Drive)
    END;

    This is a perfect example of a (simplistic) function where negative values should never come into play, since there’s NEVER a possibility of ANY of the values to become negative, and NEVER a possibility of DiskFree being bigger than DiskSize.

    Or what about this:

    GetMem(Buffer,LENGTH(S)+SizeOf(Structure))

    Once again, negative values has nothing whatsoever to do here, as they can never come into play.

    So plain unsigned arithmetic makes perfect sense in many cases…

  12. For me, unsigned integers “simply make sense” ™. Of course you must take care if you want to use them in any algorithm. I had problems starting with quick sort, binary search to any loop.

    And yes, turning on the range checking would help a lot … but who does that?

  13. My €0.02:

    In the age of powerful computers and powerful compilers, I use it all: warnings on, range and overflow checks on, and using variable types that match their intended use:
    If I know a number is positive, it will be declared as a word; if I know it will only count to e.g. 100, it will be declared as a byte.
    That way, if I make a mistake, I have the compiler alert me to it.

    The only concession here, which bites (bytes?) me occasionally is the empty for loop. But these usually are local and short-lived variables, so declaring these as int is no big deal.

    Jan

  14. Yes, you are right. In some cases integers are a better choice, though if the whole class structure is designed properly there are not many IFs anyway.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.