“in” is dead, long live “CharInSet” (or maybe not)

I’ve this question come up a few times, so …

Since Delphi 2009 you have probably noticed this warning: “[DCC Warning] Unit1.pas(27): W1050 WideChar reduced to byte char in set expressions.  Consider using ‘CharInSet’ function in ‘SysUtils’ unit.” which suggests you to use the CharInSet function instead. CharInSet still requires a simple set of AnsiChars and fails (returns false) is the code of the passed character is greater than 255 which makes it useless for Unicode characters.
Putting it simple: replacing the “a in b” statement with CharInSet(a, b) will simply silence the compiler. Nothing more.
You should really consider using Character unit if you want to check whether a character is a letter, digit or is a part of another “standard set of chars”.

Take this example:

uses Character;
if C in [‘A’..’Z’] then // <-- OLD WAY WriteLn('The character is an upper-cased letter.'); if IsUpper(C) then // <-- UNICODE WAY WriteLn('The character is an upper-cased letter.'); end; [/sourcecode]


  1. The old way was never correct unless you handled English only texts. And even then names like André would have failed.

    But what drives me crazy is that you can’t check for (well defined) symbols [‘.’, ‘,’, ‘?’, ‘/’, ‘\’, …] anymore without a performance hit.

  2. Yes, CharInSet seems utterly useless to me. And a thing like this works pretty well:

    w: word;

    w := 100;

    if w in [0..99, 101 .. 200] then

    Why did they decide to use CharInSet … no idea.

  3. Let’s face it. The whole “CharInSet” is a failure on CodeGear’s behalf to write new, proper set evaluation logic. They stay with the code that was written in the days of TURBO PASCAL for goodness sake.

    It is absurd. Claiming that magically substituting CharInSet for X in [char Set] does nothing but silence the compiler is wrong. First, it improves code compatibility and secondly, it makes the poor logic and performance hit more obvious.

    Perhaps if it had been done, CodeGear would have figured out that they should rewrite the set evaluation logic from scratch and finally removed the absurd artificial limits that have existed since DOS days.

    When they talk about abandoning some old language features in the hopes of creating some new true innovation in the langauge, perhaps they can throw out the old set logic and just write something that works properly.

    Let’s face it – many of us have had to write set logic routines when the current routines hit their limits, so it is possible to generalize. Perhaps a more general solution will not be as efficent as size limited set of binary flags, but it would be a hell of a lot more flexible AND keep the code cleaner than CharInSet allows.

    Now, if you want a readable CharInSet, you might well be better off with Pos(‘x’,’abcdefg’) 0.

  4. I do agree with Xepol on 2 points (at least).

    – the CharInSet and its old fashioned logic is technically not less than a disapointment.

    – it’s up to the developer to overwhelm this issue by generating “not generalized” code.

    On the other side:

    – nobody should be naive and await a “perfect” language/IDE: CodeGear needs this kind of incentive to improve things 😉 In other words: our present remarks are a constructive criticism – at least, CodeGear now knows what to look at.

    – There’s a further alternative, even if it’s not universal:
    do design the application/database/whatever in a matter avoiding “wrong input” (input char limitation) or at least executing the validation opportunely (check input while OnExit and similar event are triggered) thus reducing the impact of low efficiency.

  5. So what does the the WideChar overloaded CharInSet(C: WideChar; const CharSet: TSysCharSet) do? I’d expect it to verify that 0<= C <= 127, does it?

    (I don’t have D2009 yet, so I can’t check the source….)

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.