Posts tagged Unicode

DeHL 0.8.2 is out


I’ve just released the version 0.8.2 of DeHL. The downloads can be found on this page and changelog on this page.
Again, this is a minor release with a few bugs fixed and a new feature: TString (as asked in this comment).
As you might have guessed already, TString is a wrapper record modeled on .NET’s System.String class. Unfortunately I was unable to use most of the RTL’s string functionality so the “wrapper” grew quite a bit from my original expectations.

But … enough talk, here are some usage scenarios that someone may find useful:

  LStr: TString;
  { Overloaded operators and a special function "U" }
  LStr := U('Hello World for the ') + 10 + 'th time!';

  { Do some random operations }
  if (LStr.ToUpper().Contains('HELLO')) and
     (LStr.Contains('HeLLo', scLocaleIgnoreCase)) then

  { Now let's select all the distinct chars from the string }

TString overloads all sane operators: Equality, Inequality, Implicit conversions, Addition, Subtraction and offers functions to convert to and from UTF8 and UCS4 (via RTL of course). I also need to iron a few things about about Enex integration for the next minor release.

The other small improvement that I added relates to the collection package. All simple collections (not the Key/Value pair ones) implement a sort of “where T is the_class, select it as such” operation. Check out this example:

  LList: TList<TObject>;
  LBuilder: TStringBuilder;
  LObject: TObject;
  LList := TList<TObject>.Create;

  { Populate the list with some random objects }

  { Now select the objects we're interested in (string builders) }
  for LBuilder in LList.Op.Select<TStringBuilder> do
    WriteLn(LBuilder.ClassName); // Do stuff

  { Or select everything (not actually required - an example) }
  for LObject in LList.Op.Select<TObject> do
    WriteLn(LObject.ClassName); // Do stuff

If it’s still not clear what this operations does, let me explain. It basically consists of two operations: Where and Select. First, each object is checked to be of a given class and then this object is cast to that class so you can iterate directly using a FOR .. IN loop only over the objects you want to. Of course doing that for TObject makes no sense (as in example) … but well … that was an example.

Well, that’s all for today,
Have Fun!

“in” is dead, long live “CharInSet” (or maybe not)

I’ve this question come up a few times, so …

Since Delphi 2009 you have probably noticed this warning: “[DCC Warning] Unit1.pas(27): W1050 WideChar reduced to byte char in set expressions.  Consider using ‘CharInSet’ function in ‘SysUtils’ unit.” which suggests you to use the CharInSet function instead. CharInSet still requires a simple set of AnsiChars and fails (returns false) is the code of the passed character is greater than 255 which makes it useless for Unicode characters.

Putting it simple: replacing the “a in b” statement with CharInSet(a, b) will simply silence the compiler. Nothing more.

You should really consider using Character unit if you want to check whether a character is a letter, digit or is a part of another “standard set of chars”.

Take this example:

uses Character;
  if C in ['A'..'Z'] then // <-- OLD WAY
    WriteLn('The character is an upper-cased letter.');

  if IsUpper(C) then // <-- UNICODE WAY
    WriteLn('The character is an upper-cased letter.');