Posts tagged OOP

DeHL, Delphi 2010 and Serialization

DeHLA few months have passed and I did not release a new version of DeHL yet. No, it’s not dead. I’ve just been busy with a delicate new feature — Serialization. This post will demonstrate the new capabilities of DeHL it’s advantages and and shortcomings.

But first — since the new releases will focus mostly on serialization and related stuff, I decided to drop Delphi 2009 support. It made no sense to support 2009 for future versions since no essential changes are made to the prior code. You can still use 0.7 release in Delphi 2009.

Back to serialization. The following list describes the changes that went into the new version:

  • In order to support serialization, DeHL’s type system was extended to support Serialize and Deserialize methods. Each type class (that describes a type in Delphi) now knows how to serialize values of the type it manages.
  • A new unit named DeHL.Serialization was added. It contains the base definitions of types used by the type system for serialization.
  • TPointerType, TRecordType<T>, TArrayType<T>, etc. were added for simplified type handling. The old method was a mix-up of Delphi 2009 and Delphi 2010 RTTI specifics (which have some essential differences in my case).
  • All classes can now implement ISerializable interface. The TClassType<T> detects whether this interface is implemented by the object and uses it for serialization (no, reference counting is not touched).
  • DeHL.Serialization.Abstract contains the semi-implementation of a “serializer” and it’s context. It is used by specific serializers.
  • DeHL.Serialization.XML defines the TXMLSerializer<T> which can be used to serialize/deserialize into XML nodes (uses TXMLDocument). Supports it’s own set of attributes (such as XmlRoot, XmlElement, etc.).
  • DeHL.Serialization.Ini defines the TIniSerializer<T> that you can use to serialize/deserialize type into Ini files or registry (through RTL’s TRegIniFile).
  • Most DeHL types (such as Nullable<T>TFixedArray<T>, BigInteger, etc.) provide their own serialization and deserialization methods.
  • All Enex collections (except a few that can’t actually) can be serialized and deserialized. They implement a custom serialization and deserialization technique through ISerializable.

Enough talk, a mandatory example:

type
  [XmlRoot('Testing', 'http://test.namespace.com')]
  TTest = class
    { Pointer to self }
    [XmlElement('PointerToSelf')]
    FSelf: TObject; 
    {A set of format settings }
    FFormatSettings: TFormatSettings;

    { And internal record }
    FInternal: record
      { Force the field to be an attribute of FInternal }
      [XmlAttribute('Value')]
      FOne: Integer;

      { Force this element to have same name but other namespace }
      [XmlElement('Value', 'http://other.namespace.com')]
      FTwo: String;
    end;

    FListOfDoubles: TList<Double>;
  end;
var
  LDocument: IXMLDocument;
  LXMLSerializer: TXMLSerializer<TTest>;
  LOutInst, LInInst: TTest;
begin
  CoInitializeEx(nil, 0);

  { Initialize the test object }
  LOutInst := TTest.Create;
  LOutInst.FSelf := LOutInst;
  GetLocaleFormatSettings(GetThreadLocale(), LOutInst.FFormatSettings);
  LOutInst.FInternal.FOne := 1;
  LOutInst.FInternal.FTwo := '2 - Two';
  LOutInst.FListOfDoubles := TList<Double>.Create();
  LOutInst.FListOfDoubles.Add(0.55);
  LOutInst.FListOfDoubles.Add(0.122);
  LOutInst.FListOfDoubles.Add(122.23);

  { Create the serializer and an XML document }
  LXMLSerializer := TXMLSerializer<TTest>.Create();
  LDocument := TXMLDocument.Create(nil);

  { Set the options }
  LDocument.Active := true;
  LDocument.Options := LDocument.Options + [doNodeAutoIndent];

  { Force fields to elements by default }
  LXMLSerializer.DefaultFieldsToTags := true;

  { Serialize the structure }
  LXMLSerializer.Serialize(LOutInst, LDocument.Node);

  { Serialize the structure }
  LXMLSerializer.Deserialize(LInInst, LDocument.Node);

  { Cleanup }
  LDocument.SaveToFile('c:\test.xml');
  LXMLSerializer.Free;
end.

The XML file generated by this code looks like this (INI looks uglier):

<Testing xmlns="http://test.namespace.com" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:DeHL="http://alex.ciobanu.org/DeHL.Serialization.XML" xmlns:NS1="http://other.namespace.com">
  <PointerToSelf DeHL:ref="Testing"/>
  <FFormatSettings>
    <CurrencyString>$</CurrencyString>
    <CurrencyFormat>0</CurrencyFormat>
    <CurrencyDecimals>2</CurrencyDecimals>
    <DateSeparator>/</DateSeparator>
    <TimeSeparator>:</TimeSeparator>
    <ListSeparator>,</ListSeparator>
    <ShortDateFormat>M/d/yyyy</ShortDateFormat>
    <LongDateFormat>dddd, MMMM dd, yyyy</LongDateFormat>
    <TimeAMString>AM</TimeAMString>
    <TimePMString>PM</TimePMString>
    <ShortTimeFormat>h:mm AMPM</ShortTimeFormat>
    <LongTimeFormat>h:mm:ss AMPM</LongTimeFormat>
    <ShortMonthNames>
      <string>Jan</string>
      <string>Feb</string>
      <string>Mar</string>
      <string>Apr</string>
      <string>May</string>
      <string>Jun</string>
      <string>Jul</string>
      <string>Aug</string>
      <string>Sep</string>
      <string>Oct</string>
      <string>Nov</string>
      <string>Dec</string>
    </ShortMonthNames>
    <LongMonthNames>
      <string>January</string>
      <string>February</string>
      <string>March</string>
      <string>April</string>
      <string>May</string>
      <string>June</string>
      <string>July</string>
      <string>August</string>
      <string>September</string>
      <string>October</string>
      <string>November</string>
      <string>December</string>
    </LongMonthNames>
    <ShortDayNames>
      <string>Sun</string>
      <string>Mon</string>
      <string>Tue</string>
      <string>Wed</string>
      <string>Thu</string>
      <string>Fri</string>
      <string>Sat</string>
    </ShortDayNames>
    <LongDayNames>
      <string>Sunday</string>
      <string>Monday</string>
      <string>Tuesday</string>
      <string>Wednesday</string>
      <string>Thursday</string>
      <string>Friday</string>
      <string>Saturday</string>
    </LongDayNames>
    <ThousandSeparator>,</ThousandSeparator>
    <DecimalSeparator>.</DecimalSeparator>
    <TwoDigitYearCenturyWindow>50</TwoDigitYearCenturyWindow>
    <NegCurrFormat>0</NegCurrFormat>
  </FFormatSettings>
  <FInternal Value="1">
    <NS1:Value>2 - Two</NS1:Value>
  </FInternal>
  <FListOfDoubles>
    <Elements>
      <Double>0.55</Double>
      <Double>0.122</Double>
      <Double>122.23</Double>
    </Elements>
  </FListOfDoubles>
</Testing>

On the first serialized/deserialized value, serializers build up a sort of an internal “object graph” and gathers all information about the data being serialized. The next uses of the same serializer instance yield an 10x performance gain since there is no need to rebuild all the information from scratch. I am still working on more optimizations that could give greater speed boost.

P.S. I can’t show the contents of the deserialized object here so you’ll have to take my word for it.

Note. This is just a preview of what is going on in the trunk. No version is released since I have to iron out the last problems and write the missing unit tests.



TypeInfo workaround

This is going to be a short one. Just wanted to share a simple and elegant work-around for this QC issue:

type
  TypeOf<T> = record
    class function TypeInfo: PTypeInfo; static;
    class function Name: string; static;
    class function Kind: TTypeKind; static;
  end;

{ TypeOf<T> }

class function TypeOf<T>.Kind: TTypeKind;
var
  LTypeInfo: PTypeInfo;
begin
  LTypeInfo := TypeInfo;

  if LTypeInfo <> nil then
    Result := LTypeInfo^.Kind
  else
    Result := tkUnknown;
end;

class function TypeOf<T>.Name: string;
var
  LTypeInfo: PTypeInfo;
begin
  LTypeInfo := TypeInfo;

  if LTypeInfo <> nil then
    Result := GetTypeName(LTypeInfo)
  else
    Result := '';
end;

class function TypeOf<T>.TypeInfo: PTypeInfo;
begin
  Result := System.TypeInfo(T);
end;

Now, you can obtain the type information for any type by simply using TypeOf<T>.TypeInfo:

type
  TRecord<T> = record
    FVal: T;
  end;

  TIntRecord = TRecord<Integer>;

begin
  // TypeInfo(TIntRecord); // Fails with compile-time error;
  WriteLn(GetTypeName(TypeOf<TIntRecord>.TypeInfo));

  ReadLn;
end.

You can also use this construct to obtain the type information for generic types from within themselves:

type
  TSomeRec<T> = record
    function GetMyName: String;
  end;

function TSomeRec<T>.GetMyName(): String;
begin
  // Result := GetTypeName(TypeInfo(TSomeRec<T>)); // Fails
  Result := TypeOf<TSomeRec<T>>.Name;
end;

The only thing to notice is that compile time [DCC Error] E2134 Type ‘XXX’ has no type info errors will not be triggered for types having no type info. Instead, the TypeOf<T>.TypeInfo method will return nil.

DeHL 0.7 is up

DeHL

After a few months of no releases, I finally decided to throw one out — so here it is, DeHL 0.7. This release is adding three more collection, new types and fixes some internal limitations of the library.

For the people that never tried DeHL – it is a collection of types and classes that use (and even abuse) the newer generation Delphi compilers into obtaining some features that were impossible in the past releases.

The newly added collection classes are:

  • TPriorityQueue<TPriority, TValue> which implements the priority queue.
  • TDistinctMultiMap<TKey, TValue> that is similar to TMultiMap but uses sets instead of lists to store the values. This ensures that all values for a key are distinct. As usual 2 more flavors of this class exist – TSortedDistinctMultiMap<TKey, TValue> and TDoubleSortedDistinctMultiMap<TKey, TValue>.
  • TBidiMap<TKey, TValue> implements the concept of bi-directional map in which there is a bi-directional relation between keys and values (both are actually keys). TSortedBidiMap<TKey, TValue> and TDoubleSortedBidiMap<TKey, TValue> are the other two flavors of this class.
  • TStringList<T> and TWideStringList<T> collections are actually generic variants of the normal TStringList and TWideStringList (which are used as base classes).

Other enhancements include:

  • Singleton<T> class. Which can be used to access the same instance of a class across all application.
  • TWideCharSet (this was inspired on other implementations out there but use dynamic arrays to save memory and speed in certain circumstances).
  • TType speed and reliability increases.
  • A global DeHL.Defines.inc files for all $IFDEF needs.
  • Types such as Nullable<T>, Scoped<T> and TKeyValuePair<TKey, TValue> now have their own type classes.

For a full list of changes see the changelog.

This release also makes use of some new improvements of the Delphi 2010 compiler (such as class constructors, or some fixed problems in the generics handling) which makes some features unavailable on Delphi 2009. I plan to keep Delphi 2009 supported for as long as possible, but many new planned features are not going to be available on that platform.

Have Fun!

DeHL 0.6 available

DeHL

Yes I know I have skipped 0.5. The reality is that 0.5 was due a long time ago, but I did not have enough free time on my hands to complete the unit testing for all new features added. I always try to add as many unit tests as possible to test all possible scenarios.

This release features a general reorganization of some key concepts, new type extension mechanics, boxing, a lot more collections, better support for associative collections through specific Enex extensions … and many more. I am not going to lay out all the changes here since there are quite a few of them. You can view the change log here.

One important note, check out the DeHL.Base.pas unit. There are a number of TOGGLE_XXXXX constants. Toggle them to true or false (if you are using Weaver FT builds) to enable or disable specific functionality. In Tiburon (Delphi 2009) all the features are disabled by default (to be on the safe side).

Get the latest build here. If you stumble upon some bugs please report them here.

Have Fun!

Enumerating over a directory structure

Me again, and again with enumeration techniques. In this post I will try to coven a very common problem all programmers have to face one time or another: enumerating all files recursively in a directory.

Yesterday I had to do it again, and again following the standard FindFirst … FindNext and FindClose pattern. So I decided to make my life easier and use enumerators for that. Behold the results:

var
  S: String;
begin
  for S in TDirectory.Entries('I:', true) do
    WriteLn(S);
end;

That’s all you have to do to traverse the directory structure for the I:\ drive — simple and clean.

So how does this work and what are the advantages:

  • The method Entries is a static method of TDirectory structure. There are two more methods: Files, which returns only files and Directories which returns only directories.
  • The exposes static methods return an IEnumerable<String> interface which exposes the GetEnumerator() method. This is an important aspect of the implementation since exposing an interface helps you pass the lifetime management of the enumerable object to the compiler.
  • The for .. in loop then extracts a IEnumerator<String> interface and starts iterating over the directory tree.
  • The TDirectory.TEnumerator object does not use recursion internally to traverse the tree. It stores the TSearchRec at each level in a TStack<> instance (well it’s kinda like recursion).
  • You can “break” off the loop at any moment. Simple and easy.
  • It executes more instruction per iteration but I think it’s a manageable trade-off for its ease of use.

Again, th unit can be found here:  File system enumeration (2318).

Have fun!

Arbitrary precision integers

I have decided to extract the BigInteger and BigCardinal data types from DeHL and make a bundle with them separately. Someone may be interested in these types and not so much in DeHL. The attached unit should compile on D2007 and D2006 (not sure for earlier versions).

What can you learn:

  • Shows you how to develop “value types” with predictable states in Delphi using records, operator overloading and “managed types” (in this case a dynamic array).
  • How to develop a proxy class that plugs your type into the Variant manager.
  • Demonstrates how to use operator overloading (and I have overloaded a lot of operators there).
  • The reliance of the predictable state of the managed types to check whether a value is initialized or not.
  • Shows how the use an in-place BCD-izer algorithm that helps converting a big number into a string. It took me a quite some time and tweaking to come up with it so it may be interesting to someone.

Where can you use it:

  • Everywhere, except FOR loops of course.
  • The types are transparent and should be used the exact way as normal integer do.
  • Use it in applications that require very big integer values.
  • … you’ll come up with something else.

A small example of course:

var
  a, b: BigInteger;
begin
  a := StrToBigInteger('8942374632463284632623837643468589' + 
        '26367423509458904389575645349857');
  b := a * a;
  WriteLn(IntToStr(b));
end.

… or as variants …

var
  a, b: Variant;
begin
  a := StrToBigInteger('8942374632463284632623837643468589' + 
        '26367423509458904389575645349857');
  b := a * a;
  WriteLn(b);
end.

Final note: the code is tested since it’s a part of DeHL (in which I try to test everything I can) so it’s safe to use. BUT if you find any bugs, please drop me a line.

Finally! The code! You can find it here: Arbitrary precision Integers (1616)

DeHL documentation

Since I had a few requests, I have started putting together some documentation on DeHL. It’s pretty basic and only includes some conceptual stuff, not API documentation. You can find it here.

Version 0.5 is shaping up quite well with tons of changes, and new stuff coming in — but that’s a another story.

Not much in this post,  Cheers!

What’s new in DeHL 0.4

In the previous post I have mostly talked about Enex (Enumerable Extensions) I have enabled for all DeHL’s collection classes. Now I will list the other changes that got into this release that may be of interest to you:

  • Enex (Enumerable Extensions) in DeHL.Collections.Enex. Support for chained queries, predicates and selectors.
  • THashTree (A tree that uses hashes to access the child nodes). in DeHL.Collections.HashTree.
  • TConverter can be created with given IType‘s as opposed from the default ones.
  • TSuppressedWrapperType which can be used to wrap another IType. It’s main function is to forward all calls to the enclosed IType except the Cleanup() and Management() ones. This way you can use temporary collections inside your collection without the “cleanup” problem.
  • Fixed 2 bugs in Resize() method of THashSet and TDictionary which resulted in corrupted heap in some cases.
  • TLinkedList properties called First and Last were renamed to FirstNode and LastNode to avoid naming conflicts with Enex’s First() and Last() methods.
  • Fixed a bug in TQueue‘s enumerator object. It enumerated the queue’s elements quite wrong.
  • Renamed TScopedPtr to a better name: Scoped.
  • Added a new type: Nullable. It can be used to represent a NULL value for non-object types such as Integer.
  • New life-time extensions for TRefCountedObject. Each ref-counted object can now “keep-alive” other ref-counted objects. This was an useful addition in order to support query-chains in Enex.

How to use the new Nullable<> type:

function GetTheSum(const AList: TList<Integer>): 
    Nullable<Integer>;
begin
  { Check if the list is non-null and has elements }
  if (AList = nil) or (AList.Count = 0) then
    Result.MakeNull();
  else
    Result := AList.Sum();
end;

var
  LSum: Nullable<Integer>;
begin
  { ... }
  LSum := GetTheSum(...);
  if not LSum.IsNull then
    WriteLn(LSum.Value);
end;

I agree, this is not the best example, but it is OK to get you started. Nullable values are useful when you function may return or even accept a normal (let’s say Integer) value or a NULL.

More about enumerables

In the last post I have described how enumeration works in Delphi. Now I will try to expand the subject a bit and make a more general description of what enumerability actually means and how it can solve some basic problems and patterns.

An enumrebale is not necessarily a collection“. You should keep in mind that enumerability doesn’t apply only to collections. It is a more abstract concept. Enumeration can apply to any kind of sequence — abstract, mathematical or a collection. As an example let’s talk about streams — streams are also enumerables, in some cases these behave exactly like enumerable collections: For streams that do not have a defined size (like for example downloading a file over a HTTP stream that doesn’t say the file size). All you can do in these cases is to read bytes from the stream until you read the bottom and receive an EOF notification.

Abstract vs concrete sequences“. An abstract sequence in my definition is one that doesn’t occupy space and each new element is generated on the fly by the enumerable. A concrete sequence is more akin to a collection which already has its elements stored somewhere and all it does is to fetch them when enumerating.

In the next example I took a known mathematical sequence which I am sure all of you are acquainted with: the Fibonacci sequence; and created an abstract enumerable which at each iteration calculates the next number:

type
  TFibonacciEnumerator = record
  private
    FCurrent, FPrev,
      FCount, FMaxCount: Cardinal;

    function GetCurrent: Cardinal;
  public
    { Move to the next calculation }
    function MoveNext(): Boolean;

    { Reads the current number }
    property Current: Cardinal read GetCurrent;
  end;

  Fibonacci = record
  private
    FLimit: Cardinal;

  public
    { Returns the enumerator object }
    function GetEnumerator(): TFibonacciEnumerator;

    { Static function that }
    class function OfLength(const ALength: Cardinal): Fibonacci; static;
  end;

{ TFibonacciEnumerator }

function TFibonacciEnumerator.GetCurrent: Cardinal;
begin
  Result := FCurrent;
end;

function TFibonacciEnumerator.MoveNext: Boolean;
var
  LTemp: Cardinal;
begin
  { Check if the end of the chain }
  if FCount >= FMaxCount then
    Exit(false);

  Result := true;

  { Make the next calculation }
  if FCount <= 1 then
    FCurrent := FCount
  else
  begin
    LTemp := FCurrent;
    FCurrent := FCurrent + FPrev;
    FPrev := LTemp;
  end;

  Inc(FCount);
end;

{ Fibonacci }

function Fibonacci.GetEnumerator: TFibonacciEnumerator;
begin
  Result.FCurrent := 0;
  Result.FPrev := 0;
  Result.FCount := 0;
  Result.FMaxCount := FLimit;
end;

class function Fibonacci.OfLength(const ALength: Cardinal): Fibonacci;
begin
  if ALength = 0 then
    raise EArgumentOutOfRangeException.Create('ALength should be > 0');

  Result.FLimit := ALength;
end;

var
  Number: Cardinal;
begin
  { Show the first 100 Fibonacci numbers }
  for Number in Fibonacci.OfLength(100) do
    WriteLn(Number);

  ReadLn;
end.

What is cool about this example is the fact that you do not occupy any memory with the calculated numbers, those are made on-the-fly.

This example did not demonstrate another important aspect of enumerables and exactly: “materializing” abstract sequences. If my Fibonacci record was actually a class derived from TEnumerable<Cardinal>, I could have written this:

var
  List: TList<Cardinal>;
begin
  List := TList<Cardinal>.Create(Fibonnaci.OfLength(100));
end.

This would have “materialized” the abstract sequence generated by the Fibonacci object and stored each value in a concrete sequence (the List collection).

Unfortunately Delphi’s generics support is at it’s infancy so not many features are yet available in the standard classes to exploit the power of Enumerables. I predict this will change over time and more cool stuff will appear either directly in the RTL or in form of 3rd-party libraries.

Enumerables in Delphi

It is not a surprise that most programming languages in our time have the built-in support for enumerators/iterators. This is a mandatory feature since enumerators and enumerable collections simplify the development of applications and make the code cleaner. Languages such as C#, Java or Delphi have built-in syntax constructs that allow simplified and clean ways of enumerating over a collection by the means of  “foreach” or special “for” loops.

In Delphi, enumerators have been added to the language for some time now in form of a special “for” loop (you can read in the Help about it more). It allows enumerating elements of an array, string and in most collection classes. While in the case of intrinsic types, the compiler simply transforms the loop into an ordinary “for” loop, in the case of collection classes things become more interesting. But let’s take each case separately:

Enumerating over a String:

var
  LString: String;
  LChar: Char;
begin
  LString := 'Hello World';

  for LChar in LString do
    Write(c);
end;

It’s simple as that! It does not have such a negative impact on the speed and make your code look much cleaner, it also helps avoid using an index variable. The same method can be applied for AnsiString, WideString and ShortString.

Enumerating over an Array:

var
  LArr: TBytes;
  LByte: Byte;
begin
  LArr := TBytes.Create(1, 2, 3, 4, 5);

  for LByte in LArr do
    Write(LByte);
end.

Looks simple. And indeed it is! There are more use cases for different types of arrays, but you can find that in the Help.

Enumerating over a collection:

var
  List: TList<Integer>;
  I: Integer;
begin
  List := TList<Integer>.Create();
  List.AddRange([1, 2, 3, 4, 5]);

  for I in List do
    WriteLn(I);
end.

In this case the overhead for the “for” loop is higher since a call to List.GetEnumerator() is made to obtain an enumerator object. Then at each iteration the MoveNext() and Current() methods are called on the enumerator to move to the next element in the list and retrieve its value.

There are in fact only a few rules that you must abide in order to support enumeration in your collections:

  1. You must have either a class, interface or record type.
  2. Your class, interface or record must expose a GetEnumerator() function that returns an enumerator.
  3. The enumerator can be either a class, interface or a record type.
  4. The enumerator must expose a MoveNext() function that returns a Boolean and a Current property that returns the current element.
  5. When the enumerator is created there is no current element selected. Only after the first call to MoveNext() your enumerator must select the first element in the collection.
  6. MoveNext() must return true if the next element was selected or false if the collection is finished.

Extreme case — enumerating over a record using a record enumerator:

type
  { The enumerator record }
  TRecordEnumerator = record
  private
    FArray: TBytes;
    FIndex: Integer;

    function GetCurrent: Byte;
  public
    function MoveNext(): Boolean;
    property Current: Byte read GetCurrent;
  end;

  { The record/collection that will be enumerated }
  TRecordCollection = record
  private
    FArray: TBytes;
  public
    function GetEnumerator(): TRecordEnumerator;
  end;

{ TRecordCollection }

function TRecordCollection.GetEnumerator: TRecordEnumerator;
begin
  Result.FArray := FArray;
  Result.FIndex := -1;
end;

{ TRecordEnumerator }

function TRecordEnumerator.GetCurrent: Byte;
begin
  Result := FArray[FIndex];
end;

function TRecordEnumerator.MoveNext: Boolean;
begin
  Inc(FIndex);

  if FIndex >= Length(FArray) then
    Exit(false);

  Exit(true);
end;

var
  LColl: TRecordCollection;
  B: Byte;

begin
  LColl.FArray := TBytes.Create(1, 2, 3, 4, 5, 6);

  for B in LColl do
    WriteLn(B);

  ReadLn;
end.

The compiler doesn’t really care what it is enumerating and what it uses to do that. It simply must find the required methods exposed in the enumerated collection and in the enumerator.