Update for FSEnum

June 15th, 2009 alex 5 comments

Since some people asked me to improve FSEnum unit (see this post). I decided to give it another go and add some features and improve the others.

So what’s new?

  • Optional Boolean parameter for all methods that tells the enumerator whether to return the relative paths (like it did before) or absolute ones.
  • There are two levels of “mask matching” (if the enumeration is recursive). The first level allows everything to pass since you need to recurse into all directories. The real mask matching is done at the second level using the TMask (unit Masks).
  • Two new methods (overloaded): FileSystemEntries which returns an enumerable over TFileSystemEntry structure.
  • A new TFileSystemEntry structure that contains more information about a file/directory.

The next example shows the use of TDirectory.FileSystemEntries method:

var
  Entry: TFileSystemEntry;

begin
  { Show all read-only executable files on drive I: bigger than 1MB }
  for Entry in TDirectory.FileSystemEntries('i:', '*.exe', true) do
    if (Entry.Size > 1000000) and Entry.IsReadOnly then WriteLn(Entry.FullName);

  ReadLn;
end.

The unit is the same (has been updated!)  and can be found here: File system enumeration (1054)

Categories: Programming

Extending TObject with data at runtime

June 13th, 2009 alex 8 comments

In a recent comment on this blog Patrick van Logchem suggested a way of extending an existing object instance with custom data at run-time (originally the idea of Thorsten Engler). The main idea is to be able to “assign” to an arbitrary object (whose sources you cannot change) some other object at run-time. This may prove to be useful in different scenarios when you need some additional data to be carried by an object.

The suggested idea was to use the monitor field which all TObject instances have. If you are not familiar with monitors in Delphi, let me explain: All TObject instances carry a new “hidden” field which is a pointer to a TMonitor (see System unit) structure. This referenced TMonitor structure is actually a kind of synchronization object that lets you solve some common threading tasks easier. You can read more about this in other posts, since this post is about using that field to store data.

The restrictions:

  • The solution must not be too simplistic. Simply rewriting the pointer in TObject instances is not allowed.
  • Normal monitor routines must function as they did before. This means that we must store a real monitor there alongside with the data.
  • No custom functions to simulate monitor support, and no class helpers. See previous item.
  • The attached data must be disposed when the object is disposed so to avoid memory and resource leaks.

The features:

  • Possibility to extend an object with another object: ExtendObject(Object, Extension);
  • Possibility to query an object for an extension: GetObjectExtension(Object): Extension;
  • Possibility to remove an extension from an object: RemoveObjectExtension(Object): Extension;
  • Object is any type of object in Delphi! No restrictions, no common ancestor; just plain TObject!
  • Extension is also simply a TObject value. It’s user-defined in it’s implementation and purpose.

The first implementation I came up with is non-intrusive. I wanted to avoid patching the System exposed functions at run-time. The following list enumerates the design and restrictions of the first implementation:

  • The unit must me USED directly after the inclusion of SysUtils in the main source file. This is an inconvenience of course, but it is a required one. Note that the unit must be included AFTER and not BEFORE SysUtils.
  • The unit overrides the values in the System.MonitorSupport variable and inserts it’s own custom routines used to obtain and release synchronization objects.
  • The unit uses the old System.MonitorSupport routines to do the real job. These routines are normally provided by the SysUtils unit — thus the dependency on SysUtils.
  • For each synchronization object requested, my custom routines return a fake handle which is actually a pointer to a structure containing a real handle and a TObject value.
  • This method does not use the monitor field per se; rather, it uses a field in the monitor itself.
  • Class helpers are used in implementation section to obtain access to internal method in the TMonitor structure.
  • While this method is non-intrusive at assembly level, it is surely more complex and uses more CPU cycles.

The second implementation is intrusive! It patches some functions in System unit so that my handler are executed in certain scenarios. The following list enumerates what’s going on in this one:

  • The unit should be USED after or before SysUtils. This restriction comes from the fact that monitors initialized before this unit is included have a different format. So there may be (or maybe not) problems.
  • Two functions are patched in System unit: TMonitor.Destroy(TObject) and TMonitor.Create(). First one is executed when a monitor is destroyed – normally at object death; and the second one is called when a monitor value is needed for the first time.
  • The two injected functions do basically the same thing as the System versions, with a slight turn – a TMonitor + Pointer value is created/destroyed. This bonus Pointer value hold the extension object.
  • Class helpers are used internally to gain access to some monitor support routines.
  • This method does not incur any overhead on normal monitor operations, so it is the preferred one.

And now to some code:

uses
  SysUtils,
  ObjectExtensions_Intrusive;

type
  TStrExtenstion = class
  end;

  TIntExtension = class
  end;

procedure DoStuff(const A: TObject);
var
  Extension: TObject;
begin
  if A = nil then
    Exit;

  Extension := GetObjectExtension(A);

  if Extension = nil then
    Exit;

  WriteLn(Extension.ClassName);
end;

var
  a, b, c: TObject;
begin
  a := TObject.Create;
  b := TObject.Create;
  c := TObject.Create;

  ExtendObject(a, TStrExtenstion.Create);
  ExtendObject(b, TIntExtension.Create);

  DoStuff(a);
  DoStuff(b);
  DoStuff(c);

  ReadLn;
end.

It’s not hard to imagine that this method can be used is different circumstances – the ideas are all yours!

MEGA WARNING: The attached code is barely tested, possibly unstable and even worse – maybe destructive. This is just a fun and proof of concept code and not something that can be used in applications.

The code can be found in this archive: Object Extensions (629)

Mentioned the authors of the idea.

Categories: Programming

Enumerating over a directory structure

June 12th, 2009 alex 13 comments

Me again, and again with enumeration techniques. In this post I will try to coven a very common problem all programmers have to face one time or another: enumerating all files recursively in a directory.

Yesterday I had to do it again, and again following the standard FindFirst … FindNext and FindClose pattern. So I decided to make my life easier and use enumerators for that. Behold the results:

var
  S: String;
begin
  for S in TDirectory.Entries('I:', true) do
    WriteLn(S);
end;

That’s all you have to do to traverse the directory structure for the I:\ drive — simple and clean.

So how does this work and what are the advantages:

  • The method Entries is a static method of TDirectory structure. There are two more methods: Files, which returns only files and Directories which returns only directories.
  • The exposes static methods return an IEnumerable<String> interface which exposes the GetEnumerator() method. This is an important aspect of the implementation since exposing an interface helps you pass the lifetime management of the enumerable object to the compiler.
  • The for .. in loop then extracts a IEnumerator<String> interface and starts iterating over the directory tree.
  • The TDirectory.TEnumerator object does not use recursion internally to traverse the tree. It stores the TSearchRec at each level in a TStack<> instance (well it’s kinda like recursion).
  • You can “break” off the loop at any moment. Simple and easy.
  • It executes more instruction per iteration but I think it’s a manageable trade-off for its ease of use.

Again, th unit can be found here:  File system enumeration (1054).

Have fun!

Categories: Programming

Arbitrary precision integers

June 10th, 2009 alex 7 comments

I have decided to extract the BigInteger and BigCardinal data types from DeHL and make a bundle with them separately. Someone may be interested in these types and not so much in DeHL. The attached unit should compile on D2007 and D2006 (not sure for earlier versions).

What can you learn:

  • Shows you how to develop “value types” with predictable states in Delphi using records, operator overloading and “managed types” (in this case a dynamic array).
  • How to develop a proxy class that plugs your type into the Variant manager.
  • Demonstrates how to use operator overloading (and I have overloaded a lot of operators there).
  • The reliance of the predictable state of the managed types to check whether a value is initialized or not.
  • Shows how the use an in-place BCD-izer algorithm that helps converting a big number into a string. It took me a quite some time and tweaking to come up with it so it may be interesting to someone.

Where can you use it:

  • Everywhere, except FOR loops of course.
  • The types are transparent and should be used the exact way as normal integer do.
  • Use it in applications that require very big integer values.
  • … you’ll come up with something else.

A small example of course:

var
  a, b: BigInteger;
begin
  a := StrToBigInteger('8942374632463284632623837643468589' +
        '26367423509458904389575645349857');
  b := a * a;
  WriteLn(IntToStr(b));
end.

… or as variants …

var
  a, b: Variant;
begin
  a := StrToBigInteger('8942374632463284632623837643468589' +
        '26367423509458904389575645349857');
  b := a * a;
  WriteLn(b);
end.

Final note: the code is tested since it’s a part of DeHL (in which I try to test everything I can) so it’s safe to use. BUT if you find any bugs, please drop me a line.

Finally! The code! You can find it here: Arbitrary precision Integers (574)

Categories: Programming

Playing with Windows 7

June 10th, 2009 alex 20 comments

I was playing around with some (default Windows) programs these days and noticed that taskbar items for those programs show some cool effects whenever there is some progress going on So I decided to write a little unit to take advantage of these effects in one of my Delphi applications.

To take advantage of the new Windows 7 TaskBar API take a look at the ITaskBarList3 interface (an probably 4 for that matter). I have translated the interface to Delphi and added some initialization code and came up with a little unit that exposes a class helper for TCustomForm.

The final user-friendly code looks like this:

procedure TForm1.Button1Click(Sender: TObject);
begin
  SetProgressState(tbpsError);
  SetProgressValue(50, 100);
end;

The SetProgressState and SetProgressValue methods are inserted by the class helper. The example makes the taskbar icon for the window look like a 50% complete red progress bar (indicating an error):

error_task_bar

or something like this for a “indefinite” state (of course the animation is missing so you can’t see the effects):

ind_task_item

If you want to play with the unit, take a look here: Windows 7 TaskBar helper for Delphi (1016)

Categories: Programming

DeHL documentation

June 9th, 2009 alex No comments

Since I had a few requests, I have started putting together some documentation on DeHL. It’s pretty basic and only includes some conceptual stuff, not API documentation. You can find it here.

Version 0.5 is shaping up quite well with tons of changes, and new stuff coming in — but that’s a another story.

Not much in this post,  Cheers!

Categories: Programming

What’s new in DeHL 0.4

May 13th, 2009 alex 4 comments

In the previous post I have mostly talked about Enex (Enumerable Extensions) I have enabled for all DeHL’s collection classes. Now I will list the other changes that got into this release that may be of interest to you:

  • Enex (Enumerable Extensions) in DeHL.Collections.Enex. Support for chained queries, predicates and selectors.
  • THashTree (A tree that uses hashes to access the child nodes). in DeHL.Collections.HashTree.
  • TConverter can be created with given IType‘s as opposed from the default ones.
  • TSuppressedWrapperType which can be used to wrap another IType. It’s main function is to forward all calls to the enclosed IType except the Cleanup() and Management() ones. This way you can use temporary collections inside your collection without the “cleanup” problem.
  • Fixed 2 bugs in Resize() method of THashSet and TDictionary which resulted in corrupted heap in some cases.
  • TLinkedList properties called First and Last were renamed to FirstNode and LastNode to avoid naming conflicts with Enex’s First() and Last() methods.
  • Fixed a bug in TQueue‘s enumerator object. It enumerated the queue’s elements quite wrong.
  • Renamed TScopedPtr to a better name: Scoped.
  • Added a new type: Nullable. It can be used to represent a NULL value for non-object types such as Integer.
  • New life-time extensions for TRefCountedObject. Each ref-counted object can now “keep-alive” other ref-counted objects. This was an useful addition in order to support query-chains in Enex.

How to use the new Nullable<> type:

function GetTheSum(const AList: TList<Integer>):
    Nullable<Integer>;
begin
  { Check if the list is non-null and has elements }
  if (AList = nil) or (AList.Count = 0) then
    Result.MakeNull();
  else
    Result := AList.Sum();
end;

var
  LSum: Nullable<Integer>;
begin
  { ... }
  LSum := GetTheSum(...);
  if not LSum.IsNull then
    WriteLn(LSum.Value);
end;

I agree, this is not the best example, but it is OK to get you started. Nullable values are useful when you function may return or even accept a normal (let’s say Integer) value or a NULL.

Categories: Programming

DeHL 0.4: Linq-ifying Delphi

May 12th, 2009 alex 3 comments

Since I have started working on DeHL, one of the main purposes was to get to the point where I could use Linq-like extensions on collection classes. Of course I could not call the new functionality Linq since it’s not language-integrated, so I decided to give it another name: Enex (aka Enumerable Extensions). All collection classes in DeHL now support a wide set of methods that allow querying and selecting elements from the starting collection. A lot of work has been put into this release, ranging from extensions to type system, object life-time management, optimizations and bug-fixes and of course a lot of testing.

Now, back to Enex, let’s visit some cool examples of what the new functionality can do:

Let’s check a very simple example. Assuming you have a list of integers and you want to find the sum of all elements that are greater than the list average. In the “times before” you wold write something like this:

  { Calculate the average of the list }
  Sum := 0;

  for I := 0 to List.Count - 1 do
    Sum := Sum + List[I];

  Avg := Sum div List.Count;

  { Calculate the sum of the elements above average }
  Sum := 0;

  for I := 0 to List.Count - 1 do
    if List[I] > Avg then
      Sum := Sum + List[I];

… Now it is as simple as this:

  { ... }
  Sum := List.WhereGreater(List.Average()).Sum();
  { ... }

Now, let’s see another more complex example. Considering that you have two collections of integers. You have to determine the maximum value common to both collections. Check it out:

  { ... }
  Max := AColl1.Intersect(AColl2).Max();
  { ... }

Next: what is the sum of all odd numbers in interval [0..100]?

  Sum := TEnexEnumerable<Integer>
    .Interval(0, 100)
    .Where(function(Arg1: Integer): Boolean
      begin
        Exit(Odd(Arg1);
      end)
    .Sum();

The last one: You want to take all integers from a list with their absolute value and only once; and you want to write them on the screen directly:

  for I in ARandomList
    .Select(function(Arg1: Integer): Integer
      begin
        Exit(Abs(Arg1));
      end)
    .Distinct()
  do
    WriteLn(I);

What’s working and what’s not:

  • Some problems in the compiler prevent the use of generic function Select<TOut> and Cast<TOut> for now. These two could have been used to transform each element of the collection to some other type.
  • Everything is based in interfaces to make use of automatical garbage collection. the only restriction is that the first enumerable (from which the chain starts) should be either an interface or must be destroyed explicitly.
  • All collections have overriden versions of some functions. For example in a list, the First() method is implemeted directly so that you benefit from full speed whenever possible.
  • No lambdas … that makes writing selectors or predicates rather unpleasant.
  • Sum() and Average() are only supported for types registered as integers or reals (including BigCardinal and BigInteger).
  • All selector and predicate-based selection is done on-the fly so there is no additional memory used!

There are of course much more possible use cases for Enex-enabled collections, I simply do not have the time to list all of them.

You can download the latest version of DeHL here.

Categories: Programming

More about enumerables

May 12th, 2009 alex No comments

In the last post I have described how enumeration works in Delphi. Now I will try to expand the subject a bit and make a more general description of what enumerability actually means and how it can solve some basic problems and patterns.

An enumrebale is not necessarily a collection“. You should keep in mind that enumerability doesn’t apply only to collections. It is a more abstract concept. Enumeration can apply to any kind of sequence — abstract, mathematical or a collection. As an example let’s talk about streams — streams are also enumerables, in some cases these behave exactly like enumerable collections: For streams that do not have a defined size (like for example downloading a file over a HTTP stream that doesn’t say the file size). All you can do in these cases is to read bytes from the stream until you read the bottom and receive an EOF notification.

Abstract vs concrete sequences“. An abstract sequence in my definition is one that doesn’t occupy space and each new element is generated on the fly by the enumerable. A concrete sequence is more akin to a collection which already has its elements stored somewhere and all it does is to fetch them when enumerating.

In the next example I took a known mathematical sequence which I am sure all of you are acquainted with: the Fibonacci sequence; and created an abstract enumerable which at each iteration calculates the next number:

type
  TFibonacciEnumerator = record
  private
    FCurrent, FPrev,
      FCount, FMaxCount: Cardinal;

    function GetCurrent: Cardinal;
  public
    { Move to the next calculation }
    function MoveNext(): Boolean;

    { Reads the current number }
    property Current: Cardinal read GetCurrent;
  end;

  Fibonacci = record
  private
    FLimit: Cardinal;

  public
    { Returns the enumerator object }
    function GetEnumerator(): TFibonacciEnumerator;

    { Static function that }
    class function OfLength(const ALength: Cardinal): Fibonacci; static;
  end;

{ TFibonacciEnumerator }

function TFibonacciEnumerator.GetCurrent: Cardinal;
begin
  Result := FCurrent;
end;

function TFibonacciEnumerator.MoveNext: Boolean;
var
  LTemp: Cardinal;
begin
  { Check if the end of the chain }
  if FCount >= FMaxCount then
    Exit(false);

  Result := true;

  { Make the next calculation }
  if FCount <= 1 then
    FCurrent := FCount
  else
  begin
    LTemp := FCurrent;
    FCurrent := FCurrent + FPrev;
    FPrev := LTemp;
  end;

  Inc(FCount);
end;

{ Fibonacci }

function Fibonacci.GetEnumerator: TFibonacciEnumerator;
begin
  Result.FCurrent := 0;
  Result.FPrev := 0;
  Result.FCount := 0;
  Result.FMaxCount := FLimit;
end;

class function Fibonacci.OfLength(const ALength: Cardinal): Fibonacci;
begin
  if ALength = 0 then
    raise EArgumentOutOfRangeException.Create('ALength should be > 0');

  Result.FLimit := ALength;
end;

var
  Number: Cardinal;
begin
  { Show the first 100 Fibonacci numbers }
  for Number in Fibonacci.OfLength(100) do
    WriteLn(Number);

  ReadLn;
end.

What is cool about this example is the fact that you do not occupy any memory with the calculated numbers, those are made on-the-fly.

This example did not demonstrate another important aspect of enumerables and exactly: “materializing” abstract sequences. If my Fibonacci record was actually a class derived from TEnumerable<Cardinal>, I could have written this:

var
  List: TList<Cardinal>;
begin
  List := TList<Cardinal>.Create(Fibonnaci.OfLength(100));
end.

This would have “materialized” the abstract sequence generated by the Fibonacci object and stored each value in a concrete sequence (the List collection).

Unfortunately Delphi’s generics support is at it’s infancy so not many features are yet available in the standard classes to exploit the power of Enumerables. I predict this will change over time and more cool stuff will appear either directly in the RTL or in form of 3rd-party libraries.

Categories: Programming

Enumerables in Delphi

May 12th, 2009 alex 9 comments

It is not a surprise that most programming languages in our time have the built-in support for enumerators/iterators. This is a mandatory feature since enumerators and enumerable collections simplify the development of applications and make the code cleaner. Languages such as C#, Java or Delphi have built-in syntax constructs that allow simplified and clean ways of enumerating over a collection by the means of  “foreach” or special “for” loops.

In Delphi, enumerators have been added to the language for some time now in form of a special “for” loop (you can read in the Help about it more). It allows enumerating elements of an array, string and in most collection classes. While in the case of intrinsic types, the compiler simply transforms the loop into an ordinary “for” loop, in the case of collection classes things become more interesting. But let’s take each case separately:

Enumerating over a String:

var
  LString: String;
  LChar: Char;
begin
  LString := 'Hello World';

  for LChar in LString do
    Write(c);
end;

It’s simple as that! It does not have such a negative impact on the speed and make your code look much cleaner, it also helps avoid using an index variable. The same method can be applied for AnsiString, WideString and ShortString.

Enumerating over an Array:

var
  LArr: TBytes;
  LByte: Byte;
begin
  LArr := TBytes.Create(1, 2, 3, 4, 5);

  for LByte in LArr do
    Write(LByte);
end.

Looks simple. And indeed it is! There are more use cases for different types of arrays, but you can find that in the Help.

Enumerating over a collection:

var
  List: TList<Integer>;
  I: Integer;
begin
  List := TList<Integer>.Create();
  List.AddRange([1, 2, 3, 4, 5]);

  for I in List do
    WriteLn(I);
end.

In this case the overhead for the “for” loop is higher since a call to List.GetEnumerator() is made to obtain an enumerator object. Then at each iteration the MoveNext() and Current() methods are called on the enumerator to move to the next element in the list and retrieve its value.

There are in fact only a few rules that you must abide in order to support enumeration in your collections:

  1. You must have either a class, interface or record type.
  2. Your class, interface or record must expose a GetEnumerator() function that returns an enumerator.
  3. The enumerator can be either a class, interface or a record type.
  4. The enumerator must expose a MoveNext() function that returns a Boolean and a Current property that returns the current element.
  5. When the enumerator is created there is no current element selected. Only after the first call to MoveNext() your enumerator must select the first element in the collection.
  6. MoveNext() must return true if the next element was selected or false if the collection is finished.

Extreme case — enumerating over a record using a record enumerator:

type
  { The enumerator record }
  TRecordEnumerator = record
  private
    FArray: TBytes;
    FIndex: Integer;

    function GetCurrent: Byte;
  public
    function MoveNext(): Boolean;
    property Current: Byte read GetCurrent;
  end;

  { The record/collection that will be enumerated }
  TRecordCollection = record
  private
    FArray: TBytes;
  public
    function GetEnumerator(): TRecordEnumerator;
  end;

{ TRecordCollection }

function TRecordCollection.GetEnumerator: TRecordEnumerator;
begin
  Result.FArray := FArray;
  Result.FIndex := -1;
end;

{ TRecordEnumerator }

function TRecordEnumerator.GetCurrent: Byte;
begin
  Result := FArray[FIndex];
end;

function TRecordEnumerator.MoveNext: Boolean;
begin
  Inc(FIndex);

  if FIndex >= Length(FArray) then
    Exit(false);

  Exit(true);
end;

var
  LColl: TRecordCollection;
  B: Byte;

begin
  LColl.FArray := TBytes.Create(1, 2, 3, 4, 5, 6);

  for B in LColl do
    WriteLn(B);

  ReadLn;
end.

The compiler doesn’t really care what it is enumerating and what it uses to do that. It simply must find the required methods exposed in the enumerated collection and in the enumerator.

Categories: Programming