Generics + System.Move = Kaboom!

It’s probably not a news anymore that the Move method (in System unit) should not be used to move chunks of memory that contain references to managed type (like String, Interface of dynamic arrays). By moving only the reference to another memory block you’re not incrementing the reference count of that object which results in a big Kaboom later on.

A simple example of this would be:

type
  { Declare a data type which uses a ref counted 
     object - String }
  TMyData = record
    FStr: String;
  end;

var
  A, B, C: TMyData;
begin
  { Initialize the initial string }
  A.FStr := 'Hello World!';

  { Move A to B (no compiler magic involved) }
  Move(A, B, SizeOf(A));

  { Move A to C (with compiler magic involved) }
  C := A;

  { Change the string stored in A and in C }
  A.FStr[1] := '_';
  C.FStr[1] := '+';

  WriteLn('A = ', A.FStr);
  WriteLn('B = ', B.FStr);
  WriteLn('C = ', C.FStr);

  ReadLn;
end.

The result is not surprising: A = _ello World!, B = +ello World! C = +ello World. While A and B have correct values, C certainly doesn’t. This happens for obvious reasons – C would either have a reference to B‘s or A‘s string, while not holding a reference count to them.

So what does this have to do with generics? Simple, while implementing generic collections you might be tempted to use the Move function to copy data from an internal array to some external one (an example would be ToArray() method of a generic list class). This is a good idea indeed, but only if your generic type is not a managed type! Moving an array of integer to another array of integers is safe, while moving an array of strings is not. This also means that you would have to use the most generic moving possible: copy element by element which would slow down the collection if the type is integer for example.

Below is a class designed to be as fast as possible depending on the actual type of a generic class:

type
  { Our mover class }
  TArrayMover<T> = class sealed
  private
   FIsManagedType: Boolean;

  public
    constructor Create();
    procedure Move(var Source, Dest: array of T; 
      const SourceIndex, DestIndex, Count: Cardinal);
  end;

{ TMover<T> }
constructor TArrayMover<T>.Create;
const
  { Declare unsafe types which need 
    element-by-element copy }
  UnsafeTypes = [tkLString, tkWString, tkUString, 
    tkVariant, tkArray, tkInterface, tkRecord, tkDynArray];
var
  PInfo: PTypeInfo;
  I: Cardinal;
begin
  { Find out the type of the element }
  PInfo := PTypeInfo(TypeInfo(T));

  if (PInfo <> nil) and
    (PInfo^.Kind in UnsafeTypes) then
    FIsManagedType := true
  else
    FIsManagedType := false;
end;

procedure TArrayMover<T>.Move(var Source, Dest: array of T; 
  const SourceIndex, DestIndex, Count: Cardinal);
var
  I: Cardinal;
begin
  { No range checking! }

  if FIsManagedType then
  begin
    for I := 0 to Count - 1 do
      Dest[I + DestIndex] := Source[I + SourceIndex];
  end else
    System.Move(Source[SourceIndex], Dest[DestIndex], 
      Count * SizeOf(T));
end;

To use it, you first create an instance of TArrayMover. In it’s constructor it will decide if the data being operated on is unsafe to be copied directly. The Move method will then used that decision to select the appropriate copy method.

P.S. Have to fix my code now 🙂

6 Comments

  1. Correction, looking again, the values are exactly what they should be.

    The sizeOf(A) is probably 4, the size of a pointer.

    A is a pointer , you move that pointer to B.
    Hello World ref count 1

    you assign a ref counted version to C.
    Hello World ref count 2

    you change A which creates a new string with_ in front.
    Hello World ref count 1
    _ello World ref count 1

    you change C which changes the string it is pointing to with a + in front and
    +ello World ref count 1 orignally the Hello World

    since B was not reference counted and was pointing to the original Hello world, has the same value as C.

    So the results are exactly as you coded it.

  2. I suspect the larger lesson here is to never use true low level calls unless you have a true low level understanding of how things work.

    And ya, move is a very dangerous function because it does work so low level.

    Btw, nothing in the output is technically ‘wrong’ or surprising. Copy on write semantics force A to generate a new instance when you modify it and the reference count is higher than 1 (which it is thanks to the copy in C). That leaves B & C pointing at the original string which now has a reference count of 1. When you modify C’s string data, it does not have to reallocate so it just modifies the data directly. Since B points to the same string as C, the result is inevitable.

  3. The whole point of the example was to show that by using move you’re not going to get the expected result: B = “Hello Wold” (as if you would assign A to B directly).

    Wanted to make an example in which an AV would actually show but that would have been a long one, so I decided to go with the simplest version.

  4. Can’t read the code because it displays so tiny in Firefox.

    But you’ve raised an interesting point: in a managed environment, a lot of low-level stuff may be off-limits.

    Another reason for no GC in native Delphi.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.