Enumerating over a directory structure

Me again, and again with enumeration techniques. In this post I will try to coven a very common problem all programmers have to face one time or another: enumerating all files recursively in a directory.

Yesterday I had to do it again, and again following the standard FindFirst … FindNext and FindClose pattern. So I decided to make my life easier and use enumerators for that. Behold the results:

var
  S: String;
begin
  for S in TDirectory.Entries('I:', true) do
    WriteLn(S);
end;

That’s all you have to do to traverse the directory structure for the I:\ drive — simple and clean.

So how does this work and what are the advantages:

  • The method Entries is a static method of TDirectory structure. There are two more methods: Files, which returns only files and Directories which returns only directories.
  • The exposes static methods return an IEnumerable<String> interface which exposes the GetEnumerator() method. This is an important aspect of the implementation since exposing an interface helps you pass the lifetime management of the enumerable object to the compiler.
  • The for .. in loop then extracts a IEnumerator<String> interface and starts iterating over the directory tree.
  • The TDirectory.TEnumerator object does not use recursion internally to traverse the tree. It stores the TSearchRec at each level in a TStack<> instance (well it’s kinda like recursion).
  • You can “break” off the loop at any moment. Simple and easy.
  • It executes more instruction per iteration but I think it’s a manageable trade-off for its ease of use.

Again, th unit can be found here:Β  [download#40].

Have fun!

13 Comments

  1. > It executes more instruction per iteration but I think it’s a manageable trade-off for its ease of use.

    I bet it’s not even measurable since the file system is the real bottleneck.

    Great stuff, thanks! I guess I’ll find a use for that in my current project πŸ™‚

  2. I have a similar enumerator myself, but I expose the TSearchRec itself directly.

    I find that most of the time, it seriously matters to me whether something is a directory or not, what size it is, etc.

    Of course, mine does not search sub trees. It is an interesting idea I might want to add tho.

  3. This is one area that Linq like features would make even better in my mind.

    In fact, the directory tree diving would be a major benefit there as well.

    From SearchTree(‘I:\’,True) Select Filename,Size Where ((Attribute and faDirectory)=0) And Size>1024*10 Into MyFileSet Order By Filename;

    Would be really nice to be able to type in.

    I do keep hoping with all the new emphasis on R&D at Embarawhastzit that they’ll get into Linq sooner than later.

  4. There are many technical difficulties in implementing LINQ in Delphi. I would say a lot! It would require a complete redesign of the compiler to generate all the AST stuff required by the LINQ-to-SQL or whatever.

  5. @Alex -> That is just syntatic glue. A complete resign of the compiler is just FUD. Once you find the keyword that indicates LINQ syntax statement, you just run down a different parse tree.

    The harder part are the underlying libraries that truely make up LINQ. The syntax is just translates to a series of librarie class that use lambda expressions and type inference and nullable types.

    The libraries need to be written (and/or possibly licensed), and type inteference added to the language -which would be a plus in many ways (yes, some will abuse it, but that is the case for any technology feature) as would nullable types (variants are nice, but slow).

    Sorry, I appear to have hijacked your post to a different topic. I just really like what Linq has to offer and directory searching has always been one of my idea cases for arguing that.

  6. LINQ in C# is not only syntactic glue. Not to mention you need anonymous types types, nullables, lambdas (both in Code and in generated AST in EXE). And not to forget you need to cleanup the intermediary resources. While I was able to use interfaces a lot in DeHL to obtain this chaining with auto cleanup, there may be problems that could not be solved this way.

    Not saying it’s impossible but is it really worth it? I would settle for proper lambdas and optionally anonymous types … the rest can be done through library calls.

  7. Hello, Alex!
    Thank you for your wonderful work!
    Regarding FSEnum would like to invite to make a replacement
    function TDirectory.TEnumerator.GetCurrent: String;
    begin
    Result: = FCurrent;
    end;
    for
    function TDirectory.TEnumerator.GetCurrent: String;
    begin
    Result: = FPath + FCurrent;
    end;
    to derive the full path.

    Sincerely, Dmitry.
    P.S. Sorry for the bad english, because I used Google Translate.

  8. Hello, Alex!

    How to get a list of all files with the extension xml?
    When the code is called
    for S in TDirectory.Entries(‘C:’, ‘*.xml’, True)
    then nothing is as folder ‘*.xml’ does not exist.

    This is a bug?

    Maybe make two sets of masks – and for folders and files ?

    P.S. Again, sorry.

  9. @Dmitry, yes, if you filter by *.xml you will miss some files. I will update the code
    As for full path – I don’t think it is required since you can do it in the loop:

    for S in TDirectory.Entries(’C:’, β€˜*.xml’, True) do
    FileName := ‘C:’ + S;

    … this way the function is more generic in it’s functionality. Returning the full path may not be welcome by everybody. But I’ll think about it.

    @Ted — I can’t, it will break the actual beaty since you would have to enumerate using some structure:

    var
    a: TEnumStruct;
    begin

    for a in TDirectory.Entries(‘c:’) do
    WriteLn(a.FileSize);

    … which is not a bad idea but requires another special function + enumerator

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.