With new users purchasing Delphi every single day, it’s not uncommon for me to meet users that are new to the Object Pascal language. One such new user contacted me recently with questions about reading and writing structured data to files on disk.

In actual fact, this customer was quite specific about the file formats of interest.

Flat files of fixed length records with fixed length fields. Variable length fields / records where the file contains the size of a field, and then it’s data. Character delimited files such as CSV (comma separated values).

*A warning to advanced readers, this post is not for you.

None of these file formats are all too common anymore. Modern applications tend to use a known standard such as XML or JSON, for which classes are provided with Delphi. I can still see the value in using the older file types however, for the purposes of interoperability with older systems for example. There are also a few lessons to be learnt about file handling which have merit. So lets take a look at a solution to each of these file types.

Flat files of fixed length records.

In order to answer this question, I turned to Delphi Basics:: http://www.delphibasics.co.uk/

Delphi basics is an excellent resource for users new to Object Pascal. It functions as a great reference to the fundamental syntax features and available units and classes. While it is not a tutorial website, I would recommend every new user put this bookmark in their browser!

This article http://www.delphibasics.co.uk/Article.asp?Name=Files contains a section entitled “Reading and writing to typed binary files” which contains an example of working with flat files of fixed length records. I modified the sample slightly to run in the command-line:

program structuredbinary ; {$APPTYPE CONSOLE} {$R *.res} uses System . SysUtils ; type TCustomer = record name : string [ 20 ] ; age : Integer ; male : Boolean ; end ; var myFile : File of TCustomer ; // A file of customer records customer : TCustomer ; // A customer record variable begin // Try to open the Test.cus binary file for writing to AssignFile ( myFile , 'Test.cus' ) ; ReWrite ( myFile ) ; // Write a couple of customer records to the file customer . name : = 'Fred Bloggs' ; customer . age : = 21 ; customer . male : = true ; Write ( myFile , customer ) ; customer . name : = 'Jane Turner' ; customer . age : = 45 ; customer . male : = false ; Write ( myFile , customer ) ; // Close the file CloseFile ( myFile ) ; // Reopen the file in read only mode FileMode : = fmOpenRead ; Reset ( myFile ) ; // Display the file contents while not Eof ( myFile ) do begin Read ( myFile , customer ) ; if customer . male then begin Writeln ( 'Man with name ' + customer . name + ' is ' + IntToStr ( customer . age ) ) ; end else begin Writeln ( 'Lady with name ' + customer . name + ' is ' + IntToStr ( customer . age ) ) ; end ; end ; // Close the file for the last time CloseFile ( myFile ) ; Readln ; end .

In this example you can see that the file ‘myFile’ uses the datatype ‘File of TCustomer’ where ‘TCustomer’ is a record with a fixed number of bytes. The ‘name’ field is twenty characters in length, which in modern Delphi is forty bytes due to the use of UTF-16LE for the string. This is followed by a 32-bit integer for the field ‘age’ and another 32-bits for the boolean field ‘male’ to represent gender.

When using the ‘File of…’ data types, the compiler will assume you are referring to a flat binary file containing nothing but repetitions of the data type which you specify. This is convenient, and particularly useful for records of fixed length which are to be read sequentially.

Files with variable length fields.

The second type of file of interest, is a file with variable length fields. This gives us an opportunity to look at a more modern method of storing data to files, using streams. I took the example from the first file type above, and rewrote it as follows…

program structuredbinarystream ; {$APPTYPE CONSOLE} {$R *.res} uses classes , System . SysUtils ; type TCustomer = record name : string ; age : Integer ; male : Boolean ; end ; procedure WriteCustomerToStream ( customer : TCustomer ; FS : TStream ) ; var strLength : integer ; idx : integer ; ch : char ; begin // get the length of the name field. strLength : = Length ( customer . name ) ; // write the length FS . Write ( strLength , sizeof ( strLength ) ) ; // write the string a character at a time for idx : = 1 to strLength do begin ch : = customer . name [ idx ] ; FS . Write ( ch , sizeof ( ch ) ) ; end ; // write the age and gender FS . Write ( customer . age , sizeof ( customer . age ) ) ; FS . Write ( customer . male , sizeof ( customer . male ) ) ; end ; procedure ReadCustomerFromStream ( var customer : TCustomer ; FS : TFileStream ) ; var strLength : integer ; idx : integer ; ch : char ; begin // read length of name field. FS . Read ( strLength , sizeof ( strLength ) ) ; //reading back string a character at a time... customer . name : = '' ; for idx : = 1 to strLength do begin FS . Read ( ch , sizeof ( ch ) ) ; customer . name : = customer . name + ch ; end ; // reading back age and gender. FS . Read ( customer . age , sizeof ( customer . age ) ) ; FS . Read ( customer . male , sizeof ( customer . male ) ) ; end ; var FS : TFileStream ; customer : TCustomer ; // A customer record variable begin // Try to open the Test.cus binary file for writing to FS : = TFileStream . Create ( 'Test.cus' , fmCreate ) ; try // Write a couple of customer records to the file customer . name : = 'Fred Bloggs' ; customer . age : = 21 ; customer . male : = true ; WriteCustomerToStream ( customer , FS ) ; customer . name : = 'Jane Turner' ; customer . age : = 45 ; customer . male : = false ; WriteCustomerToStream ( customer , FS ) ; finally FS . Free ; end ; // Reopen the file in read only mode FS : = TFileStream . Create ( 'Test.cus' , fmOpenRead ) ; try while FS . Position < FS . Size do begin ReadCustomerFromStream ( customer , FS ) ; if customer . male then begin Writeln ( 'Man with name ' + customer . name + ' is ' + IntToStr ( customer . age ) ) ; end else begin Writeln ( 'Lady with name ' + customer . name + ' is ' + IntToStr ( customer . age ) ) ; end ; end ; finally FS . Free ; end ; // key to finish Readln ; end .

In this program I’m using the ‘TFileStream’ class to write to, and then read from the file sequentially. The ‘TCustomer’ data type now has a variable length string field for ‘name’. I’ve added two procedures, one for writing a ‘TCustomer’ record to the file, and another to read a ‘TCustomer’ from a file. In each of them, the name field is handled using a loop to read or write one character (two bytes) at a time.

In the WriteCustomerToStream() procedure, I first measure the length of the string (in characters) and write that value to the stream, followed immediately by each individual character. In ReadCustomerFromStream() I am reading the number of characters back from the stream first, and then immediately loading that number of characters from the stream. This is how we allow for the varying length of data for this field.

Using streams to read and write data is a good modern way to handle reading and writing files. Here are some of the reasons why you *should* use streams:

TFileStream is descended from TStream, in my example code above you’ll notice that the procedures WriteCustomerToStream() and ReadCustomerFromStream() take a TStream parameter, not a TFileStream. This allows any descendant of TStream to be used. Instead of writing data to a file, what if you wanted to write it to a database blob field using a TBlobStream class? Well, because those procedures work on the base class TStream, you can simply pass your blob stream class to them. Similarly you might send the data over a network using a network stream class. The TFileStream class abstracts you from the underlying operating system calls for reading and writing files. This code is therefore portable to other platforms without change (provided the correct implementation of TFileStream is available for that platform). In the example the TCustomer record could have been a class, and the WriteCustomerToStream() and ReadCustomerFromStream() procedures could have been methods of that class. In fact, renaming these to SaveToStream() and LoadFromStream() respectively, and then adding these methods to a base class, permits for some great structured data nesting options. A similar system is used by the Delphi IDE to save forms to and load forms from files in processes named ‘serialization’ (structured data to stream) and ‘deserialization’ (structured data from stream).

CSV files

Handling CSV files correctly, should be done using streams as in the above example, combined with a simple parser to ensure the CSV format is adhered to. For example, many CSV formats permit commas inside content data under the provision that the content data is surrounded by quotation characters. Some intelligence in the form of a parser is necessary to handle such situations. Having already provided the streaming example above however, parsing the data structure really is another exercise. So for this file I provided the following ‘hack’ method (of course, explaining that it is such)…

program stringlists ; {$APPTYPE CONSOLE} {$R *.res} uses classes , System . SysUtils ; const CRLF = #13 + #10 ; //- CR and LF characters, ASCII 13, 10 in decimal TAB = #09 ; // TAB character // content of the file... cFileContent = 'a,b,c' + CRLF + '1,2,3' + CRLF + '4,5,6' + CRLF + '7,8,9' + CRLF ; var FileContent : TStringList ; Fields : TStringList ; idx : longint ; idy : longint ; begin // First we'll save the CSV content from cFileContent into a file 'testfile.csv' FileContent : = TStringList . Create ; try FileContent . Text : = cFileContent ; FileContent . SaveToFile ( 'testfile.csv' ) ; finally FileContent . Free ; end ; // Now load the file back into memory... FileContent : = TStringList . Create ; try FileContent . LoadFromFile ( 'testfile.csv' ) ; // Now lets parse the field content of each line of the file... for idx : = 0 to pred ( FileContent . Count ) do begin Fields : = TStringList . Create ; try Fields . Delimiter : = ',' ; // <--- We're using a comma to delimit fields. Fields . DelimitedText : = FileContent . Strings [ idx ] ; // <-- parse one line of file. for idy : = 0 to pred ( Fields . Count ) do begin Write ( Fields [ idy ] ) ; // if it's not the last field, add a comma.. if idy < pred ( Fields . Count ) then begin Write ( TAB + ',' + TAB ) ; end ; end ; Writeln ; // new line finally Fields . Free ; end ; end ; finally FileContent . Free ; end ; // key to finish Readln ; end .

This method really isn’t good code!

What I’ve done in this program is to use the properties and methods of the TStringList class to handle saving data to a file, and loading it back. I’ve also used the TStringList class to parse each record using the ‘DelimitedText’ property, which will separate the string by a ‘Delimiter’, in this case a comma. The reason why I call this bad code is that it simply doesn’t take into account the parsing scenarios that I mentioned above. That being said, if you have a very simple CSV format file such as the one used in this sample, this quick-trick method can save you some time doing the parsing yourself.

For beginners to Object Pascal, the above samples should work if you copy and paste the code into a new “Command-Line” project. I didn’t go into every detail, and leave it as an exercise for you to try out, and to study the examples. *hint* Be sure to check out Delphi Basics as a reference! http://www.delphibasics.co.uk/

Thanks for reading!