Often, web applications will prompt the user to select a file, typically to upload to a server. Unless the web application makes use of a plugin, file selection occurs through an HTML input element, of the sort <input type="file"/> . Firefox 3.6 now supports much of the W3C File API, which specifies the ability to asynchronously read the selected file into memory, and perform operations on the file data within the web application (for example, to display a thumbnail preview of an image, before it is uploaded, or to look for ID3 tags within an MP3 file, or to look for EXIF data in JPEG files, all on the client side). This is a new API, and replaces the file API that was introduced in Firefox 3.

It is important to note that even before the advent of the W3C File API draft (which only became a Working Draft in November 2009), Firefox 3 and later provide the ability to read files into memory synchronously but that capability should be considered deprecated in favor of the new implementation in Firefox 3.6 of the asynchronous File API. The deprecated API allowed you synchronously access a file:

// After obtaining a handle to a file // access the file data var dataURL = file.getAsDataURL(); img.src = dataURL;

While Firefox 3.6 will continue to support code usage of the sort above, it should be considered deprecated since it reads files synchronously on the main thread. For large files, this could result in blocking on the result of the read, which isn’t desirable. Moreover, the file object itself provides a method to read from it, rather than having a separate reader object. These considerations informed the technical direction of the new File API in Firefox 3.6 (and the direction of the specification). The rest of this article is about the newly introduced File API.

Accessing file selections

Firefox 3.6 supports multiple file selections on an input element, and returns all the files selected using the FileList interface. Previous versions of Firefox only supported one selection of a file using the input element. Additionally, the FileList interface is also exposed to the HTML5 Drag and Drop API as a property of the DataTransfer interface. Users can drag and drop multiple files to a drop target within a web page as well.

The following HTML spawns the standard file picker, with which you can select multiple files:

Note that if you don’t use the multiple attribute, you only enable single file selection.

You can work with all the selected files obtained either through the file picker (using the input element) or through the DataTransfer object by iterating through the FileList:

var files = document.getElementById("inputFiles").files; // or, for a drag event e: // var dt = e.dataTransfer; var files = dt.files for (var i = 0; i < files.length; i++) { var file = files[i]; handleFile(file); }

Properties of files

Once you obtain a reference to an individually selected file from a FileList, you get a File object, which has name , type , and size properties. Continuing with the code snippet above:

function handleFile(file) { // RegExp for JPEG mime type var imageType = /image/jpeg/; // Check if match if (!file.type.match(imageType)) { return false; } // Check if the picture exceeds set limit if(file.size > maxSize) { alert("Choose a smaller photo!"); return false; } // Add file name to page var picData = document.createTextNode(file.name); dataGrid.appendChild(picData); return true; }

The size attribute is the file's size, in bytes. The name attribute is the file's name, without path information. The type attribute is an ASCII-encoded string in lower case representing the media type of the file, expressed as an RFC2046 MIME type. The type attribute in particular is useful in sniffing file type, as in the example above, where the script determines if the file in question is a JPEG file. If Firefox 3.6 cannot determine the file's type , it will return the empty string.

Reading Files

Firefox 3.6 and beyond support the FileReader object to read file data asynchronously into memory, using event callbacks to mark progress. The object is instantiated in the standard way:

var binaryReader = new FileReader();

Event handler attributes are used to work with the result of the file read operation. For very large files, it is possible to watch for progress events as the file is being read into memory (using the onprogress event handler attribute to set the event handler function). This is useful in scenarios where the drives in question may not be local to the hardware, or if the file in question is particularly big.

The FileReader object supports three methods to read files into memory. Each allows programmatic access to the files data in a different format, though in practice only one read method should be called on a given FileReader object:

filereader.readAsBinaryString(file); will asynchronously return a binary string with each byte represented by an integer in the range [0..255]. This is useful for binary manipulations of a file's data, for example to look for ID3 tags in an MP3 file, or to look for EXIF data in a JPEG image.

will asynchronously return a binary string with each byte represented by an integer in the range [0..255]. This is useful for binary manipulations of a file's data, for example to look for ID3 tags in an MP3 file, or to look for EXIF data in a JPEG image. filereader.readAsText(file, encoding); will asynchronously return a string in the format solicited by the encoding parameter (for example encoding = "UTF-8" ). This is useful for working with a text file, for example to parse an XML file.

will asynchronously return a string in the format solicited by the encoding parameter (for example ). This is useful for working with a text file, for example to parse an XML file. filereader.readAsDataURL(file); will asynchronously return a Data URL. Firefox 3.6 allows large URLs, and so this feature is particularly useful when a URL could help display media content in a web page, for example for image data, video data, or audio data.

An example helps tie this all together:

if (files.length > 0) { if (!handleFile(files[0])) { invalid.style.visibility="visible"; invalid.msg = "Select a JPEG Image"; } } var binaryReader = new FileReader(); binaryReader.onload = function(){ var exif = findEXIFInJPG(binaryReader.result); if (!exif) { // ...set up conditions for lack of data } else { // ...write out exif data } binaryReader.onprogress = updateProgress; binaryReader.onerror = errorHandler; binaryReader.readAsBinaryString(file); function updateProgress(evt){ // use lengthComputable, loaded, and total on ProgressEvent if (evt.lengthComputable) { var loaded = (evt.loaded / evt.total); if (loaded < 1) { // update progress meter progMeter.style.width = (loaded * 200) + "px"; } } } function errorHandler(evt) { if(evt.target.error.code == evt.target.error.NOT_FOUND_ERR) { alert("File Not Found!"); } }

In order to work with binary data, the use of the charCodeAt function exposed on strings will be particularly useful. For instance, an utility of the sort:

function getByteAt(file, idx) { return file.charCodeAt(idx); }

allows extraction of the Unicode value of the character at the given index.

An example of similar code in action in Firefox 3.6, including use of the readAsDataURL method to render an image, as well as binary analysis of a JPEG for EXIF detection (using the readAsBinaryString method), can be found in Paul Rouget's great demo of the File API..

A word on the specification

The existence of a W3C public working draft of the File API holds the promise of other browsers implementing it shortly. Firefox 3.6's implementation is fairly complete, but is missing some of the technology mentioned in the specification. Notably, the urn feature on the File object isn't yet implemented, and neither is the ability to extract byte-ranges of files using the slice method. A synchronous way to read files isn't yet implemented as part of Worker Threads. These features will come in future versions of Firefox.

References