Reading Excel files with PHP can be tricky, but fortunately there is a great library that makes this task a lot easier: PHPExcel. In the following article I will show you how to use it to convert the excel sheets into PHP arrays and use the data in PHP.

Installing PHPExcel

The first step in working with Excel documents in PHP will be to install the PHPExcel library. I will do that by using composer. If you are not familiar, you should definitely have a look at it and use it for your project’s dependency management. The composer.json file is very simple:

{ "require": { "phpoffice/phpexcel": "dev-develop" } }

Then you need to install the dependencies with composer, which will take a few seconds to complete:

composer.phar install

You will notice that the library was downloaded and installed in the project_root/vendor/phpexcel directory alongside with the project_root/vendor/composer directory and a vendor/autoload.php file which was automatically generated to take care of autoloading the classes for the library.

Step 1 – Load the file in PHPExcel

There are 2 basic ways to load the file into PHPExcel. You can specify one of the supported file types manually or you can let the library determine the file type automatically based on the file that you supply. Here is the required code for both and a list of the supported file types that you can choose from if you decide to explicitly define the file type:

<?php require_once "vendor/autoload.php"; $fileName = "sample.xlsx"; /** automatically detect the correct reader to load for this file type */ $excelReader = PHPExcel_IOFactory::createReaderForFile($fileName); /** Create a reader by explicitly setting the file type. // $inputFileType = 'Excel5'; // $inputFileType = 'Excel2007'; // $inputFileType = 'Excel2003XML'; // $inputFileType = 'OOCalc'; // $inputFileType = 'SYLK'; // $inputFileType = 'Gnumeric'; // $inputFileType = 'CSV'; $excelReader = PHPExcel_IOFactory::createReader($inputFileType); */

Step 2 – Define the reader options

I won’t go into much details about the different options you can use, because they are quite a few, but I will point out the most common and useful ones.

//if we dont need any formatting on the data $excelReader->setReadDataOnly(); //load only certain sheets from the file $loadSheets = array('Sheet1', 'Sheet2'); $excelReader->setLoadSheetsOnly($loadSheets); //the default behavior is to load all sheets $excelReader->setLoadAllSheets();

setReadDataOnly

These are pretty straightforward. Make sure you only load the sheets that you need to use since the library can use a lot of memory especially for large files. Also themethod helps speed things up a bit by only loading the data from the cells without any special excel formatting. You have to be aware though that any date formatting will be lost if you use this option and dates will be loaded as numbers with no formatting.

A more interesting feature is the ability to add filters to the reader object. This allows to only load certain columns and rows or load the excel data in chunks, which is especially useful if you are doing some sort of a database import. I will provide a basic example of both filters. The following read filter will instruct the reader to only load the data in rows 1 to 10 from columns A to C. You can of course put much more advanced logic in the filters, making them pull different rows and columns from different sheets.

<?php class SampleReadFilter implements PHPExcel_Reader_IReadFilter { public function readCell($column, $row, $worksheetName = '') { // Read rows 1 to 10 and columns A to C only if ($row >= 1 && $row <= 7) { if (in_array($column,range('A','C'))) { return true; } } return false; } }

After we have created the filter, we need to pass it to the reader object:

$sampleFilter = new SampleReadFilter(); $objReader->setReadFilter($chunkFilter);

Step 3 – Load and display the data

The final step is to load the excel data into PHP.

$excelObj = $excelReader->load($fileName);

This will produce a PHPExcel object, but in order to easily modify and transform the data to suit our needs its better to convert it to an array.

$excelObj->getActiveSheet()->toArray(null, true,true,true);

Keep in mind that PHPExcel will only display the information on the currently active sheet, which is the last loaded one. You can however manually switch between sheets and get their contents or you can automate the process and get all sheets as one array:

//get all sheet names from the file $worksheetNames = $excelObj->getSheetNames($fileName); $return = array(); foreach($worksheetNames as $key => $sheetName){ //set the current active worksheet by name $excelObj->setActiveSheetIndexByName($sheetName); //create an assoc array with the sheet name as key and the sheet contents array as value $return[$sheetName] = $excelObj->getActiveSheet()->toArray(null, true,true,true); } //show the final array var_dump($return);

Loading data in chunks

The approach outlined above is fine in most cases, but if you need to load some information for example in a database and you dont need the entire excel file loaded at once it will be much faster to split it in chunks and work with each individual chunk at a time. To do that we need a simple read filter:

<?php class ChunkReadFilter implements PHPExcel_Reader_IReadFilter { private $_startRow = 0; private $_endRow = 0; /** Set the list of rows that we want to read */ public function setRows($startRow, $chunkSize) { $this->_startRow = $startRow; $this->_endRow = $startRow + $chunkSize; } public function readCell($column, $row, $worksheetName = '') { // Only read the heading row, and the configured rows if (($row == 1) || ($row >= $this->_startRow && $row < $this->_endRow)) { return true; } return false; } } /** Define how many rows we want to read for each "chunk" **/ $chunkSize = 2048; /** Create a new Instance of our Read Filter **/ $chunkFilter = new chunkReadFilter(); /** Tell the Reader that we want to use the Read Filter **/ $excelReader->setReadFilter($chunkFilter); /** Loop to read our worksheet in "chunk size" blocks **/ for ($startRow = 2; $startRow <= 65536; $startRow += $chunkSize) { /** Tell the Read Filter which rows we want this iteration **/ $chunkFilter->setRows($startRow,$chunkSize); /** Load only the rows that match our filter **/ $excelObj = $excelReader->load($inputFileName); $data = $excelObj->getActiveSheet()->toArray(null, true,true,true); // Do some processing here - the $data variable will contain an array which is always limited to 2048 elements regardless of the size of the entire sheet }