PLR Part 1: Up and Running with PL/R (PLR) in PostgreSQL: An almost Idiot's Guide



Printer Friendly

R is both a language as well as an environment for doing statistical analysis. R is available as Free Software under the GPL. For those familiar with environments such as S, MatLab, and SAS - R serves the same purpose. It has powerful constructs for manipulating arrays, packages for importing data from various datasources such as relational databases, csv, dbf files and spreadsheets. In addition it has an environment for doing graphical plotting and outputting the results to both screen, printer and file. For more information on R check out R Project for Statistical Computing. What is PL/R? PL/R is a PostgreSQL language extension that allows you to write PostgreSQL functions and aggregate functions in the R statistical computing language. With the R-language you can write such things as aggregate function for median which doesn't exist natively in PostgreSQL and exists only in a few relational databases natively (e.g. Oracle) I can think of. Even in Oracle the function didn't appear until version 10. Another popular use of R is for doing Voronoi diagrams using the R Delaunay Triangulation and Dirichlet (Voronoi) Tesselation (deldir) Comprehensive R Archive Network (CRAN) package. When you combine this with PostGIS you have an extremely powerful environment for doing such things as nearest neighbor searches and facility planning. In the past, PL/R was only supported on PostgreSQL Unix/Linux/Mac OSX environments. Recently that has changed and now PLR can be run on PostgreSQL windows installs. For most of this exercise we will focus on the Windows installs, but will provide links for instructions on Linux/Unix/Mac OSX installs. Installing R and PL/R In order to use PLR, you must first have the R-Language environment installed on the server you have PostgreSQL on. In the next couple of sections, we'll provide step by step instructions. Installing PostgreSQL and PostGIS It goes without saying. If you don't have PostgreSQL already - please install it and preferably with PostGIS support. Checkout Installing R Next install R-Language: Pick a CRAN Mirror from http://cran.r-project.org/mirrors.html In Download and Install R section - pick your OS from the Precompiled binary section. In my case I am picking Windows (95 and later). Note there are binary installs for Linux, MacOS X and Windows. If you are given a choice between base and contrib. Choose base. This will give you an install containing the base R packages. Once you are up and running with R, you can get additional packages by using the builit in package installer in R or downloading from the web which we will do later. Run the install package. As of this writing the latest version of R is 2.10.1. The windows install file is named R-2.10.1-win32.exe Once you have installed the package - open up the RGUI. NOTE: For windows users - this is located on Start menu - Start -> Programs - >R -> R. 2.10.1. If for some reason you don't find it on the start menu - it should be located at "C:\Program Files\R\R-2.10.1\bin\Rgui.exe". If you are on a 64-bit system this will be in C:\Program Files (x86)\R\R-2.10.1 Run the following command at the R GUI Console.

update.packages() Running the above command should popup a dialog requesting for a CRAN MIRROR - pick one closest to you and then click OK. A sequence of prompts will then follow requesting if you would like to replace existing packages. Go ahead and type y to each one. After that you will be running the latest version of the currently installed packages. Installing PL/R It goes without saying. If you don't have PostgreSQL already - please install it and preferably with PostGIS support. Checkout Getting started with PostGIS Now that you have both PostgreSQL and R installed, you are now ready to install PLR procedural language for PostgreSQL. Go to http://www.joeconway.com/plr/ For non-Windows users, follow the instructions here http://www.joeconway.com/plr/doc/plr-install.html.



For Windows users: download the installation file from step 6 of http://www.joeconway.com/web/guest/pl/r As of this writing, there is no install setup for PostgreSQL 8.3/8.4/9* for windows. So what you need to do is copy the plr.dll into your PostgreSQL/(8.3/8.4/9.1/9.2/9.3)/lib folder. If you are installing on PostgreSQL 9.1+, make share to copy the .control, .sql files to share/extension folder. Set the enviroment variable (you get here by going to Control Panel -> System ->Advanced ->

Environment Variables Add an R_HOME system variable and the R_HOME location of your R install. If you are on a 64-bit system and running 32-bit - it will be installed in Program Files (x86). On 32-bit (or a 64-bit running 64-bit install it will be installed in Program Files. Edit Path system variable and add the R bin folder to the end of it. Do not remove existing ones, just add this to the end Restart your PostgreSQL service from control panel -> Services. On rare circumstances, you may need to restart the computer for changes to take effect. Loading PL/R functionality into a database If you are running R version 2.12 or above on Windows, the R bin folder has changed. Instead of bin it's bin\i386 or bin\x64. Also if you install the newer version, you'll need to use the binaries and manually register the paths and R_HOME yourself since the installer will not install. You can still use the plr.dll etc. See our other Quick Intro to PL/R for more details and examples. In order to start using PL/R in a database, you need to load the help functions in the database. To do so do the following. Using PgAdmin III - select the database you want to enable with PL/R and then click the SQL icon to get to the query window. For users running PostgreSQL 9.1+, install by typing in SQL window:

CREATE EXTENSION plr;

If you are running on PostgreSLQ 9.0 or lower you have to install using the plr.sql file. Choose -> File -> Open -> path/to/PostgreSQL/8.4/contrib/plr.sql (NOTE: on Windows the default location is C:\Program Files\PostgreSQL\8.4\contrib\plr.sql Click the Green arrow to execute Testing out PL/R Next run the following commands from PgAdminIII or psql to test out R SELECT * FROM plr_environ(); SELECT load_r_typenames(); SELECT * FROM r_typenames(); SELECT plr_array_accum('{23,35}', 42); Next try to create a helper function (this was copied from (http://www.joeconway.com/plr/doc/plr-pgsql-support-funcs.html) - and test with the following CREATE OR REPLACE FUNCTION plr_array (text, text) RETURNS text[] AS '$libdir/plr','plr_array' LANGUAGE 'C' WITH (isstrict); select plr_array('hello','world'); Using R In PostgreSQL Creating Median Function in PostgreSQL using R Below is a link creating a median aggregate function. This basically creates a stub aggregate function that calls the median function in R. http://www.joeconway.com/plr/doc/plr-aggregate-funcs.html NOTE: I ran into a problem here installing median from the plr-aggregate-funcs via PgAdmin. Gave R-Parse error when trying to use the function. I had to install median function by removing all the carriage returns (\r

) so put the whole median function body in single line like below to be safe. Evidentally when copying from IE - IE puts in carriage returns instead of unix line breaks. When creating PL/R functions make sure to use Unix line breaks instead of windows carriage returns by using an editor such as Notepad++ that will allow you to specify unix line breaks. create or replace function r_median(_float8) returns float as 'median(arg1)' language 'plr'; CREATE AGGREGATE median ( sfunc = plr_array_accum, basetype = float8, stype = _float8, finalfunc = r_median ); create table foo(f0 int, f1 text, f2 float8); insert into foo values(1,'cat1',1.21); insert into foo values(2,'cat1',1.24); insert into foo values(3,'cat1',1.18); insert into foo values(4,'cat1',1.26); insert into foo values(5,'cat1',1.15); insert into foo values(6,'cat2',1.15); insert into foo values(7,'cat2',1.26); insert into foo values(8,'cat2',1.32); insert into foo values(9,'cat2',1.30); select f1, median(f2) from foo group by f1 order by f1; In the next part of this series, we will cover using PL/R in conjunction with PostGIS.







Post Comments About PLR Part 1: Up and Running with PL/R (PLR) in PostgreSQL: An almost Idiot's Guide