Connecting R to a PostgreSQL Database
We’ve an amazing collection of data stored in a PostgreSQL database. I use R for help visualizing and analyzing some of this data. In the past I’ve used ODBC for general database connectivity and RODBC from within R.
I’ve found setup of ODBC with PostgreSQL driver quick and easy in Ubuntu Linux. Something along the lines of this (untested) block of code:
sudo apt-get install odbc-postgresql
sudo sh -c 'echo "[PostgreSQL]
Description = PostgreSQL driver for Linux & Windows
Driver = /usr/lib/odbc/psqlodbcw.so
Setup = /usr/lib/odbc/libodbcpsqlS.so" > /etc/odbcinst.ini'
sudo sh -c 'echo "[ODBC Data Sources]
demodb = The Data Base
[demodb]
Driver = /usr/lib/odbc/psqlodbcw.so
Database = demo
Servername = localhost
Port = 5432
Protocol = 9.0.5
ReadOnly = Yes
[ODBC]
InstallDir = /usr/lib" >> /etc/odbc.ini'
Continue by installing the RODBC library within R and everything is good to go. And then I got a Mac…
Problems with ODBC in OSX
I tried to use a similar ODBC setup to communicate with the PostgreSQL database on my MacBook, but I failed to succeed in doing this.
First attempt used the iODBC framework configured and the psqlodbc driver configured using advice on setting up linker flags from this email. Attempts to connect using RODBC segfaulted.
Second attempt tried using unixODBC installed via port and a reconfigured psqlodbc driver. Attempts to connect using RODBC segfaulted again.
I poked a bit outside of R to see if I could spot the problem. Testing outside of R seemed to show that ODBC was unable to connect to my server, but didn’t spend very much time getting a solid diagnosis. I wanted a solution that worked, rather than one that required non-trivial compiling, configuring, and testing. Having burnt my allocated time on documentation and forums I decided to try solutions outside of ODBC.
Solution with RPostgreSQL with DBI
A bit of forum searching brought me upon this email describing how to connect to a remote PostgreSQL database from R on mac osx. I was happy to see that the external dependancies I’m used to with ODBC did not exist. Installed the DBI and RPostgreSQL R libraries, then fired up R.
$ R
> dbname <- "demo"
> dbuser <- "demo_uer"
> dbpass <- "demo_password"
> dbhost <- "localhost"
> dbport <- 5432
> drv <- dbDriver("PostgreSQL")
> con <- dbConnect(drv, host=dbhost, port=dbport, dbname=dbname,
user=dbuser, password=dbpass)
> dbListTables(con)
All good! I guess that moving forward I’ll transition myself over to using RPostgreSQL with DBI within R.