Monday, October 6, 2014

Setting up H2 (MySQL alternative) database for Ubuntu 14, Part 2

With squirrel-sql started as given in the last post we will create a H2 Database with the following via the GUI:

1. Click on the 'Drivers' side tab
2. Select H2 Embedded
3. Click on the Extra Class Path tab
4. Click on Add
5. Select the H2 driver jar that you extracted from ~/H2.
6. Click OK

To make a new Alias to the database, do the following

1. In the Name field enter 'testdb'
2. In the Driver drop down select H2 Embedded
3. For the User and Password fields enter 'sa' and leave the password blank
4. *For the URL field replace <db-name> with /home/<your home dir>/var

* before doing this go to the terminal and type

mkdir -p ~/var/


Now, here's the kicker. There are easy ways to load text formatted data in the H2 database it gets a little more difficult if you're working with binary files. You could write a converter in Java but the advantage of using Python is that you can use a variety of Python libraries to interface with scientific formats and dump them into the database. Our goal is efficiently create a table and dump data into it to the degree possible (I'm just one guy, not a professional sys admin).

There are several options for doing this: A) Install a PostGreSQL driver along with a PostGre Python module [1], B) Use the H2 TCP server to dump data over a socket and C) Attempt to interact with the H2 Shell via subprocess.

At first glance, option A provides compiled library access to the H2 database which should be the fastest option. Option B requires extra sockets and  overhead from socket teardown/set up. It is probably the slowest option. Option C is a hack but avoids the socket overhead and might work with H2 Embedded directly (H2 Embedded is really fast), I'm not sure. Reasons for using H2 include unbounded number of rows/columns, parallel access and being a lightweight database. Because H2 is written in Java I can't pass it as an ODBC driver (as far as I know).

So, my next entry will cover what I come up with for accessing the H2 database. If options A, B and C are too complex I will investigate alternatives.

[1] https://wiki.postgresql.org/wiki/Python




No comments:

Post a Comment