Online Labbooks for Scientific Research

Over at my workplace, the Institute of Gravitational Research at the University of Glasgow, we undertake lots of experimental research (primary physics) and produce lots of data. For the most part, this data is saved on individuals’ computers (centrally backed up but only accessible to that user), a shared hard drive on the (Windows) network, or written in paper labbooks (though these do have serial numbers for archival and accountability).

In my group, Stefan Hild’s European Research Council funded Speed-Meter experiment, we were slightly ahead of the curve. My interferometry colleagues and I for many years used a simple SVN-based labbook that involved placing a text file called “Notes.txt” and any images associated with the post into a directory with the current date. This was then parsed by a simple web interface to provide an online labbook that could be viewed by anyone on the network with the URL. It worked quite well, and was certainly better than a paper-based labbook in most regards, but it lacked a few key features:

  • Ability to link to previous labbook posts directly (via URIs/URLs): we instead had to quote the labbook title so that the user, if they were interested, could find the referenced labbook post.
  • Users: we used a single login for the SVN to add posts, all which which were marked as having been uploaded by “jifadmin”. It was only possible to tell who made a post by context.
  • Categories: the ability to group posts together in logical sets.
  • Versioning: otherwise known as “keeping track of edits”. We kind of hacked this to work with the use of the SVN, but the ability to roll-back or view a diff of two files involved the use of a separate system (command line or GUI).
  • Image/media upload: display of in-line images in posts, where they would logically go. Instead, our images always appeared at the bottom of the post and we would have to reference them by file name in the appropriate part of the post.
  • Centralised user credentials: this was part of a wider issue in the group which was gradually getting better. We would require a login for the SVN to add content to the labbook, but also a different login for other services like email, WiFi and the shared calendars.

We put up with this system for a number of years, but eventually when we started a major project we ditched it and turned to some software called ‘OSLogbook’.

Open Source Logbook

Imaginatively named ‘OSLogbook‘, this software is used in the Virgo and LIGO collaborations, which are scientific organisations that we are members of and frequently collaborate with. This brought a number of new features to the table:

  • LDAP login: centralised login system using centralised credentials, via the LDAP protocol.
  • Categories and tasks: each category can have subtasks, and each post can be assigned a single task within a category.
  • Authors: each user has their own login and can post their own content that is marked as their own.
  • HTML posts: a basic WYSIWYG editor allowed us to add headings, tables and coloured text to our entries.

We used this for a couple of years and found it useful, and a vast improvement over the SVN-based labbook. However, it still annoyed us in a few places and had some quirks which were not intuitive. For instance, posts could only be assigned to one category. In our daily work we often conduct lab work or experiments that involve multiple aspects of the same experiment, such as the organisation of power supplies to various corners of the lab or temperature and humidity changes affecting sensitive measurements.

The labbook also allowed us to assign multiple authors to each post, but in the form of a text field. That sounds useful at first, but it’s actually a pretty bad way of implementing this feature. If a person makes a labbook entry and puts their name before another collaborator, and the next day the other collaborator puts their name before the first collaborator, that author list is then a different set of characters despite containing the same names! Searching for posts by a certain author was therefore difficult because you don’t know in which order their name appear. Furthermore, it was left to the whim of the author of each post to correctly specify the names of their collaborators, meaning that occasionally a nickname or mis-spelling found its way into this field.

Perhaps the weirdest feature was that comments made on each labbook entry actually showed up as their own labbook entry. The list of labbook posts therefore contained in chronological order the comments made on each post as well as their respective posts – meaning the homepage was a cluttered mess of crossing conversations.

Not good.

The sum total of a number of small niggling issues with OSLogbook led me to investigate an alternative. We wanted our alternative to have the following features, in no particular order of importance:

  • Centralised login system: using our already-established LDAP system to authenticate users.
  • Permission system: we wanted our summer students and undergraduates to contribute to our labbook, but not be able to mischievously edit or delete other users’ posts.
  • Comments: labbook entries in an experiment with dozens of members more often than not require organised discussion. A comments system allows each user to contribute their thoughts in a coherent manner, without making a new labbook entry as a follow-up.
  • Media upload: beyond images, but also for zip files, analysis and plotting scripts, PDF documents, etc.
  • Rich text editing: we wanted to be able to add tables of data, links, coloured text, lists, special characters, centrally-aligned text… every little detail. Don’t constrain the user in making a post as clear as possible.
  • Multiple authors: the ability to assign a post to multiple authors, and allow any of those authors to further modify the content, and to have all collaborative posts an author is part of show up in a search of that author’s posts.
  • Reference pages: the ability to create static pages with reference information that does not fit in a labbook post. For instance, a list of procedures for ordering equipment or links to useful intranet/internet pages.
  • Ability to add new features: ideally, rather than a black-box monolithic software like OSLogbook, we wanted to be able to modify the software to fit our needs. There are a few programmers in the group, so with a nice API it would be able to tailor the labbook the way we want it.
  • Looks nice: people don’t read ugly websites unless they really have to. Let’s not make it difficult!

The obvious choice: WordPress!

Almost all of these features can be found on full-blown news sites, blogs and other content mills. It turned out that the popular blog software WordPress fit the bill… with the help of some plugins. Its vibrant community and extensive theme and plugin support made it a no-brainer. However, it lacked a few of the desired features that would not necessarily be useful on full-blown sites, but useful to a scientific organisation, like multiple authors and LDAP support. Before taking the plunge and moving my colleagues over to the system, I had to research whether there were plugins that could help us meet our list of requirements. In the end, the huge WordPress community delivered. There were plugins or native WordPress settings for all of our desired features!

Implementing Desirable Features in WordPress

This section is a guide to the plugins required to implement cool features, listed below, in our labbook software. The same plugins should work for anyone wishing to build a scientific labbook.

Multiple authors

As described above, we wanted to be able to assign multiple authors to individual posts, and allow these posts to show up in searches of any of the individual authors of that post. This can be achieved with the Co-Authors Plus plugin – and it does exactly what we want! After a single author finishes writing the post, they can start typing the names of their collaborators. As they type, other registered users with names matching the text spring up and can be assigned authorship. If a user is named as an author on a post, they can then edit it. This lets users create and edit any post they are named as an author on, but not necessarily edit other posts should you not wish them to.

Mathematical markup in posts

As a scientific group, we frequently share mathematical equations with one another. The plugin MathJax-LaTeX lets us type LaTeX equations in our posts and displays them in-line. It also utilises MathJax rendering, which displays in modern browsers nicely and allows the corresponding LaTeX code to be copied-and-pasted.

Email alerts for new posts and comments

We wanted to have the option of receiving an email alert when a new post or comment was made. Somewhat surprisingly, this is not a standard feature of WordPress. A Google search will probably point you towards JetPack, but this plugin is too bloated for what we require – it’s focused on commercial sites and making money, not a simple task like notifying registered users via email.

Email Alerts

Some users wanted alerts via email when new posts and comments were made on the site. This was accomplished by using the already built-in feature of WordPress, RSS (“Rich Site Summary” or “Really Simple Syndication”), which allows an external feed-reading service to download a copy of the recent posts on the site and send a digest to a user’s email address. Perfect, right? Not quite. Our labbook was to be private – scientifically sensitive measurements were under discussion – so external feed readers would not be able to access the RSS feeds by default. This hurdle was overcome by the plugin Private Feed Keys, which allows users to create a special URL which allows external readers to download RSS feeds. This doesn’t require a login, but each special URL is unique to each user and can be given to a trusted feed reading service (most of us use Blogtrottr).

 

With our plugins sorted, we still had to do something about the old labbook content. Since we had used our old labbook for two years, we had plenty of content already on it that we would not want to lose. We couldn’t just delete it, so I had to investigate a way to transfer the old content across.

Importing posts from old labbook

The number of posts was not so high (roughly 250, including comments) that it was beyond the scope of hacking together an import script. WordPress also has a nice API to allow a PHP script to directly create posts in the WordPress database – so I was able to create a script to extract the old posts from the OSLogbook database and inject them in to the new labbook. Comments were a little trickier, as were image and file attachments – but I managed to do it in the end with a mix of crafty PHP scripts and some elbow grease!

Styling the new labbook

The last job was to make the labbook look great. As I said earlier, people don’t like to use nasty looking sites with bad user interfaces. Luckily, WordPress is of course designed for ease-of-use, so we already had a great default theme. To allow the site to operate more like its intended labbook and less like a blog, we adopted the theme Simple Life and modified it slightly for our particular purposes. This led us to create a child theme (see it on GitHub) and a custom plugin or two to provide specific functionality that we wanted:

  • A recent edits widget to show any posts that were recently modified, to keep track of changes across all content (GitHub page)
  • A widget containing a list of comments and links to recent SVN commits (GitHub page)

Our awesome labbook

This is our labbook:

Our Speedmeter labbook software in use.

Our Speedmeter labbook software in use.

Pretty cool, huh?

Since we made the labbook, it has garnered the attention of other colleagues in our group who have since adopted similar blogs. The IT manager simply uses WordPress’s network administration tool to create a new site in seconds and clicks a few buttons to enable the plugins we use.

I hope this guide will be of use to the wider academic community and the science industry. You don’t have to suffer with poor labbooks!

Hattrick Player Tracker

One of my current projects is the building and running of a player tracker for Hattrick. Hattrick is an online football management game where players can start teams, enter leagues, buy and sell players and eventually try to win some trophies. There are a few things about it that I think are nice:

  • The games take place in real time, so league games are weekly. No need to be online all the time to be successful.
  • A decent community of fellow football fans, and special forums for fans of particular clubs or nations to come together.
  • A national team aspect, allowing you to take part in the ups and downs of your club’s country’s national team.

It’s this last aspect that in particular is exciting. Every 6 months, the players in a nation can vote for an individual to become the national team manager. This person then has to call up players from clubs around the world in order to try to win national team games. This process is often tricky, because the manager does not see the skills a particular player has until they are called up. The manager can’t call up more than a handful of players at a time, and every time they call someone up the squad’s morale takes a hit. This means that it is very much advantageous to have a network of scouts working for the national team, keeping track of player skills wherever they can get their hands on them.

The scouting aspect is where I come in to play. I thought it would be fun to get involved in my national team in Hattrick (Scotland), but I’m by no means the most tactically astute person in the country. My talents are better served in making an automatic tracking system, which is what I’ve done. Around this time last year, I started working on Hattrick Scotland, a website where users can register to have their players’ skills tracked. After registering, a special access token is retrieved from Hattrick to allow the website to access Hattrick on their behalf. Then the user may leave, and not necessarily have to come back ever again. With the token, I can use scripts to systematically and periodically access the required data that the national team scouts require.

With a database of player skills available, scouts can then access the site to view this data privately. Scouts are assigned by the national manager and so the site can identify them and let them access the database accordingly. It was a lot of work to put this together, but I found it fun learning the reasonably new aspects of PHP 5. The last time I used PHP on a large scale was when I was in high school, hacking together forum software for my friends and I. I learned it the way that most kids do, by online examples. It turns out that PHP is very much a victim of its own ease of use, and a sizeable chunk of its users are equally oblivious to basic programming paradigms. For a long time, the examples available regarding aspects of PHP were hacky and didn’t follow good programming practice. This can be important, because programs are increasingly being reused rather than rewritten as they become more complex, and powerful new features to modern languages often require programs to be organised in a certain fashion. The consequences of this meant that coming into the task of programming a player tracker, my experience with PHP was biased towards the ‘get it done quickly’ attitude.

The initial site I made was a reasonably badly written pile of code. After a little while I started to realise that I had made some fundamental mistakes in the design of the tracker, and went back to the drawing board. The design I eventually ended up using turns out to be quite similar to Django and other web frameworks. This was a happy accident, and it meant that I could add new features quickly towards the end of summer last year, after weekends and evenings spent here and there on the project. Now the tracker is more or less complete in its initial purpose, and additions I’ve made since then have been along the lines of ease of use and look and feel rather than functionality.

Running a large site like this has some interesting challenges. Although there are only around 80,000 Scottish players to track in Hattrick, which is a small number to relational databases that power most large websites, there are plenty of other things that also need tracked. Each player’s skills are of course the primary information to track, but it’s also useful to track this across time – is a player improving with time or are they being neglected? This is important information that the national team manager must be able to find out. Also, details of the matches the players have played in are important, so the manager and scouts can work out if the player has been getting the correct game time they require to improve. All of this leads to a database with many millions of rows of data, each needing to be organised and stored. It’s certainly been a good learning exercise.

Since some of the code I’ve written I am quite proud of, I think I will eventually release this project as an open source program, letting other Hattrick countries adopt their own trackers. I have not yet done this because I want Scotland to gain an advantage from having a shiny, new and featureful tracker, at least for a while. My work has appeared to have inspired other countries to start developing similar trackers, so when the advantage diminishes I will open up the code. I am sure there are plenty of improvements that could be made by other members of the community.