Sean Leavey: Documentation

I got a Sonos kit a little while back through an employee offer at my old workplace. For anyone that doesn't know, this is an easy-to-use solution to playing audio from local computers, internet radio, Spotify, Napster and so on. You buy a Sonos speaker and base station, and connect the latter to your internet connection. It finds network attached media servers and can interface with your Spotify account (or other online music service). Then, you download the free app to control what the speaker plays.

The really nice thing about the system is the fact that it "just works". You can add new speakers to the mix by plugging it in and holding down two buttons. It then appears in your app, and you can control what is played on it. You can play different songs in different rooms, or synchronise one audio stream across all of the players. Really cool.

I would love to purchase a second Sonos to make use of the multi-room synchronisation features; however, the Sonos system has an Apple-esque pricetag, and, as a nerd and tinkerer, I find myself wondering how much effort it would take to make a homebrew version.

The obvious device to start playing with is with a Raspberry Pi. Whereas a Sonos system is many hundreds of pounds, a Raspberry Pi starts at £25. Once you add a few extras for playing music, I don't think this will come close to breaking the bank.

There are, however, numerous reasons why a homebrew setup might still be difficult:

The Raspberry Pi's analogue audio output is not good quality, and is therefore of little use for a Sonos-rivalling music system. Instead I expect a solution would either involve the Pi's HDMI output, GPIO (general purpose input/output) ports or a USB sound card.
Synchronisation across distributed units will be difficult, as standard TCP/UDP protocols do not guarantee timely arrival. Perhaps Real-time Transport Protocol will help here.
Latency can cause all sorts of issues, especially for synchronised audio. The Raspberry Pi has limited CPU power, and connecting it to a WiFi network or congested wired network will no doubt impact performance.
Interfacing with music services like Spotify and Napster together with files on a home server or connected laptop is a whole project in itself.
Controlling the various distributed units will require some topology thoughts. Should there be a web interface to a single server (not necessarily a Raspberry Pi but a computer on the same network), which can control the connected Pis? Or a mesh network where each Pi is both a client and server, passing around packets to each other? How should the server know about each connected unit? Should it send a message to the whole network to find them, or should the user need to register the IP address of each device in an admin panel or configuration file?

Sonos has addressed some of these issues by creating its own dedicated mesh network. The base station that is a requirement of the Sonos system creates a wireless network of its own which the networked speakers can communicate with; other network traffic therefore does not impact the system. It also, handily for the shareholders, means that only Sonos speakers work with other Sonos speakers. Once you've shelled out £250 for the first kit, if you want to add any more you need to shell out more for further Sonos speakers.

There are various homebrew setups of Sonos-like systems described in blogs, but there doesn't appear to be a system that covers the whole Sonos stack from app to speaker to Spotify. Technologies exist for the individual layers on the "Sonos stack" - as explained next.

Real-time Transport Protocol

Real-time Transport Protocol, or RTP, is a method of transport on networks which tries to make sure that data is sent in a timely fashion. It is intended for use with audio, and often forms part of VoIP packages a la Skype. We can tolerate packet loss with audio, in favour of keeping everything timely.

PulseAudio

PulseAudio is a piece of software that acts as an interface between audio sources (programs, games, microphones, etc.) and audio 'sinks'. A 'sink' is a place where audio goes, be it a sound card, an audio editor or a network stream. PulseAudio supports streaming via RTP.

Mopidy

Mopidy is a Music Player Daemon (MPD) service, hence the name. It supports a subset of the standard features of MPD, but most of the important ones. Staying close to MPD means that Mopidy can be controlled with one of the many MPD clients available. For instance, during my testing I was using ncmpcpp, a terminal client.

Setup

So, it seems like it can be done, with some time and effort. Others appear to have managed to get the synchronised audio part working properly, but I don't know if anyone has managed to get Spotify to work too.

Since I currently only have two Raspberry Pis, and one is spending its life forwarding electricity and temperature data to another server, I can only set up one Pi as a receiver. This is a good opportunity to test the concept of the Mopidy-PulseAudio-RTP audio stack.

In terms of hardware...

I need a server to host Mopidy and stream its audio via PulseAudio-RTP. I have a Philips Living Room PC Core 2 Duo server that's about 5 years old that I use for this kind of thing, running, as of yesterday, Ubuntu 14.04 LTS Server Edition.
I have a Raspberry Pi, of course, with a 16 GB memory card with Raspbian. The 16 GB is only there because I don't have any other SD cards larger than 2 GB (standard install Raspbian needs about 3 GB). Mopidy and PulseAudio won't need anywhere near 16 GB of storage, and I'll probably eventually strip out all the extra stuff in Raspbian I don't use and copy the whole card to a 2 GB one. I am using headphones on the Pi to listen to music. The output jack won't drive anything more powerful, so you'll need an amplifier if you want to play music through speakers. It's also possible to output audio via the HDMI port, so that might work for others.
A laptop to control what's playing. Not strictly necessary, if the server is also a useable desktop computer. You can also use one of the many MPD clients available for Android and iOS to control Mopidy, if you want.
A router running DHCP or similar, with every device connected to it. Again, strictly, the DHCP server can be the same as the music server, but I use my home router as the DHCP server so I'm using that. The Raspberry Pi and server are connected via wire to a switch which in turn is connected to the router. The room I've got this stuff set up in has only one ethernet wired connection, and I split it into two with the switch. The laptop is connected wirelessly.

In terms of software, this is what I've used:

Server: Ubuntu 14.04, PulseAudio, Mopidy
Raspberry Pi: Raspbian, PulseAudio
Laptop: Ubuntu 12.04, ncmpcpp terminal client

Instructions

PulseAudio setup on both the server and the Pi

server~$ sudo apt-get install pulseaudio
pi~$ sudo apt-get install pulseaudio

That should work for both Ubuntu and Raspbian.

Next, set up an RTP sink on the server. Change directory and edit /etc/pulse/default.pa:

server~$ cd /etc/pulse/
server~$ sudo nano default.pa

Fill it with the following content:

#! /usr/bin/pulseaudio -nF

load-module module-native-protocol-unix auth-anonymous=1
load-module module-suspend-on-idle timeout=1

load-module module-null-sink sink_name=rtp format=s16be channels=2 rate=16000
load-module module-rtp-send source=rtp.monitor

This will set the audio sample rate to 16 kHz. This is low, but I found that setting it to 44.1 kHz or 48 kHz would congest the network to the extent it is unusable. I need to tweak this later and get it to work, perhaps using a separate network altogether.

Next, edit daemon.conf:

server~$ sudo nano daemon.conf

Leave everything in there, but uncomment and edit the line exit-idle-time to be:

exit-idle-time = -1

And uncomment and edit the line default-sample-format to:

default-sample-format = s16le

On the Raspberry Pi, configure the /etc/pulse/default.pa file:

pi~$ cd /etc/pulse
pi~$ sudo nano default.pa

Uncomment the line that says #load-module module-rtp-recv.

Next, edit daemon.conf on the Pi:

pi~$ sudo nano daemon.conf

Add at the bottom:

default-sample-rate = 16000
resample-method = src-sinc-fastest
default-fragments = 10
default-fragment-size-msec = 10

Mopidy and Spotify on the Server

Following the installation instructions from Mopidy, on the server, add the repository key:

server~$ wget -q -O - http://apt.mopidy.com/mopidy.gpg \| sudo apt-key
add -

Add the repository sources as a file:

server~$ cd /etc/apt/sources.list.d/
server~$ sudo nano mopidy.list

Add to the file:

# Mopidy APT archive
deb http://apt.mopidy.com/ stable main contrib non-free
deb-src http://apt.mopidy.com/ stable main contrib non-free

Update the repositories and download the packages:

server~$ sudo apt-get update
server~$ sudo apt-get install mopidy mopidy-spotify

Run Mopidy once to make it generate the configuration file:

server~$ mopidy

Press Ctrl+C to kill the server. Edit the file ~/.config/mopidy/mopidy.conf:

server~$ sudo nano ~/.config/mopidy/mopidy.conf

Edit the configuration options under [audio] and [mpd] to look like this (you can leave the commented values already there):

[audio]
output = pulsesink device=rtp

[mpd]
enabled = true
hostname = ::
port = 9999

Set the port in [mpd] to whatever you want. I found the default Mopidy port was already used, even on the default Ubuntu 14.04 install, so I set it to 9999.

For Spotify, edit the values under [spotify]:

[spotify]
enabled = true
username = [username]
password = [password]

Obviously setting [username] and [password] as appropriate. I found that this pair was all I needed to get Mopidy to access my playlists successfully. It's also possible to specify bitrates and timeouts, but I left these commented.

(Optional) ncmpcpp on Controller

Follow the instructions on the ncmpcpp website. I didn't want to add the unstable Debian repository so I compiled it from source, but there are alternatives.

Getting it to Work

Everything is configured, so it should then be possible to start everything up and play music on the Pi. On the server, start PulseAudio as a daemon and Mopidy in userspace:

server~$ pulseaudio -D
server~$ mopidy &

Pay attention to /var/log/syslog if you have any trouble.

On the Pi, start PulseAudio:

pi~$ pulseaudio -D

Again, any trouble and check the syslog.

Now, play something on the server using the laptop or other controller:

laptop~$ ncmpcpp -h 192.168.1.10 -p 9999

The IP address 192.168.1.10 corresponded to my server. The port is set to the one I specified in the configuration.

In ncmpcpp, I can look at my Spotify playlists and play them, and I hear the tunes through my Raspberry Pi's headphones. The music quality isn't great, because I set the sample rate quite low. I will need to investigate network issues to see how to improve audio quality while keeping network congestion minimal. PulseAudio's RTP uses multicast, and this in general is a heavy protocol. For instance, reports all over the web say it can floor a WiFi connection with ease. Sonos uses a separate wireless network, and presumably this is because quality and throughput can't be guaranteed on a regular network.

So, I have the ability to play music across the network. Next up, I will get another Raspberry Pi and set it up to listen to the same stream, to test how multicast works with two clients. Hopefully the timing provided by RTP will mean the music steams will be in sync.

Credit to ianmacs' posts on this Raspberry Pi forum thread, and this blog post from rozzin.