Sonos-Like Synchronised Streaming, Part 2

In my last post I outlined my intention to make a Sonos-like audio streaming system. Today I managed to get a single Raspberry Pi playing music from a Spotify through a server on my home network. I’ll explain how I did it, and what I’m going to do next.

As I hinted at in my previous post, I don’t particularly fancy playing about with a Perl behemoth like Logitech Media Server. It also just seems a bit of a dead end to me – the server’s code is open source, but it’s hosted by Logitech, and it was open-sourced after they mothballed their Squeezebox range. Who can say how long Logitech will keep hosting and maintaining the code? Surely they will lose interest, without a product range to back up their investment of time and money. Additionally, the software seems so featureful that the community aren’t forking it and actively adding new features or refactoring the code. This state of play led me to try to find another solution, and it seems I have found it in Mopidy.

Mopidy

Mopidy is a Music Player Daemon (MPD) service, hence the name. It supports a subset of the standard features of MPD, but most of the important ones. Staying close to MPD means that Mopidy can be controlled with one of the many MPD clients available. For instance, during my testing I was using ncmpcpp, a terminal client.

The feature I love about Mopidy is that it is written in Python. I can open it up and work out what’s going on, with enough effort, and I can add extensions if I am so inclined. It also supports PulseAudio, which brings me on to the next part of the setup…

PulseAudio

PulseAudio is a piece of software that acts as an interface between audio sources (programs, games, microphones, etc.) and audio ‘sinks’. A ‘sink’ is a place where audio goes, be it a sound card, an audio editor or a network stream. PulseAudio supports streaming via RTP, which brings me on to the next part…

Real-time Transport Protocol

Real-time Transport Protocol, or RTP, is a method of transport on networks which tries to make sure that data is sent in a timely fashion. It is intended for use with audio, and often forms part of VoIP packages a la Skype. We can tolerate packet loss with audio, in favour of keeping everything timely.

Setup

Since I currently only have two Raspberry Pis, and one is spending its life forwarding electricity and temperature data to another server, I can only set up one Pi as a receiver. This is a good opportunity to test the concept of the Mopidy-PulseAudio-RTP audio stack.

In terms of hardware…

  • I need a server to host Mopidy and stream its audio via PulseAudio-RTP. I have a Philips Living Room PC Core 2 Duo server that’s about 5 years old that I use for this kind of thing, running, as of yesterday, Ubuntu 14.04 LTS Server Edition.
  • I have a Raspberry Pi, of course, with a 16 GB memory card with Raspbian. The 16 GB is only there because I don’t have any other SD cards larger than 2 GB (standard install Raspbian needs about 3 GB). Mopidy and PulseAudio won’t need anywhere near 16 GB of storage, and I’ll probably eventually strip out all the extra stuff in Raspbian I don’t use and copy the whole card to a 2 GB one. I am using headphones on the Pi to listen to music. The output jack won’t drive anything more powerful, so you’ll need an amplifier if you want to play music through speakers. It’s also possible to output audio via the HDMI port, so that might work for others.
  • A laptop to control what’s playing. Not strictly necessary, if the server is also a useable desktop computer. You can also use one of the many MPD clients available for Android and iOS to control Mopidy, if you want.
  • A router running DHCP or similar, with every device connected to it. Again, strictly, the DHCP server can be the same as the music server, but I use my home router as the DHCP server so I’m using that. The Raspberry Pi and server are connected via wire to a switch which in turn is connected to the router. The room I’ve got this stuff set up in has only one ethernet wired connection, and I split it into two with the switch. The laptop is connected wirelessly.

In terms of software, this is what I’ve used:

  • Server: Ubuntu 14.04, PulseAudio, Mopidy
  • Raspberry Pi: Raspbian, PulseAudio
  • Laptop: Ubuntu 12.04, ncmpcpp terminal client

Instructions

Here’s how to do it:

PulseAudio setup on both the server and the Pi

server~$ sudo apt-get install pulseaudio
pi~$ sudo apt-get install pulseaudio

That should work for both Ubuntu and Raspbian.

Next, set up an RTP sink on the server. Change directory and edit /etc/pulse/default.pa:

server~$ cd /etc/pulse/
server~$ sudo nano default.pa

Fill it with the following content:

#! /usr/bin/pulseaudio -nF

load-module module-native-protocol-unix auth-anonymous=1
load-module module-suspend-on-idle timeout=1

load-module module-null-sink sink_name=rtp format=s16be channels=2 rate=16000
load-module module-rtp-send source=rtp.monitor

This will set the audio sample rate to 16 kHz. This is low, but I found that setting it to 44.1 kHz or 48 kHz would congest the network to the extent it is unusable. I need to tweak this later and get it to work, perhaps using a separate network altogether.

Next, edit daemon.conf:

server~$ sudo nano daemon.conf

Leave everything in there, but uncomment and edit the line exit-idle-time to be:

exit-idle-time = -1

And uncomment and edit the line default-sample-format to:

default-sample-format = s16le

On the Raspberry Pi, configure the /etc/pulse/default.pa file:

pi~$ cd /etc/pulse
pi~$ sudo nano default.pa

Uncomment the line that says #load-module module-rtp-recv .

Next, edit daemon.conf on the Pi:

pi~$ sudo nano daemon.conf

Add at the bottom:

default-sample-rate = 16000
resample-method = src-sinc-fastest
default-fragments = 10
default-fragment-size-msec = 10

Mopidy and Spotify on the Server

Following the installation instructions from Mopidy, on the server, add the repository key:

server~$ wget -q -O - http://apt.mopidy.com/mopidy.gpg | sudo apt-key add -

Add the repository sources as a file:

server~$ cd /etc/apt/sources.list.d/
server~$ sudo nano mopidy.list

Add to the file:

# Mopidy APT archive
deb http://apt.mopidy.com/ stable main contrib non-free
deb-src http://apt.mopidy.com/ stable main contrib non-free

Update the repositories and download the packages:

server~$ sudo apt-get update
server~$ sudo apt-get install mopidy mopidy-spotify

Run Mopidy once to make it generate the configuration file:

server~$ mopidy

Press Ctrl+C to kill the server. Edit the file ~/.config/mopidy/mopidy.conf:

server~$ sudo nano ~/.config/mopidy/mopidy.conf

Edit the configuration options under [audio] and [mpd] to look like this (you can leave the commented values already there):

[audio]
output = pulsesink device=rtp
[mpd]
enabled = true
hostname = ::
port = 9999

Set the port in [mpd] to whatever you want. I found the default Mopidy port was already used, even on the default Ubuntu 14.04 install, so I set it to 9999.

For Spotify, edit the values under [spotify]:

[spotify]
enabled = true
username = [username]
password = [password]

Obviously setting [username] and [password] as appropriate. I found that this pair was all I needed to get Mopidy to access my playlists successfully. It’s also possible to specify bitrates and timeouts, but I left these commented.

(Optional) ncmpcpp on Controller

Follow the instructions on the ncmpcpp website. I didn’t want to add the unstable Debian repository so I compiled it from source, but there are alternatives.

Getting it to Work

Everything is configured, so it should then be possible to start everything up and play music on the Pi. On the server, start PulseAudio as a daemon and Mopidy in userspace:

server~$ pulseaudio -D
server~$ mopidy &

Pay attention to /var/log/syslog if you have any trouble.

On the Pi, start PulseAudio:

pi~$ pulseaudio -D

Again, any trouble and check the syslog.

Now, play something on the server using the laptop or other controller:

laptop~$ ncmpcpp -h 192.168.1.10 -p 9999

The IP address 192.168.1.10 corresponded to my server. The port is set to the one I specified in the configuration.

In ncmpcpp, I can look at my Spotify playlists and play them, and I hear the tunes through my Raspberry Pi’s headphones. The music quality isn’t great, because I set the sample rate quite low. I will need to investigate network issues to see how to improve audio quality while keeping network congestion minimal. PulseAudio’s RTP uses multicast, and this in general is a heavy protocol. For instance, reports all over the web say it can floor a WiFi connection with ease. Sonos uses a separate wireless network, and presumably this is because quality and throughput can’t be guaranteed on a regular network.

So, I have the ability to play music across the network. Next up, I will get another Raspberry Pi and set it up to listen to the same stream, to test how multicast works with two clients. Hopefully the timing provided by RTP will mean the music steams will be in sync.

Credit to ianmacs’ posts on this Raspberry Pi forum thread, and this blog post from rozzin.

Sonos-Like Synchronised Streaming, Part 1

I got a Sonos kit a little while back through an employee offer at my old workplace. For anyone that doesn’t know, this is an easy-to-use solution to playing audio from local computers, internet radio, Spotify, Napster and so on. You buy a Sonos speaker and base station, and connect the latter to your internet connection. It finds network attached media servers and can interface with your Spotify account (or other online music service). Then, you either buy the ridiculously expensive Sonos controller or download the free app to control what the speaker plays.

I’ve used a controller at a friend’s house and it is nice, but the app is just as featureful. It combines searches across the different sources quite nicely, though this is only a recent feature. You can create queues, random shuffles and so on, just like any good media player.

The really nice thing about the system is the fact that it works seamlessly. You can add new speakers to the mix by plugging it in and holding down two buttons. It then appears in your app, and you can control what is played on it. You can play different songs in different rooms, or synchronise one audio stream across all of the players. Really cool.

I can completely understand the problem the Sonos system tries to solve. It has the Apple-esque “it just works” feeling about it. I would love to purchase a second Sonos to make use of the multi-room synchronisation features. However, the Sonos system has an Apple-esque pricetag, and, as a massive nerd and tinkerer, I find myself wondering how much effort it would take to make a homebrew version.

The obvious way to do this is with a Raspberry Pi. This credit card sized computer is great for hacking together web-connected controllers. It has its own native Linux distribution and there are loads of tutorials on the web concerning most features of the device. What’s more, they are cheap. Whereas a Sonos system is many hundreds of pounds, a Raspberry Pi starts at £25. Once you add a few extras for playing music, I don’t think this will come close to breaking the bank.

There are numerous reasons why a homebrew setup might be difficult:

  • The Raspberry Pi’s analogue audio output is not so brilliant in terms of quality, and therefore of little use for a Sonos-rivalling music system. Instead I expect a solution would either involve the Pi’s HDMI output or a USB sound card.
  • Synchronisation across distributed units is potentially difficult, and might involve something like Real-time Transport Protocol.
  • Interfacing with music services like Spotify and Napster together with files on a home server or connected laptop is a whole project in itself.
  • Latency can cause all sorts of issues, especially for synchronised audio. The Raspberry Pi has limited CPU power, and connecting it to a WiFi network or congested wired network will no doubt impact performance.
  • Controlling the various distributed units will require some topology thoughts. Should there be a web interface to a single server (not necessarily a Raspberry Pi but a computer on the same network), which can control the connected Pis? Or a mesh network where each Pi is both a client and server, passing around packets to each other? How should the server know about each connected unit? Should it send a message to the whole network to find them, or should the user need to register the IP address of each device in an admin panel or configuration file?

Sonos has addressed some of these issues by insisting that it has its own dedicated network. The base station that is a requirement of the Sonos system creates a wireless network of its own which the networked speakers can listen to. Other network traffic therefore does not impact the system. It also, handily for the shareholders, means that only Sonos speakers work with other Sonos speakers. Once you’ve shelled out £250 for the first kit, if you want to add any more you need to shell out more for further Sonos speakers.

The open-source community excels at providing alternatives to systems exactly like Sonos. The gap in the market that Sonos fills is to take these ideas and create a product. Open-source implementations often involve many different pieces of independent software, and lots of configuration to get them to talk to each other. This is what I would like to try to impact. There are various homebrew setups of Sonos-like systems described in blogs, but there doesn’t appear to be a system that covers the whole Sonos stack from app to speaker to Spotify.

Technologies exist for the individual layers on the Sonos stack:

  • Synchronisation of audio streams across different devices: Real-time Transport Protocol
  • Playing of audio from network: various Linux technologies such as PulseAudio or ALSA
  • Streaming audio from a combination of local and remote sources: Mopidy
  • Control via web interface or Android/iOS app: Moped or similar

It can be done. It will take time to do though. Others appear to have managed to get the synchronised audio part working properly, but I don’t know if anyone has managed to get Spotify to work too. Various solutions also involve Logitech Media Server, which is (somewhat weirdly) open-source and written in Perl (eugh). That might be an option, but I doubt I’d want to make any code modifications in Perl. Happily, Mopidy is written in Python, a glorous language, so I would be happy to dive in and start tinkering.

I’m going to give this a go at some point. I’ll keep you posted.

Update: view the second post in this series.