Music and network theory: a plan to improve Shuffle

published Jul 12, 2004, last modified Jun 26, 2013

This article just went live at OSNews.com

Update: answering to comments in OSNews: Yes, it does strongly resemble Markov chains. I just went through Numerical Methods at college, and though I didn't pass, some of it stuck.

The 25% magical number (read the article) is just a hint. I did consider making the relationship a function of next song's play time. But the function needs to be in parts. Say, f(x) = 0 , x < 10% of playtime, f(x) = x (or x squared?) x >= 10% of playtime. And even this has a bug: it should be proportional to the playtime percentage, otherwise long songs would get unfairly higher correlation with songs that played before them.

One of the ultimate goals of music is to provide enjoyment. Be it through association of memories, stimulation, rhythm or melodic messages, music fans all over the world enjoy listening to music.

Digital media, particularly the MP3 format (and later improvements), has empowered you and me in a way imagined never before. Owning and managing a large digital song collection is much easier than owning a sizable record collection. I honestly have songs I haven't ever heard.

The problem: listening to lots of music is harder than it seems

But actually listening to so much music is harder than managing it. Let me explain why.

There are (at least) two different strategies for listening to music. I can either:

  • choose and queue some songs up in my favorite player (Rhythmbox or XMMS, at the moment)
  • queue my entire music collection and turn Shuffle on in the music player.

Here's strategy 1 in a nutshell: When I know I want to listen to Robert Miles, I queue it up and listen to it.

But, in practice, strategy 1 has problems:

  • Selecting and queueing songs is a pain
  • I often do not precisely know what I want to listen to
  • I have to actually select a number of songs in advance, or interrupt my work every two to three minutes to queue a new song
  • I do not get to discover songs I haven't heard

I bet it happens to you too.

So, I resort to strategy 2: Shuffle play. And so, this minute Robert Miles (dream dance) is up, and the next minute perhaps UB40 (reggae) will be singing on my computer. Or Juan Luis Guerra (merengue).

You see, Shuffle is very good at helping you find previously unheard songs, but it's rather bad at selecting the most appropriate song for your current mood. Shuffle can do wonders with a small list of songs, but small lists of songs means you do not get to experience songs which aren't on the playlist. This means you have to either create lists (strategy 1, anyone?), or to interrupt your business every two or three minutes, maybe seven out of ten times.

I bet you have noticed this as well. Lots of music gives you wide variety, but does not guarantee enjoyment.

Unfortunately, there is no practical way to have the computer analyze all songs and tell it to play only "uptempo", or "romantic", or "party" type songs, other than making playlists yourself (cumbersome) or sorting by genre (useless due to the differences in song tempoes). Or is there?

How we choose what we want to listen

A party DJ keeps thousands of songs, and several playlists. But the playlists are just a small part of his equation. His brain performs, in real time, a much more complicated set of calculations: he must mix, match and weigh several factors which ultimately lead to a decision on "what to play next" (that's the mission of the DJ). At any given minute, he factors:

  • requests from people
  • a good sense to determine people's tastes in relation to music
  • a good sense to determine "what to play next" based on what's currently playing and what's your audience's target mood

We (you and me) perform the exact same analysis when playing music, unconsciously. Sure, we may or may not regard others' requests, and your tastes probably differ considerably from mine, but we already silently carry around in our brains what we need:

  • a list of favorite songs, divided into "mood playlists" (we decide on a whim whether a song we hear is "wanted" in our current mood)
  • a huge mass of data on the relationships between songs we've already listened to (if asked, we can quickly know if a song would be perfect to listen to, after this one)

A mathematical foundation for the solution: network theory

Our neural network stores a song relationship network, which is intertwined with a set of playlists and preferences. Modern day players (such as Rhythmbox) already give you a way to rate songs, so preferences are mostly covered. By the way, I wish for two things in that field: Rhythmbox should build a less unobtrusive way to rate songs, and it should automatically calculate (both overall and seasonal) preferences based on what I have chosen to listen.

Song relationship networks have been explored less. There are projects which seek to mine personal networks for benefit. Audioscrobbler and AudioGalaxy have collected massive information on listeners' preferences, and can successfully tell you that if you liked Vanessa Amorosi, you're probably going to like Ace of Base. But they cannot find out whether it is "right" to listen to Ravine from Ace of Base after Beautiful life. Only you can.

What I'm proposing here is that Rhythmbox (or any other music player) should remember "what I've heard next" and use that as a reference/suggestion to choose songs in Mood mode (a Shuffle mode on steroids). All you would need to do is turn Mood on, and every once in a while hit Next until the player learns enough.

This sounds extremely difficult to implement, but it is probably easier than it seems to.

Network theory applied to music

Say you have 28 songs (A to Z) you generally like to listen to. When in a romantic mood, you like to listen to A through E, while in an uptempo mood you usually to listen to W to Z. This means A, B, C, D and E are related to each other. The same applies to W, X, Y and Z.

Relationships between songs matter just as much as personal taste. Probably Z sounds just about right after W, or X, but it may not be "wanted" after A, E or F.

It follows that the number of times you've heard Z after W is higher than you've heard it after A. Z is more strongly related to W than to A. In practical terms, the mutual relationship between W and Z goes up if you choose Z or let Z play when it comes up after W.

Songs don't form isolated groups, though. Maybe song Q sounds great after Z and after A as well (Q has a high relationship between Z and A). Song Q forms a bridge between the two islands A-E and W-Z (this allows for smooth drifts of mood).

Mathematically speaking, the probability that Z gets queued up after W is the number of times Z played after W, divided by the the number of times songs played after W.

This system is simple. Mood mode should choose the next song to play, among a list of candidates that historically played after the current song, then assign probabilities to each candidate using a probability distribution calculated from the relationship strengths between the current and the candidate songs. If Z played 250 times after W, and X played 125 times after W, then Z has double the chance of X being queued up after W.

P(Z|W) = N(Z|W) / Sum of N(any,W)

The clash with reality: how to make this system work fine

However, this system, if implemented as explained, would not be useful (are you surprised?). You'd need to manually select songs until the computer has learned, or use Shuffle to teach the computer. For ten thousand songs, that is just not practical. It would take ages to seed the database with enough relationships for it to be useful and start "digging your tastes and moods".

It is also not robust. What would happen if you were listening to a song for two minutes, and decide to change it to another song appropriate for the current mood? Should this event be tallied up as making the relationship stronger, or weaker?

Thus, I propose several incremental improvements for Mood mode.

Use two-way (mutual) instead of one-way (directional) relationships between songs

Whether you have heard W after Z, but not Z after W, should not matter. W and Z should still be related. That means everytime you choose Z (or let Z play) after W, the relationship should grow stronger between both Z and W.

Remember what has played in the current session

Songs that already played before should not be queued up again. This cancels out a flaw introduced by 1.

Do not increase the relationship between two songs right away

The player should expect that the user may change the song that just came up, in order to better suit his tastes. Say Z was played last, and B was chosen by the player (be it Mood mode or Shuffle mode) to play. B starts playing, and the user reacts this is not what I want to hear, let's play X instead. He changes to X, and lets X play to completion.

The player has to be smart enough to recognize this, perhaps by waiting until at least 25% of B has played before increasing the relationship strength between Z and B. It should then actually increase the relationship between Z and X (evidently, after 25% of X has played).

25% is off the top of my head. I don't know the magic number. Either trial and error or statistical methods can be used to determine this.

Consider not just directly related songs, but indirectly related songs

While a friend can grant you a favor, you can also request your friend to ask another friend of his to make you one. The probability that a friend of a friend would grant you a favor is, naturally, smaller than of your friend granting a favor directly to you, but it is not zero.

M is related to N is related to P. It follows that P should generally be a better a listening choice after M than a random song, although N is better.

If the list of candidates is short (say, below 3), it should be inspected one level deeper. Once a candidate has been chosen, the player should make a random choice between the candidate (with probability = 0.5) or any of its related songs (with sum of their probabilities = 0.5).

This will help you find relatively more likable alternatives to listen after each song.

Applying this principle one level deeper may be marginally more useful, but would be much more computationally intensive. Or it may not be useful at all.

Constrain the number of candidates

Over time, the list of candidates for any song may grow very big. Constraining the list of candidates to, say, the top 5, will help by penalizing weak relationships.

Weaken relationships with time

New songs are at a disadvantage in Mood mode. Allowing the relationship network to adapt by weakening old relationships gradually would help new songs find their way into candidate lists, and would help the system in adapting to your changing tastes.

This weakening should be controlled, so new songs can have the opportunity to appear on candidate lists, but at the same time don't make old, repeated choices be "overwhelmed".

Allow for discovery of new songs

This system would only let you hear songs you've heard before. That's not the goal - the goal is to listen to new songs as well. The user should have a slider type control for how much "Shuffleness" he wants (between "Be totally random" to "Strictly follow my past choices").

Ideally, the default setting for this control would be to choose songs from the list of candidates with a probability of 0.8, and choose a random song with a probability of 0.2. A "Strictly follow my past choices" setting would be ideal for hands-off hosting of a party: two clicks on a dance music track, and the rest of the party would be dance type or compatible songs. "Be totally random" would allow you to be more explorative and create new relationships.

Numbers are off the top off my head, again.

Conclusion

Exploiting network theory in music players is just one of the interesting applications of mathematics in general. I'm sure someone else can point out examples of this being used in other fields. Let's put this to use now. We'll innovate.

Pseudo-code for song selection

candidates = topFive(currentSong.getRelationships())
candidate = randomWeightedChoice(candidates)

if candidates.length() < 3
    coinToss = randomChoiceHeadsOrTails()
        if coinToss = heads
            candidates = topFive(candidate.getRelationships())
            candidate = randomWeightedChoice(candidates)
        else if coinToss = tails
            pass

playCandidate = playCandidateOrRandomSong()
if playCandidate = true
    player.queue (candidate)
else
    player.queue (trulyRandomSong())