What makes an album?
September 02, 2014 in bliss by Dan Gravell
In the world of computer audio, what makes an album an album? To some this might seem a trivial question; but that's because you have internalised the concept of an album as a collection of music. What I mean is, given a set of music files, how can you formally partition these into albums in a deterministic manner?
This is important because it underlies how most music software works. For music players, many people want to browse and play albums. For music organisers like bliss, such software needs to be able to query for album information, rather than just song or track information, because people want to classify their music by the album, and have album related metadata attached to their music (including album artwork for instance).
How bliss recognises albums
So here's how bliss recognises albums. The basic two approaches to identifying a distinct album are:
- Tracks with the same
album name
tag and deduced artist name (see below) are clustered and considered an 'album' - Tracks with the same
album name
tag and in the same folder are clustered together, with an artist deduced from the tracks' artist tags
Artist name deduction is the second level of the heuristics. This follows an ordered set of rules. The first that matches wins:
- If there is one distinct
album artist
tag then this is used. - Take all the
track artist
tags. If there is one used in over half of the tracks and there are at least seven tracks, use that. - If there's an
album artist
used in over half of the tracks, and there are at least seven tracks, use that. - Finally, if nothing could be used given the above, use Various
One thing to consider is that this is just the way bliss does this. Different software recognises albums in different ways, and this means the total count of albums can vary between software. An extreme example of this is iTunes which is pretty unsophisticated when it comes to interpreting track artists.
I'd be interested in people's thoughts about how bliss's album identification heuristics could be refined.
Thanks to xlibber for the image above.