Corey Mwamba

rambles → Bandcamp - some...

Bandcamp - some FLAC [but perhaps not the one you first gave]

Some files of mine had gone missing. They were [as is the way with all files that go missing] definitely in that folder the last time I looked, but they certainly weren't there now.

But it didn't matter, because I'd uploaded them to Bandcamp. Being a Linux chap, all my files are stored as FLAC, so I just downloaded the FLACs I needed: then stuck them on to listen...

But there was crackling. Now, I know there wasn't crackling on the audio before I uploaded it—I'd downloaded Everybody's Reading and sound was taken from the video, untreated. So I decided to ask the small section of the Internet called Twitter.

Does Bandcamp offer the same FLAC files you upload? It would appear not... Crackles that weren't apparent on original file now there.

The ears of Han Earl–Park and Paul Jones pricked up; and Paul wondered about Soundcloud's transcoding process. Soundcloud transcodes any file format to MP3 for its Flash-based player: I have no idea what it does for the HTML5-based player, but in any case the transcoding does not affect the file you can download: Soundcloud gives you the file that the maker originally downloaded, no matter what format it was. It's then up to you as the downloader to convert the file to the format of your choice. This may not be convenient, but I see it as good: it means you get exactly what the maker intended.

Bandcamp is supposed to do exactly the same thing for the uploaded formats: here's a quote from the FAQ:

Aside from actually encoding into the various formats, we don't do anything to your upload: no EQ, boosting or multi-band companding, and definitely no two-pole Butterworth band-pass/band-reject filtering (so tempting) [...] the lossless formats are exactly as you uploaded them.

But I didn't imagine the crackle. It was definitely there. So I ran a number of quick tests on the same audio—2-all from the recording with Dave Kane and Alex Hawkins [it's on Soundcloud and Bandcamp]. I'm using Linux. Don't ask me how to do it on Windows—I really don't know.

  1. md5 checksums

    this was a speedy thing to do, although not 100% fool-proof. But it shows one thing: the Bandcamp file [2allbc.flac] is different.

    [corey@corlap testsweb]$ md5sum *.flac
    50d737755f90c97345e06c33cf2be24e  2allbc.flac
    608c46e9391d32b5870df4dda7e5c9e5  2all-orig.flac
    608c46e9391d32b5870df4dda7e5c9e5  2allsc.flac
    
  2. file information

    there are various—seriously, VARIOUS—ways of getting information about an audio file on a Linux system: I just plumped for sndfile's sndfile-info. As you can read from the data [ 1,2,3 ], the Bandcamp file is bigger. This is probably because of the picture metadata.

  3. audio identity—first attempt

    I then imported all the files into Audacity and plotted a spectrum of one minute's audio at the 18s mark for each file [in Audacity go to Analyze... Plot Spectrum]. The spectrum is a graph of audio power in decibels over audio frequency in Hertz.

    In identical files, the plot will be the same. Audacity exports the plot as a table in a text file, so on a UNIX-based system you can see if there's a difference between the text files using diff. UNIX commands tend to fail silently—so if there's no difference, the command will not output anything.

    The first time I did this, I got differences between Bandcamp and the original file—the original file was louder than the Bandcamp one, but the second time I ran it they were identical. So I'd say this was inconclusive—although I have not had the time to find out why. I needed to look for a better method.

  4. audio identity—second attempt

    FLAC is a command-line tool which handily comes with its own analysis function. With the -a flag, FLAC will plot the residuals of each subframe of the file. In two identical files these should be the same.

    To test this I created a 30 second sine in Audacity and saved it as a FLAC. I then copied the file and renamed the copy. I then ran flac -a —residual-gnuplot on each file—here's a zip of the GNUplots of the first, and this is the second.

    [corey@corlap a440-test]$ diff -s a440-2/f000080.s0.gp a440-1/f000080.s0.gp 
        Files a440-2/f000080.s0.gp and a440-1/f000080.s0.gp are identical

    Using diff -s checks for identical files: and here we see that the Soundcloud plots and the original plots are identical.

    [corey@corlap testsweb]$ diff orig-gnuplot/f000080.s0.gp sc-gnuplot/f000080.s0.gp -s
        Files orig-gnuplot/f000080.s0.gp and sc-gnuplot/f000080.s0.gp are
    identical
    [corey@corlap testsweb]$ diff orig-gnuplot/f003280.s0.gp sc-gnuplot/f003280.s0.gp -s
        Files orig-gnuplot/f003280.s0.gp and sc-gnuplot/f003280.s0.gp are
    identical

    But the Bandcamp one is not—and there are also fewer plots. This time I used diff -q to just report a difference rather than show where the differences are:

    [corey@corlap testsweb]$ diff orig-gnuplot/f000080.s0.gp bc-gnuplot/f000080.s0.gp -q
        Files orig-gnuplot/f000080.s0.gp and bc-gnuplot/f000080.s0.gp differ
    [corey@corlap testsweb]$ diff orig-gnuplot/f003280.s0.gp bc-gnuplot/f003280.s0.gp -q
        Files orig-gnuplot/f003280.s0.gp and bc-gnuplot/f003280.s0.gp differ

Why?

For a start, Bandcamp makes you enter metadata on its site. This is a user-friendly step: it's annoying to download an audio file with no metadata. But if you're conscientious and do this as part of the process, Bandcamp either overwrites your original metadata on your original file; or is creating a new copy of the original file with the entered metadata. I do not know which. But since the metadata is part of the FLAC file changing the tags will change the file, so the claim that the files are exactly as you uploaded them will always be false. It's convenient but it may not be good.

This of course does not explain the crackles I heard, for which I can find no real explanation. Nor does it explain why, in terms of the audio itself, why the plots of the residuals would be different.

Suggestions

I think Bandcamp needs to start reading ID3v1/2 tags or metadata from files uploaded and not overwrite them; and be a bit clearer as to the transcoding process.

And to be clear, it is a good service—this is just an analysis. Any comments, suggestions, criticisms of method would be welcomed.