Corey Mwamba

menu

Bandcamp - some FLAC [but perhaps not the one you first gave]

Some files of mine had gone missing. They were [as is the way with all files that go missing] definitely in that folder the last time I looked, but they certainly weren't there now.

But it didn't matter, because I'd uploaded them to Bandcamp. Being a Linux chap, all my files are stored as FLAC, so I just downloaded the FLACs I needed: then stuck them on to listen...

But there was crackling. Now, I know there wasn't crackling on the audio before I uploaded it—I'd downloaded Everybody's Reading and sound was taken from the video, untreated. So I decided to ask the small section of the Internet called Twitter.

Does Bandcamp offer the same FLAC files you upload? It would appear not... Crackles that weren't apparent on original file now there.

The ears of Han Earl–Park and Paul Jones pricked up; and Paul wondered about Soundcloud's transcoding process. Soundcloud transcodes any file format to MP3 for its Flash-based player: I have no idea what it does for the HTML5-based player, but in any case the transcoding does not affect the file you can download: Soundcloud gives you the file that the maker originally downloaded, no matter what format it was. It's then up to you as the downloader to convert the file to the format of your choice. This may not be convenient, but I see it as good: it means you get exactly what the maker intended.

Bandcamp is supposed to do exactly the same thing for the uploaded formats: here's a quote from the FAQ:

Aside from actually encoding into the various formats, we don't do anything to your upload: no EQ, boosting or multi-band companding, and definitely no two-pole Butterworth band-pass/band-reject filtering (so tempting) [...] the lossless formats are exactly as you uploaded them.

But I didn't imagine the crackle. It was definitely there. So I ran a number of quick tests on the same audio—2-all from the recording with Dave Kane and Alex Hawkins [it's on Soundcloud and Bandcamp]. I'm using Linux. Don't ask me how to do it on Windows—I really don't know.

  1. md5 checksums

    this was a speedy thing to do, although not 100% fool-proof. But it shows one thing: the Bandcamp file [2allbc.flac] is different.

    [corey@corlap testsweb]$ md5sum *.flac
    50d737755f90c97345e06c33cf2be24e  2allbc.flac
    608c46e9391d32b5870df4dda7e5c9e5  2all-orig.flac
    608c46e9391d32b5870df4dda7e5c9e5  2allsc.flac
    
  2. file information

    there are various—seriously, VARIOUS—ways of getting information about an audio file on a Linux system: I just plumped for sndfile's sndfile-info. As you can read from the data [ 1,2,3 ], the Bandcamp file is bigger. This is probably because of the picture metadata.

  3. audio identity—first attempt

    I then imported all the files into Audacity and plotted a spectrum of one minute's audio at the 18s mark for each file [in Audacity go to Analyze... Plot Spectrum]. The spectrum is a graph of audio power in decibels over audio frequency in Hertz.

    In identical files, the plot will be the same. Audacity exports the plot as a table in a text file, so on a UNIX-based system you can see if there's a difference between the text files using diff. UNIX commands tend to fail silently—so if there's no difference, the command will not output anything.

    The first time I did this, I got differences between Bandcamp and the original file—the original file was louder than the Bandcamp one, but the second time I ran it they were identical. So I'd say this was inconclusive—although I have not had the time to find out why. I needed to look for a better method.

  4. audio identity—second attempt

    FLAC is a command-line tool which handily comes with its own analysis function. With the -a flag, FLAC will plot the residuals of each subframe of the file. In two identical files these should be the same.

    To test this I created a 30 second sine in Audacity and saved it as a FLAC. I then copied the file and renamed the copy. I then ran flac -a —residual-gnuplot on each file—here's a zip of the GNUplots of the first, and this is the second.

    [corey@corlap a440-test]$ diff -s a440-2/f000080.s0.gp a440-1/f000080.s0.gp 
        Files a440-2/f000080.s0.gp and a440-1/f000080.s0.gp are identical

    Using diff -s checks for identical files: and here we see that the Soundcloud plots and the original plots are identical.

    [corey@corlap testsweb]$ diff orig-gnuplot/f000080.s0.gp sc-gnuplot/f000080.s0.gp -s
        Files orig-gnuplot/f000080.s0.gp and sc-gnuplot/f000080.s0.gp are
    identical
    [corey@corlap testsweb]$ diff orig-gnuplot/f003280.s0.gp sc-gnuplot/f003280.s0.gp -s
        Files orig-gnuplot/f003280.s0.gp and sc-gnuplot/f003280.s0.gp are
    identical

    But the Bandcamp one is not—and there are also fewer plots. This time I used diff -q to just report a difference rather than show where the differences are:

    [corey@corlap testsweb]$ diff orig-gnuplot/f000080.s0.gp bc-gnuplot/f000080.s0.gp -q
        Files orig-gnuplot/f000080.s0.gp and bc-gnuplot/f000080.s0.gp differ
    [corey@corlap testsweb]$ diff orig-gnuplot/f003280.s0.gp bc-gnuplot/f003280.s0.gp -q
        Files orig-gnuplot/f003280.s0.gp and bc-gnuplot/f003280.s0.gp differ

Why?

For a start, Bandcamp makes you enter metadata on its site. This is a user-friendly step: it's annoying to download an audio file with no metadata. But if you're conscientious and do this as part of the process, Bandcamp either overwrites your original metadata on your original file; or is creating a new copy of the original file with the entered metadata. I do not know which. But since the metadata is part of the FLAC file changing the tags will change the file, so the claim that the files are exactly as you uploaded them will always be false. It's convenient but it may not be good.

This of course does not explain the crackles I heard, for which I can find no real explanation. Nor does it explain why, in terms of the audio itself, why the plots of the residuals would be different.

Suggestions

I think Bandcamp needs to start reading ID3v1/2 tags or metadata from files uploaded and not overwrite them; and be a bit clearer as to the transcoding process.

And to be clear, it is a good service—this is just an analysis. Any comments, suggestions, criticisms of method would be welcomed.

comments (13)

Alex Fiennes

23rd Apr 2012 | 8:38am

Some comments, grouped by section of the original article -

  1. MD5 Checksums

    I am 99% sure that bandcamp will generate the different formats from the non-compressed source audio with the meta-data stored in its database. It is therefore quite likely that the files will have different checksums. Remember as well that all the different —compression-level settings will give different files and you have no idea what encoding options bandcamp are using.

  2. File Information

    -

  3. Audio Identity—First Attempt

    I agree that any test that gives different results on successive runs is inconclusive.

    However, I would also say that you want to be testing the audio, not the encoding of the audio. Flac provides no guarantee that if you compress the same audio twice with different parameters (or event to be pedantic with the same parameters (although it probably does)) that it will generate the same file.

    It only states that the uncompressed file should be bit-wise identical.

    Therefore I think that any analysis should start with a wav file generated. I would be tempted to run the decoded wav files through the same analysis that you are using on the flac files.

  4. Audio Identity—Second Attempt

    The flac -a flag is testing the way that the data is encoded and the structural integrity of the file. If the files are encoded at different compression levels then it will give you a different analysis. See http://www.hydrogenaudio.org/forums/index.php?showtopic=31134&mode=threaded&pid=269842 for more-

    Like I said for 3—work on the wavs not the encodings.

Conclusions

This is where it gets slightly tricky because I can't see any links to the actual files and therefore I am not in a position to be able to listen to anything myself. However if there are audio differences between an uploaded flac and a downloaded flac then there is a bug at some point in the system.

I don't have a problem with bandcamp dropping everything down to lossless wav and then re-encoding it with data from their meta-data database added as appropriate for the target file. This gives a guaranteed level of predictability to system behaviour and a more consistent product and predictable system performance—ie the upload may be variable depending on how long it takes to get it into lossless but the download will be predictable in time depending on whether or not the format you asked for has been previously calculated and cached.

To be able to look further into the system, I would like access to:

  • the original flac file as uploaded to bandcamp along with the flags that were used to encode it
  • the bandcamp URL for the track where the downloaded track has crackles.

If crackles are getting introduced into the system then this means that either:

  • bandcamp is failing to decode the flac file to a wav file correctly on upload
  • bandcamp is failing to be able to encode the wav file back into a flac on download
  • the flac file that is generated is not cleanly decodable on HEP or PJ or CM systems (unlikely with this number).

Now obviously we don't have the source files (because they were deleted). However, if you download other formats of lossless files then do you have the same crackles? If you do then it suggests that the upload is not getting cleanly decoded.

If it is possible to get a reproducible flac->bandcamp->crackly flac then I would say that the next thing is to do wav->bandcamp->flac and see whether or not you still have the crackles. If you do then it suggests that either bandcamp is unable to invoke the flac library correctly (unlikely) or that they are running a patched version of flac in some way (???) or that there is a bug in flac that is triggered by some unusual audio configuration of the original wav (in which case I think that the flac team would very much like to know about it).

However, at the risk of being contentious, I suspect that if the flac->bandcamp->flac was not clean then I think that with the number of users they would have heard about it. flac is a statistically smaller lossless file user base than wav or aiff but there must be enough people doing this that audible crackles would have been reported. Are we sure that there were not crackles in the encoded version of the source file that was uploaded?

Unfortunately without the original files, or without another set of files that can consistently reproduce the error then it is very hard to move forwards. However, if wav —> (flac|aiff|wav) —> bandcamp —> (flac|aiff|wav) —> wav doesn't result in a bit perfect copy then their FAQ is incorrect and opening an issue with bandcamp is required.

Corey Mwamba

23rd Apr 2012 | 8:41am

It's also worth saying—while I'm still awake and able to do so [feeling quite ill!!]—that this is not proof; but evidence of a hypothesis. There's no need to throw out the scientific principle, so if you do link to this please do not call it proof.

Corey Mwamba

23rd Apr 2012 | 8:53am | replying to Alex Fiennes

Alex—thank you very much for all that.

As you say, it's impossible to tell with the deleted file—however what I can do is give links to the test files.

So:

  1. here's the Soundcloud file

  2. this is the link to Bandcamp

  3. and I have uploaded the original [compression level is 5, 16-bit, 44.1kHz]

Alex Fiennes

23rd Apr 2012 | 9:25am

OK. I'm listening to sections II-IV in flac downloaded from bandcamp.

I don't hear any obvious crackles yet.

However, the flac file is 48/16 not 44.1/16. I'm just waiting for the other files to download to double check what they are encoded as, but I suspect that possibly crap SRC on playback might be an issue?

I'm listening to it using Audirvana Plus on Snow Leopard through an Audient Centro DAC. Audirvana automatically reconfigures core audio into the sample rate of whatever you are listening to so you don't get any real-time SRC, however this is not default at least for OSX, and I'm not sure about windows or linux.

Can you give me a timestamp of a point where I should be hearing obvious crackles and I will focus on that point?

Corey Mwamba

23rd Apr 2012 | 9:26am

So—I converted to all the files to WAV.

md5sums:

 aa6612038e562db0c78e65e7a86f9214  2allbc.wav
 aa6612038e562db0c78e65e7a86f9214  2all-orig.wav
 aa6612038e562db0c78e65e7a86f9214  2allsc.wav

**they are identical.**

Corey Mwamba

23rd Apr 2012 | 9:29am

Okay Alex—I'll get back on that, but I think it's on the intro+section I.

Alex Fiennes

23rd Apr 2012 | 9:39am | replying to Corey Mwamba

aha. you beat me to it on the wav conversions—must have a higher speed internet connection than me.

However, I get the same md5 checksums as you.

What this means is that the core audio data is identical even though bandcamp (might have / has probably) decoded and re-encoded it. Which is a relief because there are some things in life that are reassuring if they are true and flac being right is one of them...

So I'm going to concentrate on the 48 vs 44.1 problem. Let me know what your playback chain is for listening to non-44.1 files.

I am also now listening to the intro+section with my correctly clocked DAC and I don't yet hear any crackles at 1m30 in. Can you describe the crackles to me so that I have a better idea of what I should (or rather shouldn't) be hearing?

Alex Fiennes

23rd Apr 2012 | 9:49am

I've listened to most of intro+section now and it seems fine on the bandcamp download.

There is a little bit of distortion here and there, but that is because the audio is clipped a wee bit on the louder bits, but nothing that I would describe as "crackling" and which I assume was from clipping on the video recorder when the file was recorded.

Corey Mwamba

23rd Apr 2012 | 9:52am | replying to Alex Fiennes

That is a relief! Consistent popping...

Corey Mwamba

23rd Apr 2012 | 9:54am

So perhaps the issue is my end which is a relief to be honest. I'll investigate further -

Alex Fiennes

23rd Apr 2012 | 9:55am | replying to Corey Mwamba

when you say consistent then do mean regular? if so then at what frequency? and pops like the single sample drops that you get when you have a bad clock or more of a "multi-sample physical sound"?

Alex Fiennes

23rd Apr 2012 | 9:57am

And I think that in general, uploading 48 tracks might be a bad idea unless you are explicitly marketing them as "audiofile" uploads because I don't think that people's systems will play them natively by default.

Much better to do some controlled high quality offline SRC to 44.1 before you upload them and have predictability of playback.

However, what I think is also worth noting is that you uploaded it as 16bit. I think that 24bit is much more important than higher sample rates so I wouldn't downsample the files (assuming that the video wasn't recording in 16bit) if you can help it and the bandcamp files will sound much nicer...

Corey Mwamba

23rd Apr 2012 | 10:30am

It wasn't sound drops—it was more noise. Firing up JACK and [perhaps more importantly] using a different sound card does seem to have cured it, however!

Still, in some ways this was useful, if only for Alex's advice and looking at a process, practical pitfalls to that process and a way to diagnose fairly quickly.

I still would maintain that Bandcamp finds a way of reading metadata from the uploaded file [with an option amend/add details as required]...

Sign in to comment using almost any profile.