The algorithm doesn't work with very old recordings.

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

The algorithm doesn't work with very old recordings.

huckfinn
Hi,
I am new to this program, so this might be cockpit error.  But I have a lot of old jazz and blues files from the 30s and 40s.  Low fidelity stuff.    Even when ripped at 320 KBPS  from CD it shows up as 128 KBPS.

This is not shocking.    The spectrum of old recording are going to have squat for high frequencies.  

Ideally,  you would choose a reference spectrum for the music you are testing.   (Maybe this program allows something like this in a way that I don't see yet.)

The program behaving this way is not much of a problem.   The sound of old music depends very little on the bit rate, the fidelity is fully captured at 128 KBPS.   Sound quality is all about the analog source and the remastering.  

I have a hunch your algorithm works very broadly for post-1950s recordings,  so I'm not sweating this.    But I would be interested in any comments or confirmation.  Thanks.    Superb program.
Reply | Threaded
Open this post in threaded view
|

Re: The algorithm doesn't work with very old recordings.

Fake No Funk
Administrator
Could you please post one or two sample spectrums for such a file?
Reply | Threaded
Open this post in threaded view
|

Re: The algorithm doesn't work with very old recordings.

huckfinn
Sure, will look into that later today..

BTW, back in 1990s,  I was the computer programmer for an audiology research group.   We were sending sounds to subjects that teased-out non-linearites,  collecting brain wave response.   Goal to improve digital  hearing aids.    So I learned signal processing,  plus got an intuitive feel for limits of human hearing.   ( I think CD fromat is more than adequate for anything people can hear, except for maybe Neil Young and cats., both of whom require  96KHz sample rates. )
After that, I worked in database development.

Please accept my application to be your flunky.    
 
Reply | Threaded
Open this post in threaded view
|

Re: The algorithm doesn't work with very old recordings.

huckfinn
I had to dig deep to remember which files I'm 100% sure I ripped myself from CD.   Don't know if these are representative examples, but I grabbed a file from 30s, 40s, 50s and analysis got  better.

1930s



1940s



1950s
Reply | Threaded
Open this post in threaded view
|

Re: The algorithm doesn't work with very old recordings.

huckfinn
BTW, tiny bug:
When you select a folder to store spectrums to,  it also resets the default directory for retrieving audio data to that directory too.  Have to keep switching back and forth.
Reply | Threaded
Open this post in threaded view
|

Re: The algorithm doesn't work with very old recordings.

Fake No Funk
Administrator
Are you really sure you have ripped them with the same encoder settings?
Which encoder did you use?

The first one from 1930 looks 100% as a 128 kbps rip
The 2nd one from 1940 more likely 192 kbps
The last one from 1950 could be a Fraunhofer codec, wich - especially in the aggressive mode - is sometimes detected as fake.

Maybe  another reason is, that "loudness war" was not present in those days. So it looks like there is nothing above e.g. 18kHz, but it's just because it is too silent. Thats hard to distinguish of course...

But apart from this: 1+2 really don't look like a 320 kbps rip at all :-)
Reply | Threaded
Open this post in threaded view
|

Re: The algorithm doesn't work with very old recordings.

huckfinn
Yes, you appear to be right that somehow the problem is with encoding.

I just tried re-encoding that 1930s  .wav with EAC  (LAME encoder) at 320.   Your software detects it as a legitimate 320.


I have used other programs and encoders, especially the AVS4YOU software.   One of those programs is not doing the encoding as expected.
Lots of the old "320" songs that I've downloaded show up as 128 fakes.

The world has needed your program for many years.

Reply | Threaded
Open this post in threaded view
|

Re: The algorithm doesn't work with very old recordings.

huckfinn
OK, I encoded the .wav as a 320 mpg  using AVS4you software, and your software detects that as a FAKE @  128.
The files look very similar in an editor,  My MediaInfo utility tells me they are both 320 KBPS

Here are the two spectograms.

The EAC-encoded 320 that your programs recognizes as a legitimate 320



The AVS-encoded 320 that your program calls a  FAKE @ 128


Reply | Threaded
Open this post in threaded view
|

Re: The algorithm doesn't work with very old recordings.

Fake No Funk
Administrator
Well, I don't know AVS4you at all... maybe there are some special settings, e.g. a lowpass filter or something like that.
Since exactaudiocopy does it pretty well, it must have something todo with the processing within avs...
And frequencies around 16 kHz are definitely something that you can hear with average equipment :-)
Would be interesting to A/B test those two versions :-)))
Reply | Threaded
Open this post in threaded view
|

Re: The algorithm doesn't work with very old recordings.

huckfinn
Human hearing is often roughly described as 20HZ to 20KHz, but my experience in the audiology lab shows a different picture.    Your ability to hear high frequencies peaks at about age 8.     After the age of 40,  you aren't hearing much  above 10K.
Even for a younger adult,  you can play a 10KHz pure tone at a volume that would  be very loud at  3Khz,  and they hear a soft buzz.

I know music is subtle.     But I am confident that my old ears hear nothing above 8Khz in my music, regardless of sound system.   I speculate nobody over age 40 hears anything around 16KHZ in music on best sound system.

Which is to say that the information above 10Khz in mp3 rarely matters.    People with superb sound systems aren't using MP3, and they are too old to hear the difference by the time they can afford it.   :)

I will do a little more avs4you testing....
Reply | Threaded
Open this post in threaded view
|

Re: The algorithm doesn't work with very old recordings.

huckfinn
This post was updated on .
The AVS encoder does have an option to impose a low pass filter, but I have it turned off.   I think you are right that they have a low pass filter at around 15 KHz  by default.   But this is a reasonable choice, not a problem.     MP3s don't need information above 15K.       I get that it makes life difficult for you, but I hope you will seek to make your program more universal..

I tested 6 AVS-encoded mp3 320s, and your software called half of them 128.

If you look at the spectrogram of the 320 that AVS encodes, it looks very much like the 32o from EAC below 15K.    And they both are easily distinguishable visually from a 128 encode.    So there has to some way to correctly identify the AVS as a 320.  

I also notice that your program seems to identify files brought over from Apple m4a  as very low fidelity,  192, or even 128.  Sometimes that might be right, but I think there are too many low guesses.    As you know,  m4a uses a newer, smarter codec that uses psychoaccoustic research to eliminate parts of the signal that humans can't hear.    For instance, sounds masked by loud signals.    Maybe they also lowpass filter the high frequencies out like AVS.

Your software makes too many false Fakes with my mp3 collection.   But it is a start, it helpfully identifies lots of genuine 320s.

I hope your program is successful enough to allow for more R&D.   What you have now is very impressive,  I can appreciate how hard it was to get the algorithm to work as well as it does.    

The critical Mp3 information is below 10 KHz.     All the vocals are contained in the 300Hz to 3500 HZ band.   It seems that the very high frequency,  12K-20K is critical to your algorithm, but really doesn't assess the sound quality.   Maybe you need a second, new algorithm to  look at lower frequencies.  Easier said than done.
Reply | Threaded
Open this post in threaded view
|

Re: The algorithm doesn't work with very old recordings.

huckfinn
I did more testing of mp3 from mp4  using several different encoders, and I was wrong that mp4 presents a special challenge.
It is the choice of encoder that is linked to false FAKEs.
Reply | Threaded
Open this post in threaded view
|

Re: The algorithm doesn't work with very old recordings.

huckfinn
I meant to say "m4a", not "mp4" for apple audio files.     Also,  by "128 mp3" I meant " 128 upsampled to 320"

Here is an interesting article on human voice/singing.
http://www.seaindia.in/blog/human-voice-frequency-range/

The highest pure tone that anybody can hear is @ 12Khz.   That is about what I saw in the audiology lab.  People over 40 barely hear 10Khz, if at all.

Nearly all vocal energy is below 3.5Khz.  But male vocal harmonics are measured up to 8 Khz.     Female harmonics are measurable all the way to 17 Khz, but doesn't mean they are heard.    

It is important  for me to remember that your program isn't measuring sound quality, just the chances that that the file was upsampled.     Where the encoder removed the high frequencies, that job must be tough.      But this reminds me of the days of building web sites when the browsers were unstable.    The website builder can complain about the shitty browsers that mess-up the site,  but it is the website's responsibility to account for the shitty browsers.   In the same way,  you are stuck dealing with the shitty encoders, someway, somehow.    If you don't do it, some bigger company will build on your good work and figure out a more broadly useful detector.






Reply | Threaded
Open this post in threaded view
|

Re: The algorithm doesn't work with very old recordings.

Fake No Funk
Administrator
Sorry, but I have to say that the article is not true in some points:

I (age: 46) can definitely hear 16 kHz and also it is absolutely no problem to hear 50 Hz., Even 30 Hz (okay, maybe this is more "feeling" than "hearing" ^^)

And I can definitely tell you that a song, that is lowpassed at 16 kHz, sounds awful..

I must admit that I can't hear anything above 17 kHz, so for me any cutoffs above 17 kHz are technically meaningless. BUT normally, if a track claims to be high quality but has cutoffs @17kHz, it has been upscaled  / filtered / maltreated and you CAN HEAR that when A/B testing...

That's the reason why it is useful to take care about the higher frequencies, even if you can't really hear them...
Reply | Threaded
Open this post in threaded view
|

Re: The algorithm doesn't work with very old recordings.

huckfinn
This post was updated on .
I have a friend who is an audiophile, violinist, and self-taught electrical engineer.   He built his own amplifiers for years.  He owns a  $50,000 stereo system that sounds amazing.

We have a 20-year running argument on high frequencies and digitization, and he more-or-less takes your side.  So we will never agree.  He claims high frequencies, even above 20Khz, matter someway, somehow, but he can't explain it.     Testing of humans indicates very little hearing going on above 10Khz.   Some can detect something.

I've wondered if there is something like digital aliasing going on, bringing high frequencies down into audible range.

BTW, his sound system sounds BEAUTIFUL to me, and I am not hearing much above 8K.    So high end audio is not primarily about preserving high frequency information, IMO.

If you say filtering at 16K matters to your ears, I believe you.  Just don't know how.

My friend also says CD can not reproduce analog music.   Except I know that from audiology research that 44Khz is a spectacularly high sample rate for human research,   The quantization noise of 16 bits is meaningless.

ps.  I get that value of 24-bit digital audio is reducing the accumulated error in processing.   I think that is just  marketing nonsense.
Reply | Threaded
Open this post in threaded view
|

Re: The algorithm doesn't work with very old recordings.

huckfinn
BTW, on subject of old recordings,  the technology only captured a limited frequency band,
1877-1925  acoustic 250 Hz - 2,5 Khz
1925-1945 electrical 60 Hz 6 KHz
https://en.wikipedia.org/wiki/History_of_sound_recording

1945-1975 magnetic  - Hard to find specific bandwidth numbers anywhere, and I'm sure quality varied widely. But I see some guy looked at frequency response of studio grade magnetic recording devices from the past, and they were generally usefully flat only out to 10 Khz
http://www.endino.com/graphs/
Reply | Threaded
Open this post in threaded view
|

Re: The algorithm doesn't work with very old recordings.

huckfinn
The useful dynamic range for a human - from the quietest sound they can hear up to the threshold of pain - is about 60 db.    Audiology research equipment used to use 12 bits was more than adequate with a 72 DB dynamic range.    16 bits to represent music is way overkill.   Nobody could hear those lower 4 bits even in a properly mastered CD using the full range.   24-bit processing  is silly.
Reply | Threaded
Open this post in threaded view
|

Re: The algorithm doesn't work with very old recordings.

huckfinn
This post was updated on .
I've been experimenting with different MP3 encoders, and found that 2 out of 6 imposed low pass filters that screwed up detection about half the time.

After getting some experience looking at various spectrograms, I'm beginning to understanding that detecting up-sampling when the high frequency information is absent would be brutally hard, if not impossible.

And then there is another issue: some uncompressed information on CD has no high frequency information.  So of course your program misreports all rips.  Here is a spectrogram of a perfectly decent sounding .wav file:

 

I really am sorry for being a pain in your ass, but I am going to push back more on the relevance of very high frequencies in MP3 listening .........
Reply | Threaded
Open this post in threaded view
|

Re: The algorithm doesn't work with very old recordings.

huckfinn
This post was updated on .
I know you have focused on sound quality through high end speakers, but I want to first talk about the more typical MP3 listening experience through headphones.    I was looking at the frequency responses of good headphones at this site:   https://www.innerfidelity.com    (I didn't look at the  top of the line.)

Many headphones do a poor job from 6K to 10K, some have surprisingly deep notches.    After 10K, most roll off pretty quickly.   Even if the headphone were perfectly fitted to your ear, the speakers aren't reproducing music above 10K in a meaningful sense.

But it is far worse than this:  the characteristic curve of headphones is measured with the headphone sitting perfectly on a measuring transducer, airtight seal is made, and an exact force is applied.    I used to calibrate headphones in the audiology lab.      You can't just put a headset on  a person's head and reliably deliver high frequency sounds above 10K.   You would need fantastic headphones and a very tight, steady seal.  

 I'm very confident that high frequencies are completely irrelevant to all headphone listening.     Maybe you agree.   But if you tell me that lowpass filtering of music at 15K makes music sound terrible,  then all music heard through headphones would be terrible.   That's not true.

I don't know how to resolve this contradiction.    But maybe, just maybe, there are nonlinearities in human hearing that are causing you to hear something different in high end equipment when high frequencies are removed.    I seriously doubt you could hear a 15K pure tone played through those speakers;  or it would be extremely faint.    But music is not pure tones,  

I have to insist that the high frequency info  (above 10K)  in mp3 music is irrelevant.      MP3 is not used for high fidelity anyway.

OK, I will stop irritating you.