Page 1 of 3 123 LastLast
Results 1 to 20 of 54

Thread: MP3 and audio data compression

  1. #1
    Join Date
    Sep 2009
    Location
    Malaysia
    Posts
    508

    Default MP3 and audio data compression

    For so many years I was thinking that Mp3 sounds inferior because it was missing frequencies above certain level. This view seems to be wrong in light of what Grant Bridgeman said in his article that, I quote, " A good mp3 achieves a compression ratio of about 10:1 compared to an original 44.1kHz/16-bit audio file, and although some of the information has been lost, the audio quality is still acceptable to most people. This is because the algorithms are written specifically to take how we hear into account, removing audio data that our ears do not perceive as being of importance.

    However, we can still hear the difference, as mp3s can sound hollow, thin and lacking sparkle. This is because that ‘un-perceivable’ information has been removed and our brains can no longer filter the sound in the same way as we would with the original recording."

    So is Mr. Bridgeman suggesting even though we do not perceive the sound but it is still important?

    ST
    Attached Files Attached Files

  2. #2
    Join Date
    Jan 2006
    Location
    South of England, UK
    Posts
    4,274

    Default MP3 and audio data compression

    For so many years I was thinking that Mp3 sounds inferior because it was missing frequencies above certain level
    No this is not true. In fact the opposite is true. The tones are missing below a certain (loudness) level. Is that what you meant to say? But even so, that's a huge simplification. Do you know the fundamental principal behind audio compression? Patents here. Note that the fundamental papents are from 1987 .... four years after the CD was launched.

    A good mp3 achieves a compression ratio of about 10:1 compared to an original 44.1kHz/16-bit audio file
    True.

    For example, if an audio compression system is specified as having 10:1 compression, that means that for every ten units of audio input to the encoder, nine units are discarded and not at all encoded. As an approximation, if the file size of an uncompressed WAV file (from an audio CD) is 330kB, if it is passed through a 10:1 audio compression system, the resulting output file is 33kB.

    This is because the algorithms are written specifically to take how we hear into account, removing audio data that our ears do not perceive as being of importance.
    True.

    as mp3s can sound hollow, thin and lacking sparkle
    Yes, but they can also sound fantastic - utterly indistinguishable from the source WAV file. The secret is how the encoder* is set-up to do its job. The MP3 encoder is not a rigid 'one setting suits all' tool. It has lots of user adjustable controls (see attached). and how it's set depends entirely on what the requirement is. If it's to make a super-small MP3 to use on a mobile phone than maybe the setting is not 10:1 compression but 100:1 (that means, from the source file, ninety nine units of audio are dumped and only one passed through to the MP3) and as you can imagine, the perceived quality must and does drop dramatically. You've probably heard that horrible swooshing phasey sound from highly compressed (100:1?) MP3s. That's nothing more than the fact that when you ignore 99% of the audio, the fine detail of reverberant decay between words and notes is erased. But you can't blame MP3 overall for that - that was how the engineer wanted to process the audio in that clip. He was driven by file size, not audio quality objectives.

    This is because that ‘un-perceivable’ information has been removed and our brains can no longer filter the sound in the same way as we would with the original recording."
    A bold statement but sorry, completely impenetrable to me. Cannot be an accurate quote from an audiologist.

    The attached screen-snap shows some of the settings available to the encoder if you pay the royalty and receive a licence (as I have done) that allows you to tweak the encode routines. There is no fundamental reason why MP3 should sound bad.

    -------------------------------------------------
    *The business model behind MP3 is fascinating. The decoders (in our consumer goods and software) may or may not be royalty free to the manufacturer (I'm not sure of the latest position) but the charges are small. The encoder is comprehensively patented and money has to be paid to the patent holder. All the clever work is in the encoder. The consumer's decoder is dumb, with not controls and can be fabricated on silicon very cheaply indeed. That's why it is the sole responsibility of the person responsible for encoding to set the quality parameters, and there is nothing at all that the consumer with his dumb decoder can do to improve upon the quality of the encoded MP3 file he receives.

    These are subjects that I can write about and willingly share my knowledge. But not when I'm wasting valuable time justifying company sales policy.
    Attached Images Attached Images
    Alan A. Shaw
    Designer, owner
    Harbeth Audio UK

  3. #3
    Join Date
    Dec 2007
    Location
    Suffolk, UK
    Posts
    334

    Default 320kB MP3

    I understand that VBR above 320Kb/s is pretty well inaudible from the master file..

    In any case, with the memory of personal stereo's getting bigger and download speeds becoming faster, I wouldn't have thought low bitrate is needed these days.

    The major problem as I see it is the severe audio compression applied to many tracks. Surely the monitors used in the mastering can reproduce this properly?

  4. #4
    Join Date
    Sep 2009
    Location
    Malaysia
    Posts
    508

    Default How does MP3 actually work?

    Quote Originally Posted by A.S. View Post
    No this is not true. In fact the opposite is true. The tones are missing below a certain (loudness) level. Is that what you meant to say?..
    I may be wrong here but my understanding of audio compression is it works by discarding frequencies outside our hearing range. Am I correct to say that?

    ST

  5. #5
    Join Date
    Jun 2009
    Location
    Canada
    Posts
    904

    Default Sound of MP3 and bitrate

    I wonder if one factor that might affect the perceived quality of an MP3 encoding has nothing to do with the encoding algorithm itself, but the nature and dynamics of the recording.

    I have been using Apple Lossless (ALAC) to rip CDs to my hard drive for a couple of years now. As the name implies, it is a "lossless" compression system that packs the data down into a smaller space, but does not discard data.

    One of the things iTunes lets you do is to see the bit rate at which any audio track was compressed, post-compression. As we know, the CD is always 1411 kpbs. After running ALAC, I get anything from 227kbps at the low end (an old Django Reinhart recording) to 1123 kbps at the high end (a pop recording by a Canadian band called Stars). Most music is in the middle, at the 500 - 700 kbps range, but there's a surprising amount at over 1000.

    Would it not be the case that, if you took two pieces of music and encoded them in MP3 (or any lossy compression scheme, such as Apple's AAC), but one piece could be losslessly encoding at 400 kbps and the other required 1100 kbps, then the one that could be packed more efficiently would sound "better" - i.e. closer to the original - than the file that needed 2.5 times the space to encode as a lossless file, because with the more efficient file there'd be less data that needed to be thrown away? I've never read anything about this, but logically it seems to make sense, and if it's right, then it might be one reason why there are variable opinions about MP3 sound quality.

  6. #6
    Join Date
    Jan 2006
    Location
    South of England, UK
    Posts
    4,274

    Default How does MP3 actually work ..... more info please

    Quote Originally Posted by STHLS5 View Post
    I...my understanding of audio compression is it works by discarding frequencies outside our hearing range ...
    OK, before I jump in, I think I need some more input from you. Can you explain please with perhaps a described example of how you think MP3 works along the frequency reduction lines you suggest?

    If you have a view, I'm fairly certain that it will be held by others too. The better I understand what that prevailing view is, the better I can explain how it actually works.
    Alan A. Shaw
    Designer, owner
    Harbeth Audio UK

  7. #7
    Join Date
    Jan 2006
    Location
    South of England, UK
    Posts
    4,274

    Default Lossless and lossy

    Quote Originally Posted by EricW View Post
    ...Would it not be the case that, if you took two pieces of music and encoded them in MP3 (or any lossy compression scheme) but one piece could be losslessly encoding at 400 kbps and the other required 1100 kbps, then the one that could be packed more efficiently would sound "better" - i.e. closer to the original...
    Whoah there! You're racing ahead! Let's take a step backwards, and clarify some terminology.

    1) "Lossy" compression: means something must have been thrown away in the sound file. It may not be obvious on listening or even on examining the waveform technically but some part of the wave must have been discarded. No escape from that fact.

    2) "Lossless" compression: means the audio waveform comping out of the encode then decode process will be exactly the same in every detail (in theory anyway) as the audio waveform that arrived at the encoder's input.

    So how can the file size be usefully reduced for the person receiving the file in lossless system? There are a number of approaches we can discuss later but you can think of it like this. Imagine that you write a book. You print out the entire manuscript, staple it together into the correct page order and then you put it into a large cardboard box and post it to your publisher. The manuscript rattles about in the box. You can think of a WAV file like this loose fitting box. But if you take the same manuscript and you put it in a tight fitting JiffyBag you can reduce the size of the package greatly, but the contents are completely unaltered. That's what lossless compression does. So regardless of which method you use, as far as the receiver (your book publisher) is concerned, the valuable contents word by word and page by page are unaltered. But the package - the data stream size is reduced.

    If the original WAV file (uncompressed, straight off the CD) needs 1000kbits/second to describe every nuance of the waveform, then passing it through any lossy CODEC and seeing only a (say) 600kb/sec data stream on the outgoing side of that encode process can mean only one thing: audio data has been discarded. No other possibility. And that means under careful technical scrutiny (if you know where to look in the waveform) you will be able to see what has been erased by the lossy encoder. As far as I can see, MP3 does not have a 'lossless' setting (why would it need one?) ergo, something is always thrown away from the audio source even on the highest bitrate settings when passing through any MP3 encoder.

    In round numbers: if as you say the datarate that an audio CD is delivering to the CD player's read laser is roughly 1200kb/sec, and the highest-quality MP3 encoder setting is 300kB/sec, then that means an encode ratio of 4:1 i.e. three out of four audio "blocks" will be thrown away in the encoder ..... you are listening to an audio signal with 75% of the data removed. Yet it can sound indistinguishable from the original WAV file - no surprise to anyone who knows how the ear works. Now the clever part that would have made you a rich man if you'd beaten the MP3 patent holders to the post: which 75% can you throw away without any noticeable degradation even by the most careful listener using the best equipment? To guarantee that you absolutely must study the way we interpret sound. In other words, some science is needed, not guesswork.
    Alan A. Shaw
    Designer, owner
    Harbeth Audio UK

  8. #8
    Join Date
    Jun 2009
    Location
    Canada
    Posts
    904

    Default Lossless date rate?

    First, to your point made to STHLS5, my understanding is that any lossy audio compression works primarily on the basis of the psychoacoustic principle of "masking".

    Masking, as I understanding, involves the idea that when a loud sound and a soft sound occur simultaneously, the ear will not pick up the softer sound (below a certain threshold), and provided it's far enough below the louder sound, it's safe to discard it, as the ear will not notice the difference. So I guess what's going on is that the encoder is constantly "looking" at the musical signal and performing an analysis of relative dynamic levels, and throwing away what it's safe to throw away. I realize this is probably a gross oversimplification, but I think that's the basic idea.

    As for my post, the point I was trying to make is that, even using a lossless encoder, the amount of data required to encode losslessly seems to vary considerably, if I can believe what iTunes is telling me. I understand the difference between lossless and lossy. So my question really was, to simplify, if the original WAV signal can be losslessly encoded at 320 kbps (for example), will running it through an MP3 encoder at the 320 setting actually involved any discarding of data?

  9. #9
    Join Date
    Aug 2009
    Location
    United States
    Posts
    73

    Default Quoting MP3 work ....

    Quote Originally Posted by A.S. View Post
    OK, before I jump in, I think I need some more input from you. Can you explain please with perhaps a described example of how you think MP3 works along the frequency reduction lines you suggest? If you have a view, I'm fairly certain that it will be held by others too. The better I understand what that prevailing view is, the better I can explain how it actually works.
    The use in MP3 of a lossy compression algorithm is designed to greatly reduce the amount of data required to represent the audio recording and still sound like a faithful reproduction of the original uncompressed audio for most listeners. An MP3 file that is created using the setting of 128 kbit/s[note 1] than the CD file created from the original audio source. An MP3 file can also be constructed at higher or lower bit rates, with higher or lower resulting quality. will result in a file that is about 11 times smaller
    The compression works by reducing accuracy of certain parts of sound that are deemed beyond the auditory resolution ability of most people. This method is commonly referred to as perceptual coding. It uses psychoacoustic models to discard or reduce precision of components less audible to human hearing, and then records the remaining information in an efficient manner. MP3 uses a complex, precise masking model that is much more signal dependent than JPEG picture encoding. (Wiki)

  10. #10
    Join Date
    Jan 2006
    Location
    South of England, UK
    Posts
    4,274

    Default How MP3 works ...

    Quote Originally Posted by Supersnake View Post
    The use in MP3 of a lossy compressor...
    Thank you for the links but actually I was not interested in the facts but what STHLS5 thought were the facts. The facts are indisputable but I am curious about what may be a widely held opinion. Or could this just be a simple confusion on my part as to what he was saying? Either way, I'd like to hear from him.
    Alan A. Shaw
    Designer, owner
    Harbeth Audio UK

  11. #11
    Join Date
    Jun 2009
    Location
    Canada
    Posts
    904

    Default MP3 and frequency response

    I have read that MP3 also does somewhat truncate high frequency response, reducing or eliminating frequencies about 16Khz. Is this incorrect?

    {Moderator's comment:"k" as in kilo Hertz is always small k not capital K please.}

  12. #12
    Join Date
    Sep 2009
    Location
    Malaysia
    Posts
    508

    Default How MP3 works ....

    Quote Originally Posted by A.S. View Post
    ..but what STHLS5 thought were the facts...
    I will agree with Supernake. My reading of Wiki or various "write-up" on MP3 suggests that besides compression it actually discards sound(frequencies) in the recording which was beyond human hearing. In Wiki, it was stated "(T)he compression works by reducing accuracy of certain parts of sound that are deemed beyond the auditory resolution ability of most people. This method is commonly referred to as perceptual coding.[13] It uses psychoacoustic models to discard or reduce precision of components less audible to human hearing, and then records the remaining information in an efficient manner." I do not have a MP3 collection and don't listen to them and therefore I must admit I was too keen to understand MP3 technology in detail.

    However, Wiki is a recent discovery. My views about limited frequencies were entrenched in my head a long time before and wonder why I came to that conclusion. I believe many of the people associate with share a similar opinion because none so far pointed out if my view was wrong.

    ST

  13. #13
    Join Date
    Sep 2009
    Location
    Malaysia
    Posts
    508

    Default How MP3 works ....

    Addendum

    Further reading on the subject now help me to understand better how MP3 works.

    This image is the current representation of MP3 encoding principle. I hope this is correct interpretation. http://awesomescreenshot.com/099bk7a2

    I am still unable to explain why I had thought that MP3 cuts off above the higher frequencies. But reading form the web I may have relied on the early encoding technology of MP3 like here below:-

    http://awesomescreenshot.com/0b7bk643

    ST

  14. #14
    Join Date
    Sep 2009
    Location
    Malaysia
    Posts
    508

    Default MP3 high frequency cut-off ...

    I am still trying hard to put my finger on where and why I came to the conclusion that MP3 cuts frequencies above certain frequencies but reading this article the position might be true at the early stages of MP3.

    "MP3 frequency comparison:

    Most of us at some time or other convert an audio recording to the more compact mp3 format for storage, or for play-back on mp3 players, or for sending audio files through web-space. Mp3 is very convenient for this but is what is known as a “lossy” format – which means the price you pay for a compressed file size is a loss of frequencies throughout the audio spectrum of the recording. You may not be aware of this frequency loss if you're listening on low-end audio equipment, but it is very definitely there.

    For my own interest I've done a little research into this effect. For my source I used a few seconds of a CD recording of the violinist Sophie Mutter playing part of Mozart's A-major violin concerto. I chose this because the harmonic content of her playing on her Guarnerius violin on occasion extends beyond 20kHz (about the top limit for a quality CD), and her playing is accompanied by a full classical orchestra. I copied the extract onto my computer using standard audio properties - bit rate 1411kbps, audio sample rate 44kHz, audio sample size 16 bit, and 2-channel stereo. Using CoolEdit's spectral frequency analyzer the frequency and harmonic content of the music was very obvious on screen.

    I made several mp3 versions of the sample at different bit rates, using eRightSoft's “SUPER ©” media converter, and compared the spectral frequencies of these mp3 files on screen using CoolEdit.


    Here are some of the results (the column tabulation may not show as it should):


    bit rate cut-off frequency compression


    1411kbps >20kHz 1:1

    320kbps 19.5kHz 1:4.4
    192kbps 18kHz 1:7.3
    160kbps 17kHz 1:8.8
    128kbps 16kHz 1:11
    96kbps 15kHz 1:14.7
    64kbps 11kHz 1:22
    32kbps 5kHz 1:44

    I think you'd have to have an extremely good ear and absolutely top-of-the-range audio equipment to notice any significant difference between the source (1411kbps) and 320kbps – and possibly as far down as 160kbps (iTunes suggests 160 or 192 for most purposes). When you get down to 128kbps (the “near-CD” quality beloved of mp3 file purveyors on the internet) even my old ears can tell the clear difference between that and 192 and higher. 128 and 96Kbps may be just about bearable if you're listening on low-end playback, but when you get down to 64-32kbps range you're definitely in the voice (radio and telephony) recording range – but even that low range can have an application in recording tunes – of which more below.


    When you do a mp3 conversion it's not only the top end frequency that goes (although a top cut-off of 16 or 17kHz is not significant for most adults), but frequencies well down in the audio range are lost or attenuated, the effect of which is some level of distortion. This distortion, as I've said above, is apparent when you get down to 128kbps or below, but can be lived with at 192 and above, and certainly with 320kbps. I did a couple of further experiments to explore this distortion effect. I digitally subtracted the 320kbps file from the 1411kbps file (the original) to generate a difference file consisting of the lost or attenuated frequencies. These were at a very low dB level and in practical terms would be inaudible. This was certainly not so when I subtracted the 128 file from the 320 – the lost frequencies, although still at a relatively low dB level, were certainly audible on playback and are the cause of the audible distortion on 128kbps mp3.


    I mentioned the 64-32kbps mp3 range. This can have an application for someone who records a tune and needs a little help in transcribing it from the recording. Convert the recording to 32kbps mp3 and you'll have a top cut-off frequency of 5kHz (just above the top note on a piano). This means you lose a lot of the clutter of hiss and high harmonics; the tune notes therefore sound more distinct and, if you use the spectral frequency option in a sound editor, are very obvious on screen.

    Summary: 192kbps and above gives best audio results; 128 is ok for web use and if you're not too worried about a little bit of distortion, and 32kbps can have an unexpected use in transcribing tunes."
    Source

    ST

  15. #15
    Join Date
    Jan 2006
    Location
    South of England, UK
    Posts
    4,274

    Default HF stripping in MP3

    It may or may not do. This HF curtailing is not strictly relevant to the discussion about how MP3 works because it's a by-product of the encode process, not it's core technology. In other words, just discarding or ignoring high frequencies during encoding will only make the resulting encoded file a little smaller. Why? Because there is very little energy in those high frequencies or put another way, these harmonics and transients are of only a small contribution to the musical spectrum.

    How do we know this? Listen to music on AM radio and all high frequencies above about 4kHz are removed. Yet the bulk of the sound is still there, and the tune is unambiguous.
    Alan A. Shaw
    Designer, owner
    Harbeth Audio UK

  16. #16
    Join Date
    Jan 2006
    Location
    South of England, UK
    Posts
    4,274

    Question Gobbledeook quotation - avoid!

    Quote Originally Posted by STHLS5 View Post
    ...In Wiki, it was stated "(T)he compression works by reducing accuracy of certain parts of sound that are deemed beyond the auditory resolution ability of most people...
    This quote, as I mentioned a few posts earlier, is gobbledegook. Completely meaningless to me. As I said it cannot be an accurate quote from someone who works in the field as a scientist. So it has no place here being quoted again.

    Perhaps you can decode it for us? I can't. It must means something to you chaps because it's been quoted twice in this serious thread.
    Alan A. Shaw
    Designer, owner
    Harbeth Audio UK

  17. #17
    Join Date
    Jan 2006
    Location
    South of England, UK
    Posts
    4,274

    Exclamation Just how can MP3 discard so much information in the source audio file?

    Quote Originally Posted by STHLS5 View Post
    ...When you do a mp3 conversion it's not only the top end frequency that goes (although a top cut-off of 16 or 17kHz is not significant for most adults), but frequencies well down in the audio range are lost or attenuated, the effect of which is some level of distortion. ...
    Forget about the contraction of the high frequency range. It's a bit of a red herring i.e. its not a fundamental part of the success of MP3. That in itself wouldn't be patentable. And MP3 technology is patented to the hilt. So, in simple language that we can all understand, what is the answer to my question about how the MP3 encoder can throw-away so much of the music and still it remains a high quality audio file? How could we demonstrate that simply and easily with just household goods? (I've already planned a video demo but I want you to work for your supper).

    I'm always conscious that most of our members here are not technical people, so whilst quoting published work is valuable to those with a technical mind, to the majority, it just adds confusion. So let's try and work this through using everyday language please. Ok?
    Alan A. Shaw
    Designer, owner
    Harbeth Audio UK

  18. #18
    Join Date
    Sep 2009
    Location
    Malaysia
    Posts
    508

    Default HF response of Mp3

    Forget about the contraction of the high frequency range. It's a bit of a red herring i.e. its not a fundamental part of the success of MP3.
    I just finished reading an article by K Brandenburg of Fraunhofer Institute published in EBU technical review June 2000 where he said "..it is reasonable technology to limit the frequency response of MP3 encoder to 16kHz.".

    MP3 did limit the upper range of frequencies, didn't it?

    ST

    {Moderator's comment: Alan said this is not a fundamental issue. He said: "Forget about the contraction of the high frequency range............". Just saying that it's reasonable to limit HF does not mean they actually did it.}

  19. #19
    Join Date
    Aug 2009
    Location
    United States
    Posts
    73

    Default

    Quote Originally Posted by A.S. View Post
    This quote, as I mentioned a few posts earlier, is gobbledegook. Completely meaningless to me. As I said it cannot be an accurate quote from someone who works in the field as a scientist. So it has no place here being quoted again.

    Perhaps you can decode it for us? I can't. It must means something to you chaps because it's been quoted twice in this serious thread.
    Being I am the one who posted it please permit me to be the one to 'decode' it.

    Only the term "perceptual coding" is in the professional paper. The professional paper does not define perceptual coding as being what Wikipedia says it is.

    The Wiki quote misleads the reader because that Citation 13 [the professional paper] only serves as a reference for the term "perceptual coding" and not any of the text that precedes the term: "The compression works by reducing accuracy of certain parts of sound that are deemed beyond the auditory resolution ability of most people". This method is commonly referred to as perceptual coding.[13]

    This link will download a pdf of the entire scholarly paper Signal compression based on models of human perception ...
    A screen shot of the Abstract is attached.
    Attached Images Attached Images

  20. #20
    Join Date
    Jun 2009
    Location
    Canada
    Posts
    904

    Default And the answer is ...

    ... all right, the HF issue is peripheral.

    But if the right answer has already been presented, albeit non-technically (i.e. masking, psychoacoustic coding), then I think not addressing it just serves to confuse. Once you raise a question, I think you have to provide the answer within a reasonable time period, or people will just lose the thread.

Page 1 of 3 123 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •