Announcement

Collapse

HUG - here for all audio enthusiasts

Since its inception ten years ago, the Harbeth User Group's ambition has been to create a lasting knowledge archive. Knowledge is based on facts and observations. Knowledge is timeless. Knowledge is human independent and replicatable. However, we live in new world where thanks to social media, 'facts' have become flexible and personal. HUG operates in that real world.

HUG has two approaches to contributor's Posts. If you have, like us, a scientific mind and are curious about how the ear works, how it can lead us to make the right - and wrong - decisions, and about the technical ins and outs of audio equipment, how it's designed and what choices the designer makes, then the factual area of HUG is for you. The objective methods of comparing audio equipment under controlled conditions has been thoroughly examined here on HUG and elsewhere and can be easily understood and tried with negligible technical knowledge.

Alternatively, if you just like chatting about audio and subjectivity rules for you, then the Subjective Soundings sub-forum is you. If upon examination we think that Posts are better suited to one sub-forum than than the other, they will be redirected during Moderation, which is applied throughout the site.

Questions and Posts about, for example, 'does amplifier A sounds better than amplifier B' or 'which speaker stands or cables are best' are suitable for the Subjective Soundings area.

The Moderators' decision is final in all matters regarding what appears here. That said, very few Posts are rejected. HUG Moderation individually spell and layout checks Posts for clarity but due to the workload, Posts in the Subjective Soundings area, from Oct. 2016 will not be. We regret that but we are unable to accept Posts that present what we consider to be free advertising for products that Harbeth does not make.

That's it! Enjoy!

{Updated Nov. 2016A}
See more
See less

MP3 and audio data compression

Collapse
This topic is closed.
X
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • MP3 and audio data compression

    For so many years I was thinking that Mp3 sounds inferior because it was missing frequencies above certain level. This view seems to be wrong in light of what Grant Bridgeman said in his article that, I quote, " A good mp3 achieves a compression ratio of about 10:1 compared to an original 44.1kHz/16-bit audio file, and although some of the information has been lost, the audio quality is still acceptable to most people. This is because the algorithms are written specifically to take how we hear into account, removing audio data that our ears do not perceive as being of importance.

    However, we can still hear the difference, as mp3s can sound hollow, thin and lacking sparkle. This is because that ‘un-perceivable’ information has been removed and our brains can no longer filter the sound in the same way as we would with the original recording."

    So is Mr. Bridgeman suggesting even though we do not perceive the sound but it is still important?

    ST
    Attached Files

  • #2
    MP3 and audio data compression

    For so many years I was thinking that Mp3 sounds inferior because it was missing frequencies above certain level
    No this is not true. In fact the opposite is true. The tones are missing below a certain (loudness) level. Is that what you meant to say? But even so, that's a huge simplification. Do you know the fundamental principal behind audio compression? Patents here. Note that the fundamental papents are from 1987 .... four years after the CD was launched.

    A good mp3 achieves a compression ratio of about 10:1 compared to an original 44.1kHz/16-bit audio file
    True.

    For example, if an audio compression system is specified as having 10:1 compression, that means that for every ten units of audio input to the encoder, nine units are discarded and not at all encoded. As an approximation, if the file size of an uncompressed WAV file (from an audio CD) is 330kB, if it is passed through a 10:1 audio compression system, the resulting output file is 33kB.

    This is because the algorithms are written specifically to take how we hear into account, removing audio data that our ears do not perceive as being of importance.
    True.

    as mp3s can sound hollow, thin and lacking sparkle
    Yes, but they can also sound fantastic - utterly indistinguishable from the source WAV file. The secret is how the encoder* is set-up to do its job. The MP3 encoder is not a rigid 'one setting suits all' tool. It has lots of user adjustable controls (see attached). and how it's set depends entirely on what the requirement is. If it's to make a super-small MP3 to use on a mobile phone than maybe the setting is not 10:1 compression but 100:1 (that means, from the source file, ninety nine units of audio are dumped and only one passed through to the MP3) and as you can imagine, the perceived quality must and does drop dramatically. You've probably heard that horrible swooshing phasey sound from highly compressed (100:1?) MP3s. That's nothing more than the fact that when you ignore 99% of the audio, the fine detail of reverberant decay between words and notes is erased. But you can't blame MP3 overall for that - that was how the engineer wanted to process the audio in that clip. He was driven by file size, not audio quality objectives.

    This is because that ‘un-perceivable’ information has been removed and our brains can no longer filter the sound in the same way as we would with the original recording."
    A bold statement but sorry, completely impenetrable to me. Cannot be an accurate quote from an audiologist.

    The attached screen-snap shows some of the settings available to the encoder if you pay the royalty and receive a licence (as I have done) that allows you to tweak the encode routines. There is no fundamental reason why MP3 should sound bad.

    -------------------------------------------------
    *The business model behind MP3 is fascinating. The decoders (in our consumer goods and software) may or may not be royalty free to the manufacturer (I'm not sure of the latest position) but the charges are small. The encoder is comprehensively patented and money has to be paid to the patent holder. All the clever work is in the encoder. The consumer's decoder is dumb, with not controls and can be fabricated on silicon very cheaply indeed. That's why it is the sole responsibility of the person responsible for encoding to set the quality parameters, and there is nothing at all that the consumer with his dumb decoder can do to improve upon the quality of the encoded MP3 file he receives.

    These are subjects that I can write about and willingly share my knowledge. But not when I'm wasting valuable time justifying company sales policy.
    Attached Files
    Alan A. Shaw
    Designer, owner
    Harbeth Audio UK

    Comment


    • #3
      320kB MP3

      I understand that VBR above 320Kb/s is pretty well inaudible from the master file..

      In any case, with the memory of personal stereo's getting bigger and download speeds becoming faster, I wouldn't have thought low bitrate is needed these days.

      The major problem as I see it is the severe audio compression applied to many tracks. Surely the monitors used in the mastering can reproduce this properly?

      Comment


      • #4
        How does MP3 actually work?

        Originally posted by A.S. View Post
        No this is not true. In fact the opposite is true. The tones are missing below a certain (loudness) level. Is that what you meant to say?..
        I may be wrong here but my understanding of audio compression is it works by discarding frequencies outside our hearing range. Am I correct to say that?

        ST

        Comment


        • #5
          Sound of MP3 and bitrate

          I wonder if one factor that might affect the perceived quality of an MP3 encoding has nothing to do with the encoding algorithm itself, but the nature and dynamics of the recording.

          I have been using Apple Lossless (ALAC) to rip CDs to my hard drive for a couple of years now. As the name implies, it is a "lossless" compression system that packs the data down into a smaller space, but does not discard data.

          One of the things iTunes lets you do is to see the bit rate at which any audio track was compressed, post-compression. As we know, the CD is always 1411 kpbs. After running ALAC, I get anything from 227kbps at the low end (an old Django Reinhart recording) to 1123 kbps at the high end (a pop recording by a Canadian band called Stars). Most music is in the middle, at the 500 - 700 kbps range, but there's a surprising amount at over 1000.

          Would it not be the case that, if you took two pieces of music and encoded them in MP3 (or any lossy compression scheme, such as Apple's AAC), but one piece could be losslessly encoding at 400 kbps and the other required 1100 kbps, then the one that could be packed more efficiently would sound "better" - i.e. closer to the original - than the file that needed 2.5 times the space to encode as a lossless file, because with the more efficient file there'd be less data that needed to be thrown away? I've never read anything about this, but logically it seems to make sense, and if it's right, then it might be one reason why there are variable opinions about MP3 sound quality.

          Comment


          • #6
            How does MP3 actually work ..... more info please

            Originally posted by STHLS5 View Post
            I...my understanding of audio compression is it works by discarding frequencies outside our hearing range ...
            OK, before I jump in, I think I need some more input from you. Can you explain please with perhaps a described example of how you think MP3 works along the frequency reduction lines you suggest?

            If you have a view, I'm fairly certain that it will be held by others too. The better I understand what that prevailing view is, the better I can explain how it actually works.
            Alan A. Shaw
            Designer, owner
            Harbeth Audio UK

            Comment


            • #7
              Lossless and lossy

              Originally posted by EricW View Post
              ...Would it not be the case that, if you took two pieces of music and encoded them in MP3 (or any lossy compression scheme) but one piece could be losslessly encoding at 400 kbps and the other required 1100 kbps, then the one that could be packed more efficiently would sound "better" - i.e. closer to the original...
              Whoah there! You're racing ahead! Let's take a step backwards, and clarify some terminology.

              1) "Lossy" compression: means something must have been thrown away in the sound file. It may not be obvious on listening or even on examining the waveform technically but some part of the wave must have been discarded. No escape from that fact.

              2) "Lossless" compression: means the audio waveform comping out of the encode then decode process will be exactly the same in every detail (in theory anyway) as the audio waveform that arrived at the encoder's input.

              So how can the file size be usefully reduced for the person receiving the file in lossless system? There are a number of approaches we can discuss later but you can think of it like this. Imagine that you write a book. You print out the entire manuscript, staple it together into the correct page order and then you put it into a large cardboard box and post it to your publisher. The manuscript rattles about in the box. You can think of a WAV file like this loose fitting box. But if you take the same manuscript and you put it in a tight fitting JiffyBag you can reduce the size of the package greatly, but the contents are completely unaltered. That's what lossless compression does. So regardless of which method you use, as far as the receiver (your book publisher) is concerned, the valuable contents word by word and page by page are unaltered. But the package - the data stream size is reduced.

              If the original WAV file (uncompressed, straight off the CD) needs 1000kbits/second to describe every nuance of the waveform, then passing it through any lossy CODEC and seeing only a (say) 600kb/sec data stream on the outgoing side of that encode process can mean only one thing: audio data has been discarded. No other possibility. And that means under careful technical scrutiny (if you know where to look in the waveform) you will be able to see what has been erased by the lossy encoder. As far as I can see, MP3 does not have a 'lossless' setting (why would it need one?) ergo, something is always thrown away from the audio source even on the highest bitrate settings when passing through any MP3 encoder.

              In round numbers: if as you say the datarate that an audio CD is delivering to the CD player's read laser is roughly 1200kb/sec, and the highest-quality MP3 encoder setting is 300kB/sec, then that means an encode ratio of 4:1 i.e. three out of four audio "blocks" will be thrown away in the encoder ..... you are listening to an audio signal with 75% of the data removed. Yet it can sound indistinguishable from the original WAV file - no surprise to anyone who knows how the ear works. Now the clever part that would have made you a rich man if you'd beaten the MP3 patent holders to the post: which 75% can you throw away without any noticeable degradation even by the most careful listener using the best equipment? To guarantee that you absolutely must study the way we interpret sound. In other words, some science is needed, not guesswork.
              Alan A. Shaw
              Designer, owner
              Harbeth Audio UK

              Comment


              • #8
                Lossless date rate?

                First, to your point made to STHLS5, my understanding is that any lossy audio compression works primarily on the basis of the psychoacoustic principle of "masking".

                Masking, as I understanding, involves the idea that when a loud sound and a soft sound occur simultaneously, the ear will not pick up the softer sound (below a certain threshold), and provided it's far enough below the louder sound, it's safe to discard it, as the ear will not notice the difference. So I guess what's going on is that the encoder is constantly "looking" at the musical signal and performing an analysis of relative dynamic levels, and throwing away what it's safe to throw away. I realize this is probably a gross oversimplification, but I think that's the basic idea.

                As for my post, the point I was trying to make is that, even using a lossless encoder, the amount of data required to encode losslessly seems to vary considerably, if I can believe what iTunes is telling me. I understand the difference between lossless and lossy. So my question really was, to simplify, if the original WAV signal can be losslessly encoded at 320 kbps (for example), will running it through an MP3 encoder at the 320 setting actually involved any discarding of data?

                Comment


                • #9
                  Quoting MP3 work ....

                  Originally posted by A.S. View Post
                  OK, before I jump in, I think I need some more input from you. Can you explain please with perhaps a described example of how you think MP3 works along the frequency reduction lines you suggest? If you have a view, I'm fairly certain that it will be held by others too. The better I understand what that prevailing view is, the better I can explain how it actually works.
                  The use in MP3 of a lossy compression algorithm is designed to greatly reduce the amount of data required to represent the audio recording and still sound like a faithful reproduction of the original uncompressed audio for most listeners. An MP3 file that is created using the setting of 128 kbit/s[note 1] than the CD file created from the original audio source. An MP3 file can also be constructed at higher or lower bit rates, with higher or lower resulting quality. will result in a file that is about 11 times smaller
                  The compression works by reducing accuracy of certain parts of sound that are deemed beyond the auditory resolution ability of most people. This method is commonly referred to as perceptual coding. It uses psychoacoustic models to discard or reduce precision of components less audible to human hearing, and then records the remaining information in an efficient manner. MP3 uses a complex, precise masking model that is much more signal dependent than JPEG picture encoding. (Wiki)

                  Comment


                  • #10
                    How MP3 works ...

                    Originally posted by Supersnake View Post
                    The use in MP3 of a lossy compressor...
                    Thank you for the links but actually I was not interested in the facts but what STHLS5 thought were the facts. The facts are indisputable but I am curious about what may be a widely held opinion. Or could this just be a simple confusion on my part as to what he was saying? Either way, I'd like to hear from him.
                    Alan A. Shaw
                    Designer, owner
                    Harbeth Audio UK

                    Comment


                    • #11
                      MP3 and frequency response

                      I have read that MP3 also does somewhat truncate high frequency response, reducing or eliminating frequencies about 16Khz. Is this incorrect?

                      {Moderator's comment:"k" as in kilo Hertz is always small k not capital K please.}

                      Comment


                      • #12
                        How MP3 works ....

                        Originally posted by A.S. View Post
                        ..but what STHLS5 thought were the facts...
                        I will agree with Supernake. My reading of Wiki or various "write-up" on MP3 suggests that besides compression it actually discards sound(frequencies) in the recording which was beyond human hearing. In Wiki, it was stated "(T)he compression works by reducing accuracy of certain parts of sound that are deemed beyond the auditory resolution ability of most people. This method is commonly referred to as perceptual coding.[13] It uses psychoacoustic models to discard or reduce precision of components less audible to human hearing, and then records the remaining information in an efficient manner." I do not have a MP3 collection and don't listen to them and therefore I must admit I was too keen to understand MP3 technology in detail.

                        However, Wiki is a recent discovery. My views about limited frequencies were entrenched in my head a long time before and wonder why I came to that conclusion. I believe many of the people associate with share a similar opinion because none so far pointed out if my view was wrong.

                        ST

                        Comment


                        • #13
                          How MP3 works ....

                          Addendum

                          Further reading on the subject now help me to understand better how MP3 works.

                          This image is the current representation of MP3 encoding principle. I hope this is correct interpretation. http://awesomescreenshot.com/099bk7a2

                          I am still unable to explain why I had thought that MP3 cuts off above the higher frequencies. But reading form the web I may have relied on the early encoding technology of MP3 like here below:-

                          http://awesomescreenshot.com/0b7bk643

                          ST

                          Comment


                          • #14
                            MP3 high frequency cut-off ...

                            I am still trying hard to put my finger on where and why I came to the conclusion that MP3 cuts frequencies above certain frequencies but reading this article the position might be true at the early stages of MP3.

                            "MP3 frequency comparison:

                            Most of us at some time or other convert an audio recording to the more compact mp3 format for storage, or for play-back on mp3 players, or for sending audio files through web-space. Mp3 is very convenient for this but is what is known as a “lossy” format – which means the price you pay for a compressed file size is a loss of frequencies throughout the audio spectrum of the recording. You may not be aware of this frequency loss if you're listening on low-end audio equipment, but it is very definitely there.

                            For my own interest I've done a little research into this effect. For my source I used a few seconds of a CD recording of the violinist Sophie Mutter playing part of Mozart's A-major violin concerto. I chose this because the harmonic content of her playing on her Guarnerius violin on occasion extends beyond 20kHz (about the top limit for a quality CD), and her playing is accompanied by a full classical orchestra. I copied the extract onto my computer using standard audio properties - bit rate 1411kbps, audio sample rate 44kHz, audio sample size 16 bit, and 2-channel stereo. Using CoolEdit's spectral frequency analyzer the frequency and harmonic content of the music was very obvious on screen.

                            I made several mp3 versions of the sample at different bit rates, using eRightSoft's “SUPER ©” media converter, and compared the spectral frequencies of these mp3 files on screen using CoolEdit.


                            Here are some of the results (the column tabulation may not show as it should):


                            bit rate cut-off frequency compression


                            1411kbps >20kHz 1:1

                            320kbps 19.5kHz 1:4.4
                            192kbps 18kHz 1:7.3
                            160kbps 17kHz 1:8.8
                            128kbps 16kHz 1:11
                            96kbps 15kHz 1:14.7
                            64kbps 11kHz 1:22
                            32kbps 5kHz 1:44

                            I think you'd have to have an extremely good ear and absolutely top-of-the-range audio equipment to notice any significant difference between the source (1411kbps) and 320kbps – and possibly as far down as 160kbps (iTunes suggests 160 or 192 for most purposes). When you get down to 128kbps (the “near-CD” quality beloved of mp3 file purveyors on the internet) even my old ears can tell the clear difference between that and 192 and higher. 128 and 96Kbps may be just about bearable if you're listening on low-end playback, but when you get down to 64-32kbps range you're definitely in the voice (radio and telephony) recording range – but even that low range can have an application in recording tunes – of which more below.


                            When you do a mp3 conversion it's not only the top end frequency that goes (although a top cut-off of 16 or 17kHz is not significant for most adults), but frequencies well down in the audio range are lost or attenuated, the effect of which is some level of distortion. This distortion, as I've said above, is apparent when you get down to 128kbps or below, but can be lived with at 192 and above, and certainly with 320kbps. I did a couple of further experiments to explore this distortion effect. I digitally subtracted the 320kbps file from the 1411kbps file (the original) to generate a difference file consisting of the lost or attenuated frequencies. These were at a very low dB level and in practical terms would be inaudible. This was certainly not so when I subtracted the 128 file from the 320 – the lost frequencies, although still at a relatively low dB level, were certainly audible on playback and are the cause of the audible distortion on 128kbps mp3.


                            I mentioned the 64-32kbps mp3 range. This can have an application for someone who records a tune and needs a little help in transcribing it from the recording. Convert the recording to 32kbps mp3 and you'll have a top cut-off frequency of 5kHz (just above the top note on a piano). This means you lose a lot of the clutter of hiss and high harmonics; the tune notes therefore sound more distinct and, if you use the spectral frequency option in a sound editor, are very obvious on screen.

                            Summary: 192kbps and above gives best audio results; 128 is ok for web use and if you're not too worried about a little bit of distortion, and 32kbps can have an unexpected use in transcribing tunes."
                            Source

                            ST

                            Comment


                            • #15
                              HF stripping in MP3

                              It may or may not do. This HF curtailing is not strictly relevant to the discussion about how MP3 works because it's a by-product of the encode process, not it's core technology. In other words, just discarding or ignoring high frequencies during encoding will only make the resulting encoded file a little smaller. Why? Because there is very little energy in those high frequencies or put another way, these harmonics and transients are of only a small contribution to the musical spectrum.

                              How do we know this? Listen to music on AM radio and all high frequencies above about 4kHz are removed. Yet the bulk of the sound is still there, and the tune is unambiguous.
                              Alan A. Shaw
                              Designer, owner
                              Harbeth Audio UK

                              Comment

                              Working...
                              X