Page 2 of 3 FirstFirst 123 LastLast
Results 21 to 40 of 54

Thread: MP3 and audio data compression

  1. #21
    Join Date
    Sep 2007
    Location
    England
    Posts
    242

    Default

    But if the right answer has already been presented, albeit non-technically (i.e. masking, psychoacoustic coding
    I'd say that was adeeply technical contribution; about as deep in the subject of sound as one can possibly go.

    "Masking, Psychoacoustic coding" etc. etc... With one or two exceptions, I wonder who reading this - members and non-members alike - have the remotest idea what that is all about. What sets this group apart is the demystification of ideas - at least, that's how I see it. It's all very well quoting from erudite sources - but that's not what we're here for. Leave that to the real boffins. We need earthy, basic ideas. Because only those stick in the mind.

    OK, I asked for some suggestions about how we could demonstrate this concept. Assuming that one knows what "masking, psychoacoustic coding" is all about then a simple demonstration would be worth a million words and pictures. Ideas? As I said, drawing not on lab equipment but everyday life.

  2. #22
    Join Date
    Jun 2009
    Location
    Canada
    Posts
    886

    Default OK, I see what you're after now

    If you're looking for a non-technical analogy that would be reasonably comprehensible to most people, one idea that occurs to me is to compare various bitrates of audio compression to another kind of compression that many are familiar with, i.e. JPEG compression in digital photography, based on the number of megapixels available to encode an image.

    I think it's correct to say that, at least at a modest image size, the eye is unlikely to detect much difference between, say, an image at 5 megapixels and another at 10 (camera and lens quality being otherwise equal), though one image contains twice as much data.

    Would this kind of visual analogy be useful, perhaps? The benefit being that the point would be made clear by people's ability to compare the two images side by side, which can't be done with sound. (If so, I can't volunteer to put it together as I have a busy week ahead, but if you think the idea has merit maybe someone else could?)

  3. #23
    Join Date
    Jan 2006
    Location
    South of England, UK
    Posts
    4,204

    Default

    Yes, exactly that sort of thing. We tried that some time ago here.

    But - this masking business is about sound. Wouldn't it be more useful to illustrate the point to use a sonic example? If we're quite clear in our own minds what the previously quoted 'masking' is all about then surely we can dream up an example or two? Eh?

    The internet is a curse in that it's all too easy to reel off other peoples quotes without really comprehending what is being quoted. You as a lawyer will be familiar with the low technical knowledge of any court jury (a cross-section of society). What I'm asking for is to take the subject and make it accessible to the ordinary juror with apt, succinct, relevant examples. I can see a way but I'm exhausted having spent the whole afternoon on making another TechTalk video. It shouldn't be for me to educate! I'm just a humble speaker designer not a tutor!
    Alan A. Shaw
    Designer, owner
    Harbeth Audio UK

  4. #24
    Join Date
    Jun 2009
    Location
    Canada
    Posts
    886

    Default A sonic example of 'psychoacoustic masking' ...

    Okay, I'll take a stab.

    Imagine you're sitting in your living room watching TV at a moderate volume. It's raining steadily outside, and you can hear the rain if you focus on it, but the TV is louder. Suddenly, there's an enormous thunderclap outside your window, that lasts 3-4 seconds.

    With ordinary digital recording (such as CD's WAV file format), all three sonic events would be registered and encoded (i.e. turned into 1s and 0s by the digital recording process). However, the way the ear works is that while the sound of the very loud thunderclap is happening, you will not be hearing the soft steady sound of the rain, and if the thunder's loud enough, for the 3-4 seconds it's happening, you probably won't hear your TV either. So all the resolution and file size of a recording system devoted to capturing rain + TV + thunder is in a sense "wasted", because while the loud thunder is happening, that's all your ears are going to pick up.

    So what MP3 does (Apple's AAC system likewise) is, as I understand it, is to continually analyze the relative levels of various sounds that occur together in time, and if the lower-level sounds are below a certain threshold, it will "decide" that those low-level sounds are unnecessary because they will be "masked" by the louder sound the way the thunder masks the rain, and the TV as well. So the MP3 system won't encode (it will completely ignore) sounds below a certain loudness, provided that a sufficiently loud masking sound exists at the same time. This reduces file size, while creating very little if any audible degradation.

    Obviously, at some point the data reduction does become audible, and the threshold for audibility will depend on such things as the quality of the reproduction system, level of ambient noise in the environment, and so on. On iTunes, which is what I'm most familiar with, I found the old standard of 128 kbps subjectively wasn''t good enough because too much audio information had been ignored by the MP3 coder to generate a small file size. They've since moved to 256 kbps (AAC), which is a lot closer to CD standard, with a bigger file size and certainly good enough for background listening (though I encode my own CDs in Lossless). But for fun, you can encode as low as 16 kbps, which means you're discarding between 98 and 99% of the original data. That's certainly audible, and I would never willingly encode music at that very low bitrate, but even then, what's surprising to me is how identifiable the basic elements of the music still are.

  5. #25
    Join Date
    Jan 2006
    Location
    South of England, UK
    Posts
    4,204

    Default Brilliant explanation of 'psychoacoustic masking'

    Quote Originally Posted by EricW View Post
    ...Imagine you're sitting in your living room watching TV at a moderate volume. It's raining steadily outside, and you can hear the rain if you focus on it, but the TV is louder. Suddenly, there's an enormous thunderclap outside your window, that lasts 3-4 seconds...
    Thanks for taking the trouble to reply with such a concise explanation of the phenomena of 'psychoacoustic masking'. Good work. That's just as I understand it to be. There are a couple of points I'd like to make:

    • Isn't it much better when we convert what we read and discover into our own simple words rather than quote from others directly? Ideas and concepts are better fixed in our minds when they are expressed in simple ordinary language, preferably with some visual association - the rain, the TV the thunder all do that admirably and unforgettably.


    • If we write here as you did as if you were trying to win-over a jury of ordinary non-technical people, a few simple words and images can have more impact than an encyclopaedia of erudite knowledge. Can we try and make this our 'style sheet' for technical discussion?

    OK, assuming that your explanation is now absorbed, we can move on to a vivid demonstration. OK, I'll make it, but please give me a day or two.

    ----------
    P.S. One important point: the discovery that human ears are susceptible to 'masking' cannot be assumed to be universal for all living creatures with ears. It may not apply to your cat or dog. It is a curious by-product of the evolution of our ears and brains. As with many patent originators based on the observation observe of a characteristic of human behaviour or life, the MP3 patent holders seized upon a business opportunity by finding a way to use the oddities of the human hearing system to their advantage to make money. Similarly, photographic pioneer George Eastman (Kodak) used his observation that the human eye was most sensitive to the colour yellow to trade mark their corporate logo. But remember, human yellow sensitivity may well not apply to other animals! We're dealing here in psychoacoustics with quirks of the human hearing system not necessarily hearing in all species.
    Alan A. Shaw
    Designer, owner
    Harbeth Audio UK

  6. #26
    Join Date
    Feb 2010
    Location
    England and Cyprus
    Posts
    370

    Default Opposite of 'masking'?

    The opposite of "masking" must be "focussing".

    I have read that a mother will, even in very difficult circumstances, hear the sound of her infant crying - even at modest volumes, it can wake her from a deep sleep. Is this principle used? I wonder if, when we hear a very compelling melody, we pay much less attention to the supporting background sound the accompaniment?

  7. #27
    Join Date
    Mar 2010
    Location
    uk
    Posts
    141

    Default Perception, the brain and beliefs

    Having pondered this subject for a while (mainly in the middle of the night while waiting to see if our youngest really has gone back to sleep) some thoughts occur:

    Even in the most straightforward mic to tape recording set-up, the sound information that ends up on tape is unlikely to be the sound information that we 'heard'. Leaving aside the physical aspects of shape of head and ears and so on, the simple fact that we are processing what we hear and the mic isn't means that the information is different. I'd not listened to any auditory illusions before, but things based on the Shepard Scale here, are just as intriguing as the optical illusions that I am more familiar with.

    Having come to terms with the fact that we hear what we think we hear in the same way that read what we think we read - it follows that our brains are doing a fair bit of filling in; we don't need all the letters in a word to read the word and, apparently, we don't need all the harmonics, or even the fundamental, to hear a note - the brain will fill in tones that it perceives to be missing. What the brain does seem to look for though, are patterns and order and continuity, we want things to look and sound 'right' - based on culture and learning and whatever else has gone into forming our own brains.

    Listening to interrupted speech, bad mobile phone reception for example, aside from being inconvenient also seems (to me) to be 'fatiguing' to the senses. Similarly talking and listening over background noise is ultimately rather tiring; I worked for many years in a shop department that had an escalator running through it, though we weren't consciously aware of raising our voices or straining to hear, the sense of relief when it was turned off at the end of the day and conversation got back to normal was always noticeable.

    VINYL REPLAY:

    An issue I have come across with vinyl replay is that some set-ups do a much better job of allowing me to hear the surface noise as something separate from the music and thereby more readily ignore the noise. When the noise and music are less differentiated replay is less satisfactory. I assume that this is down to the brain having enough information to identify that the noise is just that, and therefore de-emphasise it. That extra information is not something I would be conscious of hearing though.

    MP3:

    To get back to mp3 and leaving out information then, I think I grasp the various forms of masking that go on and in a music recording we are talking about information that has been picked up by a microphone but that we would not have consciously heard - it would have been masked for one reason or another. This doesn't actually mean that our ears didn't pick it up does it? just that our perception didn't make it known to us. I'm assuming that we have a notion of 'sound permanence' in a similar way to 'object permanence' - it's only the very young who believe that when we hide behind a book we have actually disappeared. So if I could hear the TV before the thunder clap and hear it after the thunder clap then my perception would be that it carried on through it as well. I don't know how much use mp3 encoding makes of the brains ability to fill in the gaps, to hear sound that isn't actually there, to hear what it thinks it should hear - but my guess is that it plays a part.

    Taking all of this together it seems that what allows lossy files to work is partly predicated on getting our brains to work harder; it's not just putting the manuscript in a smaller box but also leaving out letters from some of the words because we can still read them anyway. The fact that a high bitrate mp3 is all but indistinguishable from a wav must mean that either the extra information really was superfluous, or that our brains don't miss it because they can join the dots for themselves mustn't it?

    An encoding format which makes use of the brain's strengths (and weaknesses for that matter) is therefore 'shifting the workload' isn't it? By reducing the amount of information we are given (whilst not ignoring the fact that we can be given too much information at times) is it not increasing the rate at which the brain tires? I am of course assuming that the act of perception is in itself tiring, I find it to be so - but that doesn't necessarily mean that it actually is.

    A BROADCAST MONITOR SPEAKER:

    Why should this matter to a Harbeth user? One of the prerequisites for a broadcast monitor must be that it can be used all day without causing fatigue to the user; that the information it presents must be put across in such a way that the listener has to make as little effort as possible to hear it, even at / especially at low volume and with other noise around. (I hope I've got that right?)

  8. #28
    Join Date
    Jan 2006
    Location
    South of England, UK
    Posts
    4,204

    Default Data reduction

    Quote Originally Posted by weaver View Post
    ...An encoding format which makes use of the brain's strengths (and weaknesses for that matter) is therefore 'shifting the workload' isn't it? By reducing the amount of information we are given (whilst not ignoring the fact that we can be given too much information at times) is it not increasing the rate at which the brain tires? ...
    Your argument that data reduction (i.e. converting a .wav file to an MP3 file) discards much (actually most) of what the microphone collects is definitely true. And easy to prove by watching the byte count of a digitised (wav) audio file diminish as it is converted to MP3 format with ever greater compression. Even a very high quality MP3 throws away most of the fine detail on the file. But as you note, it is virtually - or I'd say actually - impossible (except under the most incredibly contrived controlled conditions with young trained ears which preculdes all of us here) to detect, say, a 256kb MP3 versus the original WAV file. That's my experience anyway and I'd put money on it.

    But as to your hypothesis that the brain is having to work overtime to 'make-up' or patch-in the missing data lost during reduction, no that is not the case. Whatever mysterious masking process is going on in (as I recall it) the hair cells of the ear that wiggle in response to impinging sound waves, it is they which mechanically define the way masking works, not some post-processing function in the brain. You can, crudly, think of the hair cells as a long row of green bottles sitting on a wall. The sound wave tickles the first bottle and passes along the chain, the bottles bending with the wave as it passes. As the bottles are quite fat, they can't discriminate very fine differences in tone of the sound wave, and this is the core of the making concept. So the brain simply isn't receiving the fine-detail from the hair cells as sensory input. Ok, I could look this up - the man who did much of the research was Zwicker - but I'll leave that to you. And Zwicker's work is the core of masking and hence, the MP3 system.

    Incidentally, when DCC appeared (and I still use it - 256kb, 4:1 encoding, completely transparent) I approached Philips and asked them if I could somehow incorperate their compression technology in our active speakers. My concept was that if we were erasing and not feeding the speaker with all the fine detail that was masked in the ear anyway, this would reduce the strain on the speaker, distortion would diminish and the result would be a cleaner sound entirely due to data reduction. I still believe that is a valid argument. The easier we make the loudspeaker's task, surely the less distortion it will produce. And let's not forget, speakers produce oodles of distortion. Even good ones!

    Of course, data reduction is underpinned by intensive research (almost certainly the most heavily studied aspect of audio, ever) and it has to make assumptions based on the typical human ear plus a margin for error. It is entirely possible that there are young (that's mandated) listeners who can hear things we can't. But as they are not socio-demographically the consumers of quality audio I'd say - and the MP3 people would definitely say - that validates their MP3 masking concept.
    Alan A. Shaw
    Designer, owner
    Harbeth Audio UK

  9. #29
    Join Date
    Mar 2010
    Location
    uk
    Posts
    141

    Default Spectral tonal components

    Thank you Alan.

    I think I had perhaps been looking at this issue from the wrong end. I had assumed that mp3 relied (in part) on the brain making up for what had been left out; having read a little more this morning it seems that it is more to do with the mp3 sweeping it's mess under a rug such that the brain doesn't notice (or more technically "to minimize the audibility of the quantization noise.")

    I came across an article originally from the Journal of The Institute of Electrical and Electronics Engineers (from which the quote above is taken), much of which is very informative and rather more of which went straight over my head. But I would recommend that anyone interested reads it, this version is easier to read but omits the diagrams, this version is complete but a little less easy to navigate on screen.

    I found passages such as this:

    (to) Separate spectral values into tonal and non-tonal components. Both models identify and separate the tonal and noise-like components of the audio signal because the masking abilities of the two types of signal are different.
    from the section on 'The psychoacoustic model' to be of interest in that the algorithm is effectively making decisions about what is noise and what is music (as I understand it).

  10. #30
    Join Date
    Jan 2006
    Location
    South of England, UK
    Posts
    4,204

    Default How we think we hear and how we actually hear are very different ....

    To quote from your linked paper ...

    The MPEG/audio committee conducted extensive subjective listening tests during the development of the standard. The tests showed that even with a 6-to-1 compression ratio (stereo, 16 bits/sample, audio sampled at 48 kHz compressed to 256 kbits/sec) and under optimal listening conditions, expert listeners were unable to distinguish between coded and original audio clips with statistical significance. Furthermore, these clips were specially chosen because they are difficult to compress.
    I am highly sceptical about the talk surrounding cables etc. because, quite simply, anyone who has spent an afternoon or two just skimming the vast library of research papers covering our hearing system would have to conclude we just do not have the auditory acuity we think we have. And following that, if we are not actually hearing what we think we are hearing there is limitless potential for self delusion.

    Empirical results also show that the human auditory system has a limited, frequency dependent, resolution. This frequency dependency can be expressed in terms of critical band widths which are less than 100 Hz for the lowest audible frequencies and more than 4 kHz at the highest. The human auditory system blurs the various signal components within a critical band although this system's frequency selectivity is much finer than a critical band.
    Considering the second quote: we sense high frequencies in 'blocks' or bands that are up to 4000Hz wide. Yes, that's right: a whopping 4kHz wide. So, at the top end of the scale, 16-20kHz is sensed in one wide spectral block.

    When listening to so many modern loudspeakers I do wonder if their designers have even the vaguest notion about the ear. But they can't because the sound is so damned fatiguing, which in my book, dooms the design to the scrap heap. The natural, un-amplified human voice box simply cannot sound fatiguing because it is crafted from soft, warm, nourished, springy living tissue. That means it is perfectly damped. So if a loudspeaker reproduces voice with a degree of hardness (many do) it says to the listener's subconscious 'that is not a real voice because real voices don't sound like that .... so what is it? A threat?' And being on-edge in anticipation is the definition of listening fatigue.

    Simply: if you want to separate folk from their cash, you have to know how to take full advantage of their auditory fallibility.
    Alan A. Shaw
    Designer, owner
    Harbeth Audio UK

  11. #31
    Join Date
    Jun 2009
    Location
    Canada
    Posts
    886

    Default Speaker distortion and MP3 encoding

    Quote Originally Posted by A.S. View Post

    Incidentally, when DCC appeared (and I still use it - 256kb, 4:1 encoding, completely transparent) I approached Philips and asked them if I could somehow incorperate their compression technology in our active speakers. My concept was that if we were erasing and not feeding the speaker with all the fine detail that was masked in the ear anyway, this would reduce the strain on the speaker, distortion would diminish and the result would be a cleaner sound entirely due to data reduction. I still believe that is a valid argument. The easier we make the loudspeaker's task, surely the less distortion it will produce. ...
    If this hypothesis is valid, wouldn't it be just as easily tested with a passive loudspeaker, by comparing distortion produced by a speaker fed a WAV file, and then comparing that to the distortion produced by the same speaker when fed a 256 kbps MP3 file that has been bit-reduced from the original WAV file? The latter would have the fine detail stripped out, so if the theory holds, the speaker should produce less distortion with the MP3 than with the WAV. Or am I missing something?

  12. #32
    Join Date
    Sep 2009
    Location
    Malaysia
    Posts
    508

    Default Data reduction and speaker stress?

    Quote Originally Posted by EricW View Post
    If this hypothesis is valid, wouldn't it be just as easily tested with a passive loudspeaker......
    No, I don't think that's the same. Alan is postulating a new idea about making the loudspeakers to produce only the sound that would be heard by us. He is making the loudspeakers to ignore all other frequencies that would have been masked in the ears. Clever indeed. It doesn't matter if the source is a CD, SACD or other format the loudspeakers play the sound as we would hear them.

    My only concern is the loss of frequencies above 18 or 19kHz when the file is compressed, which defeats the purpose of SACD or other high resolution format, which supposedly captures frequencies up to 50kHz. But then the whole argument in favour of MP3 is that human do not need to hear frequencies above 16kHz for musical enjoyment.

    I wonder if the wireless home theater system for rear speakers already implemented similar concept.

    ST

  13. #33
    Join Date
    Mar 2010
    Location
    uk
    Posts
    141

    Default About time we had a paper from BBC R&D?

    This paper (link to pdf) actually addresses the issue of cascaded audio coding, ie what happens when several compression techniques are applied successively - as happens in broadcast situations.

    It's authors (Andrew Mason & David Marston) are I believe employed at Kingswood Warren and as such may either be known to A.S. and/or (it's just possible) call in here. That being the case, I ask that mods feel absolutely free to edit/comment/delete as they see fit.

    Introduction

    The production and broadcast of audio is a technically complex operation. The audio signal will typically pass through several distinct processes including recording, sending to the studio, postproduction and so on. Increasingly, people have been turning to bitrate reduction to reduce the cost, or to increase the speed, of these processes. In isolation, the impact on audio quality of a single application of bitrate reduction can appear negligible. However, the reality is that the cumulative effects of bitrate reduction throughout the broadcast chain is far from negligible. If each process removes all redundant audio information, or uses the signal to mask the noise being introduced, then the next process might have nothing left to remove, or will see previously introduced noise as signal to be used to mask more noise.
    I was interested in simply seeing how such a study is carried out, but if you scroll straight down to the conclusions you will find this:

    The results clearly show that the cumulative effect of cascaded audio coding can be highly detrimental to audio quality, even when each stage in the chain makes only a small reduction in quality.

  14. #34
    Join Date
    Jun 2009
    Location
    Canada
    Posts
    886

    Default

    Quote Originally Posted by STHLS5 View Post
    No, I don't think that's the same. Alan is postulating a new idea about making the loudspeakers to produce only the sound that would be heard by us. He is making the loudspeakers to ignore all other frequencies that would have been masked in the ears. Clever indeed. ...

    ST
    Yes, but isn't this exactly the same as what happens when you amplify an MP3 file and send it on to the (passive) loudspeaker? Either way, the "fine detail" has been stripped out of the signal and in theory the loudspeaker should have less work to do. If it's different, what exactly is the difference?

  15. #35
    Join Date
    Jan 2006
    Location
    South of England, UK
    Posts
    4,204

    Default

    Quote Originally Posted by weaver View Post
    [URL="https://www.ebu.ch/en/technical/trev/trev_304-cascading.pdf"] .... are I believe employed at Kingswood Warren....
    As we reported in a Newsletter earlier this year, BBC Kingswood Warren no longer exists and has been demolished for housing development.

    As far as I know, those (very) few remaining stall in the BBC's Research Dept. are engaged in non-audio work. As I understand it, blue sky audio research has ceased at the BBC. Sadly, that sort of erudite paper hasn't found a practical application in a world where digitally compressed audio is commoditised. The BBC does not have the remit or the resources to try and persuade the global audio industry to address issues that they don't see as a problem - especially if the solutions require more training, more equipment or changes in working practice.
    Alan A. Shaw
    Designer, owner
    Harbeth Audio UK

  16. #36
    Join Date
    Mar 2010
    Location
    uk
    Posts
    141

    Default Pitch and tones

    Quote Originally Posted by A.S. View Post
    To quote from your linked paper ...

    Empirical results also show that the human auditory system has a limited, frequency dependent, resolution. This frequency dependency can be expressed in terms of critical band widths which are less than 100 Hz for the lowest audible frequencies and more than 4 kHz at the highest. The human auditory system blurs the various signal components within a critical band although this system's frequency selectivity is much finer than a critical band.
    Considering the second quote: we sense high frequencies in 'blocks' or bands that are up to 4000Hz wide. Yes, that's right: a whopping 4kHz wide. So, at the top end of the scale, 16-20kHz is sensed in one wide spectral block.
    Ok, I may need help with the maths again here:

    At the bottom end of the piano scale 100 Hz is the equivalent of roughly two octaves: eg. C at 33 Hz to C at 133 Hz.

    In the 16-20kHz range that you are talking about, 4000 Hz is actually around a quarter of an octave, or two whole notes.

    [This is way off the top of the keyboard, but starting with C at 4kHz (4186 Hz) then roughly the next C would be 8 kHz the next 16 and the next 32. So between 16 and 32 it's 16 kHz to an octave, or 4 kHz is roughly two whole tones. Very approximate, and as I said: I may need help with the maths.]

    I'm not sure whether this agrees with (the sense of) what you are saying or not: to differentiate between two (imaginary piano) tones above 16 kHz they need to be 2 kHz apart - and a semi-tone would be 1000 Hz. We can therefore say that the ear senses a wide frequency range as a block.

    But, in terms of sensitivity to pitch as opposed to frequency the picture is different isn't it? 4000 Hz at the top of the audible range is a much smaller range of tones than 100 Hz is at the bottom.

    I just want to be clear that in musical terms we are talking about pitch, about tones and semi-tones; whereas in audio terms we talk about frequencies. So what may be a wide range in audio terms is actually a relatively small range in human, musical terms.

  17. #37
    Join Date
    Feb 2010
    Location
    England and Cyprus
    Posts
    370

    Default Frequencies, pitch etc.

    It's not your maths I have a problem with, Weaver, it's your reasoning.

    I think all we can say is the ear much much more discriminating around middle C (256Hz) and concert pitch A (440Hz) than it is of the top of the piano keyboard counting in kiloHertz.

    And I don't think you can decouple pitch and frequency as you attempt; even if we understand that they are not as closely locked as simple science would predict - the upper octaves of the piano are "stretched" otherwise the intervals sound flat!

    http://www.blackstonepiano.com/tutor...techniques.htm

    Furthermore, music is not usually played with pure notes (sine waves) - each basic note (fundamental) comes with a family of higher notes, the harmonics, which even for a fundamental note below or in the low hundreds of cycles per second (Hz) will run up into the 1000s (kHz)

    Being able to differentiate these high harmonics helps us tell the difference between (say) a clarinet and an oboe.

    Where all of this get us I don't know, but I do know I don't know enough!

    Nevertheless I am uneasy with decoupling pitch and frequency!

  18. #38
    Join Date
    Mar 2010
    Location
    uk
    Posts
    141

    Default MP3 coding bands ....

    Quote Originally Posted by Labarum View Post
    Nevertheless I am uneasy with decoupling pitch and frequency!
    My aim was not to de-couple them, but to re-state the point that the relationship between the two is not linear. Also, in audio discussions, I think it is all too easy to forget what the numbers actually mean - maybe that's just me though.

    If we are talking about a block 4kHz wide we need to remember where that block is located - the entire (fundamental) range of the piano is only 4kHz after all. I do appreciate what you are saying with respect to harmonics though.

    One of the issues with the mp3 codec is that the first thing it does is to

    divide(s) the audio signal into 32 equal-width frequency subbands*
    You might guess that this "equal width" approach is not without compromises.

    *from the same paper referred to above.

    ps I think I remember that you have a musical background, if that is the case then not only do you know far more about this than me, but what a given frequency actually means, ie what it sounds like will be built in to your thought process. Whenever Alan talks about a frequency range I have to look it up and put it in context first.

  19. #39
    Join Date
    Sep 2009
    Location
    Malaysia
    Posts
    508

    Default CD WAV v. MP3 - am I kidding myself? A test ....

    Quote Originally Posted by A.S. View Post
    .....So now, perhaps we can return to - not abandon as you suggested - your technical deconstruction of dynamic range compression. Can you imagine a better way of visually presenting your (undoubtedly true) observation that sound just isn't what it used to be? Thoughts?
    Not so sure I said that nor I am technically competent to deconstruct the workings of MP3. But this topic is important to me because I see a bigger role in server based media player and may have to convert my CD collections to other formats such as MP3, AAC and FLAC. Still not sure what to do with my SACD collection.

    Previously, Alan has said that MP3 format at 256kb is as a good as CD quality. I agree that some MP3 do sound as good as CD but on closer listening it falls short of my expectation when I feel they tend to be monotonous in their dynamics, especially at higher volume. This is my opinion and may be wrong since I am prejudiced against MP3. Not only that, it would be embarrassing to me that I am lying all these years to myself to "claim" that SACD sounds better to my ears than CD because the imperceivable high frequencies made a different to my musical enjoyment.

    During this weekend, I am downloading from Linnrecords one track in Studio Master FLAC, CD quality FLAC or MP3 files formats, to experiment myself if I can tell the difference. Anyone want to suggest which would be the best track for comparisons?

    ST

  20. #40
    Join Date
    Oct 2009
    Location
    Australia
    Posts
    349

    Default Selecting music for testing

    Quote Originally Posted by STHLS5 View Post
    Anyone want to suggest which would be the best track for comparisons?

    ST
    If you like you can also try Bluecoast records, just register and recommend Alex De Grassi - Greensleeves. Its a fine tune and available in flac, mp3 and wav.

    {Mod's comment: before you select music tracks maybe you should anticipate the results of your test (what you hope it will show) and work backwards to the music that will best highlight whatever you expect? What do you think will be the result?}

Page 2 of 3 FirstFirst 123 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •