Rendered at 23:37:19 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
ryankrage77 2 minutes ago [-]
I decided to test for myself, downloaded Lacinato ABX and tested a 32-bit 353.7Khz flac I had lying around, to the same file downsampled to 16-bit 44.1KHz. I couldn't tell any difference. Then I tried 192k mp3... still no difference. Couldn't reliably differentiate 128 or 64kbps mp3 either. I had to go down to 32k before I could be certain which was which, and even then I still had to listen carefully.
Think I need to get my ears checked. I know I can't hear much above 15-16KHz but I didn't think it was this bad.
ycui7 7 minutes ago [-]
24-bit was created because microphone want to record large dynamic range without gain switching circuit.
96kHz was created to better reproduce 20kHz high frequency, so the digital noise shaping filter does not need to be super sharp right at the Nyquist frequency.
Both were introduced for a sound technical reason. beyond that, most are marketing non-sense to cheat consumers.
stego-tech 6 hours ago [-]
I cannot hear the difference between 16/44.1 (and by extension, 16/48) and High-Res Content generally, be they HDCD, SACD, or just straight-up Masters from Qobuz. This is on multiple sets of equipment, ranging from El Cheapo earbuds all the way to HD800 cans and full-fledged tower speakers being bi-amped.
That’s not why I go for High-Res stuff, though.
It’s all about archival, at least for me. With a 24/192 Master in FLAC or ALAC, I can downsample to whatever the destination form factor is. I can transcode to a 320kbps MP3, or a 16/48 WAV stream for a smart speaker, or a 24/96 stream for the theater. The point isn’t that I can hear the difference, it’s the fear that I might lose something irrecoverable by sticking with lower-quality files for bulk storage. Once data has been discarded, it cannot be retrieved, and that influences my preference for storage (and is also why my BD/UHD rips are into MKVs, no re-encoding).
Now that being said, I will absolutely hem and haw and ABX different releases to determine if I opt for the 16/44.1 CD rip of an album from the 80s or the new 202X remaster in 24/192 (spoiler: almost always the former), and I absolutely prefer anything with classic instruments (Jazz, Classical) in higher-quality formats because of a subjective perception of a wider, clearer sound stage, though this is almost certainly a psychological effect from performing in concert bands and orchestras rather than physical or objective in nature.
Like I tell newcommers: if it sounds better enough to you to warrant the purchase price, then that’s all that really matters. Enjoy the hobby.
saltcured 5 hours ago [-]
Decades ago, I was treated to an ABX test in my brother's recording studio. I easily recognized and preferred a 24/192 master he played versus the 16/44.1 down-mix. I honestly don't know whether there was something wrong with the down-mix, but qualitatively it did feel like it was "muffled" and coming from speakers, while the master really felt like live performance. He was surprised that I could tell them apart.
I also spent a lot of time ripping my old CDs to FLAC and trying different MP3 and AAC encoder settings to get playback that felt transparent enough to me. I could never tolerate Sirius/XM radio streaming due to the horrid compression I heard with every futile attempt. I still seem to have more sensitive hearing than most people around me, but in my 50s I know it isn't what it once was.
I never had huge budgets, but did strive for hi-fi in my limited ways. I used things like toslink and HDMI to send raw PCM data from Linux to my Yamaha A/V receiver's DACs + amplifier to drive somewhat nice Polk tower speakers. But then COVID-19 happened, and this stuff was packed up to move house.
Nowadays, music playback is streaming with mundane "subwoofer + satellite" PC speakers or MP3 playback with a mini-SD card permanently parked in my car's infotainment system.
vor_ 2 hours ago [-]
> Decades ago, I was treated to an ABX test in my brother's recording studio. I easily recognized and preferred a 24/192 master he played versus the 16/44.1 down-mix. I honestly don't know whether there was something wrong with the down-mix, but qualitatively it did feel like it was "muffled" and coming from speakers, while the master really felt like live performance. He was surprised that I could tell them apart.
As referenced in the article, a common explanation for those audible differences is that the high-resolution version of the album is sourced from a different master.
nullc 56 minutes ago [-]
This is an extremely hard comparison to do well. I'll give a few examples as to why:
Small differences in gain are ABX able much more readily than differences in noise at the 16 vs 24 bit level. So if the signal chain gives even a small difference in gain between the samples that's what you'll track. A reasonable conversion path to 16 bits for mastering will also apply dithering and some kind of brickwall limiting (you have to limit after the dither or as part of the dither as dither can change levels!), and this can result in gain changes. The DAC may behave differently or have outright bugs for some configurations too.
This is particularly true wrt reconstruction filters for sample rate differences. And if you were comparing 44.1k and 192k then the physical DAC itself was likely running at a different rate and its _analog_ filters are probably better optimized for one vs the other (this is less true for 48k vs 192k, as the hardware likely runs at the same rate for both). So one answer to this comparison can be "on this particular hardware this rate is better than that rate"-- but that's a implementation property not a property of format choice.
You might think, "okay I'll use a mathematically perfect down and up conversion process and run the DAC in the exact same configuration for all cases". But even then you run into issues like after reconstruction the _inter sample_ peak levels will be higher than the levels of the samples, so you have to handle that and in a way that doesn't produce a gain difference between the two configurations. (probably by running your perfect process and finding the gain level that results in no limiting, then making the gain of the original match).
And then for the high rate vs non-high rate you have to deal with the fact that most amplifiers are not particularly linear (compared to well constructed software at least!) and that any real speaker is very far from linear. This means that the presence or absence of ultrasonics will change the audio in the 0-20khz band.. Before you think "well that could be a reason that high rate is better" observe that if there was some consistently good effect from the ultrasonics you could just bake it into the low rate sample.
> but in my 50s I know
Yeah if you're in your 50's you're absolutely not hearing differences way up above 20khz (especially if you're male), I bet you can't even hear CRT flybacks from 100 yards anymore. :P Most people have no idea how much their high frequency hearing degrades as they age because it plays approximately no role in your life, but it's real, dramatic, and as far as I know happens to everyone.
I don't mean to discount your experience: I don't really doubt that it was real. But answering the general question of the necessity of low vs high rate probably takes a team of experts, armed with test gear and the designs of the HW/SW in question, to vet the test configuration. Testing a _particular_ configuration without the ability to distinguish its implementation quirks from format-fundamentals is much easier and that's what most attempts to test this question are actually testing.
By testing in a recording studio you were doing far better than most such comparisons. Usually people try comparing different files and they're comparing entirely different mastering processes. Files made for the "high res" market will often have much less compression and limiting then files made for commercial radio play / casual listening... and truly do sound obviously much better. Some of my favorite recordings are rips from vinyl. Vinyl is an awful format from the perspective of audio fidelity, but it's also pretty intolerant of excessive compression and limiting because the record will skip if the needle is bouncing off the rails. And more recently I suppose they also avoid over compression there because of the difference in target listener/environment.
bigiain 18 minutes ago [-]
> Small differences in gain are ABX able much more readily than differences in noise at the 16 vs 24 bit level.
This was common knowledge at least as far back as the mid 80s, when every hifi shop and salesguy knew to ensure the bit of gear with the highest profit margin got played an almost imperceptible bit louder than the gear the customer came in to buy during back to back testing.
nullc 11 minutes ago [-]
It's also a reason why double-blind testing is important. If someone doing the setup is expecting one piece of kit to sound better, if it doesn't they'll check the configuration more, and difference in gain can come from many sources. So errors that result in higher gain in favor of the "better" candidate go uncorrected, while ones that favor the worse tends to be fixed.
Point being: it doesn't even require an unscrupulous sales person to get similar results to an unscrupulous sales person! :P
empiricus 4 hours ago [-]
Even for PC, I recommend some cheap studio monitors.
I can't hear the difference between 128 kbps opus and FLAC.
nullc 25 minutes ago [-]
> I can't hear the difference between 128 kbps opus and FLAC.
A reasonable definition of transparency for high bitrate compressed audio is "Can the worst files be distinguished by a listener trained in what artifacts sound like". Maybe also add in having to use a high discrimination listening setup, including not running excessively loud (increases masking).
If that's not the test you're doing, it's unsurprising. At moderately high bitrates no one can reliably distinguish them on arbitrary samples: most inputs are easy.
If you test on known-difficult "killer samples" you'll probably easily distinguish them, even without first being shown what to look for, and certainly after.
During the development of Opus I created many 'trained listeners' and selected many killer samples, and I don't recall* ever encountering a tin ear that couldn't be taught to ABX any high rate samples, though some people are obviously much better at it.
I'm not sure I'd recommend it though: learning to identify artifacts has a frequent side effect of making low rate audio like the HE-aac used in SirusXM absolutely intolerable. I'm bothered by it even when I hear cars driving by using it. :)
[*] My memory for such things sucks, so I could be wrong-- but my point that it's not expected remains.
stego-tech 3 hours ago [-]
And that's fine! I've got a flatmate who loves 320kpbs MP3s on studio monitors, I've got musician friends who swear by CD-audio and Sennheiser HD200s, and others who love how vinyl uniquely degrades over time on big speakers.
The takeaway from these sorts of posts, at least in my opinion, should be two-fold:
* Understand the physical limits of human senses and perceptions to help inoculate yourself against outright scams and grifts
* Liberate you from the "tech grind" and allow you to enjoy what you like, how you like it.
dspillett 1 hours ago [-]
> Understand the physical limits of human senses and perceptions to help inoculate yourself against outright scams and grifts
Also understand that while there is an upper limit, we are all different within that. I can hear the difference between 128Kbps and FLAC, at least for some content, but not 256Kbps, maybe not 192. For some content (spoken word etc.), 64Kbps, sometimes less, is perfectly acceptable (to me). There was a time I could hear the difference between some encoders, but that was decades ago and anything in active use is pretty damn good (and my ears are not what they used to be) unless you really crank the bitrate down or tweak other options daftly.
sgarland 9 minutes ago [-]
I’ve not tried encoding my own MP3s in at least a decade, but when I was doing so, 128 kbps was instantly distinguishable to me on anything with cymbals, especially hi-hat: it loses that shimmery sound. At 192 kbps I could tell if I really, really tried, but it was so minute I didn’t really care. I was never able to reliably tell the difference between 256 and 320 kbps rips.
PaulDavisThe1st 1 hours ago [-]
> I can hear the difference between 128Kbps and FLAC
You've established this with double bind testing, correct?
z_open 7 hours ago [-]
As they say, most people listen to their music with equipment. Audiophiles listen to their equipment with music.
pimeys 1 hours ago [-]
I might be something from the middle. Yes, I did spend a hefty 5000 euros to my headphone setup. And yes it sounds absolutely magical and every day I'm happy listening to music with it.
But I also have a large multi-terabyte music collection, I follow new music, go to concerts, go to parties, talk about music with my friends in signal group chats.
It's a hobby, and when you get a bit older and start having some savings, if you love music treating yourself with a better system is not that crazy.
UmYeahNo 1 hours ago [-]
When I got old enough to finally afford those toys I discovered I couldn't hear above 16khz anymore.
pimeys 50 minutes ago [-]
It is not only that. It's the spacing, how the bass sounds, separation of instruments. There's so many interesting headphones in the midrange to try out. Compare the Hifiman HE1000se to Heddphone 2 GT, or to Focal Clear MG and you'll understand.
Also with HEDD you get a handcrafted device made in Berlin. And if you go with nicer cables, they are very beautifully done and feel great. There is no difference in sound of course. Some people like jewelry, I can get similar enjoyment from beautiful audio equipment and cables.
az226 34 minutes ago [-]
What’s the quality of this trove? As in bitrate or similar.
pimeys 28 minutes ago [-]
Depends. I'm more into finding certain masters. And some of the albums are DSD tape transfers. DSD if that was the original recording format, if it was mixed and PCM was needed, DXD flac.
And so many CDs of course.
nntwozz 7 hours ago [-]
This is perfect, thank you this goes straight into my long-term memory bank.
On a tangent, whenever someone mentions LP sounding warmer or whatever I like to point out that I prefer wax cylinders (a.k.a. phonograph cylinders).
fecal_henge 5 hours ago [-]
You Edison shill.
mingus88 6 hours ago [-]
That’s true, but I consider myself a collector. Think of how a comic book collector operates.
If I have an option to get a 16bit version of a recording or a high-res version, I choose the highest quality version very time
Same with a physical copy. A limited edition, better quality vinyl LP is more attractive if you are going through the trouble of curating a collection.
I’ve been curating a music library of digital files since before the iPod was released and I will always go for the highest quality version out of principle. I can always downsample it to any thing that makes sense.
rahimnathwani 7 hours ago [-]
The article says "I've run across a few articles and blog posts that declare the virtues of 24 bit or 96/192kHz by comparing a CD to an audio DVD (or SACD) of the 'same' recording. This comparison is invalid; the masters are usually different."
It may be simultaneously true that:
A) Humans cannot tell the difference between 44.1kHz/16-bit audio and any higher resolution, and
B) For a particular song, the best commercially available 44.1kHz/16-bit version may not be the best commercially available version
black_knight 31 minutes ago [-]
I usually A/B test the different versions before choosing my canonical one. I will listen to the same sections in each version, flipping back and forth to hear the differences. It is incredible how much finding the right master improves the experience of listening to a track. Often times that means I end up with a hi-res version, but not always.
zamadatix 6 hours ago [-]
While 100% true, I'd phrase B) as:
"The quality of the particular mastering can still make a noticeable difference, regardless of the ability for the digital sampling rates to perfectly represent it perceptually"
Just to be clear that the statement applies to any releases meeting the A) criteria, not just 44.1 kHz @ 16-bit ones.
Tsarp 7 hours ago [-]
This really is driving a muscle/super car, or drinking expensive wine. At the end none of specs or tests matter. It is a form of art. If it makes the listener feel better (even if its just psychological) then its probably worth it.
munchler 7 hours ago [-]
To expand on this a bit, I appreciate some audio overkill because, if I do hear sizzle or distortion, it eliminates one possible reason and helps me figure out what’s actually happening.
It’s like having gigabit internet to my house: I don’t actually need it, but when a website is slow, I know the problem isn’t in my internet connection.
ubercow13 1 hours ago [-]
Would 192khz audio result in less sizzle and distortion? Or more audible band IMD from the sound >22khz
smilekzs 6 hours ago [-]
Well, at least there are objective performance benchmarks on cars, and some of them are okay proxies of performance in motorsports.
Correct. I've paid for Tidal for a decade because I just like the peace of mind that it's closer to the original recording. I'm sure it's mostly placebo, but I like it.
handedness 35 minutes ago [-]
I tried Tidal nearly a decade ago, and the audible fluttering effect caused by their audio watermarking totally ruined certain types of music, like choral recordings, strings and such. It was obviously apparent on $20 ear buds driven by any device, far beyond the more stereotypical audiophile gripes.
I opened a support ticket but they never responded. After that it was difficult to take their lossless claims seriously when the labels were providing such garbage source material. Their whole value prop was totally hollowed out.
I don't know whether the labels still impose such horrible practices, but I largely gave up on streaming services after that experience and now focus on keeping good digital archives of my physical library.
PaulDavisThe1st 58 minutes ago [-]
The original recording of almost all music on Tidal was done with equipment that was very, very far from the 192kHz "fidelity" it claims.
yellowapple 6 hours ago [-]
It's also sort of an inverted “Van Halen demanding a bowl of M&Ms with the brown ones removed” thing for me, too. The vast majority of my Tidal listening happens over Bluetooth, so that 24bit/192kHz FLAC stream is just gonna get downsampled to 16bit/48kHz anyway because that's all any Bluetooth speaker or headset is capable of doing — but the fact that it's an option in the first place signals that other things are being done right, too (namely: that Tidal's whole “we're the streaming service that pays artists the most per listen” premise actually has some semblance of merit rather than being complete marketing bullshit; while recording quality ain't the strongest signal possible for that, it's certainly a good sign when musicians/publishers are willing to send over the highest-bitrate lossless recordings they've got and not just the same ol' compressed-to-shit MPEG audio you can yank off YouTube for free).
wat10000 7 hours ago [-]
I'd distinguish between differences that anyone can detect but some may not care about, and differences that may not be objectively detectable at all. Muscle cars, at least, are different in a way that anyone can see. Push that pedal to the floor and it feels different from a Honda Civic or whatever. Whether that difference is actually interesting or good is, of course, a matter of taste. Whereas audiophile nonsense is often indistinguishable even to the connoisseur and depends entirely on some form of self-deception. Still could be worth it, depending on what one considers worthy.
mock-possum 7 hours ago [-]
That’s actually a really good comparison, especially because - yes I can hear the difference between an excruciatingly lossless digitization of a piece of music that I’m intimately familiar with, played back on expertly configured hardware… but the difference is so little, that most of the time, I’m find just listening to it at medium high quality streaming on a pair of <$50 headphones.
I’ve played with the nice toys, and they are nice, but for 100x the price, they barely deliver 1.5x the experience.
manoDev 1 hours ago [-]
They make sense for so called audiophiles who don’t understand Nyqist frequency theory.
It’s like photographers who are confused about the difference between raw and bitmap (jpeg), videographers confused about the difference between linear raw vs log vs gamma encoded, etc.
Just because a data format with higher bit depth/sampling frequency/whatever exists for editing purposes, doesn’t mean it’s “better” or makes sense as a consumption format for a finished work.
casion 1 hours ago [-]
They make sense for sound designers and derivative artists (e.g. sampling, which is a real artform).
Forms of manipulation bring inaudible content into the audible range.
Of course that doesn't mean audiophiles aren't being audiofooled by it, but there is legitimate usage.
geraldmcboing 1 hours ago [-]
The OP is a bit off with their description of why pro audio engineers work in higher bit rates and sample rates. We use 24bit to preserve low level sounds eg reverb, breaths etc and use 32bit float when recording as the headroom is so massive clipping is not an issue (other than of course still neeing to avoid overloading microphones with max SPL - cleanly recorded distorted sound is still a fail). Unclipping 32bit float feels like voodoo - I did a test, recording fireworks & unclipping the 32bit float recordings.
I use microphones that can 'hear' up to 100kHz (Sanken CUX100K) and for film sound design playing 192kHz audio at half and quarter speed the results are very significant, and reveal there IS 'content' above human hearing. Irrelevant for general listening but very important for sound design.
PaulDavisThe1st 1 hours ago [-]
Have you ever actually checked the number of actual bits your ADC can use? Most 24 bit converters struggle to get to 18 bits.
Nobody uses 32 bit float for recording (to do so is just to capture at least 10 bits of noise, most of that being brownian); its strictly a format for mixing and processing. You don't get any more resolution from 32 bit floating point than you do from 24 bit integer formats, but the result of "clipping" is less dramatic, hence the appeal of the format.
While there is some evidence that non-auditory human sensory perception may be sensitive to ultrasonic acoustic waves, it's pretty weak right now, and somewhat in the "woo" zone. It may turn out to be significant, or it may not. I wouldn't base an audio production workflow that requires 4x the cpu power and 4x the disk space on such tentative claims, but you're welcome to.
geraldmcboing 17 minutes ago [-]
"Nobody uses 32 bit float for recording" - you are just displaying total ignorance here.
geraldmcboing 22 minutes ago [-]
Dude I've been doing sound design on films using these techniques for years. There is zero 'woo' involved, it is ALL practical evidence based use. I've been using 32bit float multitrack field recorder by Sound Devices MixPre10-II professionally for many years now. The recorder has three preamps per mic input, each gain staged to provide optimum signal to the 32bit float AD. Read this to clarify your thinking:
https://www.sounddevices.com/32-bit-float-files-explained/
Surely you understand a recording made at 48kHz has a max freq response of 24kHz and played at half speed that max freq is 12kHz and at quarter speed only 6kHz. You can very clearly hear the filter cut off due to Nyquist. Record at 192kHz with mics capable of 100kHz capture and when played at quarter speed, the sound is full spectrum because there is no truncated frequency response. And when I load a 192kHz recording to izotope RX I can literallu see the harmonics going up to 96kHz. (not with every sound of course)
I repeat, i am not talking about 'normal' listening. I am talking about an industruy you have no knowledge or lived experience with, so spare me the incorrect claims about what can & cant be heard.
PaulDavisThe1st 14 minutes ago [-]
> I am talking about an industruy you have no knowledge or lived experience with
I'm the original/lead developer of Ardour, a cross-platform DAW, and have been working with digital audio for more than 25 years.
There are no 32 bit DACs - your SDD MixPre's are giving you (at best) 22 bits packaged as a 32 bit float value. The preamps make absolutely zero difference to the DA conversion (though they might sound real nice).
> Surely you understand a recording made at 48kHz has a max freq response of 24kHz and played at half speed that max freq is 12kHz
This is a very naive version of what "played at half speed" might actually mean. If properly and correctly resampled, this is not true.
> And when I load a 192kHz recording to izotope RX I can literallu see the harmonics going up to 96kHz
Well, I'd certainly hope so! But the question is: what are the energy levels associated with the partials above Nyquist? If you recorded at 384kHz with sensitive enough equipment, you'd see partials above 96kHz - but at extremely low energies because ... well, that's just how physics works.
geraldmcboing 3 minutes ago [-]
I do not use the DACs in the MixPre. Its a recording device. The field recordings & studio recordings are transferred as data and used in a 32bit float 192kHz Protools session. So the recorders DAC is completely irrelevant.
The sounds are then used as source material, for processing and manipulation at 192k, 96k and 48k. There is no debate to be had. This is how film sound designers work & have worked for years now.
The half speed you call naive is again just showing your ignorance. Sound editors have been using this technique since the days of recording on a Nagra at 15ips and literally replaying at 7.5ips half speed, and at 3.75ips for quarter speed. There is nothing naive about it, it is a very well know technique. To be able to achieve the same result digitally with full spectrum has impacted every feature film you have experienced in recent years. Again decades of lived experience.
nullc 34 minutes ago [-]
> Nobody uses 32 bit float for recording
Yes they do, almost all high end field recorders used for film work are 32-bits now and have been for much of the last decade, often with some fancy preamp integration so that there is no expertise required for gain staging the recording. (I believe the implementations use a second matched 24bit ADC with 48 dB less gain in front of it).
The result obviously doesn't have a noise floor which is lower (as the noise of a room temperature _resistor_ gets in the way of that even at the 24-bit level) but they have more dynamic range so that your recording isn't ruined by hard clipping some unexpected loud sound.
It's a big improvement for practical usage, and also likely does improve SNR somewhat because you can run higher gains without as much fear that you'll ruin the recording. The reason it would pay off is that the SNR loss you get from splitting the signal is easily smaller than the SNR loss you would get from gain reduction to avoid clipping.
(maybe... capsule self noise is also limiting... at these levels, and usually people aren't using microphones designed for the lowest possible self noise unless they're doing something special)
PaulDavisThe1st 31 minutes ago [-]
There are precisely zero 32 bit ADCs in existence.
There are ADCs that will provide 32 bits per sample but that's entirely different.
Current technology limits the bit depth to 18-22 bits and going beyond that you'd be very quickly recording brownian (atomic) noise anyway.
The point about 32 bit float is that it is a useful format for mixing, editing and general processing, so it is widely used in digital audio tools. But it is not a format that ADCs generate "natively" via their electronics - almost all of them are generate a 24 bit integer or fixed point value and then just supplying that as a 32 bit float value because the software asked for it (the software could have done it all by itself.
[EDITED: DAC->ADC since that is what I meant and what this is all about]
nok22kon 21 minutes ago [-]
Rode NT1-A 5th gen microphone claims 32-bit float output, insisting it will not clip peaks
so maybe they do sample at 24 bit at a well chosen gain level and then convert to 32 bit float, with the max 24 bit value being above 1.0 float
or as GP said, use two separate ADCs at two different gains and combine their output
PaulDavisThe1st 10 minutes ago [-]
> Rode NT1-A 5th gen microphone claims 32-bit float output, insisting it will not clip peaks
Of course it does! And that's what it does, of course. But that has absolutely nothing to do with the AD process itself, which is chip-limited to 24 bits and likely physics-limited to somewhat less than that.
You can't beat the physical limit of a DA circuit by doubling them up at different gains.
And .. you don't want to. Going beyond 22 bits gets you into brownian noise pretty quickly, which is completely pointless.
The best you can do (or could do) is get a very, very, very good DA that can really do 22 bits (likely not commercially available because of the expense), and then get the samples from it in whatever format works best for your purpose (24 bit integer, some fixed point value, or 32 bit floating point).
nok22kon 4 minutes ago [-]
you have 22 bits for the typical audio voltage level, which you call 1.0 float
but what if you "allow" double that voltage and call it 2.0 float? a strong pressure into the microphone generates a stronger voltage
thermal noise limits you on the quiet signals, but not on the powerfull ones
so 22 bit for -1.0 -> 1.0 range and you can add a few more bits on top of that for stronger audio pressures (voltages) which you would traditionally clip
geraldmcboing 21 minutes ago [-]
Why are you obsessed with DAC? Its the ADC that is WHY we capture 32/192.
PaulDavisThe1st 7 minutes ago [-]
If I said DAC, it was a mistyping. I am (in this context) always talking about the ADC.
nullc 16 minutes ago [-]
I didn't say anything about DACs! I'm correcting a specific claim you made
> Nobody uses 32 bit float for recording (to do so is just to capture at least 10 bits of noise, most of that being brownian);
This is not true and not true for a good and important reason!
One which has no bearing on the kind of DACs that exist.
Modern field recorders allow gains set a 'reasonable' level that maximizes SNR for recordings but still won't clip when there are much louder peaks. Not so dissimilar to how a 6-digit multimeter can achieve its advertised performance both on a 0-5v range and a 0-300v range.
PaulDavisThe1st 8 minutes ago [-]
When I said "nobody uses 32 bit float for recording", I am referring to the result of the DA process that generates samples values used by a recorder.
Obviously, everyone and their mother uses 32 bit float as an internal sample format because of its fitness for purpose (except the folks who think they need 64 or 80 bit floating point, of course). But they are not using "32 bit floating point samples" - the samples come from an (at best) 18-22 bit integer conversion.
WarmWash 6 hours ago [-]
Foobar2000 has an extension that allows you to blindly test whether you can tell the difference between two tracks.[1] The prime use is to compare different encodings of the same song from the same lossless master.
It kind of changed me a bit when I ran through 20 lossless tracks I had re-encoded to various mp3 bitrates and realized that even on a fancy system, it can be really hard if not impossible to discern even moderate lossy from lossless.
If you are an audiophile geek, really think about if you want to try this, the reality check might crack your foundations.
Oh great. And here I thought that fantasy literature where forest elves could hear the screams of the plants they stepped on when they walked was just that -- fantasy.
SketchySeaBeast 6 hours ago [-]
Triffid music.
sholladay 4 hours ago [-]
Music producer here. High resolution audio is useful for editing and anywhere there might be downstream processing or format conversion that may or may not be high quality, let alone lossless. The article covers that pretty well.
However, the article claims that the final distribution doesn’t need to have a bit depth of more than 16. That does not match my experience. I can tell the difference between my renders that are 16 bit vs 24 bit. I cannot tell the difference between 44.1 kHz and higher sample rates, and that’s consistent with the math (Nyquist-Shannon), but bit depth is a different matter. Would be fun to participate in a double-blind test that includes my own tracks and others.
PaulDavisThe1st 57 minutes ago [-]
> I can tell the difference between my renders that are 16 bit vs 24 bit.
established using double blind testing, I assume?
nok22kon 1 hours ago [-]
thermal noise allows about 18-22 bits of real precision at audio level voltages, so it's plausible that 16 bit is somewhat limiting
PaulDavisThe1st 56 minutes ago [-]
16 bit may limit it on the input side, but the question is more about human hearing's sensitivity on the "output" side ...
codedokode 1 hours ago [-]
192 kHz vs 48 kHz can make a difference if you slow down the audio. If you pitch shift down 2 octaves, the ultrasonic range 20-80 kHz turns into 5-20 kHz and there will be large difference between 192 kHz and 48 kHz sources. However, I do not know if it would sound good because the mixing engineer cannot hear those frequencies and mix them properly, or the microphone might not catch it or some of the material could be recorded with lower quality.
Also, sadly consumers are getting used to low quality audio nowadays - they often listen to lossly compressed audio on social media (sometimes decompressed and re-compressed several times) which is then re-compressed to send to bluetooth headphones, or played back on an awful smartphone speakers. Streaming services also use compressed audio.
flir 55 minutes ago [-]
I guess in much the same way that the best camera is the one you have with you, the best music playback device is the one in your pocket. Hence why AM transistor radios were so popular way back when. (Just musing. But it feels like for most people, convenience trumps fidelity).
cozzyd 7 hours ago [-]
What a human centric view. I like my music to scare neighbor's pets.
Just get one of those "hi fi" valve amplifiers from Amazon you see under $100. The valve already distorts the sound, so you don't need to bother paying more for low distortion anywhere else in the audio chain. Saved you thousands of dollars, done!
PaulDavisThe1st 54 minutes ago [-]
Distortion is why people love the sound of vinyl.
And its all good! It's perfectly fine to say "I prefer the sound when the whole mix (or just that guitar) ends up being subject to interesting and possibly harmonically relevant distortion at low levels".
Just don't say "The version with the distortion is more accurate than the one without", because that's a lie.
hobonation 6 hours ago [-]
Counter: An ultra high bit rate solves the problem and you can stop worrying if it's the weakest link.
You can the focus on other things.
Example: I Bought the best skis possible. Now I know I need to just focus on my skills and not blame the equipment.
RijilV 4 hours ago [-]
I hate to be the one to break it to you, but high end skis make tradeoffs which are harmful to beginner or intermediate level skiers... also there's sorta no thing as "best ski". what you'd want for high speed bombing double blacks is going to be different from off piste or moguls or snow park fun.... double also, skis wear out. Depending on who you want to believe it's as low as 20-30 days. Which, granted the average skier is at something like 5 days a year. but if that's you... triple also?
As for how this relates to audio compression, in particular in the context of 2012. you are making a tradeoff of storage size and decompression cost. Maybe that doesn't matter to you, but maybe it either did in 2012 or still does.
hackingonempty 6 hours ago [-]
The point of this article and video is there is no problem with 16-bit 44-kHZ PCM. It thoroughly covers the audible range and is there is absolutely no need for more when distributing music for humans to listen to.
The problem is the people spreading myths and disinformation out of ignorance or to promote their enterprise.
The weak links are producers/mastering-engineers, speakers/headphones and the room when using speakers.
hgoel 1 hours ago [-]
I still insist on the higher bitrate stuff. I don't expect to notice the difference, I just think that music where the artists have bothered to prepare those files is probably recorded with more care than otherwise. I'm not generally listening to big artists where this can just be expected, and while I don't have any evidence to support my belief, I choose to continue believing it.
I'm not interested in finetuning everything in my life for efficiency.
hackingonempty 6 hours ago [-]
@xiphmont also made an amazing video response to the many responses he received to this article. Using analog equipment he busts a bunch of myths and demonstrates what really happens with digital audio.
Thank you for posting this. I thought I knew a bit about what was going on with audio sampling and reproduction, but I learned a surprising amount from this well presented introduction
dlcarrier 5 hours ago [-]
There is a good reason to distribute it though, and compressed it doesn't really change the file size.
There's multiple YouTube channels that I listen to as podcasts, that are professionally created and the creators presume that exported audio works like studio audio, so what you end up with is really quiet audio that can't be turned up without pre-processing.
If we distributed audio the same way we work with it in a studio, we could forgo a lot of problems.
Also, the human ear does have enough dynamic range to make 24 bits worthwhile, though that much dynamic range is rarely used in recordings, and that high of a bit depth provides no benefits within a small dynamic range. A 192 kHz sample rate, on the other hand, is always useless.
me551ah 6 hours ago [-]
Nobody downloads music these days and everybody just streams. Audio at 24 bit still takes a small fraction of the bandwidth that 1080p video takes, so I don’t understand the hate for it.
I use a DAC by focusrite which can do 24-bit, and if I want to listen to higher fidelity audio on my planer headphones then I should be able to. Why should I limit myself to 16-bit
mingus88 6 hours ago [-]
Counterpoint: bandcamp is doing well. Vinyl sales are doing well.
If I like an artist that I find on streaming, I buy an LP and get a lossless download for free. I still have a music library and I will never rent my favorite music.
Artists prefer to connect directly with their fans and BC is probably the best platform for people who care to pay and support acts directly. They have high res downloads and I import them.
zamadatix 5 hours ago [-]
I don't think the hate is about people who know it doesn't actually sound different if the audio file is 16 bit or 24 bit or necessarily about receiving a few more bytes than they need, it's about the pushes by these types of streaming services/offerings or people insisting that it's supposed to be any better for listening when it's not.
Also the playback rate and the file rate are different topics. The former can get into scenarios more like the audio processing section of the article e.g. I had this one shitty headset for work which required me to set the volume to 1-2 (out of 100) on the computer and I could actually blind test tell when it was in 16 bit or 24 bit mode because it was cutting and boosting it so much it effectively lost precision in 16 bit mode.
pimeys 1 hours ago [-]
Wait, what? I do download everything I listen. And Roon is quite popular in the music communities. How else you can make sure you have that correct mastering of your favorite album?
rz2k 5 hours ago [-]
My good enough amplifier and DAC combo claims up to 24bit/192kHz, I use a cheap optical interface from my computer that claims up to 32bit/192kHz, and the streaming service I use serves most albums at 24bit/44.1kHz.
It would have cost the same for the entire stack to be 16bit/44.1kHz at every step, but with excessive resolution I can control the volume anywhere. The bits right before the analog conversion at the end are essentially the same whether I turn down the volume in the software player, the operating system, or the DAC/amplifier.
PcChip 4 hours ago [-]
you might want to see if your DAC re-clocks incoming optical, if not then it's relying on the cheap clock generator from your computer
rz2k 3 hours ago [-]
Some people have claimed to hear an improvement with an external clock on a Wiim Ultra, but I do not think it is possible to re-clock the WiiM Amp Ultra with an outboard clock.
When I play from the computer, I'm not sure whether it is using the clock on my Mac, the clock on the optical interface, or the WiiM's clock. However, I do not notice any difference in fidelity when I use the Qobuz software player on my Mac or use Qobuz Connect to allow the player to directly stream from the source, so either it isn't a difference that I can hear, or the WiiM's internal clock is used for both sources.
PcChip 7 hours ago [-]
I'm curious if the audio was being sent bit-perfect to the DAC for all of these tests (ALSA direct), or if it was being run through the audio mixer and being resampled
I can always tell if my 44.1 songs are being resampled to 48 because they're being run through the OS mixer
dist-epoch 7 hours ago [-]
Proper audio resampling should not be identifiable. Of course, the OS mixer probably doesn't do proper (CPU expensive) resampling.
But a quality audio player should account for this and do it's own.
It is an incredible resource to see the quality of the resampling algorithms used by the actual production software likely used in any digital audio workflow.
You will see that while the best are indeed almost 100% transparent, many are not.
nok22kon 33 minutes ago [-]
I remember using Adobe Audition for resampling audio, this site shows I had good intuition
your software is among the best, but not pitch black best :)
PaulDavisThe1st 29 minutes ago [-]
Yeah, we use Secret Rabbit Code for ours, though we have access to the sox code now and that is "perfect". We might change to that as the default sometime this year.
PcChip 4 hours ago [-]
I'm also one of those audiophile crazies that obsesses over which metals to use in cabling, power filtering, swapping opamps, and builds their own DACs, amps, and speakers
rasz 6 hours ago [-]
"proper" resampling was expensive in 1997 when Intel was introducing fixed sampling AC'97, but was below noise floor of CPU load meter in 2007 when Microsoft released Vista killing hardware mixing.
LarsAlereon 5 hours ago [-]
The main benefit for me is that digital watermarking becomes completely inaudible with high-res audio, but I can sometimes clearly hear it in standard resolution.
speak_on 7 hours ago [-]
At a minimum, anything above 16/44.1 requires far more than just files: monitors, a treated room, listening position, DAC, etc... but most importantly - a trained ear. That last one is the most uncomfortable truth.
Blackthorn 7 hours ago [-]
Are you, per chance, a dog posting on the internet? Since 44.1khz sample rate is already past the range of the human ear, regardless of training.
speak_on 3 hours ago [-]
As I responded below, you are confusing math with physical reality. A true 44.1 kHz converter can't realistically capture frequencies ~18-20 kHz due to the limitations of filters used in the process. A perfect lowpass brick-wall filter just does not exist - they all introduce artifacts, which a trained ear can identify. You don't need to be a dog to hear the difference, just someone who does not assume that Nyquist theorem can be magically applied in the real world (and, ideally, someone who utilizes high quality converters with oversampling).
Blackthorn 47 minutes ago [-]
That extra 4.1 khz sample rate is for headroom for a low pass filter (and not necessarily a brick wall one). Leftovers or any such artifacts are below the noise floor, which is also an important part of the physical reality.
Would be happy to see an actual, real study to prove that humans can notice, but to my knowledge none exist that confirm they can. Not even any on teenagers or younger (the only group that can even hear close up 20khz).
vor_ 1 hours ago [-]
Is there evidence that a trained ear can reliably perceive these artifacts in a blind test of converters? I'd be interested in reading those links since converters typically oversample into the mHz range. At 11.29 mHz (256x 44.1 mHz), Nyquist will be at 5.64 mHz. Even the cheapest consumer converters are performing this type of oversampling.
To draw a design parallel: pixel-perfect design isn't something we are born with, noticing tiny details is a developed skill.
And yes, you are on point: oversampling is used extensively, but this just points at the exact issue: Nyquist theorem gave us a math algorithm, we still need to account for the electronic component imperfections. And then we are entering a different space of quality/precision/psychoacoustics/perception/etc. Meaning, not all converters, not all pre-amps, not all mics "sound" the same, even when they use same types of components on paper.
vor_ 24 minutes ago [-]
Oh, dear, that AES 2014 paper from Meridian (which was trying to push its controversial proprietary MQA audiophile system the same year) was widely criticized on audio forums when it came out, ranging from the rectangular dithering method to the use of a hard metal tweeter that could cause IM.
Do you have more convincing sources?
MertsA 6 hours ago [-]
You need at least twice the frequency range for sample rate in order to represent the original signal. That's slightly misleading though, that's from the Nyquist-Shannon sampling theory and it's a mathematical fact but that is true for exact numerical samples, once you add in quantization that muddies the water a bit. Taken at the extreme, it's straightforward to see why a 1 bit quantization per sample at 44.1 kHz would not capture a perfect representation of some analog signal even if there's only a 1 kHz frequency component to the signal. If we instead decide to sample at 10 MHz but still one bit quantization, now that 1 kHz frequency component can be much more accurately represented even though we're still using the worst quantization possible. Don't think of quantization like a square wave or a step pattern, think of it as "the signal is closer to here than any other discrete value".
Now in terms of realistic audio encoding, 16 bit at 44.1 kHz is designed to be a faithful representation as far as human hearing is concerned. Can someone with a trained ear potentially tell the difference between that and 24 bit at 192 kHz? In a studio environment it's possible. Most audiophile claims are dubious and a blind A/B test catches them out on most of it but the Nyquist-Shannon sampling theorem does not directly apply to quantized samples, it's about exact samples and with quantization, sampling rate is intertwined somewhat with the quantization depth.
move-on-by 6 hours ago [-]
I don’t have great hearing, so I’m not sure I can really weigh in here (thanks punk concerts in my teens). I remember similar arguments around screens and 60Hz vs ‘the human eye’. I think a lot of people, myself included, can easily perceive the difference between 60Hz and something higher- given the right conditions. I would not be so quick to disregard claims of more sensitive hearing.
speak_on 3 hours ago [-]
(I commented on this topic above/below in more detail.) Even with not-so-great hearing you would still be able to identify the difference (ie artifacts are pushed down, not up). Look up articles on the practical limitations of AD/DA converters and why the seemingly counter-intuitive claim that the difference between 44.1 kHz and above is noticeable, is actually a fully industry-accepted practical reality: aliasing, AD/DA lowpass filters, etc.
labcomputer 3 hours ago [-]
I would. It’s really simple.
The human threshold-of-hearing curve intersects the threshold-of-pain curve at about 20 kHz.
Above that frequency (or thereabouts) the sound has to be so loud that it will literally instantly damage your hearing before you can hear it.
This has been replicated across many studies for more than 100 years.
Flicker threshold is completely different. You can’t damage your vision by increasing the FPS, and it has always been commercially desirable to use a lower frequency because that is cheaper.
speak_on 1 hours ago [-]
Would you agree that a trained human could identify artifacts produced by an imperect conversion process? If you lean "yes", then that's your answer: AD/DA is not a Rust function perfectly implementing the Nyquist theorem, it's a collection of physical components many of which introduce artifacts into the audio path. This thread is not about the theory of human hearing, the electronic components are literally imperfect.
PaulDavisThe1st 48 minutes ago [-]
They're no more imperfect than the pickups on an electric guitar, the assembly inside the microphone, the circuit in the compressor and everything else in the analog signal chain that exists long before AD happens.
speak_on 37 minutes ago [-]
Absolutely! All these examples have imperfect audio paths - that is the point.
PaulDavisThe1st 34 minutes ago [-]
But the central point is that there's no reason to pick on the digital elements in any particular way. Recorded music in 2026 is a pretty good recreation of the original acoustic pressure waves when it is intended to be, but (a) not perfect, even in the pure analog domain and (b) it is frequently not intended to be.
speak_on 12 minutes ago [-]
The central point is that AD conversion can and will introduce artifacts. DA process wil intrduce more artifacts. The "imperfect" is a huge range and AD/DA converters play a role in that. We are not talking about "golden cables" bs here, conversion does introduce measurable artifacts in the audio path. The more tracks you record the more artifacts you have. Can everyone hear them? Definitely no. Can they be heard - yes, I can hear the difference between an old Digidesign interface and Grace Design interface.
PaulDavisThe1st 3 minutes ago [-]
No, the central point is that the analog signal handling before AD introduces vastly more "artifacts" than the AD or DA does.
In addition, nobody cares about "measurable" artifacts (or rather, they should not). What matters are "audible" artifacts.
Artifacts do not sum linearly, because they do not originate from correlated sources (unless you're doing something rather unusual).
Glad you can hear the difference between two converters, but I trust you've tested it in a double blind setting?
ses1984 46 minutes ago [-]
Can you give any examples of people identifying these artifacts in a/b tests?
I know from my 20-ish year mixing experience that I can hear the difference when mixing. Is it good evidence? No. So we can agree to disagree then.
clawlor 6 hours ago [-]
Max representable frequency is half the sampling rate (nyquist-shannon theorem), which is still a bit above normal but IIRC the extra headroom has something to do with eliminating aliasing
Blackthorn 6 hours ago [-]
Indeed. And what is the max frequency that a human can hear?
speak_on 2 hours ago [-]
The artifacts produced by pure 44.1 kHz convertion are aliased back down to lower frequencies. It's not about a theoretical human ear, it's about the actual physics of AD/DA conversion.
PaulDavisThe1st 47 minutes ago [-]
But the energies of the signal present above the Nyquist frequency (22050Hz in this case) are almost always incredibly weak, and double blind testing rarely shows any indication that humans can actually hear the aliasing.
speak_on 33 minutes ago [-]
Mixing process often involves hundreds of tracks, and if each introduces aliasing, this can become a problem. Some engineers do swear by "the final mix is 16/44.1 so why mix at a different resolution?" mantra - that's fine too.
PaulDavisThe1st 30 minutes ago [-]
This is false. Aliasing is not additive in any meaningful way.
speak_on 8 minutes ago [-]
Ok dude, you obviously never recorded anything. Twelve mics on a drum kit, 60 tracks of rhythm guitars, several bass guitar layers, vocals, backing vocals, electric organ, percussions, saxophone solo. Do you think recording them at 44.1 somehow creates a shared "cloud-based" aliasing artifact that I store in S3?
Depends on age of the listener, on average, 30 to 50 year olds hear a maximum frequency of 14 to 16 kHz.
Blackthorn 6 hours ago [-]
Right. Which are quite below 1/2 of 44.1k!
OkayPhysicist 3 hours ago [-]
Sure, but those are averages. I'm 30-ish, and my hearing doesn't cut out until somewhere in the 21kHz range. When I was younger, it was even higher. One of my roommates in college had one of those anti-rodent high-frequency noise generators, we almost came to blows over it.
UtopiaPunk 6 hours ago [-]
If you want to hear the difference between an audio file recorded at 44.1 and 88.2kHZ, then you need slow the audio playback down. Otherwise, a trained ear cannot physically hear the difference.
speak_on 3 hours ago [-]
44.1 is "enough" only in theory. This assumes a physically impossible steep filter. Realistically, frequencies around 20 kHz will create audible artifacts (aliasing). So yes, a trained ear can tell the diffrenece between 44.1 and even 48 kHz. Like many other commenters in this thread, you are mixing up math theory with physical limitations of AD/DA converters. Oversampling is a common way to address this limitation, but strictly speaking 44.1 kHz is not as obviously "enough" as it seems.
PaulDavisThe1st 46 minutes ago [-]
> Realistically, frequencies around 20 kHz will create audible artifacts (aliasing)
The energy of the signal components above the Nyquist is generally very low, and very few double blind tests have given any indication that humans can detect the resulting aliasing (even though many people claim to be able to do, almost always in non-double-blind environments).
Badly written digital synthesis can generate high energy signal components above 22kHz, but that's because they're badly written, not because the theory is wrong.
speak_on 26 minutes ago [-]
Genereally very low for a single track? What about 200 tracks? Badly written synthesis, or badly recorded live instruments, or bounced and re-bounced dozens of times... we are not talking about the quality-defining aspect here. You can produce an excellent mix on KRKs connected directly to a MacBook.
This space is not driven by a single precise formula. 48/96 kHz helps some engineers to produce better sounding mixes. Can everyone hear the extended range of Adam tweeters? Probably not. But some can, and they benefit from that. Even if there is no double-blind study to prove this in absolute terms.
PaulDavisThe1st 18 minutes ago [-]
If you recorded 200 tracks of the same instrument, so that the partials above Nyquist were all broadly the same, then sure, summing the tracks would include summing 200 copies of the aliasing results too.
But very little music is like that, and the energy profile above Nyquist will differ dramatically. Consequently, you're not summing a set of identical aliasing results, and in general, the results will still be undetectable to almost everyone.
Jacob Collier routinely works with 300+ tracks in Logic. He doesn't worry about this sort of thing, and neither do the Grammy voters who love what he does.
vor_ 1 hours ago [-]
Do you have citations for this claim? The "golden ears" argument is often employed by audiophiles, but even the cheapest converters oversample by up to several hundred times as well as employ antialiasing filters.
scns 7 hours ago [-]
A treated room would be the most impactful, DACs the least.
speak_on 3 hours ago [-]
The most impactful for noticing the difference? Again, I would argue it's the trained ear. If you have plenty of mixing experience then all these details add up, and a treated room becomes the most critical - agree with that.
vor_ 12 minutes ago [-]
So far, here isn't sufficient evidence that anyone has such reliably golden ears.
yellowapple 6 hours ago [-]
The DAC is pretty impactful if it's outright incapable of outputting anything beyond the usual 48kHz :)
vor_ 14 minutes ago [-]
Even the cheapest consumer DACs oversample into the megahertz range.
2 hours ago [-]
dist-epoch 7 hours ago [-]
The whole audiophile industry is built on stuff which doesn't make any sense
My favourite: "audiophile-grade" audio players which allocate a single continuous buffer of RAM into which they load/decode the whole .WAV/.FLAC file, because supposedly the CPU "jumping" between "fragmented audio" causes audible "jitter".
Of course, they don't know that what looks like continuous memory to user-code is probably discontinuous in kernel/physical RAM.
Didn't check in many years, I wonder if they created kernel level players to account for that, to have "true continuous memory"
platinumrad 7 hours ago [-]
Don't forget: "most players use malloc to get memory while new is the c++ method and sounds better."[1]
audiophiles (https://forums.stevehoffman.tv/threads/turntables-with-pace....) also claim that turntables can be rated on "timing, rhythm, and pace" in which supposedly the timing of the music can be affected by the turntable's mass and other properties.
How this would occur without also producing grossly audible pitch distortion never seems to be discussed.
lmc 7 hours ago [-]
> My favourite: "audiophile-grade" audio players which allocate a single contignuous buffer of RAM into which they load/decode the whole .WAV/.FLAC file, because supposedly the CPU "jumping" between "fragmented memory" causes audible "jitter".
Thanks for the laugh... this is absolutely bonkers. In case anyone is wondering, before sound hits our ears it has to go through a digital to analog conversion, which takes place on hardware independent of the CPU, operating with its own clock and buffers etc.
justsomehnguy 6 hours ago [-]
Am486DX/100 was enough to decode and listen an MP3 at 22KHz (and maybe mono?) and was more than enough to listen for 44/16/2 PCM. It's 31 y.o. today.
Sohcahtoa82 2 hours ago [-]
I remember playing 44khz 16-bit stereo MP3s encoded at 128 kbit/sec on a 133 Mhz 486.
It gobbled like 90% of the CPU and I had to make sure I gave it a pretty large buffer so it didn't stutter when an app claimed CPU for more than a second, but it worked.
wat10000 7 hours ago [-]
In addition to that, while it is possible to hit a delay and run out of buffer because memory access is slow (the most obvious would be if the input got swapped to disk at an inopportune moment), but the audible effect is really obvious. This isn't some subtle "oh my music sounds ineffably worse" effect, it's "my computer is glitching and my music is unlistenable."
billyjobob 6 hours ago [-]
I can tell when my CPU usage spikes because it causes a hum through my speakers, so this does not seem that far-fetched.
justsomehnguy 6 hours ago [-]
It's just means you have a shitty audio tract with not enough shielding. Move to SPDIF/TOSLINK.
codedokode 1 hours ago [-]
I have an external audio card, if I put it on a laptop I can hear the modem-like sounds. I wonder why it is so sensitive, should not DAC produce strong signal that cannot be easily affected by radio waves?
Also my headphones are extremely sensitive. I can touch the ring and sleeve of a jack with a finger, and touch a metal bed frame with a tip and I hear quiet clicks as I move the tip along the metal. Sometimes I do not even need to touch the jack with a finger. It doesn't work with small objects like a knife though.
PaulDavisThe1st 43 minutes ago [-]
Bad grounding everywhere. This is insanely basic stuff.
nok22kon 52 minutes ago [-]
the radio waves could be interfeering with the signal before it gets amplified
bellowsgulch 7 hours ago [-]
The latter is probably true, but the former does actually happen, and it's easy to accidentally do--lossless or not.
dijit 7 hours ago [-]
huh...
So I guess the programmer equivalent is distributing .pdb's (or, symbols)
Blackthorn 7 hours ago [-]
Pretty good analogy. Thing is though, the person who receives the 16-bit, 44.1khz music file can always upsample it to 192khz and not lose anything in the process (heck, lots of audio stuff oversamples internally to this level or beyond, for extra aliasing headroom!). I'm not sure about expansion from 16bit to 24bit though, downward expansion isn't necessarily perfect.
gizajob 7 hours ago [-]
You’d be adding 150khz and 8bits of nothing.
viccis 7 hours ago [-]
If you try to use empiricism when it comes to certain groups audiophiles, you are going to be sorely reminded that it's basically the equivalent of healing crystals for a different type of person. 24/192 is useful for mixing/mastering, but completely unnecessary for the end product to distribute for listening.
evo 7 hours ago [-]
24/192 is also great for digital synthesizers--if you're generating a waveform like a sawtooth that has theoretically instantaneous transitions, they can eat as much frequency as you can give them. Running at 44khz loses noticeable high-end content.
Most modern digital synths have already caught onto this and run internally at much higher sampling rates even if their output gets downsampled, but sometimes you run across a vintage plugin that runs at the host audio rate and working in a higher sampling rate is audible.
Blackthorn 7 hours ago [-]
You can generate perfect band-limited sawtooth waves at 44.1khz, there are multiple techniques for doing this and most production digital synthesizers use them.
Oversampling gives you headroom for aliases for the rest of the synth that is more vulnerable to it.
evo 7 hours ago [-]
Yeah, I was oversimplifying a blit, the raw waveforms are usually okay, but I distinctly remember old-school VSTs where you couldn't achieve a nice saw lead at 44.1.
Blackthorn 7 hours ago [-]
It's tough to tell without specific names, but I imagine a lot of particularly old* VSTs were written to use naive sawtooths rather than perfect band-limited ones, which would have terrible aliasing at 44.1 khz. Oversampling those would help a lot!
* Some people are still making this mistake, despite information on the (many) ways to do it the right way being widely and freely available!
evo 6 hours ago [-]
I wonder if there's also distortion or ring modulation stages where some of the energy above hearing range might spill into audible sidebands if they're not nyquist-limited first.
Blackthorn 6 hours ago [-]
Yeah, that's the "rest of the synth" part that's more vulnerable to aliasing.
There's some ways to do band-limited distortion but...they aren't nearly as widespread, easy, or universal as band-limited oscillators.
Ring modulation is funny though because you'd ideally want the sidebands to modulate down by default rather than filter them out, that's why you're using it.
nullc 42 minutes ago [-]
> 24/192 is also great for digital synthesizers--if you're generating a waveform like a sawtooth that has theoretically instantaneous transitions, they can eat as much frequency as you can give them.
So if your synthesizers do not use proper band-limited oscillators then 192KHz is _FAR_ too slow. You'd want to be running at hundreds of KHz, perhaps a few MHz.
In reality synth software that doesn't sound like crap uses band limited oscillators and should work okay at 48KHz too. That said, even if the oscillators are band limited it may be the case the varrious modulations aren't band limited properly, as getting those wrong won't sound instantly wrong (in particular because you have to modulate to make it wrong, and the underlying change of the modulation may make it harder to tell its wrong).
Though also in those cases if you're not counting on every step being properly band limited then 192KHz may be an improvement but you're still probably getting some meaningful aliasing. I think given how fast computers have become relative to digital audio there is probably a good case to just make any "modular synth" run at 32-bit 480KHz or even 4.8MHz through every stage that could process the audio.
Maybe 192KHz really is enough to suppress the aliasing artifacts but I think to be convinced of that I'd want to see a system that supported both and validate that the difference between a downsampled 48KHz output from the two modes was below -90dB or something.
Or otherwise you can just declare that the aliasing is part of the sound and then there are no right choices... 24khz sampling, 48k, 192k ... who cares, use what you like best. :)
Applejinx 43 minutes ago [-]
Hydrasynth aliases like a mad thing. My flagship synth ended up being Summit, and its oscillators are digital but run at a crazy high sample rate. Did likewise with some Chord Organ modules: that Teensy board it was built on could do chord audio at 300k and over a megahertz if you were just generating one wave as simply as possible. The freedom from aliasing really helped the sound, for all that it's a 12 bit analog output. A squarewave is a 1 bit signal…
dist-epoch 7 hours ago [-]
No synth generates sawtooths by literally drawing a saw tooth in PCM. The distorsion you get if you do that is not subtle at all.
colmmacc 7 hours ago [-]
32-bits are great for recording too because they do an incredible job of capturing the dynamic range without having to be precise on the preamp settings. It removes an entire job from the recording workflow.
192 for mixing and mastering can be useful especially if you're doing a lot of effects, especially anything that pitch shifts. But I've seen low quality phone-microphone recordings make it to the master; if you capture lightning in a bottle, it hardly matters what the settings were, what the microphone was, or anything else.
PaulDavisThe1st 41 minutes ago [-]
The limit on current DACs is 18-22 bits. The rest is just brownian noise. Literally.
Aldipower 7 hours ago [-]
Even with mixing/mastering 96khz is enough for persisting to files. But as another commenter said, 192 is useful, if you bend and stretch samples!
tshaddox 7 hours ago [-]
They literally sell actual crystals that you’re supposed to place on top of speakers and amplifiers to make them sound better.
Blackthorn 6 hours ago [-]
We had a really nice crystal decoration that I happened to put on top of one of my TV speakers and, wouldn't you know it, it had this resonant frequency somewhere around specific human speech frequencies that drove us absolutely bonkers until I figured out the cause and moved it.
teach 7 hours ago [-]
(2012)
lokar 7 hours ago [-]
I wonder how many people think that 24 bit audio encodes 50% “more”
recursive 7 hours ago [-]
It is 50% more headroom above the noise floor in logarithmic decibels.
Arodex 6 hours ago [-]
I completely accept that human audition has limits that are easy to determine by playing a pure sound. But is it the same with music, where multiple frequencies are played and interfere with each other? Aren't some harmonics or effects created by these "inaudible" frequencies?
To try to imagine something similar: the human eye is unable to see UV light, yet fluorescent paint has a visible quality of its own compared to "normal" pigments.
nok22kon 49 minutes ago [-]
when beams of ultrasounds interract they can produce audible frequencies
24 bits is now ubiquitous and 32 bit is becoming the norm in recording studios.
evo 7 hours ago [-]
32-bit float has become popular in filmmaking/field recording equipment lately because, with a microphone preamp that supports it, you can capture the entire dynamic range of the microphone--there's no accidental clipping if you drive the gain stage too hard.
It's a bit redundant for a skilled technician, they're already used to setting the gain staging, inbound compression, and feathering the mics to avoid this in 24-bit, but if you're handing a boom mic to a novice and have a scene where e.g. someone's whispering and another person's screaming, it can be nice to not have to worry about it.
lysace 7 hours ago [-]
That use case is literally addressed in the first sentence.
metalman 7 hours ago [-]
sheeesh , measly 24-bit/192kHz
of course it makes no sense, unless it is downloaded through low oxyegen wire, which somehow and unfathomably, must have been omited or forgotten.
b3orn 7 hours ago [-]
If it has been transmitted via hollow-core fibres it will obviously sound hollow.
waffletower 6 hours ago [-]
For typical listening (though humans can perceive bone-conducted vibrations up to 100 kHz or even 120 kHz) 16-bit-fixed/44.1kHz is a high-fidelity transport format. As a DSP researcher, I prefer 32-bit-float/44.1kHz as a transport format. I often upsample to 32-bit-float/188.2kHz or even 32-bit-float/192kHz for signal processing applications such as high-fidelity reverberation via direct and FFT convolution. While the author advocates for the transport to ear use case, I would argue that 24-bit/192kHz provides greater fidelity and resolution for sound processing. I found the pedantic arrogance of the author to be annoying. But yes, the sampling theory is an important consideration -- but so is the quality of the actual digital filters used in the DAC->ADC pipeline. They are much more forgiving and less lossy at 192kHz.
Aldipower 7 hours ago [-]
[dead]
6 hours ago [-]
haunter 7 hours ago [-]
The more the bits the better the music, easy as one two three
Don't forget to buy the new low oxygen platinum plated HDMI cables for the better experience!
96kHz was created to better reproduce 20kHz high frequency, so the digital noise shaping filter does not need to be super sharp right at the Nyquist frequency.
Both were introduced for a sound technical reason. beyond that, most are marketing non-sense to cheat consumers.
That’s not why I go for High-Res stuff, though.
It’s all about archival, at least for me. With a 24/192 Master in FLAC or ALAC, I can downsample to whatever the destination form factor is. I can transcode to a 320kbps MP3, or a 16/48 WAV stream for a smart speaker, or a 24/96 stream for the theater. The point isn’t that I can hear the difference, it’s the fear that I might lose something irrecoverable by sticking with lower-quality files for bulk storage. Once data has been discarded, it cannot be retrieved, and that influences my preference for storage (and is also why my BD/UHD rips are into MKVs, no re-encoding).
Now that being said, I will absolutely hem and haw and ABX different releases to determine if I opt for the 16/44.1 CD rip of an album from the 80s or the new 202X remaster in 24/192 (spoiler: almost always the former), and I absolutely prefer anything with classic instruments (Jazz, Classical) in higher-quality formats because of a subjective perception of a wider, clearer sound stage, though this is almost certainly a psychological effect from performing in concert bands and orchestras rather than physical or objective in nature.
Like I tell newcommers: if it sounds better enough to you to warrant the purchase price, then that’s all that really matters. Enjoy the hobby.
I also spent a lot of time ripping my old CDs to FLAC and trying different MP3 and AAC encoder settings to get playback that felt transparent enough to me. I could never tolerate Sirius/XM radio streaming due to the horrid compression I heard with every futile attempt. I still seem to have more sensitive hearing than most people around me, but in my 50s I know it isn't what it once was.
I never had huge budgets, but did strive for hi-fi in my limited ways. I used things like toslink and HDMI to send raw PCM data from Linux to my Yamaha A/V receiver's DACs + amplifier to drive somewhat nice Polk tower speakers. But then COVID-19 happened, and this stuff was packed up to move house.
Nowadays, music playback is streaming with mundane "subwoofer + satellite" PC speakers or MP3 playback with a mini-SD card permanently parked in my car's infotainment system.
As referenced in the article, a common explanation for those audible differences is that the high-resolution version of the album is sourced from a different master.
Small differences in gain are ABX able much more readily than differences in noise at the 16 vs 24 bit level. So if the signal chain gives even a small difference in gain between the samples that's what you'll track. A reasonable conversion path to 16 bits for mastering will also apply dithering and some kind of brickwall limiting (you have to limit after the dither or as part of the dither as dither can change levels!), and this can result in gain changes. The DAC may behave differently or have outright bugs for some configurations too.
This is particularly true wrt reconstruction filters for sample rate differences. And if you were comparing 44.1k and 192k then the physical DAC itself was likely running at a different rate and its _analog_ filters are probably better optimized for one vs the other (this is less true for 48k vs 192k, as the hardware likely runs at the same rate for both). So one answer to this comparison can be "on this particular hardware this rate is better than that rate"-- but that's a implementation property not a property of format choice.
You might think, "okay I'll use a mathematically perfect down and up conversion process and run the DAC in the exact same configuration for all cases". But even then you run into issues like after reconstruction the _inter sample_ peak levels will be higher than the levels of the samples, so you have to handle that and in a way that doesn't produce a gain difference between the two configurations. (probably by running your perfect process and finding the gain level that results in no limiting, then making the gain of the original match).
And then for the high rate vs non-high rate you have to deal with the fact that most amplifiers are not particularly linear (compared to well constructed software at least!) and that any real speaker is very far from linear. This means that the presence or absence of ultrasonics will change the audio in the 0-20khz band.. Before you think "well that could be a reason that high rate is better" observe that if there was some consistently good effect from the ultrasonics you could just bake it into the low rate sample.
> but in my 50s I know
Yeah if you're in your 50's you're absolutely not hearing differences way up above 20khz (especially if you're male), I bet you can't even hear CRT flybacks from 100 yards anymore. :P Most people have no idea how much their high frequency hearing degrades as they age because it plays approximately no role in your life, but it's real, dramatic, and as far as I know happens to everyone.
I don't mean to discount your experience: I don't really doubt that it was real. But answering the general question of the necessity of low vs high rate probably takes a team of experts, armed with test gear and the designs of the HW/SW in question, to vet the test configuration. Testing a _particular_ configuration without the ability to distinguish its implementation quirks from format-fundamentals is much easier and that's what most attempts to test this question are actually testing.
By testing in a recording studio you were doing far better than most such comparisons. Usually people try comparing different files and they're comparing entirely different mastering processes. Files made for the "high res" market will often have much less compression and limiting then files made for commercial radio play / casual listening... and truly do sound obviously much better. Some of my favorite recordings are rips from vinyl. Vinyl is an awful format from the perspective of audio fidelity, but it's also pretty intolerant of excessive compression and limiting because the record will skip if the needle is bouncing off the rails. And more recently I suppose they also avoid over compression there because of the difference in target listener/environment.
This was common knowledge at least as far back as the mid 80s, when every hifi shop and salesguy knew to ensure the bit of gear with the highest profit margin got played an almost imperceptible bit louder than the gear the customer came in to buy during back to back testing.
Point being: it doesn't even require an unscrupulous sales person to get similar results to an unscrupulous sales person! :P
A reasonable definition of transparency for high bitrate compressed audio is "Can the worst files be distinguished by a listener trained in what artifacts sound like". Maybe also add in having to use a high discrimination listening setup, including not running excessively loud (increases masking).
If that's not the test you're doing, it's unsurprising. At moderately high bitrates no one can reliably distinguish them on arbitrary samples: most inputs are easy.
If you test on known-difficult "killer samples" you'll probably easily distinguish them, even without first being shown what to look for, and certainly after.
During the development of Opus I created many 'trained listeners' and selected many killer samples, and I don't recall* ever encountering a tin ear that couldn't be taught to ABX any high rate samples, though some people are obviously much better at it.
I'm not sure I'd recommend it though: learning to identify artifacts has a frequent side effect of making low rate audio like the HE-aac used in SirusXM absolutely intolerable. I'm bothered by it even when I hear cars driving by using it. :)
[*] My memory for such things sucks, so I could be wrong-- but my point that it's not expected remains.
The takeaway from these sorts of posts, at least in my opinion, should be two-fold:
* Understand the physical limits of human senses and perceptions to help inoculate yourself against outright scams and grifts
* Liberate you from the "tech grind" and allow you to enjoy what you like, how you like it.
Also understand that while there is an upper limit, we are all different within that. I can hear the difference between 128Kbps and FLAC, at least for some content, but not 256Kbps, maybe not 192. For some content (spoken word etc.), 64Kbps, sometimes less, is perfectly acceptable (to me). There was a time I could hear the difference between some encoders, but that was decades ago and anything in active use is pretty damn good (and my ears are not what they used to be) unless you really crank the bitrate down or tweak other options daftly.
You've established this with double bind testing, correct?
But I also have a large multi-terabyte music collection, I follow new music, go to concerts, go to parties, talk about music with my friends in signal group chats.
It's a hobby, and when you get a bit older and start having some savings, if you love music treating yourself with a better system is not that crazy.
Also with HEDD you get a handcrafted device made in Berlin. And if you go with nicer cables, they are very beautifully done and feel great. There is no difference in sound of course. Some people like jewelry, I can get similar enjoyment from beautiful audio equipment and cables.
And so many CDs of course.
On a tangent, whenever someone mentions LP sounding warmer or whatever I like to point out that I prefer wax cylinders (a.k.a. phonograph cylinders).
If I have an option to get a 16bit version of a recording or a high-res version, I choose the highest quality version very time
Same with a physical copy. A limited edition, better quality vinyl LP is more attractive if you are going through the trouble of curating a collection.
I’ve been curating a music library of digital files since before the iPod was released and I will always go for the highest quality version out of principle. I can always downsample it to any thing that makes sense.
It may be simultaneously true that:
A) Humans cannot tell the difference between 44.1kHz/16-bit audio and any higher resolution, and
B) For a particular song, the best commercially available 44.1kHz/16-bit version may not be the best commercially available version
"The quality of the particular mastering can still make a noticeable difference, regardless of the ability for the digital sampling rates to perfectly represent it perceptually"
Just to be clear that the statement applies to any releases meeting the A) criteria, not just 44.1 kHz @ 16-bit ones.
It’s like having gigabit internet to my house: I don’t actually need it, but when a website is slow, I know the problem isn’t in my internet connection.
https://www.carwow.co.uk/blog/carwow-quarter-mile-400-metre-...
https://en.wikipedia.org/wiki/List_of_N%C3%BCrburgring_Nords...
I opened a support ticket but they never responded. After that it was difficult to take their lossless claims seriously when the labels were providing such garbage source material. Their whole value prop was totally hollowed out.
I don't know whether the labels still impose such horrible practices, but I largely gave up on streaming services after that experience and now focus on keeping good digital archives of my physical library.
I’ve played with the nice toys, and they are nice, but for 100x the price, they barely deliver 1.5x the experience.
It’s like photographers who are confused about the difference between raw and bitmap (jpeg), videographers confused about the difference between linear raw vs log vs gamma encoded, etc.
Just because a data format with higher bit depth/sampling frequency/whatever exists for editing purposes, doesn’t mean it’s “better” or makes sense as a consumption format for a finished work.
Forms of manipulation bring inaudible content into the audible range.
Of course that doesn't mean audiophiles aren't being audiofooled by it, but there is legitimate usage.
I use microphones that can 'hear' up to 100kHz (Sanken CUX100K) and for film sound design playing 192kHz audio at half and quarter speed the results are very significant, and reveal there IS 'content' above human hearing. Irrelevant for general listening but very important for sound design.
Nobody uses 32 bit float for recording (to do so is just to capture at least 10 bits of noise, most of that being brownian); its strictly a format for mixing and processing. You don't get any more resolution from 32 bit floating point than you do from 24 bit integer formats, but the result of "clipping" is less dramatic, hence the appeal of the format.
While there is some evidence that non-auditory human sensory perception may be sensitive to ultrasonic acoustic waves, it's pretty weak right now, and somewhat in the "woo" zone. It may turn out to be significant, or it may not. I wouldn't base an audio production workflow that requires 4x the cpu power and 4x the disk space on such tentative claims, but you're welcome to.
Surely you understand a recording made at 48kHz has a max freq response of 24kHz and played at half speed that max freq is 12kHz and at quarter speed only 6kHz. You can very clearly hear the filter cut off due to Nyquist. Record at 192kHz with mics capable of 100kHz capture and when played at quarter speed, the sound is full spectrum because there is no truncated frequency response. And when I load a 192kHz recording to izotope RX I can literallu see the harmonics going up to 96kHz. (not with every sound of course)
I repeat, i am not talking about 'normal' listening. I am talking about an industruy you have no knowledge or lived experience with, so spare me the incorrect claims about what can & cant be heard.
I'm the original/lead developer of Ardour, a cross-platform DAW, and have been working with digital audio for more than 25 years.
There are no 32 bit DACs - your SDD MixPre's are giving you (at best) 22 bits packaged as a 32 bit float value. The preamps make absolutely zero difference to the DA conversion (though they might sound real nice).
> Surely you understand a recording made at 48kHz has a max freq response of 24kHz and played at half speed that max freq is 12kHz
This is a very naive version of what "played at half speed" might actually mean. If properly and correctly resampled, this is not true.
> And when I load a 192kHz recording to izotope RX I can literallu see the harmonics going up to 96kHz
Well, I'd certainly hope so! But the question is: what are the energy levels associated with the partials above Nyquist? If you recorded at 384kHz with sensitive enough equipment, you'd see partials above 96kHz - but at extremely low energies because ... well, that's just how physics works.
The half speed you call naive is again just showing your ignorance. Sound editors have been using this technique since the days of recording on a Nagra at 15ips and literally replaying at 7.5ips half speed, and at 3.75ips for quarter speed. There is nothing naive about it, it is a very well know technique. To be able to achieve the same result digitally with full spectrum has impacted every feature film you have experienced in recent years. Again decades of lived experience.
Yes they do, almost all high end field recorders used for film work are 32-bits now and have been for much of the last decade, often with some fancy preamp integration so that there is no expertise required for gain staging the recording. (I believe the implementations use a second matched 24bit ADC with 48 dB less gain in front of it).
The result obviously doesn't have a noise floor which is lower (as the noise of a room temperature _resistor_ gets in the way of that even at the 24-bit level) but they have more dynamic range so that your recording isn't ruined by hard clipping some unexpected loud sound.
It's a big improvement for practical usage, and also likely does improve SNR somewhat because you can run higher gains without as much fear that you'll ruin the recording. The reason it would pay off is that the SNR loss you get from splitting the signal is easily smaller than the SNR loss you would get from gain reduction to avoid clipping.
(maybe... capsule self noise is also limiting... at these levels, and usually people aren't using microphones designed for the lowest possible self noise unless they're doing something special)
There are ADCs that will provide 32 bits per sample but that's entirely different.
Current technology limits the bit depth to 18-22 bits and going beyond that you'd be very quickly recording brownian (atomic) noise anyway.
The point about 32 bit float is that it is a useful format for mixing, editing and general processing, so it is widely used in digital audio tools. But it is not a format that ADCs generate "natively" via their electronics - almost all of them are generate a 24 bit integer or fixed point value and then just supplying that as a 32 bit float value because the software asked for it (the software could have done it all by itself.
[EDITED: DAC->ADC since that is what I meant and what this is all about]
so maybe they do sample at 24 bit at a well chosen gain level and then convert to 32 bit float, with the max 24 bit value being above 1.0 float
or as GP said, use two separate ADCs at two different gains and combine their output
Of course it does! And that's what it does, of course. But that has absolutely nothing to do with the AD process itself, which is chip-limited to 24 bits and likely physics-limited to somewhat less than that.
You can't beat the physical limit of a DA circuit by doubling them up at different gains.
And .. you don't want to. Going beyond 22 bits gets you into brownian noise pretty quickly, which is completely pointless.
The best you can do (or could do) is get a very, very, very good DA that can really do 22 bits (likely not commercially available because of the expense), and then get the samples from it in whatever format works best for your purpose (24 bit integer, some fixed point value, or 32 bit floating point).
but what if you "allow" double that voltage and call it 2.0 float? a strong pressure into the microphone generates a stronger voltage
thermal noise limits you on the quiet signals, but not on the powerfull ones
so 22 bit for -1.0 -> 1.0 range and you can add a few more bits on top of that for stronger audio pressures (voltages) which you would traditionally clip
> Nobody uses 32 bit float for recording (to do so is just to capture at least 10 bits of noise, most of that being brownian);
This is not true and not true for a good and important reason! One which has no bearing on the kind of DACs that exist.
Modern field recorders allow gains set a 'reasonable' level that maximizes SNR for recordings but still won't clip when there are much louder peaks. Not so dissimilar to how a 6-digit multimeter can achieve its advertised performance both on a 0-5v range and a 0-300v range.
Obviously, everyone and their mother uses 32 bit float as an internal sample format because of its fitness for purpose (except the folks who think they need 64 or 80 bit floating point, of course). But they are not using "32 bit floating point samples" - the samples come from an (at best) 18-22 bit integer conversion.
It kind of changed me a bit when I ran through 20 lossless tracks I had re-encoded to various mp3 bitrates and realized that even on a fancy system, it can be really hard if not impossible to discern even moderate lossy from lossless.
If you are an audiophile geek, really think about if you want to try this, the reality check might crack your foundations.
[1]https://www.foobar2000.org/components/view/foo_abx
We store files in the highest quality because it gives us the option to encode the music without audible loss of quality.
[1]: https://www.cnn.com/2023/03/30/world/plants-make-sounds-scn
However, the article claims that the final distribution doesn’t need to have a bit depth of more than 16. That does not match my experience. I can tell the difference between my renders that are 16 bit vs 24 bit. I cannot tell the difference between 44.1 kHz and higher sample rates, and that’s consistent with the math (Nyquist-Shannon), but bit depth is a different matter. Would be fun to participate in a double-blind test that includes my own tracks and others.
established using double blind testing, I assume?
Also, sadly consumers are getting used to low quality audio nowadays - they often listen to lossly compressed audio on social media (sometimes decompressed and re-compressed several times) which is then re-compressed to send to bluetooth headphones, or played back on an awful smartphone speakers. Streaming services also use compressed audio.
(2014) https://news.ycombinator.com/item?id=8689231 424 comments
(2015) https://news.ycombinator.com/item?id=10520639 228 comments
(2017) https://news.ycombinator.com/item?id=15127633 428 comments
(2019) https://news.ycombinator.com/item?id=19318898 314 comments
And its all good! It's perfectly fine to say "I prefer the sound when the whole mix (or just that guitar) ends up being subject to interesting and possibly harmonically relevant distortion at low levels".
Just don't say "The version with the distortion is more accurate than the one without", because that's a lie.
You can the focus on other things.
Example: I Bought the best skis possible. Now I know I need to just focus on my skills and not blame the equipment.
As for how this relates to audio compression, in particular in the context of 2012. you are making a tradeoff of storage size and decompression cost. Maybe that doesn't matter to you, but maybe it either did in 2012 or still does.
The problem is the people spreading myths and disinformation out of ignorance or to promote their enterprise.
The weak links are producers/mastering-engineers, speakers/headphones and the room when using speakers.
I'm not interested in finetuning everything in my life for efficiency.
https://video.xiph.org/vid2.shtml
or on YT if you can't play it https://www.youtube.com/watch?v=cIQ9IXSUzuM
There's multiple YouTube channels that I listen to as podcasts, that are professionally created and the creators presume that exported audio works like studio audio, so what you end up with is really quiet audio that can't be turned up without pre-processing.
If we distributed audio the same way we work with it in a studio, we could forgo a lot of problems.
Also, the human ear does have enough dynamic range to make 24 bits worthwhile, though that much dynamic range is rarely used in recordings, and that high of a bit depth provides no benefits within a small dynamic range. A 192 kHz sample rate, on the other hand, is always useless.
I use a DAC by focusrite which can do 24-bit, and if I want to listen to higher fidelity audio on my planer headphones then I should be able to. Why should I limit myself to 16-bit
If I like an artist that I find on streaming, I buy an LP and get a lossless download for free. I still have a music library and I will never rent my favorite music.
Artists prefer to connect directly with their fans and BC is probably the best platform for people who care to pay and support acts directly. They have high res downloads and I import them.
Also the playback rate and the file rate are different topics. The former can get into scenarios more like the audio processing section of the article e.g. I had this one shitty headset for work which required me to set the volume to 1-2 (out of 100) on the computer and I could actually blind test tell when it was in 16 bit or 24 bit mode because it was cutting and boosting it so much it effectively lost precision in 16 bit mode.
It would have cost the same for the entire stack to be 16bit/44.1kHz at every step, but with excessive resolution I can control the volume anywhere. The bits right before the analog conversion at the end are essentially the same whether I turn down the volume in the software player, the operating system, or the DAC/amplifier.
When I play from the computer, I'm not sure whether it is using the clock on my Mac, the clock on the optical interface, or the WiiM's clock. However, I do not notice any difference in fidelity when I use the Qobuz software player on my Mac or use Qobuz Connect to allow the player to directly stream from the source, so either it isn't a difference that I can hear, or the WiiM's internal clock is used for both sources.
I can always tell if my 44.1 songs are being resampled to 48 because they're being run through the OS mixer
But a quality audio player should account for this and do it's own.
It is an incredible resource to see the quality of the resampling algorithms used by the actual production software likely used in any digital audio workflow.
You will see that while the best are indeed almost 100% transparent, many are not.
your software is among the best, but not pitch black best :)
Would be happy to see an actual, real study to prove that humans can notice, but to my knowledge none exist that confirm they can. Not even any on teenagers or younger (the only group that can even hear close up 20khz).
A quick search returned this PDF with a nice diagram of what aliasing looks like: https://download.tek.com/document/76W_30631_0_HR_Letter.pdf
To draw a design parallel: pixel-perfect design isn't something we are born with, noticing tiny details is a developed skill.
And yes, you are on point: oversampling is used extensively, but this just points at the exact issue: Nyquist theorem gave us a math algorithm, we still need to account for the electronic component imperfections. And then we are entering a different space of quality/precision/psychoacoustics/perception/etc. Meaning, not all converters, not all pre-amps, not all mics "sound" the same, even when they use same types of components on paper.
Do you have more convincing sources?
Now in terms of realistic audio encoding, 16 bit at 44.1 kHz is designed to be a faithful representation as far as human hearing is concerned. Can someone with a trained ear potentially tell the difference between that and 24 bit at 192 kHz? In a studio environment it's possible. Most audiophile claims are dubious and a blind A/B test catches them out on most of it but the Nyquist-Shannon sampling theorem does not directly apply to quantized samples, it's about exact samples and with quantization, sampling rate is intertwined somewhat with the quantization depth.
The human threshold-of-hearing curve intersects the threshold-of-pain curve at about 20 kHz.
Above that frequency (or thereabouts) the sound has to be so loud that it will literally instantly damage your hearing before you can hear it.
This has been replicated across many studies for more than 100 years.
Flicker threshold is completely different. You can’t damage your vision by increasing the FPS, and it has always been commercially desirable to use a lower frequency because that is cheaper.
In addition, nobody cares about "measurable" artifacts (or rather, they should not). What matters are "audible" artifacts.
Artifacts do not sum linearly, because they do not originate from correlated sources (unless you're doing something rather unusual).
Glad you can hear the difference between two converters, but I trust you've tested it in a double blind setting?
Who has the best ears? What can they detect?
I know from my 20-ish year mixing experience that I can hear the difference when mixing. Is it good evidence? No. So we can agree to disagree then.
https://ardour.org/ is my website.
The energy of the signal components above the Nyquist is generally very low, and very few double blind tests have given any indication that humans can detect the resulting aliasing (even though many people claim to be able to do, almost always in non-double-blind environments).
Badly written digital synthesis can generate high energy signal components above 22kHz, but that's because they're badly written, not because the theory is wrong.
This space is not driven by a single precise formula. 48/96 kHz helps some engineers to produce better sounding mixes. Can everyone hear the extended range of Adam tweeters? Probably not. But some can, and they benefit from that. Even if there is no double-blind study to prove this in absolute terms.
But very little music is like that, and the energy profile above Nyquist will differ dramatically. Consequently, you're not summing a set of identical aliasing results, and in general, the results will still be undetectable to almost everyone.
Jacob Collier routinely works with 300+ tracks in Logic. He doesn't worry about this sort of thing, and neither do the Grammy voters who love what he does.
My favourite: "audiophile-grade" audio players which allocate a single continuous buffer of RAM into which they load/decode the whole .WAV/.FLAC file, because supposedly the CPU "jumping" between "fragmented audio" causes audible "jitter".
Of course, they don't know that what looks like continuous memory to user-code is probably discontinuous in kernel/physical RAM.
Didn't check in many years, I wonder if they created kernel level players to account for that, to have "true continuous memory"
[1] https://www.audioasylum.com/messages/pcaudio/119979/
How this would occur without also producing grossly audible pitch distortion never seems to be discussed.
Thanks for the laugh... this is absolutely bonkers. In case anyone is wondering, before sound hits our ears it has to go through a digital to analog conversion, which takes place on hardware independent of the CPU, operating with its own clock and buffers etc.
It gobbled like 90% of the CPU and I had to make sure I gave it a pretty large buffer so it didn't stutter when an app claimed CPU for more than a second, but it worked.
Also my headphones are extremely sensitive. I can touch the ring and sleeve of a jack with a finger, and touch a metal bed frame with a tip and I hear quiet clicks as I move the tip along the metal. Sometimes I do not even need to touch the jack with a finger. It doesn't work with small objects like a knife though.
So I guess the programmer equivalent is distributing .pdb's (or, symbols)
Most modern digital synths have already caught onto this and run internally at much higher sampling rates even if their output gets downsampled, but sometimes you run across a vintage plugin that runs at the host audio rate and working in a higher sampling rate is audible.
Oversampling gives you headroom for aliases for the rest of the synth that is more vulnerable to it.
* Some people are still making this mistake, despite information on the (many) ways to do it the right way being widely and freely available!
There's some ways to do band-limited distortion but...they aren't nearly as widespread, easy, or universal as band-limited oscillators.
Ring modulation is funny though because you'd ideally want the sidebands to modulate down by default rather than filter them out, that's why you're using it.
So if your synthesizers do not use proper band-limited oscillators then 192KHz is _FAR_ too slow. You'd want to be running at hundreds of KHz, perhaps a few MHz.
In reality synth software that doesn't sound like crap uses band limited oscillators and should work okay at 48KHz too. That said, even if the oscillators are band limited it may be the case the varrious modulations aren't band limited properly, as getting those wrong won't sound instantly wrong (in particular because you have to modulate to make it wrong, and the underlying change of the modulation may make it harder to tell its wrong).
Though also in those cases if you're not counting on every step being properly band limited then 192KHz may be an improvement but you're still probably getting some meaningful aliasing. I think given how fast computers have become relative to digital audio there is probably a good case to just make any "modular synth" run at 32-bit 480KHz or even 4.8MHz through every stage that could process the audio.
Maybe 192KHz really is enough to suppress the aliasing artifacts but I think to be convinced of that I'd want to see a system that supported both and validate that the difference between a downsampled 48KHz output from the two modes was below -90dB or something.
Or otherwise you can just declare that the aliasing is part of the sound and then there are no right choices... 24khz sampling, 48k, 192k ... who cares, use what you like best. :)
192 for mixing and mastering can be useful especially if you're doing a lot of effects, especially anything that pitch shifts. But I've seen low quality phone-microphone recordings make it to the master; if you capture lightning in a bottle, it hardly matters what the settings were, what the microphone was, or anything else.
To try to imagine something similar: the human eye is unable to see UV light, yet fluorescent paint has a visible quality of its own compared to "normal" pigments.
this has practical applications
Some previous discussions:
2023 https://news.ycombinator.com/item?id=34698427
2022 https://news.ycombinator.com/item?id=30138561
2019 https://news.ycombinator.com/item?id=19318898
2017 https://news.ycombinator.com/item?id=15127633
2015 https://news.ycombinator.com/item?id=10520639
2014 https://news.ycombinator.com/item?id=8689231
2012 https://news.ycombinator.com/item?id=3668310
It's a bit redundant for a skilled technician, they're already used to setting the gain staging, inbound compression, and feathering the mics to avoid this in 24-bit, but if you're handing a boom mic to a novice and have a scene where e.g. someone's whispering and another person's screaming, it can be nice to not have to worry about it.
Don't forget to buy the new low oxygen platinum plated HDMI cables for the better experience!
/s