As I sit here listening to some big symphonic music playing on my ‘KEF’ DSP-based active crossover stereo system, I am struck by the thought: how could it be any better?

I sometimes read columns where people wonder about the future of audio, as though continuous progress is natural and inevitable – and as though we are accustomed to such progress. But it does occur to me that there is no reason why we cannot have reached the point of practical perfection already.

I think the desire for exotic improvements over what we have now has to be seen within the context of most people having not yet heard a good stereo system. They imagine that if the system they heard was expensive, it must therefore represent the state of the art, but in audio I think they could well be wrong. Some time ago, the audio industry and enthusiasts may even have subconsciously sniffed that they were reaching a plateau and begun to stall or reverse progress just to make life more interesting for themselves.

At the science fiction level, people dream of systems that reproduce live events exactly, including the acoustics of the performance venue. Even if this were possible, would it be worth it without the corresponding visuals? (and smells, temperature, humidity, etc.?)

Something like it could probably be achieved using the techniques of the computer games industry: synthesis of the acoustics from first principles, headphones with head tracking, or maybe even some system of printed transducer array wall coverings that could create the necessary sound fields in mid-air if there was no furniture in the room (and knowing the audio industry, it would also supplement the system with some conventional subwoofers). My prediction is that you would try it a couple of times, find it a rather contrived, unnatural experience, and next time revert to your stereo system with two speakers.

On a more practical level, the increasing use of conventional DSP is predicted. We are now seeing the introduction of systems that aim to reduce the (supposedly) unwanted stereo crosstalk that occurs from stereo speakers. The idea is to send out a slightly attenuated antiphase impulse from one speaker for every impulse from the other speaker, that will cancel out the crosstalk at the ‘wrong ear’. It then needs to send out an anti-antiphase impulse from the other speaker to cancel out that impulse as it reaches the other ear, and so on. My gut instinct is that this will only work perfectly at one precise location, and at all other locations there will be ‘residue’ possibly worse than the crosstalk. In fact we don’t seem bothered by the crosstalk from ordinary stereo – I am not convinced we hear it as “colouration”. Maybe it results in a narrowing of the width of the ‘scene’, but with the benefit of increasing its stability. (Hand-waving justification of the status quo, maybe, but I have tried ambiophonic demonstrations, and I was eventually happy to go back to ordinary stereo).

Other predictions include the increasing use of automatic room correction, ultra-sophisticated tone controls and loudness profiles that allow the user to tailor every recording to their own preferences.

Tiny speakers will generate huge SPLs flat down to 20 Hz – the Devialet Phantom is the first example of this, along with the not-so-futuristic drawback of needing to burn huge amounts of energy to do it. Complete multi-channel surround envelopment will come from hidden speakers.

At the hardware fetish end, no doubt some people imagine that even higher resolution sample rates and bit depths must result in better audible quality. Some people probably think that miniaturised valves will transform the listening experience. High resolution vinyl is on the horizon. Who knows what metallurgical miracles await in the science of audio interconnects?

For the IT-oriented audiophile, what is left to do? Multi-room audio, streaming from the cloud, complete control from handheld devices are all here, to a level of sophistication and ease of use limited only by the ‘cognitive gap’ between computer people and normal human users that sometimes results in clunky user interfaces. The technology is not a limiting factor. Do you want the album artwork to dissolve as one track fades out and the new artwork to spiral in and a CGI gatefold sleeve to open as the new track fades in? The ability to talk to your device and search on artist, genre, label, composer, producer, key signature? Swipe with hand gestures like Minority Report? Trivial. There really is no limit to this sort of thing already.

In fact, for the real music lover, I don’t think there is anything left to do. Truth be told, we were most of the way there in 1968.

The basic test is: how much better do you want the experience of summoning invisible musicians to your living room to be? I can’t imagine many worthwhile improvements over what we have now. The sound achievable from a current neutral stereo system is already at ‘hologram’ level; the solidity of the phantom image is total – the speakers disappear. It isn’t a literal hologram that reproduces the acoustics in absolute terms, allowing you to walk around it, of course, but it is a plausible ‘hologram’ from any static listening position, allowing you to ‘walk around it’ in your mind, and it stays plausible as you turn your head.

It isn’t complete surround envelopment, but there is reverberation from your own room all around you, and it seems natural to sit down and face the music. You will hear fully-formed, discrete, musical parts emerging from an open, three dimensional space, with acoustics that may not be related to the space you are listening in. You have been transported to a different venue – if that is what the recording contains. In terms of volume and dynamics, a modern system can give you the same visceral dynamics as the real performance.

And all this is happening in your living room, but without any visuals of the performance – it is music that you are wanting to listen to after all. If the requirement is to experience a literal night at the opera, then short of a synthesised Star Trek type ‘holodeck’ experience you will be out of luck.

You could always watch a high resolution DVD of some performance or the BBC’s Proms programmes, for example, and such visuals may give you a different experience. They will, however, destroy the pure recreation of the acoustic space in front of you because, by necessity, the visuals jump around from location to location, scene to scene in order to maintain the interest level, and your attention will be split between the sound and the imagery. Anyway, a huge TV will cost you about £200 from Tescos these days so that aspect is pretty well covered, too.

The natural partner to a huge TV is multi-channel surround sound. Quadraphonic sound seemed like the next big thing in the 1970s, but didn’t take off at the time. We now have five or seven channel surround sound. Does this improve the musical experience? Some people say so, but that could just be the gimmick factor, or an inferior stereo system being jazzed up a bit. While the correlation between two good speakers produces an unambiguous ‘solution’ to the equations thereof, multiple sources referring to the same ‘impulse’ could result in no clear ‘solution’ – that is, a fuzzy and indistinct ‘hologram’ that our ears struggle to make sense of. Mr. Linkwitz surmises something similar in the case of the centre speaker, plus he finds it visually distracting; with just two speakers, the space between them becomes a virtual blank space in which it is easier to imagine the audio scene. Most recordings are stereo and are likely to remain that way with a large proportion of listeners using headphones. For these reasons, I am happy that stereo is the best way to carry on listening to music.

Can DSP improve the listening experience further? Hardly at all I would say. So-called ‘room correction’ cannot transform a terrible room into a great one, and it doesn’t even transform a so-so one into a slightly better one. It starts from a faulty assumption: that human hearing is just a frequency response analyser for which real acoustics (the room) are an error, rather than human hearing having a powerful acoustics interpreter at the front end. If you attempt to ‘fix’ the acoustics by changing the source you just end up with a strange-sounding source. At a pinch, the listener could listen in the near(er) field to get rid of the room, anyway.

I am convinced that the audiophile obsession with tailoring recordings to the listener’s exact requirements is a red herring: the listener doesn’t want total predictability, and a top notch system shouldn’t be messed about with. As a reviewer of the Kii Three said:

…the traditional kind of subjective analysis we speaker reviewers default to — describing the tonal balance and making a judgement about the competence of a monitor’s basic frequency response — is somehow rendered a little pointless with the Kii Three. It sounds so transparent and creates such fundamentally believable audio that thoughts of ‘dull’ or ‘bright’ seem somehow superfluous.

The user doesn’t have access to the individual elements of the recording. What can be done in terms of, say, reducing the volume of the hi-hats (or whatever) is crude and unnatural and bleeds over every other element of the recording. The only chance of reproducing a natural sound, maintaining the separation between fully-formed elements and reproducing a three dimensional ‘scene’, is for the system to be neutral. When this happens, the level of the hi-hats likely just becomes just part of the performance. Audiophiles who, without any caveat, say they want DSP tone controls in order to fiddle about with recordings have already given up on that natural sound.

In summary, I see the way music was ‘consumed’ 40 or even 50 years ago as already pretty much at the pinnacle: two large speakers at one side or end of a comfortably-furnished living room, filling the space with beautiful sound – at once combining compatibility with domestic living and the ability to summon musicians to perform in the space in a comprehensible form that one or several people can enjoy without having to don special apparatus or sit in a super-critical location. And the fitted carpets of those times were great for the acoustics!

All that has happened in the meantime is just the ‘mopping up’ of the remaining niggles. We (can) now have better performance with respect to distortion, frequency response, dynamic range, and a more solid, holographic audio ‘scene’; no scratches and pops; instant selection of our choice of the world’s total music library. The incentives for the music lover to want anything more than this are surely extremely limited.