The Challenge of Immersive Sound

Bookmark and Share

Tue, 09/17/2013 - 15:22 -- Michael Karagosian

By Michael Karagosian  

Michael KaragosianImmersive sound promises an enhanced experience in digital cinema.  But being the new kid on the block, it’s also the most misunderstood technology in cinema, leading to poor decisions by industry leaders.  The goal of this article is to educate the industry about challenges that immersive sound presents and to present my arguments as to why the National Association of Theatre Owners’ and Digital Cinema Initiative’s rushed actions have the potential to harm, rather than help, the first release motion picture industry.

Immersive sound, in my definition, is a broad category describing any system designed to convey a soundfield that goes beyond a planar soundfield.  The everyday examples of planar soundfields are 5.1 and 7.1 sound, where a horizontal line of speakers is situated behind the screen, and another horizontal array of speakers surrounds the audience.  When one or more vertically placed speaker elements are added, we then are moving into so-called 3D sound, or what is now commonly called immersive sound.

There are several ways in which immersive sound can be stored in the distribution files received by the cinema.  The most direct way is to use channel-based sound, in which every speaker, or elemental speaker system, is directly coupled to a recorded channel of sound.  An example of an immersive, but purely channel-based approach, is that taken by Auro3D.  Dolby Atmos goes a step further and utilizes both object-based sound, as well as channel-based sound.  Object-based sound is the technique of using snippets of audio, called the object, and coupling those objects with metadata that instructs a rendering engine as to how to render the sound object in the auditorium using a specific speaker array.  Normally, the object is dynamically rendered, such that it moves across the intended speaker array in the manner that the sound mixer desires.  The value of incorporating object-based sound in an immersive sound format is to gain the benefit of supporting a very large number of sound channels without burdening the distribution with a very large number of recorded sound channels.  In presenting these two important approaches to immersive sound – channel-based and object-based – there is no intent on my part to convey a higher value to either Auro3D or Dolby Atmos.  They are simply different approaches to solving the many challenges evident with immersive sound.

There is another aspect to immersive sound that also must be considered, and that’s the preservation of creative intent.  By creative intent, I refer to the artistry of the movie director, editor, cinematographer, and sound designer.  I will argue that the primacy of first release in the cinema requires not only the preservation of the first release window, but also the preservation of creative intent.  The preservation of creative intent in the cinema is the driver that encourages the artist to incorporate one’s best efforts in the production of a movie.  Any dilution of the ability of the cinema to preserve creative intent simply devalues the cinema as the primary vehicle for first release movies.

I will also argue that creative intent is not preserved in any other form of release.  For example, in cinema, it would be sacrilegious to say that the exhibitor should be allowed to resize images, resize aspect ratios, sharpen images, and/or alter color and contrast to one’s liking.  However, such controls are readily available to the user on every TV set.  Similarly, it would be sinful to say that the exhibitor should be encouraged to adjust the spatiality of sound in the cinema auditorium at will.  Consumers can adjust such things at will on their home playback systems, but cinemas do things differently.  Cinemas engage in an industry-defined set-up process to deliver the creative intent of the artist to the audience in the best possible manner.  However, the recent and hasty actions of NATO and DCI have the dangerous potential of diluting the preservation of creative intent with immersive sound, and are unlikely to produce the intended results. There are two key issues to consider about immersive sound.

1. Lopsided availability of content.  This refers to the ability of exhibitors to obtain movies mixed in both Atmos and Auro3D.  As already explained, these systems are based on two fundamentally different types of immersive sound formats.  More often than not, a movie is distributed in one format, and not another.  There are reasons for this.  To begin, exhibitors bought immersive sound systems without understanding the risks and economics associated with the availability of immersive content.  They only asked for standards in distribution after they realized their mistake.  The actions of exhibition industry today are in stark contrast to how it dealt with the transition to digital cinema in 1999, when exhibitors recognized the need for standards ahead of making substantial investments in digital cinema equipment. Second, pressure is placed on the distributor to provide audio mixes in both formats, but the ability to do so is not always in control of the distributor.  Producers, for example, have been known to sign exclusive production deals that favor one format over another, tying the hand of the distributor. Another important factor that skews uniform availability of formats is the lack of acceptable mixing tools that would reduce the cost of creating mixes for both formats.  I once asked a prominent sound mixing facility how it dealt with immersive sound mixes.  I was told that an A team worked on the 5.1 mix while a B team worked on the immersive mix, regardless of the fact that 5.1 down-mixing tools were readily available for each immersive format.  Evidently, these tools weren’t acceptable.  In the scenario described to me, a total of three mixes would be required to produce a 5.1 mix, an Auro3D mix, and an Atmos mix, placing a major strain on budgeted production times and costs.  This problem will not be addressed by a common distribution format, as described in my next point.

2. Rendering engines are not created equal.  One of the myths of immersive sound is that 3rd-party rendering engines can accurately reproduce any object-based mix on any speaker system.  This is the core argument touted in support of a common distribution format incorporating object-based sound.  But the fact is that immersive sound cannot be rendered on any speaker system and also preserve Creative Intent. 

As described previously, there are three components of object-based sound, all of which are tightly coupled.  This is illustrated below.

The three tightly coupled components of an object-based immersive sound system.At a minimum, an alteration in one block requires an alteration in at least one other block.  For example, a change in the layout and composition of an immersive speaker system requires, at a minimum, a change in the immersive rendering engine.  Likewise, a change in the way the object-based mix is represented in the distribution format requires, at a minimum, a change in either the speaker system or the rendering engine, or both.  The three elements of immersive sound must be tightly coupled for creative intent to be preserved.

One of the arguments against strongly-defined rendering algorithms and speaker systems for immersive sound is that precision soundfields are not delivered today:  real 5.1 sound systems do not always deliver the exact soundfield intended by the sound mixer.  This is the glass half empty, glass half full argument.  It ignores the industry’s past reliance on organizations such as THX to certify, and thus differentiate auditoriums that offer the best ability to accurately deliver creative intent.  THX’s role may be diminished today, but only because the difference between a THX-approved and a non-THX-approved cinema has also diminished – a credit to the exhibition industry.  In the absence of cinema certification entities such as THX, the industry has relied on widely accepted, strongly defined loudspeaker layouts to insure that creative intent is preserved.  Immersive sound, on the other hand, relies solely on branded systems, namely Auro3D and Atmos, to dictate strongly defined speaker layouts and preserve creative intent.

In response to the issues outlined, the industry reactions leave a lot to be desired.  NATO has taken the approach of lobbying for a single distribution format, investing resources towards the development of the DTS-proposed Multi-Dimensional Audio (MDA) format, which DTS would like to utilize for its introduction of immersive sound in the consumer market.  (DTS divested itself of all cinema-related assets in 2008.)  Given NATO’s strong position for the primacy of first release content, this has to be one of the oddest collaborations ever.  In addition, the six major Hollywood studios recently amended the DCI specification (yes, Martha, that’s amended, as opposed to issuing an erratum) to now require that DCI-compliant systems incorporate a standardized, optionally encrypted, immersive distribution format.  This would be a commendable step if the standardized immersive distribution format also embraced standards for rendering algorithms and strongly defined speaker systems.  But the amended DCI specification falls short of this requirement. 

There is a mechanism in digital cinema for allowing content to only play on systems that will preserve creative intent, and that’s encryption.  Rendering engines are too complicated to incorporate in the Integrated Media Blocks that are installed in projectors, so immersive sound systems with rendering engines incorporate an outboard media block.  To play the content on the system for which it is intended, a key must be available to unlock the immersive sound media block.  For branded immersive sound, the key would only be available for the sound system for which the content is intended, a simple matter to contractually require when licensing a particular brand or technology for a mix.  If the industry goes down this path – which it appears to be doing – then it would be the first time that content intended for general distribution would be locked to a particular brand of equipment.  It is just another scenario that says if exhibitors think that standardized distributions alone will allow a Brand A immersive mix to play on their Brand B immersive system, then they should think again. 

Sadly, NATO and DCI have jumped on the single distribution bandwagon for immersive sound without fully grasping the issues behind the preservation of creative intent.  There is no free lunch with immersive sound.  If a standardized immersive sound distribution standard is introduced, even if free of standards-essential intellectual property, then the business models that support the preservation of creative intent will require locking the mix to the playout system.  If an immersive sound distribution standard is introduced that attempts to undermine such business models, then cinema will be no different than the wild west of consumer experience, where creative intent is merely two words with little meaning.  There are smarter paths to travel.

About the Writer

Michael Karagosian is founder and president of MKPE, a consultancy for business development of cinema technologies.  Karagosian made his first contribution to cinema at Dolby Laboratories with the development of split surround for 70mm film, developed expressly for the 70mm release of Apocalypse Now in 1979, and pre-cursor to the 5.1 sound format.  He led the development of Dolby’s flagship CP200 Cinema Sound Processor, which became a cornerstone of the THX program.  In the 90’s, he was President of Cinema Group Ltd., creators of the CinemAcoustics product line for cinema sound, certified by THX.  He was a founding member and an original chair of the SMPTE standards effort for digital cinema, where he conceived and led the collaborative effort that defined the digital cinema package, and drove the creation of standards to enable the widespread use of accessible technologies in the cinema.  He led negotiations for VPFs in Ireland, Philippines, and South America.  He has consulted to most major manufacturers and organizations in digital cinema, including the National Association of Theatre Owners, where he led early technology discussions with DCI, and drove the collaborative effort that produced NATO’s Digital Cinema Requirements.