I received a ton of feedback to my previous editorial regarding Audio Quality in Communications. And I also received a ton of suggestions about products and solutions that people are using - and the most important message is clearly that people are, for one, discovering the superior qualities of recent consumer products, including the latest headphones and earbuds, while also realizing the value of using proper professional communication solutions - which unfortunately not many could use at home. Evidently, superior Bluetooth 5 radio designs, the latest generation built-in MEMS microphones, beamforming, and processing abilities, all work toward improved communications, whether they are applied to consumer products - including those for gaming - or for "office" and professional use.
But my aim with the previous editorial was not - at all - to highlight any specific product or brand. I used a few familiar references as examples, and I certainly left out many others because I was not trying to write a guide or anything of the sort. Otherwise, as explained, my considerations aim purely at discussing the current audio quality problems we all have to face in this unpredictable circumstances.
I do need to reference an error in the caption of the first photo I used, which showed a Poly Voyager 4200 UC headset, and not the Blackwire 3300 Series model. Apologies to Poly.
And that reference to Poly (the company that resulted from the merger of Plantronics and Polycom) is a great introduction to the topic I intend to address this week. Because current "professional" Unified Communications and Collaboration (UC&C) products are a nice way to illustrate how the industry was focused on providing solutions to existing practices. And those solutions didn't always evolve the way they were supposed to.
|
The Bell Labs PicturePhone developed in the 1950s and brought to market in the 1960s was clearly ahead of its time. Or was it?
|
Traditional communication systems work mainly based on a few existing protocols agreed upon by the International Telecommunication Union (ITU), which were largely challenged by the interests of mobile network operators worldwide, and the massive transition to a global system of interconnected Internet Protocol (IP) networks and Internet services. With mobile networks, the IP convergence, and consumers changing devices almost every year - all happening faster than any other transition in the history of telecommunications - we could even say that the surprise is that things even work at all!
With the fast pace of changes in standards and technology solutions, quality assurance on the connections was an unfortunate result. And profit-centered strategies also dictated that telecommunication systems set a relatively low bar for voice quality, even though technology progressed immensely on that front. Telecom companies mostly ignore progress, and continue to offer a "good-enough" voice quality standard (GSM quality, basically). Even after the introduction of the 3GPP Enhanced Voice Services (EVS) audio codec for VoLTE, which allowed super wideband (SWB) audio quality for mobile phones, only a handful of telecom operators globally introduced the service - surprising given device support was in place, including from the world' best-sellers and main equipment brands.
Even before 3G cellular networks, when the world communication lines were transitioning from "digital" (ISDN anyone?) to IP, the industry already had improved voice codecs that didn't require increased bandwidth and not much additional investment in infrastructure to telecom operators. Yet, these remained stubbornly firm on their "no one is going to tell the difference anyway" attitude.
|
The Enhanced Voice Services (EVS) codec was a multi-industry effort, optimized for operation with voice and music/mixed content signals, with extended bandwidth, and was standardized in 2014 as part of the 3GPP project. Its potential remains to be discovered by consumers. Source Fraunhofer IIS
|
While at least ISDN offered some QoS for conference applications, with the predominant IP-based solutions we use today there is much less chance of QoS. That's clearly visible in video issues, but it is much more disruptive in audio quality issues. Today, communications over the Internet has adapted to the variable network conditions. We are supposed to have more bandwidth than we need for voice communications - after all we stream music on-demand in high-quality stereo in the same network. But since communication systems need to happen in real time, we use VoIP codecs that quickly adapt to variable bandwidth conditions, often resulting in a lower quality.
In the early years of cellular networks, while people adopted mobile phones for the convenience, they never trusted the reliability and the quality of the connections enough to truly consider it the only option. But regular landline phones were gradually forgotten, first and foremost because some telecom companies insisted on maintaining the cost formula of fixed communications, and also because there was a "free" alternative with voice-over-IP software or Internet Telephony. In fact, the pressure that should have been placed over QoS and the adoption of high-quality voice communications was quickly replaced by the excitement of moving to IP services, even if at the cost of quality.
Skype, a company founded by a group of Nordic entrepreneurs and a group of Estonian programmers and software developers who had create Kazaa, a peer-to-peer file sharing application, was not even the first solution available. But Skype was certainly the first Internet-based communication solution to be massively adopted by everyone who had a computer and a broadband connection - and it certainly took the office world by storm.
The reality was that, by using available software for PC and Macs, users were for the first time able to experiment with the extremely high-quality audio of modern voice codecs. The variables - not so network dependent - depended more on the microphone, the presence of bad quality speakers (beige Windows PCs with $30 plastic speakers, remember?) feeding back the signal, poor audio cards on old PCs, etc. That's why anyone using an Apple Mac, could quickly feel the quality difference, because Macs were always a fully integrated solution with built-in microphones and speakers that work particularly well for VoIP - no headphones even required.
Skype and all its competitors also used a mix of existing technologies and protocols, combined with new and more or less proprietary solutions, including audio codecs. The difference with Skype was that it was built over a peer-to-peer model - which was the reason it was so successful, but also the reason why it later had to be revised for security concerns. With the subsequent acquisitions of Skype by eBay and later in 2011 by Microsoft, Skype quickly evolved its audio codecs from G.729 to SILK, which was developed in-house. SILK is now open-source and available royalty free, and was included in Opus another open-source effort. SILK and Opus, are now widely used, and the voice-over-IP (VoIP) solution adopted by everyone, from WhatsApp to Zoom. (The story of these codecs would make for an interesting book that I hope will be published one day).
|
Oh the wonders of telepresence over ISDN connections! Pictured is the HP Collaboration Studio. At least the room acoustics were supposed to be good.
|
Remember Videoconference?
It's important also to revisit how we got to the success of Skype, Zoom, and similar all-embracing communication tools. While consumers were playing with message boards and Internet forums, before social media, the professional world was betting everything on a concept that would enhance telecommunications from voice to video. Having already forgotten the lessons of the Bell Labs Picturephone, telecom companies wanted to sell the videophone again, and encouraged some hardware companies to develop videoconference solutions for governments, big multinationals, and others.
The videoconference market initially started with some naive concepts about creating "hyper-sophisticated" rooms using dedicated hardware and telecom-led infrastructure and services, where people would gather to have important group discussions, of course assuming that a dedicated staff at each end would coordinate the connections and make sure that everything was functional and ready to go when the "important people" would enter the room(s). How did they established the coordination to make sure those rooms were connected and functional? They exchanged emails and of course made regular phone calls to make sure the "other side" would turn on the equipment and accept the connection.
Governments, multinational corporations, and basically any organization where security is critical and a dedicated closed system is the best option (no matter the price or the inconvenience) did use those solutions - and still do. But soon people started using Skype directly from their desks and the corporate communications concept changed radically. Shortly after, regular folks had iPhones with FaceTime, laptops and tablets with Skype, and those million dollar investments in "videoconference" systems started to look absurd.
|
And from dedicated videoconference terminals for the top management to Skype on every company PC, things changed very quickly.
|
Then, the videoconference industry was reborn under the designation of UC&C - a work in progress concept that not even the leading players are very certain what it's all about. Following an intense period of corporate recycling through mergers and acquisitions, UC&C migrated to technologies offered by the big Internet and IT giants. Cisco, Microsoft, Google, and others offered software and technology to support the concepts, which of course migrated fully to web-based and cloud services.
Very quickly, the corporate world created its own version of "office-oriented" tools (because security is always a great motivation) and the modern "meeting rooms" market was born, followed short after by the "huddle rooms" concepts (the same thing, but basically happening in any corner and with little or no concern for acoustics, where people connect their own devices to a big screen, a camera, and a soundbar). With Internet and cloud companies also trying to get definitive control of those markets through services, they allowed hardware companies to sell and support the needed screens, webcams, microphones, and speakers, while they focused on "Unified Communications as a Service (UCaaS)".
Throughout the process, it was clear that progress was based on a distorted view of "communications" and "collaboration", which tried to ignore the potential for disruption coming from bring-your-own-devices (BYOD) and consumer convergence with web-based calls. And that will be the topic next week :)
|
HEAD acoustics is promoting webinars, published a white paper and created a highly recommend website dedicated to the topic "How to Optimize Audio Conferencing Solutions to Improve Communication Quality." Click the image to visit.
|