VoIP QoE, Ken Lay [formerly] of Enron, and “Photographic Evidence”

Picture a normal day in the office. I was multitasking: a lively exchange of ideas on a conference call on low volume, a pile of unopened paper mail, a half eaten sesame seed bagel with cream cheese and lox, a Polycom webinar being presented by Jeff Rodman buried under eight other windows; and flipping back and forth between research for my upcoming Unified Communications book, Wikipedia, and an occasionally beeping “URGENT MAIL” window. All of a sudden I thought I heard something about “the misunderstood CEO”. Was it the conference call or the webinar? Turns out it was the guy on the webinar. He was talking about Ken Lay, formerly of Enron. I turned the sound down on the conference call and turned up the sound on the “Demystifying Wideband Telephony” webinar.

On the left side of the screen was Ken Lay being led off in hand cuffs. On the top was a quote from Mr. Lay: “But I told him to pass the books … not to pad them!”. The presenter made it clear that the quote was only in the interest of humor but it would be used to explain a very serious, and important, concept. The presenter had already explained that human speech is composed of sounds in 1.pngthe frequency range from about 100 to about 14,000 Hertz, or cycles per second, but that the traditional Plain ‘Ol Telephone Service (POTS) telephony only captures, encodes and transmits sounds in the 300-3,300 Hertz range. He explained, “POTS phones carry 1/4 of human speech”, “higher frequencies are critical for identifying the talker”, “consonants carry half of the speech information” and that “consonants occur largely above the POTS range”. Then he presented the evidence. On his slide was a screen shot from a spectrum analyzer, shown below.

On the left side is the spectrum graph for the phrase “pad the books” and on the right side for the phrase “pass the books”. The portion of spectrum analysis shown in the picture is the 300-3,300 Hertz range, which is the range that would be captured, coded and transmitted by traditional telephony, or for that matter, and maybe more importantly, by VoIP phone systems – regardless of whether they are SIP, H.323, Cisco Skinny or whatever – that are being deployed right now.

Then came the real kicker! I remember thinking “wow” as he flipped to his next slide. It contained the spectrum graph shown below. As you can see it represents the same phrase but also shows speech above 3,330 HZ and below 300 Hz.
Without the context of the telephone call it would not be possible to listen to a tape made of a traditional telephone call at the receiving end and know the difference between “pass the books” and “pad the books”. I have been studying, consulting on, teaching 2.pngand writing about voice and video Quality of Experience — defined loosely as the user’s perceived “quality” of a call — for over a decade and I now had the “photographic evidence” to demonstrate what is often the topic of class discussions.

I often tell a class a fact that is reasonably consistent across studies: females tend to give lower Mean Opinion Scores (a human listener’s opinion of voice quality on a 1-5 scale, 5 being best) when listening to traditional voice systems than males do. Interestingly, younger females tend to give lower scores than older females and females generally give lower scores when listening to other females than men do listening to females. “Why?” I ask. The answers range from “women are pickier” to a fascinating discussion of how girls are raised to listen well while boys are allowed to run amok to “men probably aren’t listening that closely during the test”! It turns out that researchers tend to agree that the higher frequency ranges are a more important part of female communication than of male communication though the higher frequency ranges figure prominently in both – see graph above. Interesting points, and now I had the evidence for all to see. Jeff from Polycom, Inc. had my undivided attention for the next 50 minutes – a rare situation, indeed!

Jeff went on to explain that we already have wideband sound – which is the name for the technology that includes a much richer spectrum of frequencies than traditional telephony or basic, vanilla VoIP – from Skype, iPods and MP3 players, traditional television audio, videoconferencing audio, FM, AM and satellite radio and increasingly from VoIP. The G.722 wideband codec standard was published in 1988, G.722.1 in 2000, and G.722.2 in 2002. When VoIP devices negotiate the codec they will use at the beginning of the call, the choice increasingly is a wideband codec.

Jeff summarized the webinar in five bullet points:

  • Wideband VoIP is better matched [to today’s QoE needs], because voice is wideband.
  • Open [wideband] VoIP standards give much higher sound quality [than their traditional counterparts].
  • Today’s wideband codecs use bit rates comparable to narrowband.
  • Most common wideband codecs are open standards.
  • Excellent standards-based wideband endpoints are openly available today [from a variety of manufacturers].

Bottom line: the next time I am going to pass anything I want to do it over a wideband VoIP system. I’ve been discussing this for years and now I’ve got the photographic evidence. And, thanks to this Eogogics newsletter article, so do you.

Editor’s note: The author, as witty as he is knowledgeable, teaches the Eogogics courses on VoIP.