Hello and Welcome! To network design for collaboration.
For a while now there’s been a big push for the unification of communication and network services. Mostly this comes from the stability and scalability of our networks today. We can now handle carrying the company’s voice traffic on our network backbone because we have enough bandwidth available and our uptime is significantly increasing.
With the addition of voice and video traffic onto our networks, there comes some considerations and also some unique design elements with regards to the placement of the network devices and service applications which provide the unified communications.
With this being an ever increasing direction for businesses to go, it’s become very important for the network engineer and architects to be familiar with the components and considerations involved so we can appropriately handle and design our networks to handle these unique traffic flows.
In this video we’ll be going through an overview of the major pieces involved with unified communications, a brief overview of deployment models, as well as considerations for the traffic itself depending on your situation and circumstance. So let’s get started!
- Grade of Service (GoS) – probability a call will be blocked due to congestion
- Erlang – measure of call time used per unit time. 2 calls for 60min during 1hr = 2 Erlangs
- Busy Hour – The most used hour during a work day for calls
- Busy Hour Traffic – measure of call time used during busy hour, in erlangs
Although collaboration includes voice and video, the CCDA will focus mainly on voice; though you’ll need to be aware of some video concepts. Here let’s run through some of the terminology used in voice network design. The Grade of Service, this is the probability a call will be blocked due to congestion. If I have a GoS value of 0.05. that would be a 5% probability that a call will be blocked, or about 1 in 20 will fail. An Erlang is a bit of a strange unit to wrap your head around, but it’s a measure of call time spent per unit time. So if I have 2 calls going on concurrently, each lasting 1 hour, this would equal 2hrs of call time per 1 hr, so the usage would be 2 erlangs.
The busy hour is simply the busiest hour for calling in the day. This is what is usually used for scoping, as you’ll want to be reasonably sure your design can handle the calls during the busy hour. The busy hour traffic is the amount of call traffic generated during the busy hour and is measured in erlangs.
So when we’re talking about IP telephony, there’s 4 main components to the deployment. First of course you have your phones. Now the modern phones are not exactly just a simple desk phone anymore. They can have integrated video, be a wireless IP phone, or a telepresence system. Of course with Webex and similar services you can have collaboration traffic generated from the workstation as well, but we won’t be covering that here in the CCDA.
For the call processing you have CUCM, Cisco Unified Communications Manager. This is set up as a server in your server farm or data center, or can also be placed at your branch office locations as well. CUCM is set up in a redundant cluster of course as without it your IP telephony system breaks down.
In a lot of Cisco’s example diagrams for IPT deployments you’ll see Unity server as well. This is a service that essentially provides voicemail services, but also for sending voice and video messages to other staff within the enterprise.
Finally, you’ll see mention in documentation for voice-enabled infrastructure. This is just the routers and switches that carry the voice data. Since IP telephony is just IP traffic with voice data within, you generally don’t need special equipment outside of possibly power over ethernet switches to be able to support IPT.
Okay, so let’s switch gears a bit and get into the 10000ft view of the collaboration design. These 2 models you can use, centralized, and distributed. In the centralized model here you have a single large site, and you may have branches as well. You have a single CUCM cluster and Unity cluster. A note here to avoid confusion is that these connections to the PSTN, and that’s public switched telephone network, these are not to connect the sites together, but are to support the trunks for the call system to get out to the public telephone network. Your branches will have WAN links to get back to the main office to support your IP telephony and internal calling.
Now in the event of WAN link failure, you can have SRST, that’s survivable remote site telephony, service running on your ISR router at the branch. This Allows the ISR to act as a PSTN gateway and allows the branch to still place calls to public telephone numbers if it loses connections to the CUCM cluster. Of course though extension to extension internal calls would not be functional, but the branch wouldn’t be totally incapacitated when it comes to a WAN link failure.
So moving over to the distributed deployment model, here you’d have a CUCM cluster at multiple branches or even each branch. Now maybe your branch offices are really big and you don’t want all of that traffic traversing a WAN link, or maybe you just don’t trust your servers at all and want even more redundancy by having redundant CUCM clusters! In this model you don’t necessarily have a CUCM and Unity cluster at every branch, but you have more than one. These clusters can act in a failover model, so if your main office goes down, say there’s a cooling failure and the servers just burst into flames, you can have those phones and devices use the CUCM at one of the remote sites.
Obviously this is the most redundant design, and if your budget allows for it, then more is always better, but this does get pretty expensive pretty quick. A valid alternative to using a full CUCM deployment at the branches though is to run CME, which is call manager express, from an ISR at the branch. This gives you functional call manager services, much more than SRST. Though CME only supports up to 450 phones, but this is likely more than enough for the vast majority of branch office networks out there. You’re talking a mighty large enterprise whose branch office has a need for more than 450 IPT devices.
Now we’ve likely all seen, and perhaps even configured some basic quality of service before, but this is mentioned in the exam topics so I wanted to make sure we covered it. As a quick tidbit, Cisco calls their framework for configuring QoS the MQC, or the Modular QoS CLI. This works in 3 steps. First you define what the interesting traffic is with a class map, then you define what to do with that traffic with a service policy. This can be anything from tagging the traffic for upstream routers to properly handle, to setting the QoS queueing policy locally. From there you assign the service policy to an interface for either ingress or egress traffic. The way QoS works is it applies prioritization, queuing, and or policing during times of congestion. So if there’s more traffic on the line than the available bandwidth, you can decide what traffic moves first and how the remaining traffic is handled.
Policing simply drops traffic that exceeds the maximum rate, whereas queueing will hold the traffic and just release it at a rate that you specify. As you can imagine, this induces latency because the traffic is having to wait to be sent out onto the line. We’ll get into the specifics shortly, but voice and video traffic really don’t like latency, especially changes in latency which is called jitter. This is why you would want to choose an algorithm like Low Latency Queueing. What this does is creates a policing queue for voice traffic. You can specify the maximum rate voice traffic can go, just in case someone hijacks the system and starts tagging non-voice traffic as voice, so the voice traffic will never be queued. The data traffic will though, which is fine because we don’t really care about latency in data traffic, but just a little latency can drastically reduce call quality and make the staff show up at your door in angry mob fashion.
When I mentioned someone hijacking the system, this would be when they violate the trust boundary. See before IP phones we would look at anything connected to an access switch with suspicion. Someone could mark their traffic for expedited forwarding and just throttle dummy traffic into your network basically hosing everything in its path. This is why we would set service policy to mark the traffic as it comes into the switch to make sure this would happen and it made sense because your switch was behind a locked door and you control it.
With phones however, you’re now expanding the trust boundary out to the phone, that sits on their desk! The best way to do this is to only trust phones that identify themselves with CDP or LLDP. Even this isn’t super secure since those protocols can be emulated, but it’s a good start. You should also set a limit to the low latency queue bandwidth allowed, so even if someone emulated a phone they couldn’t hose your whole network.
Best practice is indeed to mark traffic as close to the source as you can though. So If you can have the phone mark its own traffic as expedited forwarding, that’s less processing load on the switch, because packet manipulation does take CPU processing power.
Now the reason why we mark our voice traffic with the DSCP, that’s differentiated services code point, of EF for expedited forwarding is because it’s very sensitive in comparison to data traffic. Video is actually much more sensitive than voice is, mostly when it comes to packet loss, and this is due to delta encoding techniques that are used. For the most part, each whole frame is not actually sent to the other party, only the difference between it and the previous frame are sent with the whole frame sent every so often to account for possible packet loss. With voice traffic, this is when someone starts sounding a bit garbled or even just cuts out for a period of time. High latency will simply cause your conversation to be very latent and out of sync, which most of us have been there before and it’s a very frustrating thing to deal with.
You’d be surprised the level of intolerance people have for phone call quality problems. I mean, on a cell phone, it’s expected, if there’s issue you tell them, hey give me a minute and I’ll call you back on a landline, because the landlines are known to be very very reliable.
We can’t always predict when the network will be congested, which is why we need to make sure when there is congestion, that your important traffic, like voice, is treated with the appropriate priority level.
So we end up saying that voice needs to be handled with special care and expedited, but just how much of your bandwidth should you reserve? Well, this mostly depends upon what codec you end up using as they vary significantly with regards to the amount of bandwidth used per call. These also have other considerations like audio clarity and CPU processing load for the encoding, and latency induced by the encoding.
I’ve seen information related to the MOS score on the CCDA exam before so I wanted to touch on this specifically. A ‘toll quality’ call is the call to be expected over a standard analog long distance call over the PSTN. This is the quality everyone expects from a phone call, worse is suboptimal really, and better is what we’re shooting for. These scores are averages for these codecs, they do go higher sometimes if the network conditions are optimal.
There are many codecs out these and these are just the common ones. I’d make sure you know these and their bandwidths required for the exam.
Moving on to our last slide here, I wanted to go over the same type of information but for video. First, you should know for the exam that H.264 is by far the most popular video codec used today. This table can give you a solid sense for how you may need to provision your network if your company is doing a fair bit of video streaming or conferencing. The higher the bitrate, generally the better the image quality. Video is also very bursty. It only sends the full frame every now and then with most frames just being the delta from that key frame. You can certainly see the difference here, a Telepresence 3000 using 12Mbps, where 720p youtube, which isn’t bad video quality really, uses ~2mbps in comparison.