Designing for Collaboration

Hello and Welcome! To network design for collaboration.

For a while now there’s been a big push for the unification of communication and network services. Mostly this comes from the stability and scalability of our networks today. We can now handle carrying the company’s voice traffic on our network backbone because we have enough bandwidth available and our uptime is significantly increasing.

With the addition of voice and video traffic onto our networks, there comes some considerations and also some unique design elements with regards to the placement of the network devices and service applications which provide the unified communications.

With this being an ever increasing direction for businesses to go, it’s become very important for the network engineer and architects to be familiar with the components and considerations involved so we can appropriately handle and design our networks to handle these unique traffic flows.

In this video we’ll be going through an overview of the major pieces involved with unified communications, a brief overview of deployment models, as well as considerations for the traffic itself depending on your situation and circumstance. So let’s get started!


  • Grade of Service (GoS) – probability a call will be blocked due to congestion
  • Erlang – measure of call time used per unit time. 2 calls for 60min during 1hr = 2 Erlangs
  • Busy Hour – The most used hour during a work day for calls
  • Busy Hour Traffic – measure of call time used during busy hour, in erlangs

Although collaboration includes voice and video, the CCDA will focus mainly on voice; though you’ll need to be aware of some video concepts. Here let’s run through some of the terminology used in voice network design. The Grade of Service, this is the probability a call will be blocked due to congestion. If I have a GoS value of 0.05. that would be a 5% probability that a call will be blocked, or about 1 in 20 will fail. An Erlang is a bit of a strange unit to wrap your head around, but it’s a measure of call time spent per unit time. So if I have 2 calls going on concurrently, each lasting 1 hour, this would equal 2hrs of call time per 1 hr, so the usage would be 2 erlangs.

The busy hour is simply the busiest hour for calling in the day. This is what is usually used for scoping, as you’ll want to be reasonably sure your design can handle the calls during the busy hour. The busy hour traffic is the amount of call traffic generated during the busy hour and is measured in erlangs. 

Typical Cisco collaboration devices

So when we’re talking about IP telephony, there’s 4 main components to the deployment. First of course you have your phones. Now the modern phones are not exactly just a simple desk phone anymore. They can have integrated video, be a wireless IP phone, or a telepresence system. Of course with Webex and similar services you can have collaboration traffic generated from the workstation as well, but we won’t be covering that here in the CCDA.

For the call processing you have CUCM, Cisco Unified Communications Manager. This is set up as a server in your server farm or data center, or can also be placed at your branch office locations as well. CUCM is set up in a redundant cluster of course as without it your IP telephony system breaks down.

In a lot of Cisco’s example diagrams for IPT deployments you’ll see Unity server as well. This is a service that essentially provides voicemail services, but also for sending voice and video messages to other staff within the enterprise.

Finally, you’ll see mention in documentation for voice-enabled infrastructure. This is just the routers and switches that carry the voice data. Since IP telephony is just IP traffic with voice data within, you generally don’t need special equipment outside of possibly power over ethernet switches to be able to support IPT.

Centralized CUCM deployment model

Okay, so let’s switch gears a bit and get into the 10000ft view of the collaboration design. These 2 models you can use, centralized, and distributed. In the centralized model here you have a single large site, and you may have branches as well. You have a single CUCM cluster and Unity cluster. A note here to avoid confusion is that these connections to the PSTN, and that’s public switched telephone network, these are not to connect the sites together, but are to support the trunks for the call system to get out to the public telephone network. Your branches will have WAN links to get back to the main office to support your IP telephony and internal calling.

Now in the event of WAN link failure, you can have SRST, that’s survivable remote site telephony, service running on your ISR router at the branch. This Allows the ISR to act as a PSTN gateway and allows the branch to still place calls to public telephone numbers if it loses connections to the CUCM cluster. Of course though extension to extension internal calls would not be functional, but the branch wouldn’t be totally incapacitated when it comes to a WAN link failure.

Distributed CUCM deployment model

So moving over to the distributed deployment model, here you’d have a CUCM cluster at multiple branches or even each branch. Now maybe your branch offices are really big and you don’t want all of that traffic traversing a WAN link, or maybe you just don’t trust your servers at all and want even more redundancy by having redundant CUCM clusters! In this model you don’t necessarily have a CUCM and Unity cluster at every branch, but you have more than one. These clusters can act in a failover model, so if your main office goes down, say there’s a cooling failure and the servers just burst into flames, you can have those phones and devices use the CUCM at one of the remote sites.

Obviously this is the most redundant design, and if your budget allows for it, then more is always better, but this does get pretty expensive pretty quick. A valid alternative to using a full CUCM deployment at the branches though is to run CME, which is call manager express, from an ISR at the branch. This gives you functional call manager services, much more than SRST. Though CME only supports up to 450 phones, but this is likely more than enough for the vast majority of branch office networks out there. You’re talking a mighty large enterprise whose branch office has a need for more than 450 IPT devices.

Impact of policing vs. shaping on traffic flows

Now we’ve likely all seen, and perhaps even configured some basic quality of service before, but this is mentioned in the exam topics so I wanted to make sure we covered it. As a quick tidbit, Cisco calls their framework for configuring QoS the MQC, or the Modular QoS CLI. This works in 3 steps. First you define what the interesting traffic is with a class map, then you define what to do with that traffic with a service policy. This can be anything from tagging the traffic for upstream routers to properly handle, to setting the QoS queueing policy locally. From there you assign the service policy to an interface for either ingress or egress traffic. The way QoS works is it applies prioritization, queuing, and or policing during times of congestion. So if there’s more traffic on the line than the available bandwidth, you can decide what traffic moves first and how the remaining traffic is handled.

Policing simply drops traffic that exceeds the maximum rate, whereas queueing will hold the traffic and just release it at a rate that you specify. As you can imagine, this induces latency because the traffic is having to wait to be sent out onto the line. We’ll get into the specifics shortly, but voice and video traffic really don’t like latency, especially changes in latency which is called jitter. This is why you would want to choose an algorithm like Low Latency Queueing. What this does is creates a policing queue for voice traffic. You can specify the maximum rate voice traffic can go, just in case someone hijacks the system and starts tagging non-voice traffic as voice, so the voice traffic will never be queued. The data traffic will though, which is fine because we don’t really care about latency in data traffic, but just a little latency can drastically reduce call quality and make the staff show up at your door in angry mob fashion.

When I mentioned someone hijacking the system, this would be when they violate the trust boundary. See before IP phones we would look at anything connected to an access switch with suspicion. Someone could mark their traffic for expedited forwarding and just throttle dummy traffic into your network basically hosing everything in its path. This is why we would set service policy to mark the traffic as it comes into the switch to make sure this would happen and it made sense because your switch was behind a locked door and you control it.

With phones however, you’re now expanding the trust boundary out to the phone, that sits on their desk! The best way to do this is to only trust phones that identify themselves with CDP or LLDP. Even this isn’t super secure since those protocols can be emulated, but it’s a good start. You should also set a limit to the low latency queue bandwidth allowed, so even if someone emulated a phone they couldn’t hose your whole network.

Acceptable limits of network disruption to voice and video

Only subscribers can view the full content. Join Now!

Scroll to top

You have successfully subscribed to the newsletter

There was an error while trying to send your request. Please try again.

CiscoLessons will use the information you provide on this form to send occasional (less than 1/wk) updates and marketing.