De-risking NFV: Pitfalls to Avoid and How to Get Past Them

A few weeks ago I was invited to join a joint customer call between a leading U.S. service provider and a leading network equipment vendor. What should have been a cooperative everyday discussion about their network functions virtualization (NFV) deployment turned into what is becoming commonplace – a stalemate. The vendor would not do what the service provider wanted and, more importantly, what the service provider needed in order to reliably deploy the vendor’s equipment. But this type of conflict need not be the case as service providers work to transform their networks to adapt to today’s hyper-connected needs.

Going back to the call, the service provider explained to the vendor how important traffic visibility was to them in order to deploy virtualized network functions (VNF) versions of existing hardware functions, and how they could only go so far as the trial stage without comprehensive management, monitoring and traffic visibility down to the packet level. As the discussion advanced, it was clear that no deployment would take place as planned. Why? Without the packet-level visibility, the service provider would have no fault- finding nor problem- diagnosis capability, and this essentially meant they would have no way to diagnose if an outage occurred. The vendor explained that the carrier didn’t need to worry because packet-level visibility isn’t necessary and all the needed monitoring would be included in the product. The carrier dug their heels in and said “no.”

This real-life conversation validates much of what we know about the importance about network visibility and the pitfalls service providers and vendors encounter when it comes to working together to de-risk new technologies such as NFV. The operator was so adamant about packet-level visibility because the capability is critical to supporting their fault-finding requirements and enabling smoother deployment. When it comes to service chaining there are a number of challenges, the two most common being:

Where traffic passes through virtualized servers – often with obfuscated interfaces there is no easy way to access to the traffic;
Where there is service chaining either within the same server or within the same virtual environment across servers, there is no easy way to access the traffic.

In a best-of-breed environment where there are several vendors each providing their own VNF(s), running on their own virtualization environment, and performing service chaining – how exactly do you debug the network to find out what’s not working, especially where you don’t have access to the traffic at the packet level? How do you find the areas where multi-vendor interoperability breaks down? How do find the smoking gun to point the finger at one vendor versus another? How do you hold your vendors accountable to stop the inevitable finger pointing exercise?

Pan-industry Testing Initiatives

Organizations, like the New IP Agency (NIA), have mandates to provide independent industry testing, which is an essential step in removing the risk and uncertainty inherent with new technology deployments. The NIA commissions leading test organizations to facilitate independent, multi-vendor testing that provides real-world data and transparency about interoperability between NFV elements. This is hugely important since there is currently no University of New Hampshire-type organization fulfilling this role in the virtualization space. With the mission to deliver independent industry testing and multi-party interoperability, the NIA has carried out three test phases with several more planned:

Phase 1 – NFVi-VNF Interoperability Evaluation
Phase 2 – VNF Management Interoperability Evaluation

Phase 3 – Service Chaining Interoperability Evaluation & Live Demo

Importantly, the NIA overcomes the need for each vendor to perform interoperability testing with each other and with each carrier separately. The test phases allow carriers to apply greater pressure to the vendor community to enforce multi-party interoperability. Although good, this opens the door to another problem – what to do if there is no packet-level visibility built in to the vendor products? You could be back to square one, but perhaps not.

The Tap-as-a-Service Project

One emerging approach to solving the packet-level visibility problem is Tap-as-a-Service (TaaS), an extension to the OpenStack network service (Neutron) that provides remote port mirroring capability for tenant virtual networks – offering a vendor-independent method for accessing data sitting inside a virtualized server environment. Port mirroring involves sending a copy of packets entering and/or leaving one port to another, which is usually different from the original destinations of the packets being mirrored.

TaaS has been primarily designed to help tenants (or the cloud administrator) debug complex virtual networks and gain visibility into their virtual machines (VMs) by monitoring the network traffic associated with them. TaaS honors tenant boundaries and its mirror sessions are capable of spanning across multiple compute and network nodes. It serves as an essential infrastructure component that can be utilized for supplying data to a variety of network analytics and security applications.

Through this open source project, traffic is now accessible in the traditional manner as if a TAP or SPAN/Mirror port is sourcing the traffic from a legacy or physical network. With visibility restored, the operator could de-risk the rollout of new technology through the use of existing debugging, analytic and performance tools they have already come to rely on.

Rising outage costs

Most people accept the benefits of virtualization techniques, which are the same as NFV. Savings on capex from reduced spend on proprietary hardware present opex savings in the long run – as the same resources can be used to run multiple different vendor systems or be redeployed to other network areas and projects.

However, there is an area that is often overlooked. Outage costs, whether hard or reputation-based, are rising due to the extra time it takes to bring a system back up and working again after failure. As a result, it directly offsets any savings NFV represents. Will a one-minute outage become a five-minute outage due to the lack of direct packet level visibility? Will the extra cost and complexity of having multiple vendor management systems interface with the operator’s OSS/BSS offset the reduced complexity NFV promises?

Operators are encouraged to identify and de-risk NFV rollouts by supporting independent testing whether through participation or observation. This way, vendors and service providers can agree upon packet level visibility to improve fault-finding, interoperability and easy diagnosis. At the same time, vendor equipment and new technologies will be de-risked so that carriers may deploy faster and find shorter time to revenue.

So, there are certainly ways to de-risk NFV deployments. But its up to the industry to define what the NFV acronym will finally stand for: Network Functions Virtualization or No Freaking Visibility?

Andy Huckridge is the ‎Director of Service Provider Solutions at Gigamon with 20 years of Silicon Valley telecommunications industry experience. Prior to Gigamon, Huckridge served as an independent consultant to operators and carriers as well as to network and test equipment manufacturers.

Huckridge holds a Master of Science and Bachelor of Engineering, in Advanced Telecom and Spacecraft Engineering. He has also holds a patent in VoIP, co-authored an IETF RFC and was an inaugural member of the “Top 100 Voices of IP Communications” list.

Filed Under: Infrastructure

Related Articles Read More >

Do Sensors Make Infrastructure Safer?

Crawling Robots and Flying Drones May Help Missouri’s Bridges

Viasat and Facebook Collaborate to Expand Internet Connectivity in Rural Mexico

Smartphone-Based System to Monitor America’s Crumbling Infrastructure

Search Design World