This use case is specifically about deploying the Perimeta Session Border Controller (SBC) Virtual Network Function (VNF) from Metaswitch Networks in OpenStack.
Perimeta, like other SBCs, sits on the edge of a service provider’s network and polices SIP and RTP (i.e. VoIP) control and media traffic passing over both * the access network between end-users and the core network * the trunk network between the core and another service provider.
Access + SP A core + Trunk + SP B core network | network | network | network
| || |
+——-+ +-+–+ +———+ +-+–+ +-+–+ +———+ |User | |SBC | |Network | |SBC | |SBC | |Network | |device |–| |—|function |–| |————| |—|function | +——-+ +-+–+ +———+ +-+–+ +-+–+ +———+
| |
See the Glossary for a description of these terms.
In order to implement its security and admission control functions (e.g. DDoS protection), Perimeta must perform line-rate processing of received packets. For RTP streams, this equates to several million VoIP packets (each ~64-220 bytes depending on codec) per second per core. Perimeta must be able to guarantee this performance and offer SLAs.
Perimeta must be fully HA, with no single points of failure, and service continuity over both software and hardware failures (i.e. all SIP sessions and RTP sessions must continue with minimal interruption over software or hardware failures).
Perimeta must be elastically scalable, enabling an NFV orchestrator to add and remove instances in response to applied load.
To apply different policies to traffic from different customers, Perimeta must be able to distinguish and separate traffic from different customers via VLANs or similar mechanism.
Perimeta must separate networks carrying live customer traffic from networks carrying management or other internal data.
Perimeta signaling instances must be able to support large numbers of concurrent TCP connections (hundreds of thousands) to cater for large numbers of clients using TCP.
Perimeta must be able to coexist with VMs which do not have these requirements on the same host, so long as it can provide sufficient dedicated resources. For example, just because Perimeta may not require security group function it does not mean this can be disabled at a host scope, or just because Perimeta uses SR-IOV or DPDK it does not mean that all VMs on that host must do so.
The Perimeta Session Border controller from Metaswitch Networks is a Telco-grade implementation of a Session Border Controller designed to run either on generic PC hardware or virtualized, running on OpenStack and other clouds, providing high availability, high scale and high performance.
Although this user story is specifically about Perimeta, it is more generally representative of the issues involved in deploying in OpenStack any VNF utilising a fast data plane or high scale SIP. The use case focuses on those elements rather than more generic issues like orchestration and high availability (HA).
The problem statement above leads to the following requirements.
Achieving packets per second target - networking implications
A standard OpenStack/OpenvSwitch platform allows VMs to drive NICs to full bandwidth if using large packet sizes typical for Web applications. What makes VoIP different is the small packet size, which means order of magnitude more packets and permits only a few hundred CPU instructions per packet - nowhere near enough to drive a packet through the standard OpenStack networking stack from VM to NIC. Instead, this requires technologies such as SR-IOV (https://blueprints.launchpad.net/nova/+spec/pci-passthrough-sriov - completed in 2014.2, though with some technical debt remaining) or a DPDK or similar poll mode based vSwitch in the host. Note that SR-IOV in particular imposes some limitations (e.g. prevents live migration) so may not be a desirable option for some SPs.
Ideally the network would support and respect QoS rules on traffic priority and bandwidth limits.
Security - networking implications
Security groups must be disabled for network technologies where they are not bypassed completely.
The network should protect against ARP poisoning attacks from other VMs.
High scale TCP - networking implications
For ports with security group function disabled, it is desirable that host connection tracking function is disabled to avoid performance and occupancy hits for large numbers of TCP connections and the need to tune host parameters unnecessarily.
Achieving packets per second target - compute implications
HA
Perimeta must be deployable to provide a 5 9’s level of availability. If deployed in a single cloud instance, that instance must therefore itself be more than 5 9’s available. As that is hard to achieve with today’s state of the art, Perimeta is designed to be able to span multiple independent cloud instances, so that the failure of any one cloud has a minor impact. The requirements that creates are still being discussed and will be addressed in a future use case.
When deploying Perimeta within a single cloud instance, Perimeta uses an active/standby architecture with an internal heartbeat mechanism allowing the standby to take over within seconds of failure of the active, including taking over its IP address. To support these application level HA mechanisms requires:
The former is supported through standard anti-affinity nova scheduler rules, and the latter through the neutron allowed-address-pairs extension.
If using SR-IOV, Perimeta does not need multiple SR-IOV ports, as application level redundancy copes with the failure of a single NIC. However, it can take advantage of local link redundancy using multiple SR-IOV vNICs. For this to be of any benefit requires the SR-IOV VFs forming a redundant pair to be allocated on separate PFs.
Additionally, it is clearly desirable that the underlying cloud instance is as available as possible e.g. no single points of failure (SPOFs) in the underlying network or storage infrastructure.
Elastic scaling
An NFV orchestrator must be able to rapidly launch or terminate new Perimeta instances in response to applied load and service responsiveness. This is basic OpenStack nova function.
Support for a scalable mechanism to support multiple networks in a VM
There must be a scalable mechanism to present multiple networks to Perimeta, of order hundreds or thousands, so far exceeding the number of vNICs that can be attached. Various mechanisms are possible; a common one, and the one that Perimeta supports, is for different customer networks to be presented over VLANs. This creates a guest requirement for VLAN trunking support.
There are multiple possible ways of mapping networks to these VLANs within OpenStack, for example, trunking external VLAN networks directly to the VMs with minimal OpenStack knowledge or configuration (already supported in Kilo) or configuring the mapping between OpenStack networks and VLANs as covered in VLAN aware VMs: https://blueprints.launchpad.net/neutron/+spec/vlan-aware-vms
The above requirements currently suffer from these gaps.
SR-IOV
Initial support for this has been released, but there remains technical debt being tracked at https://etherpad.openstack.org/p/kilo_sriov_pci_passthrough which would improve the usability and robustness of an SR-IOV-based solution.
Making use of multiple bonded SR-IOV VFs requires the following bps:
Allowing link redundancy with SR-IOV requires the following bp:
Additional minor enhancements that would be useful:
Security protection
CPU pinning
https://blueprints.launchpad.net/nova/+spec/virt-driver-cpu-thread-pinning
VLAN aware VMs
The following are not yet addressed.
Removing redundant bridge
https://blueprints.launchpad.net/neutron/+spec/ovs-optimization-redundant-bridge
Disabling connection tracking in host
[No current blueprint]
HA
As above, to deliver 5 9’s service Perimeta expects to be deployed spanning multiple cloud instances, but if deployed in a single instance it is desirable for that cloud to be available as possible.
This use case also implicitly places requirement on elements outside core OpenStack, such as the DPDK OVS mechanism driver (https://github.com/stackforge/networking-ovs-dpdk).
This obsoletes the TelcoWG vSBC story.
None.