Main Achievements at the end of the project

The main achievements of the project in the second reporting period of activities are as follows:

1. Selection and detailed description of use cases / scenarios for the integrated project testbed

The project has selected 12 use cases to demonstrate most of the technology developed in the project, matched to the planned contributions of partners according to the DoA. In deliverable D7.1, we have shown the relation of the use cases with the technological blocks developed by the project.
Then we have identified the “scenes” within a unified demonstration story, showing how the components needed to implement the use cases map into the integrated project demonstrator.

2. Setup of two interconnected testbeds, elaboration of an integration plan, components deployment and integration

The integrated Superfluidity system is based on two interconnected testbed, one in BT premises representing a central cloud infrastructure and another one in Nokia FR premises, hosting the MEC/Edge Cloud/C-RAN infrastructure. The two testbeds are interconnected by means of a VPN, providing a single logical networking infrastructure. Several services and components, implemented with different technologies and having different requirements, needed to be deployed and integrated in this distributed environment, making the integration a challenge. This work (documented in the deliverable D7.2) was the basis for the subsequent work on the project unified prototype validation and assessment. A total of 28 components were integrated and run together, showing the 14 selected use cases /scenes.

2.1 Final integrated system up and running. Validation and assessment of the integrated system.

The project has followed the ambitious approach of integrating as many components as possible into the integrated Superfluidity system. We selected and implemented a sequence of use cases/scenes, conceived to mimic the lifecycle of a real network, from the initial planning, design and configuration stages, to network and service deployment and operation, including multiple alternative solutions, to assess optimal performance. The validation and assessment of this unified prototype has been reported in the deliverable D7.3.

3. Kuryr: enabling neutron-networking ecosystem for containers , regardless they are running on baremetal servers or inside OpenStack VMs. The work done avoid double encapsulation for the nested case , as well as speeds up the containers boot up in Neutron networks.

Kuryr is an OpenStack project that targets to bring OpenStack networking capabilities to containers. The idea behind Kuryr, is to be able to leverage the abstraction and all the hard work that was put in Neutron and its plugins and services and use that to provide production grade networking for containers use cases. In Superfluidity, we have contributed to Kuryr project by:

(1) Enabling nested containers, i.e., allowing to create containers inside OpenStack VMs by making use of the VLAN-Aware-VMs Neutron capability. This enables containers to be in different networks than their containing VM, providing not only Neutron networking features, but also an extra security layer, as well as avoiding double encapsulation issues when running Kubernetes cluster on top of OpenStack VMs.
(2) Adding pools of Neutron resources, in this case, Neutron ports, to speed up containers boot up time when using OpenStack networking. This approach minimizes the interaction with Neutron during containers boot up, as well as make a more efficient use of Neutron REST API by performing bulk requests instead of single ones.
(3) Kuryr integration with Neutron load balancers, making both LBaaSv2 and Octavia load balancers available for containers, even being able to load balance between VMs and containers

Note that these contributions are already available upstream and the community is already making use of it. It has been even integrated with other SDN solutions, such as DragronFlow and OpenDaylight.

4. RDCL 3D tool for the design of NFV based services on heterogeneous infrastructure.

The tool supports the modelling and design of: 1) services with nested RFBs like regular VMs and ClickOS Unikernels, respectively deployed on a traditional NFV virtualization infrastructure and on a XEN platform supporting ClickOS Unikernels; 2) services with mixed VMs and containers; 3) softwarized C-RAN configurations.
The RDCL 3D tool is a web framework that can be adapted to support the modelling and design of different types of NFV services and the interaction with different orchestrators. It is modular and it is designed to facilitate the introduction and the support of different models. The tool has been released as Open Source under the Apache 2.0 license.

4.1 The OSM GUI for the OSM lightweight build is based on the RDCL 3D tool.

It is part of OSM Release Four to be released in May 2018.
The project has worked to improve the OSM orchestrator with a new web GUI. The work has been performed in close cooperation with the TID team leading the OSM development in the context of the ETSI OSM community. Thanks to the flexibility of RDCL 3D tool, it has been relatively easy to support the functionality of OSM lightweight, providing a powerful GUI for accessing the OSM functionality.

5. Integration of decomposed C-RAN solution prototyped with RDCL 3D tool

RDCL3D allows a high level of abstraction of any network or a service to be deployed on a heterogeneous cloud infrastructure. The network or the service will be described via a graph where the different RFBs are connected. Once the graph is built, an RFB descriptor file available in YAML and JSON format is generated. In this way, RDCL 3D enables Infrastructure as Code. For the Cloud-RAN, three components are defined: RRH, BBU, and a fronthaul. The C-RAN will be connected to a core EPC.
After building the graph, a button’s click allows us to trigger the deployment of the designed network. Behind the scene, an orchestrator takes the generated file, translates it into another file or language that will be used by the actual controller to run the deployment. For instance, if we want to deploy C-RAN components using Kubernetes, then the RFB descriptor file would be translated into Kubernetes yaml file. The first thing that we deploy is a set of Open vSwitch bridges that form the fronthaul network spanning multiple host. Next, the EPC docker container starts, followed by the RRH and finally the BBU.

6. Unikernel technology: LightVM re-architects the Xen virtualization system and uses unikernels to achieve VM boot times of a few milliseconds for up to 8,000 guests (faster than containers).

Containers are in great demand because they are light- weight when compared to virtual machines. On the down- side, containers offer weaker isolation than VMs, to the point where people run containers in virtual machines to achieve proper isolation. In this work, we examine whether there is indeed a strict trade-off between isolation (VMs) and efficiency (containers). We state that VMs can be as nimble as containers, as long as they are small and the toolstack is fast enough.
We achieve lightweight VMs by using unikernels for specialized applications and with Tinyx, a tool that enables creating tailor-made, trimmed-down Linux virtual machines. By themselves, lightweight virtual machines are not enough to ensure good performance since the virtualization control plane (the toolstack) becomes the performance bottleneck. We present LightVM, a new virtualization solution based on Xen that is optimized to offer fast boot-times regardless of the number of active VMs. LightVM features a complete redesign of Xen’s control plane, transforming its centralized operation to a distributed one where interactions with the hypervisor are reduced to a minimum. LightVM can boot a VM in 2.3ms which comparable to fork/exec on Linux (1ms), and two orders of magnitude faster than Docker.

7. Unikernel orchestration: OpenVIM extensions to support Unikernels. Performance evaluation of different Virtual Infrastructure Managers extensions (OpenVIM, OpenStack and Nomad) to support Unikernel orchestration.

We have adapted and optimized three VIMs (Virtual Infrastructure Managers) to support the ClickOS Unikernel, namely OpenVIM, OpenStack and Nomad. In particular, we have extended OpenVIM to support the ClickOS Unikernel (and more in general the XEN virtualization) in a backward compatible way. We have also prototyped the support of Unikernels in OpenStack and Nomad and provided performance measurements of orchestration time for the three considered platforms.

7.1 OpenVIM extensions for Unikernels merged in the mainstream OpenVIM.

Our work has been presented to the OSM community that is maintaining OpenVIM. We received feedback and reacted accordingly. Then we have submitted a revised code contribution that has been merged upstream in April 2018.

8. Intent based reprogrammable fronthaul network infrastructure

The fronthaul network offers the connectivity between the RRH or the Front-End Unit and the EDGE cloud. The process running in each location will depend on the type of split we apply which are discussed in D4.3.
To connect the antenna site (RRH) to the EDGE cloud, a CPRI link is normally used. Our objective was to build a fully re-programmable fronthaul solution, which can select the routing path according to the requirement of the flow in terms of latency and throughput following a declarative model.
For that purpose, an Ethernet based solution was developed using an overlay network composed of Open Virtual Switches to deliver the required flexibility. Using a SDN controller (in our case ONOS), and the intent programming, we succeed in connecting the RRH and the EDGE cloud in a software manner and making it fully re-programmable.
The orchestrator instructs an ONOS controller to install host-to-host intents to create paths between the BBU and RRH, and between the BBU and EPC. The BBU initiates communication with the RRH and EPC, and once all is set up, a UE can be attached.

9. Algorithms and tools for flow/packet processing function allocation to processors in Fastclick

FastClick allows a higher level of abstraction than Click, thanks to its automatic resource allocator. The user only needs to specify the logic of the network function, and FastClick will automatically handle low-level details such as core allocation.
FastClick has also been extended with support for flow processing in addition to packet processing. This extension supports writing flow-aware middleboxes in a simplified manner, such as an HTTP reverse proxy, directly inside FastClick. We also improved service chaining efficiency by sharing classification work, which is now done only once (using hardware offloading when possible), even if several components in the chain need classification.
Finally, using a machine-learning-based approach, we can now predict the performance (e.g. throughput) of some network functions based on information about the workload and allocated resources. This work will be used to guide resource allocation and function placement towards optimality.

10. Design of a Packet Manipulation Processor (PMP) based on a RISC architecture for extended Data Plane Programmability.

Programmable dataplanes are emerging as a disruptive technology to implement network function virtualization in an SDN environment. Starting from the original OpenFlow’s match/action abstraction, most of the work has so far focused on key improvements in matching flexibility. Conversely, the “action” part, i.e. the set of operations (such as encapsulation or header manipulation) performed on packets after the forwarding decision, has received way less attention. With PMP,we move beyond the idea of “atomic”, pre-implemented, actions, and aim at having programmable dataplane actions while retaining high speed multi-gbps operation. The Packet Manipulation Processor (PMP) is a domain-specific HW architecture, able to efficiently support micro-programs implementing such actions.

11. Dissemination and demo of Citrix Hammer, the Traffic Generator that was used in WP4 and WP7 activities.

Hammer is a real world, end-to-end network traffic simulator, capable of simulating complex and dynamic network, user and server behaviours. The focus of this tool is to primarily facilitate investigations related to product stability, for instance different aspects of capacity, longevity, memory leaks, cores and to handle customer content testing that will reveal the behaviour of the device under test in realistic network conditions. Hammer has a modular design, which offers excellent scalability, rendering the platform capable of being installed on commodity hardware. In addition, it is designed in a resource-savvy manner, thus requires limited computational resources to generate significant traffic loads. Instead of operating on a packet level, Hammer offers application-layer workload interaction along with inherent data plane acceleration, delivering brisk performance with unparalleled flexibility and ease of use. Our tests show that Hammer’s operation is linearly linked to the underlying hardware resources, however, even when the simulator is installed in a resource-bound environment, it can still deliver traffic loads that correspond to thousands of interconnected users each with real-world behaviour per session. When installed in cutting-edge contemporary servers with the latest generation CPUs, Hammer performs significantly better indicating the software’s suitability for high-demanding tasks. Under the auspices of the project, we have partially presented and benchmarked Hammer, while demonstrating some of its capabilities on testing actual platforms deployed in production-like networking environments.

12. A novel approach to resource allocation in 5G networks data plane that aims to extend the packet switching principle to CPUs.

Software network processing in 5G relies on dedicated cores and hardware isolation to ensure appropriate throughput guarantees. Such isolation comes at the expense of low utilization in the average case, and severely restricts the number of network processing functions one can execute on a server. We propose that multiple processing functions should simply share a CPU core, turning the CPU into a special type of “link”. We use multiple NIC receive queues and the FastClick suite to test the feasibility of this approach. We find that, as expected, per core throughput decreases when more processes are contending; however the decrease is not dramatic: around 10% drop with 10 processes and 50% in the worst case where the processing is very cheap (bridging). We also find that the processor is not shared fairly when the different functions have different per packet costs. Finally, we implement and test in simulation a solution that enables efficient CPU sharing by sending congestion signals proportional to per-packet cost for each flow. This enables endpoint congestion control (e.g. TCP) to react appropriately and share the CPU fairly.

13. OSM adoption and integration in the MEC environment as a Management and Orchestration (MEO) solution for applications Life Cycle Management (instantiation and disposal).

The Open Source MANO (OSM) is an ETSI-hosted project, created to develop an open source NFV Management and Orchestration (MANO) software stack aligned with ETSI NFV. As the MANO NFV functions are significantly similar to the MEO (Mobile/Multi-access Edge Orchestrator), a MEC component, after some evaluation work of different MANO solutions, we decided to use OSM.
MEO is responsible for managing and orchestrating ME Applications (ME Apps) which are hosted at the Edge (ME Host). Management features include the lifecycle operations such as instantiation, scaling, and termination of ME Apps in a ME Hosts. The orchestration functionality has a global view of all ME Hosts and is able to take decisions about where a particular ME App should run, or about the migration of a particular ME App between two different ME Hosts, e.g. triggered by an end-user movement.
It is important to understand that MEO, implemented using OSM, is part of the MEC solution and devoted to manage MEC Apps. Previously the MEC solution and other VNFs (e.g. C-RAN) needed to be deployed so that they can start this operation. For this purpose, the project uses ManageIQ.

14. Platform-aware C-RAN baseband RFB adaptation and optimization using dynamic task clustering and scheduling approach.

Typically, implementation of communication signal processing algorithms employs static approaches for application definition, mapping and scheduling. In Superfluidity. We developed dataflow runtime system and reconfigurable scalable baseband C-RAN application that enables dynamic application generation (TTI specific task graphs), dynamic graph transformation using task clustering and dynamic mapping to underlying hardware. Dynamic generation-transformation-mapping approach allows for runtime application structure optimisation and adaptation according to hardware platform and performance requirements, which make it suitable for flexible deployment on various HW platforms.

15. Scalable and predictable 5G baseband architecture employing dynamic task scheduling and guaranteed service inter-task connection allocation.

The huge diversity of 5G application requirements and associated modem protocols impose high demands on radio platforms in terms of scalable latency, reliability, and computation performance. Within the Superfluidity project, we scope a scalable multi-processor platforms that can handle a plurality of parallel sliced wireless links. The challenge in this regards is to efficiently allocate and use shared computation resources according to workload requirements of particular slice, and to minimize inter-task congestion by provisioning adequate communication resources. We propose dedicated unit core-manager to dynamically map, schedule and prioritize the application tasks to computation resources. Moreover, we propose linear complexity trellis-search path algorithm and associated network-manager allowing to dynamically find an optimum network path between communicating tasks to provide guaranteed service connection in terms of throughput and latency.

16. Evaluation of cost of software switching for NFV.

An important enabler for the NFV paradigm is software switching, which should satisfy rigid network requirements such as high throughput and low latency. However, software switching comes with an extra cost in terms of computing resources that are allocated specifically to software switching in order to steer the traffic through running services (in addition to computing resources required by VNFs). This cost depends primarily on the way the VNFs are internally chained, packet processing requirements, and accelerating technologies (e.g., packet acceleration such as Intel DPDK).
The use of Fast Dataplane technology in network interface cards as an alternative to SR-IOV and DPDK was evaluated. The cost of software switching (in terms of consumed resources) and its expected performance are key inputs for Superfluidity orchestration and particularly placement algorithms. The task has evaluated the performance of software switching using different configurations. There was a particular focus on two key acceleration technologies, namely, OVS DPDK and FD.io VPP. OVS DPDK benchmarked and its performance and resource usage compared against standard OVS as well as with SR-IOV. The task also evaluated the emerging Fast Dataplane input/output (FD.io) switch approach. The performance of FD.io was compared against the performance of an OVS-DPDK enabled soft switch.

17. MDP (Markov Decision Process)-Optimized VM scaling and load balancing mechanism for NFV.

Dynamically adjusting the amount of resources allocated to a certain VNF based on the demand (performance) is a highly important problem. We devised a decision mechanism that dynamically increases (via scale out operation) or decreases (scale in operation) the number of VMs that are allocated to the VNF, according to the demand from the network service (e.g., number of flows a firewall handles). Our scaling decision mechanism minimizes the consumed resources, while maintaining the application required SLA. In conjunction with the scaling decisions, our dynamic mechanism also handles load balancing, steering the traffic flows to the different VMs and balancing the load between them. In this study, we tackle both the scaling decision and the load balancing strategy as a single problem, formulated as a Markov Decision Process.

18. Automating workload fingerprinting, trying to identify the hardware subsystems on a compute node that are most significantly affected by the deployment of a workload.

A characterisation approach developed was based on the automated identification of platform metrics that most significant influence the behaviour of service KPI’s. In addition a profiling approach based on automated workload fingerprinting was developed which can identify the hardware subsystems on a compute node that are most significantly affected by the deployment of a workload. The approach developed has utility for informing workload placement decisions.

19. A novel automated approach for the implementation of block abstraction models was developed.

A novel automated approach for the implementation of block abstraction models was developed. The block abstraction model provides a logical representation of a workloads affinity for infrastructure resource allocations and features. The block abstraction model also encapsulated the effect of the deployment topology across a heterogeneous resource landscape on a workload’s KPI performance. The block abstractions modelling approach was applied to enabling reasoning over performance versus cost deployment options. The approach developed enabled infrastructure cost reductions when favouring cost over performance for a test service

20. Debugging P4 Programs with Vera.

Truly programmable switches allow replacing the dataplane algorithm too, not just the rules it uses (like in Openflow). To ensure programmability and performance, 5G networks will adopt switches that support the P4 language, or other ones (e.g. OpenState from Univ. of Rome Tor Vergata). We present Vera, a tool that exhaustively verifies P4 programs using symbolic execution. Vera automatically uncovers a number of common bugs including parsing/deparsing errors, invalid memory accesses, loops and tunneling errors, among others. To enable scalable, exhaustive verification Vera automatically generates all valid header layouts, it uses symbolic table entries to simulate a variety of table rule snapshots and uses a novel data-structure for match-action processing optimized for verification. These techniques allow Vera to scale very well: it only takes between 5s-15s to track the execution of a purely symbolic packet in the largest P4 program currently available (6KLOC) and can compute SEFL model updates according to table insertions and deletions in milliseconds. We have used Vera to analyze all P4 programs we could find including the P4 tutorials, P4 programs in the research literature and the switch code from https://p4.org. Vera has found several bugs in each of them in seconds.

21. Resource allocation for network functions in Click-based environments: a joint characterization, modeling and optimisation approach based on machine learning.

We tackled the optimisation of resource allocation for network functions in Click-based environments. We devised a joint characterization, modeling and optimisation approach based on machine learning. The proposed approach is very general and can also be applied to other environments, such as the deployment of VMs, which was also promisingly investigated using this method. As this approach requires a network function performance dataset, we first devised a tool, the Network Performance Framework (NPF), to help with such experiments and we began the generation of an open dataset to foster further research into this topic, both in the networking and in the machine learning communities. While this research work is not fully completed, the achieved results are very promising.