Cloud computing has significantly changed the IT landscape. Today it is possible for small companies or even single individuals to access virtually unlimited resources in large data centres (DCs) for running computationally demanding tasks. This has triggered the rise of "big data" applications, which operate on large amounts of data. These include traditional batch-oriented applications, such as data mining, data indexing, log collection and analysis, and scientific applications, as well as real-time stream processing, web search and advertising.
To support big data applications, parallel processing systems, such as MapReduce, adopt a partition/aggregate model: a large input data set is distributed over many servers, and each server processes a share of the data. Locally generated intermediate results must then be aggregated to obtain the final result. An open challenge of the partition/aggregate model is that it results in high contention for network resources in DCs when a large amount of data traffic is exchanged between servers. Improving the performance of such network-bound applications in DCs has attracted much interest from the research community. A class of solutions focuses on reducing bandwidth usage by employing overlay networks to distribute data and to perform partial aggregation. However, this requires applications to reverse-engineer the physical network topology to optimise the layout of overlay networks. Even with perfect knowledge of the physical topology, there are still fundamental inefficiencies: e.g. any logical topology with a server fan-out higher than one cannot be mapped optimally to the physical network if servers have only a single network interface.
In contrast, we argue that the problem can be solved more effectively by providing DC tenants with efficient, easy and safe control of network operations. Instead of over-provisioning, we focus on optimising network traffic by exploiting application-specific knowledge. We term this approach "network-as-a-service" (NaaS) because it allows tenants to customise the service that they receive from the network. NaaS-enabled tenants can deploy custom routing protocols, including multicast services or anycast/incast protocols, as well as more sophisticated mechanisms, such as content-based routing and content-centric networking. By modifying the content of packets on-path, they can efficiently implement advanced, application-specific network services, such as in-network data aggregation and smart caching. Parallel processing systems such as MapReduce would greatly benefit because data can be aggregated on-path, thus reducing execution times. Key-value stores (e.g. memcached) can improve their performance by caching popular keys within the network, which decreases latency and bandwidth usage compared to end-host-only deployments.