Internet resilience

Richard Clayton has a great post summarising the recent paper for ENISA that he co-authored on the Internet resilience. Food for thought:

Internet interconnectivity is a complex ecosystem with many interdependent layers. Its operation is governed by the collective self-interest of the Internet’s networks, but there is no central Network Operation Centre (NOC), staffed with technicians to leap into action when trouble occurs.  The open and decentralised organisation that is the very essence of the ecosystem is essential to the success and resilience of the Internet. Yet there are a number of concerns.

First, the Internet is vulnerable to various kinds of common mode technical failures where systems are disrupted in many places simultaneously; service could be substantially disrupted by failures of other utilities, particularly the electricity supply; a flu pandemic could cause the people on whose work it depends to stay at home, just as demand for home working by others was peaking; and finally, because of its open nature, the Internet is at risk of intentionally disruptive attacks.

Second, there are concerns about sustainability of the current business models.  Internet service is cheap, and becoming rapidly cheaper, because the costs of service provision are mostly fixed costs; the marginal costs are low, so competition forces prices ever downwards.  Some of the largest operators – the ‘Tier 1′ transit providers – are losing substantial amounts of money, and it is not clear how future capital investment will be financed. There is a risk that consolidation might reduce the current twenty-odd providers to a handful, at which point regulation may be needed to prevent monopoly pricing.

Third, dependability and economics interact in potentially pernicious ways.  Most of the things that service providers can do to make the Internet more resilient, from having excess capacity to route filtering, benefit other providers much more than the firm that pays for them, leading to a potential ‘tragedy of the commons’.  Similarly, security mechanisms that would help reduce the likelihood and the impact of malice, error and mischance are not implemented because no-one has found a way to roll them out that gives sufficiently incremental and sufficiently local benefit.

Fourth, there is remarkably little reliable information about the size and shape of the Internet infrastructure or its daily operation.  This hinders any attempt to assess its resilience in general and the analysis of the true impact of incidents in particular.  The opacity also hinders research and development of improved protocols, systems and practices by making it hard to know what the issues really are and harder yet to test proposed solutions.

If you have the time read the full report. If you don’t have the time, at least read the Executive Summary.