2. Contenido de los poemarios
2.4. Poema al compañero Chicho
We briefly discuss how the three kinds of knowledge, configuration, operation, and control knowledge, are collected in our system in this section.
First, agents collect configuration knowledge from the local AS. To do so, agents need authorization to access BGP tables from local border routers. We believe that BGP infor-mation is usually not sensitive, so organizations may be willing to disclose this inforinfor-mation, and agents can employ policies when exposing such information to others. Network topol-ogy is also stable, so it can be stored locally and updated infrequently. In this work, we use BGP data from multiple sources to construct an AS-level network topology as complete as possible, as was done by He et al. [56].
To provide configuration knowledge, especially network topology knowledge, some agents maintain two kinds of network topology knowledge at the Autonomous System level: (1) the AS number to which an IP address belongs; (2) the AS path between the local AS and another AS.
Some AS information is already available [17]. Other researchers have successfully retrieved BGP information. For example, PlanetLab [96] implemented an application sup-port layer to provide BGP information. To do this, a PlanetLab server is matched with a BGP router. They are configured to provide a one-way information flow from the router to the PlanetLab node [81]. This does not require implementing any special interface to the routers. Furthermore, a set of servers in PlanetLab collectively construct a peering AS graph. A PlanetLab node also implements a PLUTO BGP Sensor interface to provide ap-plications an easy access to BGP information [115]. Note that it is possible to replicate the service of the mapping between IP addresses and their AS numbers in each region, as the data set is not large. For example, the compressed data set of the RouteViews BGP tables is only 13MB.
In this way, an agent only maintains the local view of the Internet, which represents the reachability of the local network to the rest of the Internet. This is due to the following reasons: First, it is hard to get an accurate global view of network topology, but it is easy to obtain the local view; Second, such local network knowledge should be able to satisfy local requests most of the time; Third, remote network knowledge can be obtained from other agents through request resolution. Note that restrictions may be applied to the parameters of those functions. For example, an application may not be allowed to query the AS path between any two ASes, due to the privacy concerns of routing information of those ASes.
Geographic information is another kind of network knowledge we are interested in.
Geographic information provides physical location information, which is often directly related to network performance, such as latency. It also enables a large set of location-aware applications. It is not trivial to obtain accurate geography information today, but approximate location is enough to organize agents in this work. We use data from the GeoIP Database [41] for this purpose, and plan to leverage existing techniques like [90] in the future. Agents can run the geography service, and register the service at the regional
leaders. However, it gets complicated if an agent is mobile. We set aside the mobility issue in this work.
Second, operation knowledge may be more costly to obtain and maintain than configu-ration knowledge. Among different kinds of opeconfigu-ration knowledge, latency is usually easy and lightweight to measure using measurement tools such as ping. Unless the latency to a large number of hosts or an average over a long time period is needed, real-time mea-surement will work because of its simplicity and low overhead. Other information, such as bandwidth and loss rate, can be obtained through measurement with more overhead. Many tools have been developed to measure network status, and new tools are being developed.
Agents can use those tools and share performance knowledge. A request for performance knowledge between two hosts is resolved by agents near the hosts. For example, the latency between two hosts can be approximated by the latency between two agents plus the latency between each host and its nearby agent, similar to [55]. As another example, agents may infer the property of a new path by segmentation and composition using previous measure-ment results. This is similar to network tomography [28]. Consistency is another important issue here. Agents at different locations may return different answers to the same request, and the same request asked at different times may get different answers, even if it is not time sensitive.
Note that the performance and geographic knowledge may be approximate instead of accurate. First, the performance knowledge changes frequently. Even if we obtain accu-rate measurement results, it may be outdated when it is returned to the requester. Second, in many situations it is enough to have approximate information. For example, a stream-ing video application only needs to know the class of bandwidth (high, medium, low) to determine the appropriate encoding method.
Third, control knowledge, such as policies, is usually very hard to obtain directly or infer. Sometimes we may discover it partially. For example, we can test if an ISP blocks any port. But generally this knowledge is hard to observe from outside. We focus on simple policy information in this thesis.