icarus.scenarios package¶
Submodules¶
icarus.scenarios.algorithms module¶
Various algorithms used for optimal cache placement.

pam
(distances, k, n_iter=10)[source]¶ Compute kmedoids using the PAM algorithm
Parameters: distances : 2d NumPy array
Array of distances between points
k : int
Number of clusters
n_iter : int
Number of iterations to repeat. Each repetition is executed using a different initial random assignment. Repetiting the experiment allow to reach different local optima, possibly achieving a best solution.
Notes
Implementation based on: https://github.com/salspaugh/machine_learning/blob/master/clustering/kmedoids.py

extract_cluster_level_topology
(topology)[source]¶ Build a clusterlevel topology.
Each node in the topology must be have the ‘cluster’ attribute
Parameters: topology : Topology
The routerlevel topology
Returns: topology : Topology
The clusterlevel topology
Notes
 Each router must have a cache deployed
 All sources and receiver must have one single attachment point with a cache
 Each node must be labelled with cluster

deploy_clusters
(topology, clusters, assign_src_rcv=True)[source]¶ Annotate topology with cluster informations
This function checks that all ICR candidate nodes are assigned exactly to one cluster.
If assign_src_rcv is True, then it also labels source and receiver nodes to the closest cluster.
 This function assumes that:
 each node of the topology is either an icr_candidate, a source or a receiver
 each source and receiver must have degree equal to 1
Parameters: topology : Topology
The topology onto which deploy clusters
clusters : list of sets
Routercluster assignment. Each element of a list is a set of node identifiers. Nodes in the same set belong to the same cluster. The length of the list therefore corresponds to the number of clusters.
assign_src_rcv : bool, optional
If True, the function labels source and receiver nodes with the cluster label of the router they are attached to.

compute_clusters
(topology, k, distance='delay', nbunch=None, n_iter=10)[source]¶ Cluster nodes of a topologies as to minimize the intracluster latency.
This function assumes that every link is labelled with latencies and performs clustering using the kmedoids method with the PAM algorithm.
Parameters: topology : Topology
The topology
k : int
The number of clusters
distance : str, optional
The link metric used to represent distance between nodes. If None, hop count is used instead
n_iter : int, optional
The number of iterations
icarus.scenarios.cacheplacement module¶
Cache placement strategies
This module provides algorithms for performing cache placement, i.e., given a cumulative cache size and a topology where each possible node candidate is labelled, these functions deploy caching space to the nodes of the topology.

uniform_cache_placement
(topology, cache_budget, **kwargs)[source]¶ Places cache budget uniformly across cache nodes.
Parameters: topology : Topology
The topology object
cache_budget : int
The cumulative cache budget

degree_centrality_cache_placement
(topology, cache_budget, **kwargs)[source]¶ Places cache budget proportionally to the degree of the node.
Parameters: topology : Topology
The topology object
cache_budget : int
The cumulative cache budget

betweenness_centrality_cache_placement
(topology, cache_budget, **kwargs)[source]¶ Places cache budget proportionally to the betweenness centrality of the node.
Parameters: topology : Topology
The topology object
cache_budget : int
The cumulative cache budget

uniform_consolidated_cache_placement
(topology, cache_budget, spread=0.5, metric_dict=None, target='top', **kwargs)[source]¶ Consolidate caches in nodes with top centrality.
Differently from other cache placement strategies that place cache space to all nodes but proportionally to their centrality, this strategy places caches of all the same size in a set of selected nodes.
Parameters: topology : Topology
The topology object
cache_budget : int
The cumulative cache budget
spread : float [0, 1], optional
The spread factor, The greater it is the more the cache budget is spread among nodes. If it is 1, all candidate nodes are assigned a cache, if it is 0, only the node with the highest/lowest centrality is assigned a cache
metric_dict : dict, optional
The centrality metric according to which nodes are selected. If not specified, betweenness centrality is selected.
target : (“top”  “bottom”), optional
The subsection of the ranked node on which to the deploy caches.

random_cache_placement
(topology, cache_budget, n_cache_nodes, seed=None, **kwargs)[source]¶ Deploy caching nodes randomly
Parameters: topology : Topology
The topology object
cache_budget : int
The cumulative cache budget
n_nodes : int
The number of caching nodes to deploy

optimal_median_cache_placement
(topology, cache_budget, n_cache_nodes, hit_ratio, weight='delay', **kwargs)[source]¶ Deploy caching nodes in locations that minimize overall latency assuming a partitioned strategy (a la Google Global Cache). According to this, in the network, a set of caching nodes are deployed and each receiver is mapped to one and only one caching node. Requests from this receiver are always sent to the designated caching node. In case of cache miss requests are forwarded to the original source.
This placement problem can be mapped to the pmedian locationallocation problem. This function solves this problem using the vertex substitution heuristic, which practically works like the kmedoid PAM algorithms, which is also similar to the kmeans clustering algorithm. The result is not guaranteed to be globally optimal, only locally optimal.
Parameters: topology : Topology
The topology object
cache_budget : int
The cumulative cache budget
n_nodes : int
The number of caching nodes to deploy
hit_ratio : float
The expected cache hit ratio of a single cache
weight : str
The weight attribute
Notes
This placement assumes that all receivers have degree = 1 and are connected to an ICR candidate nodes. Also, it assumes that contents are uniformly assigned to sources.

optimal_hashrouting_cache_placement
(topology, cache_budget, n_cache_nodes, hit_ratio, weight='delay', **kwargs)[source]¶ Deploy caching nodes for hashrouting in optimized location
Parameters: topology : Topology
The topology object
cache_budget : int
The cumulative cache budget
n_nodes : int
The number of caching nodes to deploy
hit_ratio : float
The expected global cache hit ratio
weight : str, optional
The weight attribute. Default is ‘delay’

clustered_hashrouting_cache_placement
(topology, cache_budget, n_clusters, policy, distance='delay', **kwargs)[source]¶ Deploy caching nodes for hashrouting in with clusters
Parameters: topology : Topology
The topology object
cache_budget : int
The cumulative cache budget
n_clusters : int
The number of clusters
policy : str (node_const  cluster_const)
The expected global cache hit ratio
distance : str
The attribute used to quantify distance between pairs of nodes. Default is ‘delay’
icarus.scenarios.contentplacement module¶
Content placement strategies.
This module contains function to decide the allocation of content objects to source nodes.

uniform_content_placement
(topology, contents, seed=None)[source]¶  Places content objects to source nodes randomly following a uniform
 distribution.
Parameters: topology : Topology
The topology object
contents : iterable
Iterable of content objects
 source_nodes : list
List of nodes of the topology which are content sources
Returns: cache_placement : dict
Dictionary mapping content objects to source nodes
Notes
A deterministic placement of objects (e.g., for reproducing results) can be achieved by using a fix seed value

weighted_content_placement
(topology, contents, source_weights, seed=None)[source]¶  Places content objects to source nodes randomly according to the weight
 of the source node.
Parameters: topology : Topology
The topology object
contents : iterable
Iterable of content objects
 source_weights : dict
Dict mapping nodes nodes of the topology which are content sources and the weight according to which content placement decision is made.
Returns: cache_placement : dict
Dictionary mapping content objects to source nodes
Notes
A deterministic placement of objects (e.g., for reproducing results) can be achieved by using a fix seed value
icarus.scenarios.topology module¶
Functions for creating or importing topologies for experiments.
To create a custom topology, create a function returning an instance of the IcnTopology class. An IcnTopology is simply a subclass of a Topology class provided by FNSS.
 A valid ICN topology must have the following attributes:
 Each node must have one stack among: source, receiver, router
 The topology must have an attribute called icr_candidates which is a set of router nodes on which a cache may be possibly deployed. Caches are not deployed directly at topology creation, instead they are deployed by a cache placement algorithm.

class
IcnTopology
(data=None, name='', **kwargs)[source]¶ Bases:
fnss.topologies.topology.Topology
Class modelling an ICN topology
An ICN topology is a simple FNSS Topology with addition methods that return sets of caching nodes, sources and receivers.
Attributes
name
Methods
add_cycle
(nodes, **attr)Add a cycle. add_edge
(u, v[, attr_dict])Add an edge between u and v. add_edges_from
(ebunch[, attr_dict])Add all the edges in ebunch. add_node
(n[, attr_dict])Add a single node n and update node attributes. add_nodes_from
(nodes, **attr)Add multiple nodes. add_path
(nodes, **attr)Add a path. add_star
(nodes, **attr)Add a star. add_weighted_edges_from
(ebunch[, weight])Add all the edges in ebunch as weighted edges with specified weights. adjacency_iter
()Return an iterator of (node, adjacency dict) tuples for all nodes. adjacency_list
()Return an adjacency list representation of the graph. adjlist_dict_factory
alias of dict
applications
()Return a dictionary of all applications deployed, keyed by node buffers
()Return a dictionary of all buffer sizes, keyed by interface cache_nodes
()Return a dictionary mapping nodes with a cache and respective cache capacities
()Return a dictionary of all link capacities, keyed by link clear
()Remove all nodes and edges from the graph. copy
()Return a copy of the topology. degree
([nbunch, weight])Return the degree of a node or nodes. degree_iter
([nbunch, weight])Return an iterator for (node, degree). delays
()Return a dictionary of all link delays, keyed by link edge_attr_dict_factory
alias of dict
edges
([nbunch, data, default])Return a list of edges. edges_iter
([nbunch, data, default])Return an iterator over the edges. get_edge_data
(u, v[, default])Return the attribute dictionary associated with edge (u,v). has_edge
(u, v)Return True if the edge (u,v) is in the graph. has_node
(n)Return True if the graph contains the node n. is_directed
()Return True if graph is directed, False otherwise. is_multigraph
()Return True if graph is a multigraph, False otherwise. nbunch_iter
([nbunch])Return an iterator of nodes contained in nbunch that are also in the graph. neighbors
(n)Return a list of the nodes connected to the node n. neighbors_iter
(n)Return an iterator over all neighbors of node n. node_dict_factory
alias of dict
nodes
([data])Return a list of the nodes in the graph. nodes_iter
([data])Return an iterator over the nodes. nodes_with_selfloops
()Return a list of nodes with self loops. number_of_edges
([u, v])Return the number of edges between two nodes. number_of_nodes
()Return the number of nodes in the graph. number_of_selfloops
()Return the number of selfloop edges. order
()Return the number of nodes in the graph. receivers
()Return a set of receiver nodes remove_edge
(u, v)Remove the edge between u and v. remove_edges_from
(ebunch)Remove all edges specified in ebunch. remove_node
(n)Remove node n. remove_nodes_from
(nodes)Remove multiple nodes. selfloop_edges
([data, default])Return a list of selfloop edges. size
([weight])Return the number of edges. sources
()Return a set of source nodes stacks
()Return a dictionary of all node stacks, keyed by node subgraph
(nbunch)Return the subgraph induced on nodes in nbunch. to_directed
()Return a directed representation of the topology. to_undirected
()Return an undirected copy of the topology. weights
()Return a dictionary of all link weights, keyed by link

topology_tree
(k, h, delay=1, **kwargs)[source]¶ Returns a tree topology, with a source at the root, receivers at the leafs and caches at all intermediate nodes.
Parameters: h : int
The height of the tree
k : int
The branching factor of the tree
delay : float
The link delay in milliseconds
Returns: topology : IcnTopology
The topology object

topology_path
(n, delay=1, **kwargs)[source]¶ Return a path topology with a receiver on node 0 and a source at node ‘n1’
Parameters: n : int (>=3)
The number of nodes
delay : float
The link delay in milliseconds
Returns: topology : IcnTopology
The topology object

topology_ring
(n, delay_int=1, delay_ext=5, **kwargs)[source]¶ Returns a ring topology
This topology is comprised of a ring of n nodes. Each of these nodes is attached to a receiver. In addition one router is attached to a source. Therefore, this topology has in fact 2n + 1 nodes.
It models the case of a metro ring network, with many receivers and one only source towards the core network.
Parameters: n : int
The number of routers in the ring
delay_int : float
The internal link delay in milliseconds
delay_ext : float
The external link delay in milliseconds
Returns: topology : IcnTopology
The topology object

topology_mesh
(n, m, delay_int=1, delay_ext=5, **kwargs)[source]¶ Returns a ring topology
This topology is comprised of a mesh of n nodes. Each of these nodes is attached to a receiver. In addition m router are attached each to a source. Therefore, this topology has in fact 2n + m nodes.
Parameters: n : int
The number of routers in the ring
m : int
The number of sources
delay_int : float
The internal link delay in milliseconds
delay_ext : float
The external link delay in milliseconds
Returns: topology : IcnTopology
The topology object

topology_geant
(**kwargs)[source]¶ Return a scenario based on GEANT topology
Parameters: seed : int, optional
The seed used for random number generation
Returns: topology : fnss.Topology
The topology object

topology_tiscali
(**kwargs)[source]¶ Return a scenario based on Tiscali topology, parsed from RocketFuel dataset
Parameters: seed : int, optional
The seed used for random number generation
Returns: topology : fnss.Topology
The topology object

topology_wide
(**kwargs)[source]¶ Return a scenario based on GARR topology
Parameters: seed : int, optional
The seed used for random number generation
Returns: topology : fnss.Topology
The topology object

topology_garr
(**kwargs)[source]¶ Return a scenario based on GARR topology
Parameters: seed : int, optional
The seed used for random number generation
Returns: topology : fnss.Topology
The topology object

topology_rocketfuel_latency
(asn, source_ratio=0.1, ext_delay=34, **kwargs)[source]¶ Parse a generic RocketFuel topology with annotated latencies
To each node of the parsed topology it is attached an artificial receiver node. To the routers with highest degree it is also attached a source node.
Parameters: asn : int
AS number
source_ratio : float
Ratio between number of source nodes (artificially attached) and routers
ext_delay : float
Delay on external nodes
icarus.scenarios.workload module¶
Traffic workloads
Every traffic workload to be used with Icarus must be modelled as an iterable class, i.e. a class with at least an __init__ method (through which it is initialized, with values taken from the configuration file) and an __iter__ method that is called to return a new event.
Each call to the __iter__ method must return a 2tuple in which the first element is the timestamp at which the event occurs and the second is a dictionary, describing the event, which must contain at least the three following attributes:
 receiver: The name of the node issuing the request
 content: The name of the content for which the request is issued
 log: A boolean value indicating whether this request should be logged or not for measurement purposes.
Each workload must expose the contents attribute which is an iterable of all content identifiers. This is needed for content placement.

class
StationaryWorkload
(topology, n_contents, alpha, beta=0, rate=1.0, n_warmup=100000, n_measured=400000, seed=None, **kwargs)[source]¶ Bases:
object
This function generates events on the fly, i.e. instead of creating an event schedule to be kept in memory, returns an iterator that generates events when needed.
This is useful for running large schedules of events where RAM is limited as its memory impact is considerably lower.
These requests are Poissondistributed while content popularity is Zipfdistributed
All requests are mapped to receivers uniformly unless a positive beta parameter is specified.
If a beta parameter is specified, then receivers issue requests at different rates. The algorithm used to determine the requests rates for each receiver is the following:
 All receiver are sorted in decreasing order of degree of the PoP they are attached to. This assumes that all receivers have degree = 1 and are attached to a node with degree > 1
 Rates are then assigned following a Zipf distribution of coefficient beta where nodes with higherdegree PoPs have a higher request rate
Parameters: topology : fnss.Topology
The topology to which the workload refers
n_contents : int
The number of content object
alpha : float
The Zipf alpha parameter
beta : float, optional
Parameter indicating
rate : float, optional
The mean rate of requests per second
n_warmup : int, optional
The number of warmup requests (i.e. requests executed to fill cache but not logged)
n_measured : int, optional
The number of logged requests after the warmup
Returns: events : iterator
Iterator of events. Each event is a 2tuple where the first element is the timestamp at which the event occurs and the second element is a dictionary of event attributes.

name
= 'STATIONARY'¶

class
GlobetraffWorkload
(topology, reqs_file, contents_file, beta=0, **kwargs)[source]¶ Bases:
object
Parse requests from GlobeTraff workload generator
All requests are mapped to receivers uniformly unless a positive beta parameter is specified.
If a beta parameter is specified, then receivers issue requests at different rates. The algorithm used to determine the requests rates for each receiver is the following:
 All receiver are sorted in decreasing order of degree of the PoP they are attached to. This assumes that all receivers have degree = 1 and are attached to a node with degree > 1
 Rates are then assigned following a Zipf distribution of coefficient beta where nodes with higherdegree PoPs have a higher request rate
Parameters: topology : fnss.Topology
The topology to which the workload refers
reqs_file : str
The GlobeTraff request file
contents_file : str
The GlobeTraff content file
beta : float, optional
Spatial skewness of requests rates
Returns: events : iterator
Iterator of events. Each event is a 2tuple where the first element is the timestamp at which the event occurs and the second element is a dictionary of event attributes.

name
= 'GLOBETRAFF'¶

class
TraceDrivenWorkload
(topology, reqs_file, contents_file, n_contents, n_warmup, n_measured, rate=1.0, beta=0, **kwargs)[source]¶ Bases:
object
Parse requests from a generic request trace.
 This workload requires two text files:
 a requests file, where each line corresponds to a string identifying the content requested
 a contents file, which lists all unique content identifiers appearing in the requests file.
Since the trace do not provide timestamps, requests are scheduled according to a Poisson process of rate rate. All requests are mapped to receivers uniformly unless a positive beta parameter is specified.
If a beta parameter is specified, then receivers issue requests at different rates. The algorithm used to determine the requests rates for each receiver is the following:
 All receiver are sorted in decreasing order of degree of the PoP they are attached to. This assumes that all receivers have degree = 1 and are attached to a node with degree > 1
 Rates are then assigned following a Zipf distribution of coefficient beta where nodes with higherdegree PoPs have a higher request rate
Parameters: topology : fnss.Topology
The topology to which the workload refers
reqs_file : str
The path to the requests file
contents_file : str
The path to the contents file
n_contents : int
The number of content object (i.e. the number of lines of contents_file)
n_warmup : int
The number of warmup requests (i.e. requests executed to fill cache but not logged)
n_measured : int
The number of logged requests after the warmup
rate : float, optional
The networkwide mean rate of requests per second
beta : float, optional
Spatial skewness of requests rates
Returns: events : iterator
Iterator of events. Each event is a 2tuple where the first element is the timestamp at which the event occurs and the second element is a dictionary of event attributes.

name
= 'TRACE_DRIVEN'¶

class
YCSBWorkload
(workload, n_contents, n_warmup, n_measured, alpha=0.99, seed=None, **kwargs)[source]¶ Bases:
object
Yahoo! Cloud Serving Benchmark (YCSB)
The YCSB is a set of reference workloads used to benchmark databases and, more generally any storage/caching systems. It comprises five workloads:
Workload Operations Record selection A  Update heavy B  Read heavy C  Read only D  Read latest E  Short ranges Read: 50%, Update: 50% Read: 95%, Update: 5% Read: 100% Read: 95%, Insert: 5% Scan: 95%, Insert 5% Zipfian Zipfian Zipfian Latest Zipfian/Uniform Notes
At the moment only workloads A, B and C are implemented, since they are the most relevant for caching systems.

name
= 'YCSB'¶

Module contents¶
This package contains the code for generating simulation scenarios.