icarus.scenarios package¶
Submodules¶
icarus.scenarios.algorithms module¶
Various algorithms used for optimal cache placement.
-
pam
(distances, k, n_iter=10)[source]¶ Compute k-medoids using the PAM algorithm
Parameters: - distances : 2-d NumPy array
Array of distances between points
- k : int
Number of clusters
- n_iter : int
Number of iterations to repeat. Each repetition is executed using a different initial random assignment. Repetiting the experiment allow to reach different local optima, possibly achieving a best solution.
Notes
Implementation based on: https://github.com/salspaugh/machine_learning/blob/master/clustering/kmedoids.py
-
extract_cluster_level_topology
(topology)[source]¶ Build a cluster-level topology.
Each node in the topology must be have the ‘cluster’ attribute
Parameters: - topology : Topology
The router-level topology
Returns: - topology : Topology
The cluster-level topology
Notes
- Each router must have a cache deployed
- All sources and receiver must have one single attachment point with a cache
- Each node must be labelled with cluster
-
deploy_clusters
(topology, clusters, assign_src_rcv=True)[source]¶ Annotate topology with cluster informations
This function checks that all ICR candidate nodes are assigned exactly to one cluster.
If assign_src_rcv is True, then it also labels source and receiver nodes to the closest cluster.
- This function assumes that:
- each node of the topology is either an icr_candidate, a source or a receiver
- each source and receiver must have degree equal to 1
Parameters: - topology : Topology
The topology onto which deploy clusters
- clusters : list of sets
Router-cluster assignment. Each element of a list is a set of node identifiers. Nodes in the same set belong to the same cluster. The length of the list therefore corresponds to the number of clusters.
- assign_src_rcv : bool, optional
If True, the function labels source and receiver nodes with the cluster label of the router they are attached to.
-
compute_clusters
(topology, k, distance='delay', nbunch=None, n_iter=10)[source]¶ Cluster nodes of a topologies as to minimize the intra-cluster latency.
This function assumes that every link is labelled with latencies and performs clustering using the k-medoids method with the PAM algorithm.
Parameters: - topology : Topology
The topology
- k : int
The number of clusters
- distance : str, optional
The link metric used to represent distance between nodes. If None, hop count is used instead
- n_iter : int, optional
The number of iterations
icarus.scenarios.cacheplacement module¶
Cache placement strategies
This module provides algorithms for performing cache placement, i.e., given a cumulative cache size and a topology where each possible node candidate is labelled, these functions deploy caching space to the nodes of the topology.
-
uniform_cache_placement
(topology, cache_budget, **kwargs)[source]¶ Places cache budget uniformly across cache nodes.
Parameters: - topology : Topology
The topology object
- cache_budget : int
The cumulative cache budget
-
degree_centrality_cache_placement
(topology, cache_budget, **kwargs)[source]¶ Places cache budget proportionally to the degree of the node.
Parameters: - topology : Topology
The topology object
- cache_budget : int
The cumulative cache budget
-
betweenness_centrality_cache_placement
(topology, cache_budget, **kwargs)[source]¶ Places cache budget proportionally to the betweenness centrality of the node.
Parameters: - topology : Topology
The topology object
- cache_budget : int
The cumulative cache budget
-
uniform_consolidated_cache_placement
(topology, cache_budget, spread=0.5, metric_dict=None, target='top', **kwargs)[source]¶ Consolidate caches in nodes with top centrality.
Differently from other cache placement strategies that place cache space to all nodes but proportionally to their centrality, this strategy places caches of all the same size in a set of selected nodes.
Parameters: - topology : Topology
The topology object
- cache_budget : int
The cumulative cache budget
- spread : float [0, 1], optional
The spread factor, The greater it is the more the cache budget is spread among nodes. If it is 1, all candidate nodes are assigned a cache, if it is 0, only the node with the highest/lowest centrality is assigned a cache
- metric_dict : dict, optional
The centrality metric according to which nodes are selected. If not specified, betweenness centrality is selected.
- target : (“top” | “bottom”), optional
The subsection of the ranked node on which to the deploy caches.
-
random_cache_placement
(topology, cache_budget, n_cache_nodes, seed=None, **kwargs)[source]¶ Deploy caching nodes randomly
Parameters: - topology : Topology
The topology object
- cache_budget : int
The cumulative cache budget
- n_nodes : int
The number of caching nodes to deploy
-
optimal_median_cache_placement
(topology, cache_budget, n_cache_nodes, hit_ratio, weight='delay', **kwargs)[source]¶ Deploy caching nodes in locations that minimize overall latency assuming a partitioned strategy (a la Google Global Cache). According to this, in the network, a set of caching nodes are deployed and each receiver is mapped to one and only one caching node. Requests from this receiver are always sent to the designated caching node. In case of cache miss requests are forwarded to the original source.
This placement problem can be mapped to the p-median location-allocation problem. This function solves this problem using the vertex substitution heuristic, which practically works like the k-medoid PAM algorithms, which is also similar to the k-means clustering algorithm. The result is not guaranteed to be globally optimal, only locally optimal.
Parameters: - topology : Topology
The topology object
- cache_budget : int
The cumulative cache budget
- n_nodes : int
The number of caching nodes to deploy
- hit_ratio : float
The expected cache hit ratio of a single cache
- weight : str
The weight attribute
Notes
This placement assumes that all receivers have degree = 1 and are connected to an ICR candidate nodes. Also, it assumes that contents are uniformly assigned to sources.
-
optimal_hashrouting_cache_placement
(topology, cache_budget, n_cache_nodes, hit_ratio, weight='delay', **kwargs)[source]¶ Deploy caching nodes for hashrouting in optimized location
Parameters: - topology : Topology
The topology object
- cache_budget : int
The cumulative cache budget
- n_nodes : int
The number of caching nodes to deploy
- hit_ratio : float
The expected global cache hit ratio
- weight : str, optional
The weight attribute. Default is ‘delay’
-
clustered_hashrouting_cache_placement
(topology, cache_budget, n_clusters, policy, distance='delay', **kwargs)[source]¶ Deploy caching nodes for hashrouting in with clusters
Parameters: - topology : Topology
The topology object
- cache_budget : int
The cumulative cache budget
- n_clusters : int
The number of clusters
- policy : str (node_const | cluster_const)
The expected global cache hit ratio
- distance : str
The attribute used to quantify distance between pairs of nodes. Default is ‘delay’
icarus.scenarios.contentplacement module¶
Content placement strategies.
This module contains function to decide the allocation of content objects to source nodes.
-
uniform_content_placement
(topology, contents, seed=None)[source]¶ Places content objects to source nodes randomly following a uniform distribution.
Parameters: - topology : Topology
The topology object
- contents : iterable
Iterable of content objects
- source_nodes : list
List of nodes of the topology which are content sources
Returns: - cache_placement : dict
Dictionary mapping content objects to source nodes
Notes
A deterministic placement of objects (e.g., for reproducing results) can be achieved by using a fix seed value
-
weighted_content_placement
(topology, contents, source_weights, seed=None)[source]¶ - Places content objects to source nodes randomly according to the weight
- of the source node.
Parameters: - topology : Topology
The topology object
- contents : iterable
Iterable of content objects
- source_weights : dict
Dict mapping nodes nodes of the topology which are content sources and the weight according to which content placement decision is made.
Returns: - cache_placement : dict
Dictionary mapping content objects to source nodes
Notes
A deterministic placement of objects (e.g., for reproducing results) can be achieved by using a fix seed value
icarus.scenarios.topology module¶
Functions for creating or importing topologies for experiments.
To create a custom topology, create a function returning an instance of the IcnTopology class. An IcnTopology is simply a subclass of a Topology class provided by FNSS.
- A valid ICN topology must have the following attributes:
- Each node must have one stack among: source, receiver, router
- The topology must have an attribute called icr_candidates which is a set of router nodes on which a cache may be possibly deployed. Caches are not deployed directly at topology creation, instead they are deployed by a cache placement algorithm.
-
class
IcnTopology
(data=None, name='', **kwargs)[source]¶ Bases:
fnss.topologies.topology.Topology
Class modelling an ICN topology
An ICN topology is a simple FNSS Topology with addition methods that return sets of caching nodes, sources and receivers.
Attributes: - name
Methods
add_cycle
(nodes, **attr)Add a cycle. add_edge
(u, v[, attr_dict])Add an edge between u and v. add_edges_from
(ebunch[, attr_dict])Add all the edges in ebunch. add_node
(n[, attr_dict])Add a single node n and update node attributes. add_nodes_from
(nodes, **attr)Add multiple nodes. add_path
(nodes, **attr)Add a path. add_star
(nodes, **attr)Add a star. add_weighted_edges_from
(ebunch[, weight])Add all the edges in ebunch as weighted edges with specified weights. adjacency_iter
()Return an iterator of (node, adjacency dict) tuples for all nodes. adjacency_list
()Return an adjacency list representation of the graph. adjlist_dict_factory
alias of __builtin__.dict
applications
()Return a dictionary of all applications deployed, keyed by node buffers
()Return a dictionary of all buffer sizes, keyed by interface cache_nodes
()Return a dictionary mapping nodes with a cache and respective cache size capacities
()Return a dictionary of all link capacities, keyed by link clear
()Remove all nodes and edges from the graph. copy
()Return a copy of the topology. degree
([nbunch, weight])Return the degree of a node or nodes. degree_iter
([nbunch, weight])Return an iterator for (node, degree). delays
()Return a dictionary of all link delays, keyed by link edge_attr_dict_factory
alias of __builtin__.dict
edges
([nbunch, data, default])Return a list of edges. edges_iter
([nbunch, data, default])Return an iterator over the edges. get_edge_data
(u, v[, default])Return the attribute dictionary associated with edge (u,v). has_edge
(u, v)Return True if the edge (u,v) is in the graph. has_node
(n)Return True if the graph contains the node n. is_directed
()Return True if graph is directed, False otherwise. is_multigraph
()Return True if graph is a multigraph, False otherwise. nbunch_iter
([nbunch])Return an iterator of nodes contained in nbunch that are also in the graph. neighbors
(n)Return a list of the nodes connected to the node n. neighbors_iter
(n)Return an iterator over all neighbors of node n. node_dict_factory
alias of __builtin__.dict
nodes
([data])Return a list of the nodes in the graph. nodes_iter
([data])Return an iterator over the nodes. nodes_with_selfloops
()Return a list of nodes with self loops. number_of_edges
([u, v])Return the number of edges between two nodes. number_of_nodes
()Return the number of nodes in the graph. number_of_selfloops
()Return the number of selfloop edges. order
()Return the number of nodes in the graph. receivers
()Return a set of receiver nodes remove_edge
(u, v)Remove the edge between u and v. remove_edges_from
(ebunch)Remove all edges specified in ebunch. remove_node
(n)Remove node n. remove_nodes_from
(nodes)Remove multiple nodes. selfloop_edges
([data, default])Return a list of selfloop edges. size
([weight])Return the number of edges. sources
()Return a set of source nodes stacks
()Return a dictionary of all node stacks, keyed by node subgraph
(nbunch)Return the subgraph induced on nodes in nbunch. to_directed
()Return a directed representation of the topology. to_undirected
()Return an undirected copy of the topology. weights
()Return a dictionary of all link weights, keyed by link
-
topology_tree
(k, h, delay=1, **kwargs)[source]¶ Returns a tree topology, with a source at the root, receivers at the leafs and caches at all intermediate nodes.
Parameters: - h : int
The height of the tree
- k : int
The branching factor of the tree
- delay : float
The link delay in milliseconds
Returns: - topology : IcnTopology
The topology object
-
topology_path
(n, delay=1, **kwargs)[source]¶ Return a path topology with a receiver on node 0 and a source at node ‘n-1’
Parameters: - n : int (>=3)
The number of nodes
- delay : float
The link delay in milliseconds
Returns: - topology : IcnTopology
The topology object
-
topology_ring
(n, delay_int=1, delay_ext=5, **kwargs)[source]¶ Returns a ring topology
This topology is comprised of a ring of n nodes. Each of these nodes is attached to a receiver. In addition one router is attached to a source. Therefore, this topology has in fact 2n + 1 nodes.
It models the case of a metro ring network, with many receivers and one only source towards the core network.
Parameters: - n : int
The number of routers in the ring
- delay_int : float
The internal link delay in milliseconds
- delay_ext : float
The external link delay in milliseconds
Returns: - topology : IcnTopology
The topology object
-
topology_mesh
(n, m, delay_int=1, delay_ext=5, **kwargs)[source]¶ Returns a ring topology
This topology is comprised of a mesh of n nodes. Each of these nodes is attached to a receiver. In addition m router are attached each to a source. Therefore, this topology has in fact 2n + m nodes.
Parameters: - n : int
The number of routers in the ring
- m : int
The number of sources
- delay_int : float
The internal link delay in milliseconds
- delay_ext : float
The external link delay in milliseconds
Returns: - topology : IcnTopology
The topology object
-
topology_geant
(**kwargs)[source]¶ Return a scenario based on GEANT topology
Parameters: - seed : int, optional
The seed used for random number generation
Returns: - topology : fnss.Topology
The topology object
-
topology_tiscali
(**kwargs)[source]¶ Return a scenario based on Tiscali topology, parsed from RocketFuel dataset
Parameters: - seed : int, optional
The seed used for random number generation
Returns: - topology : fnss.Topology
The topology object
-
topology_wide
(**kwargs)[source]¶ Return a scenario based on GARR topology
Parameters: - seed : int, optional
The seed used for random number generation
Returns: - topology : fnss.Topology
The topology object
-
topology_garr
(**kwargs)[source]¶ Return a scenario based on GARR topology
Parameters: - seed : int, optional
The seed used for random number generation
Returns: - topology : fnss.Topology
The topology object
-
topology_rocketfuel_latency
(asn, source_ratio=0.1, ext_delay=34, **kwargs)[source]¶ Parse a generic RocketFuel topology with annotated latencies
To each node of the parsed topology it is attached an artificial receiver node. To the routers with highest degree it is also attached a source node.
Parameters: - asn : int
AS number
- source_ratio : float
Ratio between number of source nodes (artificially attached) and routers
- ext_delay : float
Delay on external nodes
icarus.scenarios.workload module¶
Traffic workloads
Every traffic workload to be used with Icarus must be modelled as an iterable class, i.e. a class with at least an __init__ method (through which it is initialized, with values taken from the configuration file) and an __iter__ method that is called to return a new event.
Each call to the __iter__ method must return a 2-tuple in which the first element is the timestamp at which the event occurs and the second is a dictionary, describing the event, which must contain at least the three following attributes:
- receiver: The name of the node issuing the request
- content: The name of the content for which the request is issued
- log: A boolean value indicating whether this request should be logged or not for measurement purposes.
Each workload must expose the contents attribute which is an iterable of all content identifiers. This is needed for content placement.
-
class
StationaryWorkload
(topology, n_contents, alpha, beta=0, rate=1.0, n_warmup=100000, n_measured=400000, seed=None, **kwargs)[source]¶ Bases:
object
This function generates events on the fly, i.e. instead of creating an event schedule to be kept in memory, returns an iterator that generates events when needed.
This is useful for running large schedules of events where RAM is limited as its memory impact is considerably lower.
These requests are Poisson-distributed while content popularity is Zipf-distributed
All requests are mapped to receivers uniformly unless a positive beta parameter is specified.
If a beta parameter is specified, then receivers issue requests at different rates. The algorithm used to determine the requests rates for each receiver is the following:
- All receiver are sorted in decreasing order of degree of the PoP they are attached to. This assumes that all receivers have degree = 1 and are attached to a node with degree > 1
- Rates are then assigned following a Zipf distribution of coefficient beta where nodes with higher-degree PoPs have a higher request rate
Parameters: - topology : fnss.Topology
The topology to which the workload refers
- n_contents : int
The number of content object
- alpha : float
The Zipf alpha parameter
- beta : float, optional
Parameter indicating
- rate : float, optional
The mean rate of requests per second
- n_warmup : int, optional
The number of warmup requests (i.e. requests executed to fill cache but not logged)
- n_measured : int, optional
The number of logged requests after the warmup
Returns: - events : iterator
Iterator of events. Each event is a 2-tuple where the first element is the timestamp at which the event occurs and the second element is a dictionary of event attributes.
-
name
= 'STATIONARY'¶
-
class
GlobetraffWorkload
(topology, reqs_file, contents_file, beta=0, **kwargs)[source]¶ Bases:
object
Parse requests from GlobeTraff workload generator
All requests are mapped to receivers uniformly unless a positive beta parameter is specified.
If a beta parameter is specified, then receivers issue requests at different rates. The algorithm used to determine the requests rates for each receiver is the following:
- All receiver are sorted in decreasing order of degree of the PoP they are attached to. This assumes that all receivers have degree = 1 and are attached to a node with degree > 1
- Rates are then assigned following a Zipf distribution of coefficient beta where nodes with higher-degree PoPs have a higher request rate
Parameters: - topology : fnss.Topology
The topology to which the workload refers
- reqs_file : str
The GlobeTraff request file
- contents_file : str
The GlobeTraff content file
- beta : float, optional
Spatial skewness of requests rates
Returns: - events : iterator
Iterator of events. Each event is a 2-tuple where the first element is the timestamp at which the event occurs and the second element is a dictionary of event attributes.
-
name
= 'GLOBETRAFF'¶
-
class
TraceDrivenWorkload
(topology, reqs_file, contents_file, n_contents, n_warmup, n_measured, rate=1.0, beta=0, **kwargs)[source]¶ Bases:
object
Parse requests from a generic request trace.
- This workload requires two text files:
- a requests file, where each line corresponds to a string identifying the content requested
- a contents file, which lists all unique content identifiers appearing in the requests file.
Since the trace do not provide timestamps, requests are scheduled according to a Poisson process of rate rate. All requests are mapped to receivers uniformly unless a positive beta parameter is specified.
If a beta parameter is specified, then receivers issue requests at different rates. The algorithm used to determine the requests rates for each receiver is the following:
- All receiver are sorted in decreasing order of degree of the PoP they are attached to. This assumes that all receivers have degree = 1 and are attached to a node with degree > 1
- Rates are then assigned following a Zipf distribution of coefficient beta where nodes with higher-degree PoPs have a higher request rate
Parameters: - topology : fnss.Topology
The topology to which the workload refers
- reqs_file : str
The path to the requests file
- contents_file : str
The path to the contents file
- n_contents : int
The number of content object (i.e. the number of lines of contents_file)
- n_warmup : int
The number of warmup requests (i.e. requests executed to fill cache but not logged)
- n_measured : int
The number of logged requests after the warmup
- rate : float, optional
The network-wide mean rate of requests per second
- beta : float, optional
Spatial skewness of requests rates
Returns: - events : iterator
Iterator of events. Each event is a 2-tuple where the first element is the timestamp at which the event occurs and the second element is a dictionary of event attributes.
-
name
= 'TRACE_DRIVEN'¶
-
class
YCSBWorkload
(workload, n_contents, n_warmup, n_measured, alpha=0.99, seed=None, **kwargs)[source]¶ Bases:
object
Yahoo! Cloud Serving Benchmark (YCSB)
The YCSB is a set of reference workloads used to benchmark databases and, more generally any storage/caching systems. It comprises five workloads:
Workload Operations Record selection A - Update heavy B - Read heavy C - Read only D - Read latest E - Short ranges Read: 50%, Update: 50% Read: 95%, Update: 5% Read: 100% Read: 95%, Insert: 5% Scan: 95%, Insert 5% Zipfian Zipfian Zipfian Latest Zipfian/Uniform Notes
At the moment only workloads A, B and C are implemented, since they are the most relevant for caching systems.
-
name
= 'YCSB'¶
-
Module contents¶
This package contains the code for generating simulation scenarios.