app.domain
Subpackages
Submodules
app.domain.cluster_groups
This module contains domain specific classes that represent groups of
storage nodes.
-
class
Cluster(master, file_name, members, sim_id=0, origin='')[source] Bases:
objectRepresents a group of network nodes ensuring the durability of a file.
-
id A unique identifier of the
Clusterinstance.- Type
-
current_epoch The simulation’s current epoch.
- Type
-
corruption_chances A two-element list containing the probability of
FileBlockDatabeing corrupted and not being corrupted, respectively. Seeget_disk_error_chances()for corruption chance configuration.- Type
List[float]
-
master A reference to a server that coordinates or monitors the
Cluster.- Type
-
members A collection of network nodes that belong to the
Cluster.- Type
-
file A reference to
FileDataobject that represents the file being persisted by the Cluster instance.- Type
-
critical_size Minimum number of network nodes plus required to exist in the Cluster to assure the target replication level.
- Type
-
sufficient_size Sum of
critical_sizeand the number of nodes expected to fail between two successive recovery phases.- Type
-
redundant_size Application-specific parameter, which indicates that membership of the Cluster must be pruned.
- Type
-
running Indicates if the Cluster instance is active. Used by
Masterto manage the simulation processes.- Type
-
_membership_changed Flag indicates wether or not
_members_viewneeds to be updated duringmembership_maintenance(). The variable is set to false at the beggining of every epoch and set to true if the length ofoff_nodeslist return bynodes_execute()is bigger than zero.- Type
-
_recovery_epoch_sum Helper attribute that facilitates the storage of the sum of the values returned by all
set_recovery_epoch()method calls. Important for logging purposes.- Type
-
_recovery_epoch_calls Helper attribute that facilitates the storage of the sum of the values returned by all
set_recovery_epoch()method calls throughout thecurrent_epoch.- Type
-
__init__(master, file_name, members, sim_id=0, origin='')[source] Instantiates an
Clusterobject- Parameters
master (
MasterType) – A reference to anMasterobject that manages theClusterbeing initialized.file_name (str) – The name of the file the
Clusteris responsible for persisting.members (
NodeDict) – A dictionary where keys arenode identifiersand values are theirinstance objects.sim_id (int) – Identifier that generates unique output file names, thus guaranteeing that different simulation instances do not overwrite previous out files.
origin (str) – The name of the simulation file name that started the simulation process.
- Return type
-
_get_new_members()[source] Helper method that searches for possible
network nodeby querying themasterof theCluster.- Returns
A dictionary mapping where keys are
node identifiersand values arenode instances.- Return type
-
_log_evaluation(plive, ptotal=- 1)[source] Helper that collects
Clusterdata and registers it on aloggerobject.- Parameters
plive (int) – The number of existing parts in the cluster at the simulation’s current epoch at online or suspect nodes.
ptotal (int) – The number of existing parts in the cluster at the simulation’s current epoch. This parameter is optional and may be used or not depending on the intent of the system. As a rule of thumb
plivetracks the number of parts that are alive in the system for logging purposes, where asptotalis used for comparisons and averages, e.g.,SGCluster evaluate.
- Return type
-
_set_fail(message)[source] Ends the Cluster instance simulation.
Sets
runningtoFalseand ordersFileDatato writecollected logsto disk and close it’sout_filestream.
-
_setup_epoch(epoch)[source] Initializes some attributes cluster attributes at the start of an epoch.
This method also forces all of the
Clustersmembers to update their connectivity status before any node is instructed to execute.
-
complain(complainter, complainee, reason)[source] Registers a complaint against a possibly offline node.
Note
This method provides no default functionality and should be overridden in sub classes if required.
- Parameters
complainter (str) – The identifier of the complaining
network node.complainee (str) – The identifier of the
network nodebeing complained about.reason (
app.type_hints.HttpResponse) – Thehttp codethat led to the complaint.
- Return type
-
evaluate()[source] Evaluates and logs the health, possibly other parameters, of the
Clusterat every epoch.- Return type
-
execute_epoch(epoch)[source] Orders all
membersto execute their epoch.Note
If the
Clusterterminates early, before it reachesMAX_EPOCHS, nothing should be logged inLoggingDataat the specifiedepochto avoid skewing previously collected results.
-
get_cluster_status()[source] Determines the
Cluster’s status based on the length of the currentmemberslist.- Returns
The status of the
Clusteras a string.- Return type
-
get_node()[source] Retrives a random node from the members of the cluster group, whose status is likely to be online.
-
maintain(off_nodes)[source] Offers basic maintenance functionality for Cluster types.
If
off_nodeslist param as at least one node reference,_membership_changedis set toTrue.- Parameters
off_nodes (List[th.NodeType]) – A possibly empty of offline nodes.
- Return type
-
membership_maintenance()[source] Attempts to recruits new network nodes to be members of the cluster.
The method updates both
membersand_members_view.- Returns
A dictionary that is empty if membership did not change.
- Return type
-
nodes_execute()[source] Queries all
membersto execute the epoch.This method logs the amount of lost replicas throughout
current_epochaccording to thememberswho went offline and theFileBlockDatareplicas they posssed and is responsible forsetting a replication epoch. Similarly it logs the number of members who disconnected.- Returns
List of
membersthat disconnected during thecurrent_epoch. Seeapp.domain.network_nodes.Node.update_status().- Return type
List[
NodeType]
-
route_part(sender, receiver, replica, is_fresh=False)[source] Sends a
file block replicato some othernetwork nodeinmembers.- Parameters
sender (str) – An identifier of the
network nodewho is sending the message.receiver (str) – The destination
network nodeidentifier.replica (
FileBlockData) – Thefile block replicato be sent specified destination:receiver.is_fresh (bool) – Prevents recently created replicas from being corrupted, since they are not likely to be corrupted in disk. This argument facilitates simulation.
- Returns
An http code sent by the
receiver.- Return type
-
set_replication_epoch(replica)[source] Delegates to
set_replication_epoch().- Parameters
replica (domain.helpers.smart_dataclasses.FileBlockData) – The
file block replicathat was lost.- Return type
-
spread_files(replicas, strat='i')[source] Distributes a collection of
file block replicasamong themembersof the cluster group.- Parameters
replicas (
ReplicasDict) – TheFileBlockDatareplicas, without replication.strat (str) –
Defines how
replicaswill be initially distributed in theCluster. Unless overridden in children of this class the received value ofstratwill be ignored and will always be set to the default valuei.- i
This strategy creates a probability vector containing the normalization of
network nodes uptimes'and uses that vector to randomly select whichnodewill receive each replica. There is a bias to give more replicas to the most resillentnodeswhich results from using the created probability vector.
- Return type
-
-
class
HDFSCluster(master, file_name, members, sim_id=0, origin='')[source] Bases:
app.domain.cluster_groups.ClusterRepresents a group of network nodes ensuring the durability of a file in a Hadoop Distributed File System scenario.
Note
Members of
HDFSClusterare of typeHDFSNode, they do not perform swarm guidance behaviors and instead report with regular heartbeats to theirmonitors. This class could be a NameNode Server in HDFS or a master server in GFS.-
suspicious_nodes A set containing the identifiers of suspicious
network nodes.- Type
-
data_node_heartbeats A dictionary mapping
node identifiersto the number of complaints made against them. Each node has five lives. When they miss five beats in a row, i.e., when the dictionary value count is zero, they are evicted from the cluster.
-
__init__(master, file_name, members, sim_id=0, origin='')[source] Instantiates an
Clusterobject- Parameters
master (
MasterType) – A reference to anMasterobject that manages theClusterbeing initialized.file_name (str) – The name of the file the
Clusteris responsible for persisting.members (
NodeDict) – A dictionary where keys arenode identifiersand values are theirinstance objects.sim_id (int) – Identifier that generates unique output file names, thus guaranteeing that different simulation instances do not overwrite previous out files.
origin (str) – The name of the simulation file name that started the simulation process.
- Return type
-
evaluate()[source] Logs the number of existing replicas in the
HDFSCluster.- Overrides:
- Return type
-
maintain(off_nodes)[source] Evicts any
network nodewhose heartbeats indata_node_heartbeatsreached zero.
-
membership_maintenance()[source] Attempts to recruits new network nodes to be members of the cluster.
The method updates both
membersand_members_view.- Returns
A dictionary that is empty if membership did not change.
- Return type
-
nodes_execute()[source] Queries all
membersto execute the epoch.- Returns
A collection of
memberswho disconnected during the current epoch. Seeapp.domain.network_nodes.HDFSNode.update_status().- Return type
List[
NodeType]
-
-
class
NewscastCluster(master, file_name, members, sim_id=0, origin='')[source] Bases:
app.domain.cluster_groups.ClusterRepresents a P2P network of nodes performing mean degree aggregation, while simultaneously using Newscast for
view shuffling.-
__init__(master, file_name, members, sim_id=0, origin='')[source] Instantiates an
Clusterobject- Parameters
master (
MasterType) – A reference to anMasterobject that manages theClusterbeing initialized.file_name (str) – The name of the file the
Clusteris responsible for persisting.members (
NodeDict) – A dictionary where keys arenode identifiersand values are theirinstance objects.sim_id (int) – Identifier that generates unique output file names, thus guaranteeing that different simulation instances do not overwrite previous out files.
origin (str) – The name of the simulation file name that started the simulation process.
- Return type
-
_setup_epoch(epoch)[source] Initializes some attributes cluster attributes at the start of an epoch.
-
evaluate()[source] Prints the epoch’s aggregated peer degree, to the command-line interface.
- Return type
-
execute_epoch(epoch)[source] Orders all
membersto execute their epoch.Note
If the
Clusterterminates early, before it reachesMAX_EPOCHS, nothing should be logged inLoggingDataat the specifiedepochto avoid skewing previously collected results.
-
nodes_execute()[source] Queries all network node members execute the epoch.
- Overrides:
app.domain.cluster_groups.Cluster.nodes_execute().- Note:
NewscasterCluster.nodes_executealways returns None.
- Returns
A collection of members who disconnected during the current epoch. See
app.domain.network_nodes.NewscastNode.update_status().- Return type
List[
NodeType]
-
spread_files(replicas, strat='o')[source] Distributes a collection of
file block replicasamong themembersof the cluster group.- Overrides:
app.dommain.cluster_groups.Cluster.spread_files()
- Parameters
replicas (
ReplicasDict) – TheFileBlockDatareplicas, without replication.strat (str) –
Defines how
replicaswill be initially distributed in theCluster. Unless overridden in children of this class the received value ofstratwill be ignored and will always be set to the default valueo.- o
This strategy assumes erasure-coding is being used and that each
network nodewill have no more than one encoded block, i.e., replication level is always equal to one. Note however, that if there are more encoded blocks than there arenetwork nodes, some of thesenodesmight end up possessing an excessive amount of blocks.
- Return type
-
wire_k_out()[source] Creates a random directed P2P topology.
The initial cache size of each
network node, is at most as big asNEWSCAST_CACHE_SIZE.Note
The topology does not have self loops, because
add_neighbor()does not accept node self addition toview. In rare occasions, the selected node out-going edges might all be invalid, this should be a non-issue, as the nodes will eventually join the overaly throughout the simulation.
-
-
class
SGCluster(master, file_name, members, sim_id=0, origin='')[source] Bases:
app.domain.cluster_groups.ClusterRepresents a group of network nodes persisting a file using swarm guidance algorithm.
-
v_ Density distribution cluster members must achieve with independent realizations for ideal persistence of the file.
- Type
-
cv_ Tracks the file current density distribution, updated at each epoch.
- Type
-
avg_ Tracks the file average density distribution. Used to assert if throughout the life time of a cluster, the desired density distribution
v_was achieved on average. Differs fromcv_because cv_ is used for instantaneous convergence comparison.- Type
-
__init__(master, file_name, members, sim_id=0, origin='')[source] Instantiates an
Clusterobject- Parameters
master (
MasterType) – A reference to anMasterobject that manages theClusterbeing initialized.file_name (str) – The name of the file the
Clusteris responsible for persisting.members (
NodeDict) – A dictionary where keys arenode identifiersand values are theirinstance objects.sim_id (int) – Identifier that generates unique output file names, thus guaranteeing that different simulation instances do not overwrite previous out files.
origin (str) – The name of the simulation file name that started the simulation process.
- Return type
-
_log_evaluation(pcount, ptotal=- 1)[source] Helper that collects
Clusterdata and registers it on aloggerobject.- Parameters
plive – The number of existing parts in the cluster at the simulation’s current epoch at online or suspect nodes.
ptotal (int) – The number of existing parts in the cluster at the simulation’s current epoch. This parameter is optional and may be used or not depending on the intent of the system. As a rule of thumb
plivetracks the number of parts that are alive in the system for logging purposes, where asptotalis used for comparisons and averages, e.g.,SGCluster evaluate.pcount (int) –
- Return type
-
_normalize_avg_()[source]
-
_pretty_print_eq_distr_table(target, rtol, atol)[source] Pretty prints a PSQL formatted table for visual vector comparison.
-
_validate_transition_matrix(m, v_)[source] Asserts if
mis a Markov Matrix.Verification is done by raising the
mto the power of4096(just a large number) and checking if all columns of the powered matrix are element-wise equal to the entries oftarget_distribution.
-
add_cloud_reference()[source] Adds a cloud server to the
membersof theSGCluster.This method is used when
SGClustermembership size becomes compromised and a backup solution using cloud approaches is desired. The idea is that surviving members upload their replicas to the cloud server, e.g., an Amazon S3 instance. See Master methodget_cloud_reference()for more details.Note
This method is virtual.
- Return type
-
broadcast_transition_matrix(m)[source] Slices a matrix and delivers columns to the respective
network nodes.- Parameters
m (
DataFrame) – A matrix to be broadcasted to the network nodes belonging who are currently members of the Cluster instance.- Return type
Note
An optimization could be made that configures a transition matrix for the cluster, independent of of file names, i.e., turn cluster groups into groups persisting multiple files instead of only one, thus reducing simulation spaceoverheads and in real-life scenarios, decreasing the load done to metadata servers, through queries and matrix calculations. For simplicity of implementation each cluster only manages one file.
-
create_and_bcast_new_transition_matrix()[source] Helper method that attempts to generate a markov matrix to be sliced and distributed to the
SGClustermembers.At most three transition matrices will be generated. The first to be successfully
validatedis distributed to thenetwork nodes. If all matrices are invalid, the last matrix will be used to prevent infinite loops in the simulation. This is not an issue as eventually the membership of theSGClusterwill change, thus, more opportunities to perform a correct swarm guidance behavior will be possible.- Return type
-
equal_distributions()[source] Asserts if the
desired distributionandcurrent distributionare equal.Equalility is calculated using numpy allclose function which has the following formula:
absolute(`a` - `b`) <= (`atol` + `rtol` * absolute(`b`))
- Returns
Trueif distributions are close enough to be considered equal, otherwise, it returnsFalse.- Return type
-
evaluate()[source] Evaluates and logs the health, possibly other parameters, of the
Clusterat every epoch.- Return type
-
execute_epoch(epoch)[source] Orders all
membersto execute their epoch.Note
If the
Clusterterminates early, before it reachesMAX_EPOCHS, nothing should be logged inLoggingDataat the specifiedepochto avoid skewing previously collected results.
-
maintain(off_nodes)[source] Evicts any node who is referenced in off_nodes list.
-
membership_maintenance()[source] Attempts to recruits new network nodes to be members of the cluster.
The method updates both
membersand_members_view.- Extends:
app.domain.cluster_groups.Cluster.membership_maintenance().SGCluster.membership_maintenanceadds and removes cloud references depending depending on the length ofmembersbefore maintenance is performed.
- Returns
A dictionary that is empty if membership did not change.
- Return type
-
new_desired_distribution(member_ids, member_uptimes)[source] Sets a new
desired distributionfor theSGCluster.Received
member_uptimesare normalized to create a stochastic representation of the desired distribution, which can be used by the different transition matrix generation strategies.- Parameters
member_ids (List[str]) – A list of
node identifierswho aremembersof theSGCluster.member_uptimes (List[float]) – A list of
node identifiers.
- Return type
List[float]
Note
member_idsandmember_uptimeselements at each index should belong to each other, i.e., they should originate from from the samenetwork node.
-
new_transition_matrix()[source] Creates a new transition matrix that is likely to be a Markov Matrix.
- Returns
The labeled matrix that has the fastests mixing rate from all the pondered strategies.
- Return type
-
nodes_execute()[source] Queries all network node members execute the epoch.
- Returns
A collection of members who disconnected during the current epoch. See
app.domain.network_nodes.Node.update_status().- Return type
List[
NodeType]
-
remove_cloud_reference()[source] Remove cloud references and delete files within it
Note
This method is virtual.
- Return type
-
select_fastest_topology(a, v_)[source] Creates multiple transition matrices and selects the fastest.
The fastest of the created transition matrices corresponds to the one with a faster mixing rate.
- Parameters
- Returns
A transition matrix that is likely to be a markov matrix whose steady state is
v_, but is not yet validated. See_validate_transition_matrix().- Return type
-
spread_files(replicas, strat='i')[source] Distributes a collection of
FileBlockDataobjects among themembersof theSGCluster.- Parameters
replicas (
ReplicasDict) – TheFileBlockDatareplicas, without replication.strat (str) –
Defines how
replicaswill be initially distributed in theCluster.- u
Each
file block replicainreplicasis distributed following a uniform probability vector amongmembersof the cluster group.- a
Each
file block replicainreplicasis given up toNdifferentmemberswhereNis equal toREPLICATION_LEVEL.- i
Each
file block replicainreplicaswith bias towards the ideal steady state distribution. This implementation of differs fromapp.domain.cluster_groups.Cluster.spread_files(), because it is not necessarely based onnodeuptime.
- Return type
-
-
class
SGClusterExt(master, file_name, members, sim_id=0, origin='')[source] Bases:
app.domain.cluster_groups.SGClusterRepresents a group of network nodes persisting a file.
SGClusterExtinstances differ fromSGClusterbecause their members are of typeSGNodeExt. When combined these classes give nodes the responsibility of collaborating in the detection of faulty members of theSGClusterExtand eventually kicking them out of the group.-
complaint_threshold Reference value that defines the maximum number of complaints a
network nodecan receive before it is evicted from theSGClusterExt.- Type
-
nodes_complaints A dictionary mapping
network node identifiers'to the number of complaints made against them by othermembers. When complaints becomes bigger than py:py:attr:complaint_threshold the complaintee is evicted from the group.
-
suspicious_nodes A dictionary containing the unique
node identifiersof known suspicious members and how many epochs have passed since they changed to such status.
-
_epoch_complaints A set of unique identifiers formed from the concatenation of
node identifiers, to avoid multiple complaint registrations on the same epoch, done by the same source towards the same target. The set is reset every epoch.- Type
-
__init__(master, file_name, members, sim_id=0, origin='')[source] Instantiates an
Clusterobject- Parameters
master (
MasterType) – A reference to anMasterobject that manages theClusterbeing initialized.file_name (str) – The name of the file the
Clusteris responsible for persisting.members (
NodeDict) – A dictionary where keys arenode identifiersand values are theirinstance objects.sim_id (int) – Identifier that generates unique output file names, thus guaranteeing that different simulation instances do not overwrite previous out files.
origin (str) – The name of the simulation file name that started the simulation process.
- Return type
-
complain(complainter, complainee, reason)[source] Registers a complaint against a possibly offline node.
A unique identifier for the complaint is generated by concatenation of the complainter and the complainee unique identifiers.
-
execute_epoch(epoch)[source] Orders all
membersto execute their epoch.Note
If the
Clusterterminates early, before it reachesMAX_EPOCHS, nothing should be logged inLoggingDataat the specifiedepochto avoid skewing previously collected results.
-
maintain(off_nodes)[source] Evicts any
network nodewho has been complained about more thancomplaint_thresholdtimes.- Overrides:
-
nodes_execute()[source] Queries all network node members execute the epoch.
- Overrides:
app.domain.cluster_groups.SGCluster.nodes_execute().Offline
network nodesare considered suspects until enough complaints from otherSGNodeExtmembersare received. This is important because lost parts can not be logged multiple times. Yet suspectednetwork nodesneed to be contabilized as offline for simulation purposes without being evicted from the group until they are detected by their peers as being offline.
- Returns
A collection of
memberswho disconnected during the current epoch. Seeapp.domain.network_nodes.SGNodeExt.update_status().- Return type
List[
NodeType]
-
-
class
SGClusterPerfect(master, file_name, members, sim_id=0, origin='')[source] Bases:
app.domain.cluster_groups.SGClusterRepresents a group of network nodes persisting a file using swarm guidance algorithm.
This implementation assumes nodes never disconnect, there are no disk errors and there is no link loss, i.e., it is used to study properties of the system independently of computing environment.
-
__init__(master, file_name, members, sim_id=0, origin='')[source] Instantiates an
Clusterobject- Parameters
master (
MasterType) – A reference to anMasterobject that manages theClusterbeing initialized.file_name (str) – The name of the file the
Clusteris responsible for persisting.members (
NodeDict) – A dictionary where keys arenode identifiersand values are theirinstance objects.sim_id (int) – Identifier that generates unique output file names, thus guaranteeing that different simulation instances do not overwrite previous out files.
origin (str) – The name of the simulation file name that started the simulation process.
- Return type
-
execute_epoch(epoch)[source] Orders all
membersto execute their epoch.Note
If the
Clusterterminates early, before it reachesMAX_EPOCHS, nothing should be logged inLoggingDataat the specifiedepochto avoid skewing previously collected results.
-
new_transition_matrix()[source] Creates a new transition matrix that is likely to be a Markov Matrix.
- Returns
The labeled matrix that has the fastests mixing rate from all the pondered strategies.
- Return type
-
nodes_execute()[source] Queries all network node members execute the epoch.
- Returns
A collection of members who disconnected during the current epoch. See
app.domain.network_nodes.Node.update_status().- Return type
List[
NodeType]
-
select_fastest_topology(a, v_)[source] Creates multiple transition matrices and selects the fastest.
The fastest of the created transition matrices corresponds to the one with a faster mixing rate.
- Parameters
- Returns
A transition matrix that is likely to be a markov matrix whose steady state is
v_, but is not yet validated. See_validate_transition_matrix().- Return type
-
app.domain.master_servers
This module contains domain specific classes that coordinate all
app.domain.cluster_groups of a simulation instance. These could
simulate centralized authentication servers, file localization or
file metadata servers or a bank of currently online and offline
storage nodes.
-
class
HDFSMaster(simfile_name, sid, epochs, cluster_class, node_class)[source] Bases:
app.domain.master_servers.Master-
_process_simfile(path, cluster_class, node_class)[source] Opens and processes the simulation filed referenced in path.
- Overrides:
app.domain.master_servers.Master._process_simfile().The method is exactly the same except for one instruction. The
_split_files()is invoked with fixed bsize = 1MB. The reason for this is two-fold:The default and, thus recommended, block size for the hadoop distributed file system is 128MB. The system is not designed to perform well with small file blocks, but SG requires many file blocks to work, hence being more effective with small block sizes.
Hadoop limits the minimum block size to be 1MB, dfs.namenode.fs-limits.min-block-size. For this reason, we make HDFSMaster split files into 1MB chunks, as that is the closest we would get to our Hive’s default block size in the real world.
The other difference is that the spread strategy is ignored. We are not interested in knowing if the way the files are initially spread affects the time it takes for clusters to achieve a steady-state distribution since in HDFS
file block replicasare stationary on data nodes until they die.
- Parameters
path (str) – The path to the simulation file. Including extension and parent folders.
cluster_class (str) – The name of the class used to instantiate cluster group instances through reflection. See
app.domain.cluster_groups.node_class (str) – The name of the class used to instantiate network node instances through reflection. See
app.domain.network_nodes.
- Return type
-
-
class
Master(simfile_name, sid, epochs, cluster_class, node_class)[source] Bases:
objectSimulation manager class, some kind of puppet-master. Could represent an authentication server or a monitor that decides along with other
Masterentities whatnetwork nodesare online using consensus algorithms.-
origin The name of the simulation file name that started the simulation process.
- Type
-
sid Identifier that generates unique output file names, thus guaranteeing that different simulation instances do not overwrite previous out files.
- Type
-
epoch The simulation’s current epoch.
- Type
-
cluster_groups A collection of
cluster groupsmanaged by theMaster. Keys arecluster identifiersand values are the cluster instances.
-
network_nodes A dictionary mapping
node identifiersto their instance objects. This collection differs fromapp.domain.cluster_groups.Cluster.membersattribute in the sense that the formernetwork_nodesincludes all nodes, both online and offline, available on the entire distributed backup storage system regardless of their participation in anycluster group.
-
__init__(simfile_name, sid, epochs, cluster_class, node_class)[source] Instantiates an Master object.
- Parameters
simfile_name (str) – A path to the simulation file to be run by the simulator.
sid (int) – Identifier that generates unique output file names, thus guaranteeing that different simulation instances do not overwrite previous out files.
epochs (int) – The number of discrete time steps the simulation lasts.
cluster_class (str) – The name of the class used to instantiate cluster group instances through reflection. See
cluster groups module.node_class (str) – The name of the class used to instantiate network node instances through reflection. See
network nodes module.
- Return type
-
_create_network_nodes(json, node_class)[source] Helper method that instantiates all
network nodesthat are specified in the simulation file.
-
_new_cluster_group(cluster_class, size, fname)[source] Helper method that initializes a new Cluster group.
- Parameters
cluster_class (str) – The name of the class used to instantiate cluster group instances through reflection. See
cluster groups module.fname (str) – The name of the fille being stored in the cluster.
- Returns
The
Clusterinstance.- Return type
-
_new_network_node(node_class, nid, node_uptime)[source] Helper method that initializes a new Node.
- Parameters
node_class (str) – The name of the class used to instantiate network node instances through reflection. See
network nodes module.nid (str) – An id that will uniquely identifies the
network node.node_uptime (str) – A float value in string representation that defines the uptime of the network node.
- Returns
The
Nodeinstance.- Return type
-
_process_simfile(path, cluster_class, node_class)[source] Opens and processes the simulation filed referenced in
path.This method opens the file reads the json data inside it. Combined with
app.environment_settingsit sets up the class instances to be used during the simulation (e.g.,cluster groupsandnetwork nodes). This method also be splits the file to be persisted in the simulation into multipleblocksorchunksand for triggering the initialfile spreadingmechanism.- Parameters
path (str) – The path to the simulation file. Including extension and parent folders.
cluster_class (str) – The name of the class used to instantiate cluster group instances through reflection. See
app.domain.cluster_groups.node_class (str) – The name of the class used to instantiate network node instances through reflection. See
app.domain.network_nodes.
- Return type
-
_split_files(fname, cluster, bsize)[source] Helper method that splits the files into multiple blocks to be persisted in a
cluster group.- Parameters
fname (str) – The name of the file located in
SHARED_ROOTfolder to be read and splitted.cluster (
ClusterType) – A reference to thecluster groupwhosememberswill be responsible for ensuring the file specified infnamebecomes durable.bsize (int) – The maximum amount of bytes each file block can have.
- Returns
A dictionary in which the keys are integers and values are
file blocks, whose attributenumberis the key.- Return type
-
find_online_nodes(n=1, blacklist=None)[source] Finds
nnetwork nodeswho are currently registered at theMasterand whose status is online.- Parameters
n (int) – How many
network nodereferences the requesting entity wants to find.blacklist (
NodeDict) – A collection ofnodes identifiersand their object instances, which specify nodes the requesting entity has no interest in.
- Returns
A collection of
network nodeswhich is at most as big asn, which does not include any node named inblacklist.- Return type
-
MAX_EPOCHS: Optional[int] = None
-
MAX_EPOCHS_PLUS_ONE: Optional[int] = None
-
-
class
NewscastMaster(simfile_name, sid, epochs, cluster_class, node_class)[source] Bases:
app.domain.master_servers.Master-
__init__(simfile_name, sid, epochs, cluster_class, node_class)[source] Instantiates an Master object.
- Parameters
simfile_name (str) – A path to the simulation file to be run by the simulator.
sid (int) – Identifier that generates unique output file names, thus guaranteeing that different simulation instances do not overwrite previous out files.
epochs (int) – The number of discrete time steps the simulation lasts.
cluster_class (str) – The name of the class used to instantiate cluster group instances through reflection. See
cluster groups module.node_class (str) – The name of the class used to instantiate network node instances through reflection. See
network nodes module.
- Return type
-
_process_simfile(path, cluster_class, node_class)[source] Opens and processes the simulation filed referenced in path.
- Overrides:
app.domain.master_servers.Master._process_simfile().Newscast is a gossip-based P2P network. We assume erasure-coding would be used in this scenario and thus, for simplicity, we divide the specified file’s size into multiple
1/N, whereNis the number ofnetwork nodesin the system.
Note
This class,
NewscastClusterandNewscastNodewere created to test our simulators performance, concerning the amount of supported simultaneous network nodes in a simulation. We do not actually care if the created file blocks are lost as thenetwork nodesjob in the simulation is to carry out the protocol defined in PeerSim’s AverageFunction. PeerSim uses configurationExample 2provided in release 1.0.5, as a means of testing the simulator performance, according to this Ms.C. dissertation by J. Neto. This configuration uses Newscast protocol with AverageFunction and periodic monitoring of the system state. We implement our version of Adaptaive Peer Sampling with Newscast by N. Tölgyesi and M. Jelasity, to avoid the effort of translating PeerSim’s code.- Parameters
path (str) – The path to the simulation file. Including extension and parent folders.
cluster_class (str) – The name of the class used to instantiate cluster group instances through reflection. See
app.domain.cluster_groups.node_class (str) – The name of the class used to instantiate network node instances through reflection. See
app.domain.network_nodes.
- Return type
-
-
class
SGMaster(simfile_name, sid, epochs, cluster_class, node_class)[source] Bases:
app.domain.master_servers.Master-
get_cloud_reference()[source] Use to obtain a reference to 3rd party cloud storage provider
The cloud storage provider can be used to temporarely host files belonging to
cluster clustersin bad conditions that may compromise the file durability of the files they are responsible for persisting.Note
This method is virtual.
- Returns
A pointer to the cloud server, e.g., an IP Address.
- Return type
-
app.domain.network_nodes
This module contains domain specific classes that represent network nodes
responsible for the storage of file blocks. These could be
reliable servers or P2P nodes.
-
class
HDFSNode(uid, uptime)[source] Bases:
app.domain.network_nodes.NodeRepresents a data node in the Hadoop Distribute File System.
-
execute_epoch(cluster, fid)[source] Instructs the
HDFSNodeinstance to execute the epoch.The method iterates
filesheld in disk and attempts to corrupt them silently. In HDFS file blocks’sha256are only verified when a user or client accesses the remote replica. Hence, no replication epoch is set up when a corruption occurs. The corruption is still logged in the output file.- Parameters
cluster (
ClusterType) – A reference to theClusterthat invoked theNodemethod.fid (str) – The
file name identifierof the file being simulated.
- Return type
-
replicate_part(cluster, replica)[source] Attempts to restore the replication level of the specified file block replica.
Replicas are sent selectively in descending order to the most reliable Nodes in the
Clusterdown to the least reliable.Note
There are no guarantees that
REPLICATION_LEVELwill be completely restored during the execution of this method.- Parameters
cluster (
ClusterType) – A reference to theClusterthat will deliver the newreplica.replica (
FileBlockData) – Thefile block replicato be delivered.
- Return type
-
update_status()[source] Used to update the time to live of the node instance.
When invoked, the network node decides if it should remain online or change some other state.
- Returns
The the status of the
Node.- Return type
-
-
class
NewscastNode(uid, uptime)[source] Bases:
app.domain.network_nodes.NodeRepresents a Peer running Newscast protocol, using shuffling techniques to exchange acquaintances with other network peers and performing peer degree aggregation using AverageFunction.
-
view A partial view of the P2P network.
Viewis a collection ofnetwork nodes, theNewscastNodeinstance may contact other than himself. Keys areNewscastNodeinstances, and values are their age in the dictionary. A key-value pair is commonly referenced as adescriptor.
-
aggregation_value Stores the aggregation value. The type of
aggregation_valueis defined by the body of theaggregate()method.
-
__init__(uid, uptime)[source] Instantiates a
Nodeobject.These are network nodes responsible for persisting
file block replicas.
-
_merge(a, b)[source] Merges two network views. If a node descriptor exists in both views, the most recent descriptor is kept.
- Parameters
a (_NetworkView) – A dictionary where keys are
network nodesand values are their respective age in the view.b (_NetworkView) – A dictionary where keys are
network nodesand values are their respective age in the view.
- Returns
The set union of both views with only the most up to date descriptors.
- Return type
_NetworkView
-
_select_view(view_buffer)[source] Reduces the size of the view to a predefined maximum size.
:param A dictionary where keys are
network nodes: :param and values are their respective age in the view.:- Returns
The
view_bufferwith at mostmax_view_sizedescriptors.- Parameters
view_buffer (_NetworkView) –
- Return type
_NetworkView
-
add_neighbor(node)[source] Adds a new network node to the node instance’s view.
If the view is full, the eldest node is replaced with the new node. Otherwise, the new
NewscastNodeis added to the instance’s view with age zero, unless the entry is already inviewor thenodeis the currentNewscastNodeinstance.- Returns
Trueifnodewas successfuly added,Falseotherwise.- Parameters
- Return type
-
aggregate(node=None)[source] The network node instance contacts another node from his view, then, both nodes assign the mean of their degrees to
aggregation_value.- Parameters
node (Optional[app.domain.network_nodes.NewscastNode]) – When
nodeis None a randomNewscastNodeis selected fromview. When specified to be contacted is the one referenced in the parameter.- Return type
-
execute_epoch(cluster, fid)[source] Instructs the
NewscastNodeinstance to execute the epoch.During the execution of the epoch, the
NewscastNodeinstance randomly selects anotherNewscastNodewho belongs to hisviewand aggregates their degree using the Average Function. Sometimes, during the epoch, theNewscastNodeinstance will also perform shuffling with the selected target.- Parameters
cluster (
ClusterType) – A reference to theClusterthat invoked theNodemethod.fid (str) – The
file name identifierof the file being simulated.
- Return type
-
get_degree()[source] Counts the number of descriptors in the node’s view.
- Returns
The degree of the
NewscastNodeinstance.- Return type
-
get_node()[source] Gets a random node from the current network view.
Each candidate
NewscastNodeto be returned is first pinged, if no answer is obtained, another node is selected as a candidate by iterating a list representation ofviewand the previous candidate is removed from theview.Note
Newscast should always return a random node, thus iteration should not be used, but this search is more efficient and readable.
- Returns
The selected
NewscastNode.- Return type
Optional[app.domain.network_nodes.NewscastNode]
-
replicate_part(cluster, replica)[source] Attempts to restore the replication level of the specified file block replica.
Similar to
send_part()but with slightly different instructions. In particular newreplicascan not be corrupted at the current node, at the current epoch.Note
There are no guarantees that
REPLICATION_LEVELwill be completely restored during the execution of this method.- Parameters
cluster (
ClusterType) – A reference to theClusterthat will deliver the newreplica.replica (
FileBlockData) – Thefile block replicato be delivered.
- Raises
NotImplementedError – When children of
Nodedo not implement the abstract method.- Return type
-
shuffle(node)[source] Starts a shuffle process that merges and crops two nodes’ views at the current node and at the destination node.
The final view consists of most up to date descriptors from both
viewsup to a maximum ofmax_view_sizedescriptors.- Parameters
node (app.domain.network_nodes.NewscastNode) – The node to be contacted for shuffling.
- Return type
-
shuffle_request(senders_view)[source] Merges and crops two nodes’ views at the current node.
The final view consists of most up to date descriptors from both
viewsup to a maximum ofmax_view_sizedescriptors.- Parameters
senders_view (_NetworkView) – A dictionary where keys are
network nodesand values are their respective age in the view.- Returns
A
viewand a freshdescriptorfrom theNewscastNodeinstance, before it is merged with the requestor’s view.- Return type
_NetworkView
-
update_status()[source] Used to update the time to live of the node instance.
When invoked, the network node decides if it should remain online or change some other state.
- Returns
The the status of the
Node.- Return type
-
-
class
Node(uid, uptime)[source] Bases:
objectThis class contains basic network node functionality that should always be useful.
-
id A unique identifier for the
Nodeinstance.- Type
-
uptime The amount of time the
Nodeis expected to remain online without disconnecting. Current uptime implementation is based on availability percentages.Note
Current implementation expects
network nodesjoining acluster groupto remain online for approximately:time_to_live=uptime*MAX_EPOCHS.However, a
network nodewho belongs to multiplecluster groupsmay disconnect earlier than that, i.e.,network nodesremain onlinetime_to_liveafter their first operation on the distributed backup system.- Type
-
status Indicates if the
Nodeinstance is online or offline. In later releases this could also contain a ‘suspect’ status.
-
suspicious_replies Collection that contains
http codesthat when received, trigger complaints to monitors about the replier.- Type
-
files A dictionary mapping file names to dictionaries of file block identifiers and their respective contents, i.e., the
file block replicashosted at theNode.- Type
Dict[str,
ReplicasDict]
-
__init__(uid, uptime)[source] Instantiates a
Nodeobject.These are network nodes responsible for persisting
file block replicas.
-
discard_part(fid, number, corrupt=False, cluster=None)[source] Safely deletes a part from the SGNode instance’s disk.
- Parameters
fid (str) – Name of the file the file block replica belongs to.
number (int) – The part number that uniquely identifies the file block.
corrupt (bool) – If discard is being invoked due to identified file block corruption, e.g., Sha256 does not match the expected.
cluster (
ClusterType) –Clusterthat willset the replication epochor mark the simulation as failed.
- Return type
-
execute_epoch(cluster, fid)[source] Instructs the
Nodeinstance to execute the epoch.- Parameters
cluster (
ClusterType) – A reference to theClusterthat invoked theNodemethod.fid (str) – The
file name identifierof the file being simulated.
- Raises
NotImplementedError – When children of
Nodedo not implement the abstract method.- Return type
-
get_file_parts(fid)[source] Gets collection of file parts that correspond to the named file.
- Parameters
fid (str) – The
file name identifierthat designates thefile block replicasto be retrieved.- Returns
A dictionary where keys are
file block numbersand values arefile block replicas- Return type
-
get_file_parts_count(fid)[source] Counts the number of file block replicas of a specific file owned by the
Node.- Parameters
fid (str) – The
file name identifierthat designates thefile block replicasto be counted.- Returns
The number of counted replicas.
- Return type
-
is_suspect()[source] Returns
Trueif the node is behaving suspiciously, elseFalse.- Return type
-
receive_part(replica)[source] Endpoint for file block replica reception.
The
Nodestores a newfile block replicainfilesif he does not have a replica with sameidentifier.- Parameters
replica (domain.helpers.smart_dataclasses.FileBlockData) – The
file block replicato be received byNode.- Returns
If upon integrity verification the
sha256hashvalue differs from the expected, the worker replies with a BAD_REQUEST. If theNodealready owns a replica with the sameidentifierit replies with NOT_ACCEPTABLE. Otherwise it replies with a OK, i.e., the delivery is successful.- Return type
-
replicate_part(cluster, replica)[source] Attempts to restore the replication level of the specified file block replica.
Similar to
send_part()but with slightly different instructions. In particular newreplicascan not be corrupted at the current node, at the current epoch.Note
There are no guarantees that
REPLICATION_LEVELwill be completely restored during the execution of this method.- Parameters
cluster (
ClusterType) – A reference to theClusterthat will deliver the newreplica.replica (
FileBlockData) – Thefile block replicato be delivered.
- Raises
NotImplementedError – When children of
Nodedo not implement the abstract method.- Return type
-
send_part(cluster, destination, replica)[source] Attempts to send a replica to some other network node.
- Parameters
cluster (
ClusterType) – A reference to theClusterthat will deliver the newreplica. In a real world implementation this argument would not make sense, but we use it to facilitate simulation management and environment logging.destination (str) – The name, address or another unique identifier of the node that will receive the file block replica.
replica (
FileBlockData) – The file block container to be sent to some other worker.
- Returns
An http code.
- Return type
-
-
class
SGNode(uid, uptime)[source] Bases:
app.domain.network_nodes.NodeRepresents a network node that executes a Swarm Guidance algorithm.
-
clusters A collection of
cluster groupstheSGNodeis a member of.
-
routing_table Contains the information required to appropriately route file block blocks to other SGNode instances.
- Type
Dict[str,
DataFrame]
-
__init__(uid, uptime)[source] Instantiates a
Nodeobject.These are network nodes responsible for persisting
file block replicas.
-
execute_epoch(cluster, fid)[source] Instructs the
Nodeinstance to execute the epoch.The method iterates all file block blocks in
filesand independently decides if they should be sent to anotherSGNodeby following the probabilities inrouting_tablecolumn vectors.- Parameters
cluster (
ClusterType) – A reference to theClusterthat invoked theNodemethod.fid (str) – The
file name identifierof the file being simulated.
- Return type
-
remove_file_routing(fid)[source] Removes a file name from the
SGNoderouting table.This method is called when a
SGNodeis evicted from thecluster groupand results in the deletion from disk of allfile block replicaswith identifierfid.- Parameters
fid (str) – The
file name identifierof the file whose routing is being eliminated.- Return type
-
replicate_part(cluster, replica)[source] Attempts to restore the replication level of the specified file block replica.
Similar to
send_part()but with slightly different instructions. In particular newreplicascan not be corrupted at the current node, at the current epoch. The replicas are also sent selectively in descending order to the most reliable Nodes in theClusterdown to the least reliable. Whereassend_part(). follows stochastic swarm guidance routing.Note
There are no guarantees that
REPLICATION_LEVELwill be completely restored during the execution of this method.- Parameters
cluster (
ClusterType) – A reference to theClusterthat will deliver the newreplica.replica (
FileBlockData) – Thefile block replicato be delivered.
- Return type
-
select_destination(fid)[source] Selects a random message destination according to routing_table probabilities for the specified file name.
- Parameters
fid (str) – The
file name identifierto obtain the properrouting_tablefor destination selection.- Returns
The name or address of the selected destination.
- Return type
-
set_file_routing(fid, v_)[source] Maps a file name identifier with a transition column vector used for file block replica routing.
- Parameters
fid (str) – The
file name identifierof the file whose routing is being configured.v_ (Union[
Series,DataFrame]) – A column vector with probabilities that dictate the odds of sending file block blocks belonging to the file with specified id to other Cluster members also working on the persistence of the file block blocks.
- Raises
ValueError – If
transition_vectoris not aDataFrameand cannot be casted to it.- Return type
-
-
class
SGNodeExt(uid, uptime)[source] Bases:
app.domain.network_nodes.SGNodeRepresents a network node that executes a Swarm Guidance algorithm.
SGNodeExtinstances differ fromSGNodein the sense that the latter does not monitor the peers belonging to hiscluster groups, concerning their connectivitystatusor suspicious behaviours.-
update_status()[source] Used to update the time to live of the node instance.
When invoked, the network node decides if it should remain online or change some other state.
- Returns
The the status of the
Node.- Return type
-
-
_NetworkView: Dict[Union[str, app.domain.network_nodes.Node], int]