Gen-Netways

A Genetic Algorithm for Network Diameter Reduction

Contents

1 Introduction
1.1 Network Optimization
1.2 Genetic Algorithms
1.3 Simulated annealing
2 Architecture
2.1 Graph Structure
2.2 Gen-Netways Architecture
2.3 Mutation
2.4 Interbreeding
3 Test Results
3.1 ...
4 The Future
4.1 Breeding and mutation
4.2 Crystals
4.3 Beyond programming
4.4 Last Words

Abstract

Given is an arbitrary graph with nodes V and edges E. The goal is to minimize the diameter (the longest distance between two nodes) of the graph while maintaing a minimum spanning tree of all nodes (aka: every node is reachable by every other node). The only allowed modification hereby is to unify or split edges such that the total of waylength always remains the same (e.g. replacing two edges of length one with one edge of length two: n1-n2-n3 becomes n1--n3; n2 must be connected by other nodes to fulfill the spanning tree requirement).

It is widely accepted that this task is NP-hard, so a deterministic algorithm cannot have a good runtime. Instead, a genetic algorithm is used: One solution is one individual. By combinating and mutating superiour individuals and removing inferiour ones, it is hoped that a close-to-optimum solution can be found in far less time than with a deterministic algorithm.

The term "evolutionary programming" was coined for algorithms that mimick mother nature - in the case of "genetic programming" the darwinian evolution theory. Genetic programming is therefore subbranch of evolutionary programming.

1 Introduction

1.1 Network Optimization

Every computer network, may it be processors on a board or computers on the internet, has to deal with the problem of communication speed. Obviously, the more "stations" are involved on the way from one node to the other, the slower the response time. Thus, on any given network, one would wish that the diameter, that is: the longest way from any node to another, should be minimized.

On hardware systems like those used in parallel computing this diameter is particulary important in respect to guaranteed response times. While for some structures like quadratic meshs the optimal solution is already known (H-trees), irregular graphed networks require customized solutions.

The process of minimizing. A given graph of processing nodes has usually the constraint that the number and length of links is immutable. So the only possible adjustment is to join links, hereby removing the node in the middle (e.g. n1-n2-n3, a 3-node link, becomes n1-n3, a two node link; while the waylength from n1 to n3 is optimized (2 becomes 1), n2 may, dependent on the surrounding graph, suffer from this join).

Finding an optimal solution to this problem is NP-hard, so it is not desireable to use a determistic algorithm. Instead, heuristic approaches may be better suited.

1.2 Genetic Algorithms

Genetic algorithms, a subbranch of evolutionary programming, is such a heuristic approach to NP problems. Genetic algorithms mimick the evolution ("survival of the fittest") of natural beings. In such a paradigm, each solution is such a being ("individual").

Individuals can mutate or interbreed, thus creating new solutions that are only slightly different from their parent(s). Each solution has a fitness rating (in this case: the diameter of the graph), and only the fittest survive; thus, a step-by-step improvement process is initiated.

The basic process may be explained by a pseudo-algorithm.

while ( <evolution should run> ) {
  <kill everyone except the top 10% or so>
  <interbreed them>
  <let some of the children mutate>
}

It is easy to see that the top 10% of solutions never become worse than in the previous generation - good solutions are simply immortal and may only be superseded by better children or mutations.

While the breeding process combines the attributes of the parents at random, the mutation is far more random - virtually any configuration can be achieved by mutation. In Gen-netways the frequency and severeness of mutation is dependent on the radiation in the gene pool; this radiation changes over time.

As a conclusion, genetic algorithm need:

Genetic algorithms can achieve good results - yet it is improbable that they will get the optimal result. But for NP-hard problems, close-to-perfect is good enough.

1.3 Simulated annealing

Simulated annealing is another heuristic technique used for NP problems. Similar to the genetic programming paradigm, solutions are individuals. In contrast, there is no survival of the fittest and no interbreeding - yet there is mutation.

Annealing (cooling down) referes to the severeness of mutation. In the beginning, mutations tend to be big, thus reaching every region in the solution space; then, mutations begin to become smaller, heading for local minima (minima in the case of diameter reduction - normally maxima). This way, all local minima are reached, and one of them is hopefully the global minimum.

In Gen-netways this effect is (somewhat limited) incorporated by the principle of radiation. From time to time (random; factor can be customized) a nuclear outbreak occurs, raising the radiation to maximum; in the following time, radiation converges to the norm again.

It is hoped that the use of simulated annealing principles can reduce the concentration on local minima in the gene pool.

2 Architecture

2.1 Graph Structure

The native structure of the graph are Path objects that contain two or more nodes in order, indicating that these nodes are connected by a path (e.g. one Path object would say: n1-n2-n3). Together, those path objects form the "genes" of each individual - genes that can be recombinated or mutate.

For fitness calculation, the graph will be represented as a distance matrix M which is quadratic; each side has the number of nodes as length. In this matrix, an upper triangle is build which represents the costs of reaching a point from another. Since every edge is bidirectional, redundant information can be erased.

Graph Structure Example
Graph Structure Example

Example: 0 means reachable at no cost, 1 means directly reachable with waylength 1, -1 means not directly reachable, x means "does not matter".

2.2 Gen-Netways Architecture

The architecture in Gen-Netways consists of 4 components: Path, Individual, GenePool and World.

Path. The smallest unit in the system, a Path is simple a link from one node to another. It may have more nodes inbetween, though; these nodes indicate where the link lies on the graph (the link may be splitted at these nodes later on). Example: n1-n2 is a simple path of length 1; n1-n2-n3 is a path between n1 und n3 of length 1 but it may be splitted to form to paths (n1-n2 and n2-n3) of length 1 each. Paths are the "genes" of individuals.

Individual. Each individual is one solution; everyone contains a list of Paths (genes) that define the solution graph. The fitness of each individual is calculated using a adjacency matrix derived from the Path list. Individuals can breed by mixing their Path lists and mutate by making changes to the list.

GenePool. A gene pool consists of a number of individuals. Radiation and selection parameters are also properties of the gene pool. The number of individuals remains the same; in each evolution cycle a certain number of survivors (dependant on parameters, default 40%) is selected and the remaining slots are filled with children and mutations. A gene pool primarily serves as a collection of individuals and controller of their surroundings.

World. The world keeps the gene pool running. A world could, in future versions, also contain more than one gene pool. In Gen-Netways, the World serves as starting point and as model for the GUI (model-view-controller pattern, a software technique).

2.3 Mutation

Mutation is the first and more simply method of changing an individual. The simplest way to mutate a path list is:

In Gen-Netways, however, the amount of radiation is to be taken in account. The most obvious effect of higher radiation is the increased number of mutations per individual.

Additionally, to fulfill the idea of simulated annealing, higher radiation results in building "snakes" of path objects, thus creating bigger jumps (node-wise). All the time only if the result is still a spanning tree, of course. The slow-cool-down property of the radiation hopefully provides a simulated-annealing effect.

On the breakdown side, paths are chosen at random, and, when they have a length greater than the standard-2, broken at a random point. Since no connecting node is taken away, the resulting graph is always a spanning tree if it was one before the breakdown.

2.4 Interbreeding

Interbreeding is the process of mixing the genes (paths) of 2 or more parents to create a child.

The first parent is taken as blueprint (copy) for the child, thereafter all other parents are searched for paths that have a length greater than the standard-2. Each of these longer paths has only a probability of 1/ParentCount to be taken into account of all.

If this "outstanding gene" is to be applied the child, it is checked if there is a substitute that can be removed to compensate the adding of a new path (e.g. when n1-n2-n3-n4 is to be inserted that must be paid for with removal of equivalent paths, one e.g. being the combination n1-n2-n3 and n3-n4). This ensures that only the ways that exists are used, and none is used twice. Even if there are substitutes, the result after the insertion must still be a spanning tree. If not, insertion is rejected.

It is to be noted that the first parent is in clear advantage - all of its outstanding genes are in the first blueprint of the child; on the other hand, its outstanding genes are more likely to be removed during the insertion process of partner genes.

3 Test Results

[Lots of tables, differentiating parameters, blabla rhabarber...]

From first tests it seems that mutation is far more powerful than breedings - maybe the simulated annealing principle is working better than the overall approach of genetic algorithms. But since interbreeding is faulty (see also 4.1), this may be the mistake.

4 The Future

4.1 Breeding and mutation (from the yet-to-be-done dept.)

At this point, there is an interbreeding problem: When it comes to path replacement (a longer path replaces several shorter ones), all the subpaths must be standard-2 in length (e.g. inserting n1-n2-n3-n4 is only possible if n1-n2, n2-n3 and n3-n4 exist; n1-n2 and n2-n3-n4 would theoretically suffice, but are not taken in the current version). Should be fixed when I return from holidays.

Another idea is that there could be more than one gene pool. A number of "houses" would only interbreed among each other (with a few "interracial mixes" to improve gene pool quality), and from time to time wage nuclear wars, resulting in increased radiation on the others side. Unlike the real world, it is not recommended to kill the fittest individuals in senseless wars.

4.2 Crystals (from the cool-but-undeveloped-idea dept.)

Since the H-tree is optimal for quadratic meshs, I wonder how crystal-like structures can be applied to regular graphs of odd forms. In particular, each likeness for a different kind of crystals could be genes, and individuals with a tendency towards structure could compete with individuals with a tendency towards chaos (stronger mutation).

4.3 Beyond programming (from the on-the-other-side dept.)

Anyone into genetic programming could regularly take a look at both bio-computers (some even work with DNA) and quantum computers, as those both offer unprecedented parallel performance - most obvious, genetic programming with its massive parallelity could greatly profit from it, not to mention the similarity between the techniques.

4.4 Last Words

Thank you for reading this document. Have a nice day!

EOF (Jul:2001)