From 3b625af8fe0abdde66950bd9fa1135dd1098a8dc Mon Sep 17 00:00:00 2001 From: Francesc Verdugo Date: Wed, 25 Sep 2024 14:21:36 +0200 Subject: [PATCH] Polishing TSP notebook. --- notebooks/tsp.ipynb | 221 +++++++++++++++++++++++++++++++++++++------- 1 file changed, 186 insertions(+), 35 deletions(-) diff --git a/notebooks/tsp.ipynb b/notebooks/tsp.ipynb index 767c2c6..112b284 100644 --- a/notebooks/tsp.ipynb +++ b/notebooks/tsp.ipynb @@ -72,7 +72,7 @@ "## The traveling sales person (TSP) problem\n", "\n", "\n", - "In this notebook we will study another algorithm that works with graphs, the [traveling sales person (TSP) problem](https://en.wikipedia.org/wiki/Travelling_salesman_problem). The classical formulation of this problem is as follows (quoted from Wikipedia) \"Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once?\" This problem as applications in combinatorial optimization, theoretical computer science, and operations research. It is very expensive problem to solve (NP-hard problem) which often needs parallel computing.\n", + "In this notebook, we will study another algorithm that works with graphs, the [traveling sales person (TSP) problem](https://en.wikipedia.org/wiki/Travelling_salesman_problem). The classical formulation of this problem is as follows (quoted from Wikipedia) \"Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once?\" This problem has applications in combinatorial optimization, theoretical computer science, and operations research. It is very expensive problem to solve (NP-hard problem), which makes parallel computing often required to solve it.\n", "\n", "
\n", "Note: There are two key variations of this problem. One in which the sales person returns to the initial city, and another in which the sales person does not return to the initial city. We will consider the second variant for simplicity.\n", @@ -106,7 +106,7 @@ "source": [ "### Sequential algorithm (branch and bound)\n", "\n", - "A well known method to solve this problem is based on a [branch and bound](https://en.wikipedia.org/wiki/Branch_and_bound) strategy. It consisting in organizing all possible routes in a tree-like structure (this is the \"branch\" part). The root of this tree is the initial city. The children of each node in the graph are the neighbor cities that have not been visited in the path so far. When all neighbor cities are already visited, the city becomes a leaf node in the tree. See figure below for the tree associated with our TSP problem example. The TSP problem consists now in finding which is the \"shortest\" branch in this tree. The tree data structure is just a convenient way of organizing all possible routes in order to search for the shortest one. We refer to it as the *search tree* or the *search space*." + "A well known method to solve this problem is based on a [branch and bound](https://en.wikipedia.org/wiki/Branch_and_bound) strategy. It consists in organizing all possible routes in a tree-like structure (this is the \"branch\" part from the branch and bound strategy). The root of this tree is the initial city. The children of each node in the graph are the neighbor cities that have not been visited in the path so far. When all neighbor cities are already visited, the city becomes a leaf node in the tree. See figure below for the tree associated with our TSP problem example. The TSP problem consists now in finding which is the \"shortest\" branch in this tree. The tree data structure is just a convenient way of organizing all possible routes in order to search for the shortest one checking one by one, from left to right, using a [depth-first search](https://en.wikipedia.org/wiki/Depth-first_search). We refer to it as the *search tree* or the *search space*." ] }, { @@ -131,7 +131,7 @@ "source": [ "### Nearest city first heuristic\n", "\n", - "When building the search tree we are free to choose any order when defining the children of a node. A clever order is using the *nearest city first heuristic*. I.e., we sort the children according to how far they are from the current node, in ascending order. This allows to quickly find a minimum bound for the distance which will be used to prune the remaining paths (see next section). The figure above used the nearest city first heuristic. In blue you can see the distance between cities. The first child is always the one with the shortest distance." + "When building the search tree we are free to choose any order when defining the children of a node. A clever order is using the *nearest city first heuristic*. I.e., we sort the children according to how far they are from the current node, in ascending order. This heuristic increases the chances that the the optimal route is located at the left of the tree, which will allow us to speed-up the search using pruning (next section). The figure above used the nearest city first heuristic. In blue you can see the distance between cities. The first child is always the one with the shortest distance." ] }, { @@ -141,9 +141,8 @@ "source": [ "### Pruning the search tree\n", "\n", - "The basic idea of the algorithm is to loop over all possible routes (all branches in the search tree) and find find the one with the shortest distance. One can optimize this process by \"pruning\" the search tree. We keep track of the best solution of all paths visited so far, which allows us to skip searching paths that already exceed this value. This is the \"bound\" part of the branch and bound strategy. \n", - "\n", - "For example, in the following graph only 3 out of 6 possible routes need to be fully traversed to find the shortest route. In particular, we do not need to fully traverse the second branch/route (figure below left). when visiting the third city in this branch the current distance is already equal to the full previous route. It means that the solution will not be in this part of the tree for sure. In figure below (right), the gray nodes are the ones we do not visit because the minimum distance had been exceeded before completing the route." + "The basic idea of the algorithm is to loop over all possible routes (all branches in the search tree) and find find the one with the shortest distance. One can optimize this process by \"pruning\" the search tree. We visit all possible routes in the tree from left to right. In this process, we keep track of the shortest route visited so far. While visiting a new node in a route, we check if the distance traveled in this route is already larger than the shortest route traversed so far. If this is the case, we can skip all routes that continue from this city as we now for sure that a route will not be in this part of the tree. This is called \"pruning\" since we are cutting branches of the tree, and is the \"bound\" part of the branch and bound strategy. \n", + "For example, in the following graph, only 3 out of 6 possible routes need to be fully traversed to find the shortest route. In particular, we do not need to fully traverse the second branch/route (figure below left). When visiting the third city in this branch, the current distance is already equal to the full previous route. It means that the solution will not be in this part of the tree. In figure below (right), the gray nodes are the ones we do not visit because the minimum distance had been exceeded before completing the route." ] }, { @@ -168,9 +167,9 @@ "source": [ "### Computation complexity\n", "\n", - "The total number of routes we need to traverse is $O(N!)$, where $N$ is the number of cities. This comes from the fact that the number of possible routes is equal to the number of possible permutations of $N$ cities. Thus the cost of the algorithm is $O(N!)$, which becomes expensive very quickly when $N$ grows.\n", + "The total number of routes is $O(N!)$, where $N$ is the number of cities. This comes from the fact that the number of possible routes is equal to the number of possible permutations of $N$ cities. Thus the cost of the algorithm is $O(N!)$, which becomes expensive very quickly when $N$ grows.\n", "\n", - "In practice, however, we will not need to traverse all $O(N!)$ possible routes to find the shortest one since we consider pruning. The nearest city first heuristic also makes more likely that the shortest route is among the first routes to be traversed (left part of the tree), thus speeding the process. However, the solution can be anywhere in the search tree, and the number of routes to be traversed is $O(N!)$ in the worse case scenario." + "In practice, however, we will not need to traverse all $O(N!)$ possible routes to find the shortest one since we consider pruning. The nearest city first heuristic also makes more likely that the shortest route is among the first routes to be traversed (left part of the tree), thus increasing the chance to prune routes afterwards. However, the solution can be anywhere in the search tree, and the number of routes to be traversed is $O(N!)$ in the worse case scenario (the optimal route is the right-most one)." ] }, { @@ -200,10 +199,21 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 1, "id": "a50706bc", "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/plain": [ + "sort_neighbors (generic function with 1 method)" + ] + }, + "execution_count": 1, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "function sort_neighbors(C)\n", " n = size(C,1)\n", @@ -226,10 +236,25 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 2, "id": "6dd0288e", "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/plain": [ + "4-element Vector{Vector{Tuple{Int64, Int64}}}:\n", + " [(1, 0), (2, 2), (4, 2), (3, 3)]\n", + " [(2, 0), (4, 1), (1, 2), (3, 4)]\n", + " [(3, 0), (1, 3), (4, 3), (2, 4)]\n", + " [(4, 0), (2, 1), (1, 2), (3, 3)]" + ] + }, + "execution_count": 2, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "C = [\n", " 0 2 3 2\n", @@ -250,12 +275,27 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 3, "id": "00608e1d", "metadata": { "scrolled": true }, - "outputs": [], + "outputs": [ + { + "data": { + "text/plain": [ + "4-element Vector{Tuple{Int64, Int64}}:\n", + " (3, 0)\n", + " (1, 3)\n", + " (4, 3)\n", + " (2, 4)" + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "city = 3\n", "C_sorted[city]" @@ -282,7 +322,7 @@ "id": "6c91a99f", "metadata": {}, "source": [ - "Next, we write an algorithm that traverses the whole search tree and prints all the possible paths. To this end, the tree is traversed in [depth-first order](https://en.wikipedia.org/wiki/Depth-first_search) using a recursive function call. Before we go to a neighbouring city, we also have to verify that it has not been visited on this path yet. If we reach a leaf node, we print the complete path and continue searching. " + "Next, we write an algorithm that traverses the whole search tree and prints all the possible paths. To this end, the tree is traversed in [depth-first order](https://en.wikipedia.org/wiki/Depth-first_search) using a recursive function call. Before we go to a neighbouring city, we also have to verify that it has not been visited on this path yet. If we reach a leaf node, we print the complete path and continue searching." ] }, { @@ -302,10 +342,21 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 4, "id": "2ddc2ec1", "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/plain": [ + "visit_all_paths_recursive! (generic function with 1 method)" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "function visit_all_paths(C_sorted,city)\n", " # Initialize path\n", @@ -339,10 +390,23 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 5, "id": "723a0f1a", "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "I just completed route [1, 2, 4, 3]\n", + "I just completed route [1, 2, 3, 4]\n", + "I just completed route [1, 4, 2, 3]\n", + "I just completed route [1, 4, 3, 2]\n", + "I just completed route [1, 3, 4, 2]\n", + "I just completed route [1, 3, 2, 4]\n" + ] + } + ], "source": [ "city = 1\n", "visit_all_paths(C_sorted,city)" @@ -385,10 +449,21 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 6, "id": "5989f0ac", "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/plain": [ + "tsp_serial_no_prune_recursive! (generic function with 1 method)" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "verbose::Bool = true\n", "function tsp_serial_no_prune(C_sorted,city)\n", @@ -428,10 +503,33 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 7, "id": "d1be2bfc", "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "I just completed route [1, 2, 4, 3]. Min distance so far is 6\n", + "I just completed route [1, 2, 3, 4]. Min distance so far is 6\n", + "I just completed route [1, 4, 2, 3]. Min distance so far is 6\n", + "I just completed route [1, 4, 3, 2]. Min distance so far is 6\n", + "I just completed route [1, 3, 4, 2]. Min distance so far is 6\n", + "I just completed route [1, 3, 2, 4]. Min distance so far is 6\n" + ] + }, + { + "data": { + "text/plain": [ + "6" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "city = 1\n", "verbose = true\n", @@ -465,10 +563,21 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 8, "id": "8241e0df", "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/plain": [ + "tsp_serial_recursive! (generic function with 1 method)" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "function tsp_serial(C_sorted,city)\n", " num_cities = length(C_sorted)\n", @@ -512,10 +621,33 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 9, "id": "998087f2", "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "I just completed route [1, 2, 4, 3]. Min distance so far is 6\n", + "I am pruning at [1, 2, 3]\n", + "I am pruning at [1, 4, 2, 3]\n", + "I am pruning at [1, 4, 3, 2]\n", + "I am pruning at [1, 3, 4]\n", + "I am pruning at [1, 3, 2]\n" + ] + }, + { + "data": { + "text/plain": [ + "6" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "city = 1\n", "verbose = true\n", @@ -533,10 +665,29 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 12, "id": "e1eb74d8", "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " 1.214294 seconds (1 allocation: 144 bytes)\n", + " 0.002645 seconds (1 allocation: 144 bytes)\n" + ] + }, + { + "data": { + "text/plain": [ + "\u001b[32m\u001b[1mTest Passed\u001b[22m\u001b[39m" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "n = 11 # It is safe to test up to n=11 on a laptop\n", "using Random\n", @@ -650,12 +801,12 @@ "source": [ "### Performance issues: Load balance\n", "\n", - "Pruning is essential in this algorithm but makes challenging to evenly distribute the work over available processors. Image that we assign the same number of branches per worker and that the workers use pruning locally to speed up the solution process. It is not possible to know in advance how many branches will be fully traversed by each worker since pruning depends on the actual values in the input distance matrix (runtime values). It might happen that a worker can prune many branches and finishes fast, whereas other workers are not able to prune so many branches and they need more time to finish. This is a clear example of bad load balance. We will explain later a strategy to fix it.\n", + "Pruning is essential in this algorithm but makes challenging to evenly distribute the work over available processors. Image that we assign the same number of branches per worker and that the workers use pruning locally to speed up the solution process. It is not possible to know in advance how many branches will be fully traversed by each worker since pruning depends on the values in the input distance matrix (runtime values). It might happen that a worker can prune many branches and finishes fast, whereas other workers are not able to prune so many branches and they need more time to finish. This is a clear example of bad load balance. We will explain a strategy to fix it later in the notebook.\n", "\n", "\n", "### Performance issues: Search overhead \n", "\n", - "Another disadvantage of this kind of parallel search is that the pruning is now less effective. The workers each run their own version of the search algorithm and keep track of their local minimum distances. This means that less nodes will be pruned in the parallel version than in the serial version. The parallel code might search more routes than the sequential ones. This is called *search overhead*." + "Another disadvantage of this kind of parallel search is that the pruning is now less effective. The workers each run their own version of the search algorithm and keep track of their local minimum distances, which are different from the global minimum found so far. This means that less nodes will be pruned in the parallel version than in the serial version. The parallel code might search more routes than the sequential ones. This is called *search overhead*." ] }, { @@ -664,7 +815,7 @@ "metadata": {}, "source": [ "
\n", - "Question: How routes are fully traversed in total when we assign two branches to each worker? Look at the illustration below. Assume that each worker does pruning locally and independently of the other workers.\n", + "Question: How many routes are fully traversed in total when we assign two branches to each worker? Look at the illustration below. Assume that each worker does pruning locally and independently of the other workers.\n", "
" ] }, @@ -732,9 +883,9 @@ "source": [ "### Negative search overhead\n", "\n", - "The parallel algorithm might search more branches than the sequential one when we parallelize the pruning process. However, it is also possible that parallel algorithm searches less branches that the sequential one for particular cases. Imagine that the optimal route is on the right side of the tree (or the last route in the tree in the limit case). The parallel algorithm will need less work than the sequential one in this case. The last workers might find the optimal route very quickly and inform the other workers about the optimal minimum, which can then prune branches very effectively. Whereas the sequential algorithm will need to traverse many branches in order to reach the optimal one. If the parallel code does less searches than the sequential one, we way that the search overhead is negative. \n", + "The parallel algorithm might search more branches than the sequential one when we parallelize the pruning process. However, it is also possible that parallel algorithm searches less branches that the sequential one for particular cases. Imagine that the optimal route is on the right side of the tree (or the last route in the tree in the limit case). The parallel algorithm will need less work than the sequential one in this case. The last workers might find the optimal route very quickly and inform the other workers about the optimal minimum, which can then prune branches very effectively. Whereas the sequential algorithm will need to traverse many branches in order to reach the optimal one. If the parallel code does less searches than the sequential one, we say that the search overhead is negative. \n", "\n", - "Negative search overhead is very good for parallel speedups, but it depends on the input values. We cannot rely on it to speed up the parallel execution of the algorithm. \n" + "Negative search overhead is very good for parallel speedups, but it depends on the input values. We cannot rely on it to speed up the parallel execution of the algorithm." ] }, { @@ -1182,7 +1333,7 @@ "- We studied the solution of the TSP problem using a branch and bound strategy\n", "- The problem is $O(N!)$ complex in the worse case scenario, where $N$ is the number of cities.\n", "- Luckily, the compute time can be drastically reduced in practice using the nearest city first heuristic and branch pruning.\n", - "- Pruning, however, introduces load imbalance in the parallel code. To this fix this, one needs a dynamic load balancing strategy as the actual work per worker depends on the input matrix (runtime values).\n", + "- Pruning, however, introduces load imbalance in the parallel code. To fix this, one needs a dynamic load balancing strategy as the actual work per worker depends on the input matrix (runtime values).\n", "- A replicated workers model is useful to distribute work dynamically. However, it introduces a trade-off between load balance and communication depending on the value of `maxhops`.\n", "- The parallel code might suffer from positive search overhead (if the optimal route is on the left of the tree) or it can benefit from negative search overhead (if the optimal route is on the right of the tree).\n", "- In some cases, it is possible to observe super-linear speedup thanks to negative search overhead.\n"