In this section we show a simple example of how to use PyGLPK to solve the Hamiltonian path problem. This particular example is intended to be much more high level for those frustrated by lengthly explanations with excessive hand holding.
Suppose we have an undirected graph. We want to find a path from any node to any other node such that we visit each node exactly one. This is the Hamiltonian path problem!
This is our strategy of how to solve this with a mixed integer program:
For the purpose of our function, we encode graphs as a list of two element tuples. Each tuple represents an edge between two vertices. Vertices can be anything Python can hash. We assume we can infer the vertex set from this edge set, i.e., there are no isolated vertices. (If there are, then the problem is trivial anyway.) For example, [(1,2), (2,3), (3,4), (4,2), (3,5)]
would represent a graph over five nodes, containing a K3 subgraph (among vertices 2,3,4) with additional vertices 1 attached to 2 and 5 attached to 3.
The function accepts such a list of edges, and return the sublist comprising the path, or None
if it could not find a path.
Here is the implementation of that function:
def hamiltonian(edges): node2colnums = {} # Maps node to col indices of incident edges. for colnum, edge in enumerate(edges): n1, n2 = edge node2colnums.setdefault(n1, []).append(colnum) node2colnums.setdefault(n2, []).append(colnum) lp = glpk.LPX() # A new empty linear program glpk.env.term_on = False # Stop annoying messages. lp.cols.add(len(edges)) # A struct var for each edge lp.rows.add(len(node2colnums)+1) # Constraint for each node for col in lp.cols: # Go over all struct variables col.kind = bool # Make binary, not continuous # For each node, select at least 1 and at most 2 incident edges. for row, edge_column_nums in zip(lp.rows, node2colnums.values()): row.matrix = [(cn, 1.0) for cn in edge_column_nums] row.bounds = 1, 2 # We should select exactly (number of nodes - 1) edges total lp.rows[-1].matrix = [1.0]*len(lp.cols) lp.rows[-1].bounds = len(node2colnums)-1 assert lp.simplex() == None # Should not fail this way. if lp.status != 'opt': return None # If no relaxed sol., no exact sol. assert lp.integer() == None # Should not fail this way. if lp.status != 'opt': return None # Count not find integer solution! # Return the edges whose associated struct var has value 1. return [edge for edge, col in zip(edges, lp.cols) if col.value > 0.99]
We shall now go over this function section by section, but not quite in such exhaustive detail as before.
node2colnums = {} # Maps node to col indices of incident edges. for colnum, edge in enumerate(edges): n1, n2 = edge node2colnums.setdefault(n1, []).append(colnum) node2colnums.setdefault(n2, []).append(colnum)
This is pure Python non-PyGLPK code. It is simply computing a mapping of nodes to a list of column numbers corresponding to each node's incident edges. Note that each edge will have the same index in both the column collection, as well as the input edges
list.
lp = glpk.LPX() # A new empty linear program glpk.env.term_on = False # Stop annoying messages. lp.cols.add(len(edges)) # A struct var for each edge lp.rows.add(len(node2colnums)+1) # Constraint for each node
In this section, we create a new empty linear program, make it quiet, and add as many columns as there are edges, and as many rows as there are vertices, plus one additional row to encode the constraint that exactly as many edges should be selected as there are vertices minus one.
for col in lp.cols: # Go over all struct variables col.kind = bool # Make binary, not continuous
Here, we set our LP as a MIP, and going over the columns set the associated structural variable to be binary (i.e., have 0 to 1 bounds and be an integer variables).
# For each node, select at least 1 and at most 2 incident edges. for row, edge_column_nums in zip(lp.rows, node2colnums.values()): row.matrix = [(cn, 1.0) for cn in edge_column_nums] row.bounds = 1, 2
These are the constraints that say for each node, either one or two edges should be selected. For each node, we set a row to have a constraint which sums over all of the structural variables representing edges incident to that node, and forces this sum to be between 1 and 2.
# We should select exactly (number of nodes - 1) edges total lp.rows[-1].matrix = [1.0]*len(lp.cols) lp.rows[-1].bounds = len(node2colnums)-1
Similarly, have the last row in the constraint matrix encode the constraint that the sum of all the structural variables (i.e., the number of edges selected) should be the number of vertices minux one.
Note how in this case we do not specify the column indices in our matrix assignment: we just assign a long list of 1.0 values, and use how matrix assignments will implicitly assue that each single value will be placed in the entry directly after the last assigned value.
assert lp.simplex() == None # Should not fail this way. if lp.status != 'opt': return None # If no relaxed sol., no exact sol. assert lp.integer() == None # Should not fail this way. if lp.status != 'opt': return None # Count not find integer solution!
As in the SAT example, we run the simplex solver to come up with an initial continuous relaxed basic solution to this problem. We fail, we miss the assertion, and if the optimal solution was not found, we return that there is no solution. We then run the integer optimizer.
# Return the edges whose associated struct var has value 1. return [edge for edge, col in zip(edges, lp.cols) if col.value > 0.99]
Finally, in this case, we select out those columns which have a value close to 1 (indicating this edge was selected) and return the associated edge using our colnum2edge
map we constructed at the start of the function.
Suppose we have, after this function definition, these calls.
# 1----2----3----5 # \ / # \/ Has one H path! # 4 g1 = [(1,2), (2,3), (3,4), (4,2), (3,5)] print hamiltonian(g1) # 4 5 6 # | | | # | | | Has no H path! # 1----2----3 g2 = [(1,2), (2,3), (1,4), (2,5), (3,6)] print hamiltonian(g2) # 4 5----6 # | | | # | | | Has two H paths! # 1----2----3 g3 = g2 + [(5,6)] print hamiltonian(g3)This will produce this output.
[(1, 2), (3, 4), (4, 2), (3, 5)] None [(1, 2), (1, 4), (2, 5), (3, 6), (5, 6)]
Note that we did not define an objective function value. If we wanted, we could solve the traveling salesman problem (with symmetric weights) by making the following modifications:
Here is the resulting function. Sections with important changes have been bolded.
def tsp(edges): node2colnums = {} # Maps node to col indices of incident edges. for colnum, edge in enumerate(edges): n1, n2, cost = edge node2colnums.setdefault(n1, []).append(colnum) node2colnums.setdefault(n2, []).append(colnum) lp = glpk.LPX() # A new empty linear program glpk.env.term_on = False # Stop annoying messages. lp.cols.add(len(edges)) # A struct var for each edge lp.rows.add(len(node2colnums)+1) # Constraint for each node lp.obj[:] = [e[-1] for e in edges] # Try to minimize the total costs. lp.obj.maximize = False for col in lp.cols: # Go over all struct variables col.kind = bool # Make binary, not continuous # For each node, select two edges, i.e.., an arrival and a departure. for row, edge_column_nums in zip(lp.rows, node2colnums.values()): row.matrix = [(cn, 1.0) for cn in edge_column_nums] row.bounds = 2 # We should select exactly (number of nodes) edges total lp.rows[-1].matrix = [1.0]*len(lp.cols) lp.rows[-1].bounds = len(node2colnums) assert lp.simplex() == None # Should not fail this way. if lp.status != 'opt': return None # If no relaxed sol., no exact sol. assert lp.integer() == None # Should not fail this way. if lp.status != 'opt': return None # Count not find integer solution! # Return the edges whose associated struct var has value 1. return [edge for edge, col in zip(edges, lp.cols) if col.value > 0.99]