# What is the minimum cost to connect all the islands?

There is a grid of size N x M. Some cells are islands denoted by ‘0’ and the others are water. Each water cell has a number on it denoting the cost of a bridge made on that cell. You have to find the minimum cost for which all the islands can be connected. A cell is connected to another cell if it shares an edge or a vertex.

What algorithm can be used to solve this problem? What can be used as a brute force approach if the values of N, M are very small, say NxM <= 100?

Example: In the given image, green cells indicate islands, blue cells indicate water and light blue cells indicate the cells on which a bridge should be made. Thus for the following image, the answer will be 17.

Initially I thought of marking all the islands as nodes and connecting every pair of islands by a shortest bridge. Then the problem could be reduced to Minimum spanning tree, but in this approach I missed the case where edges are overlapping. For example, in the following image, the shortest distance between any two islands is 7 (marked in yellow), so by using Minimum Spanning Trees the answer would be 14, but the answer should be 11 (marked in light blue).

To approach this problem, I would use an integer programming framework and define three sets of decision variables:

• x_ij: A binary indicator variable for whether we build a bridge at water location (i, j).
• y_ijbcn: A binary indicator for whether water location (i, j) is the n^th location linking island b to island c.
• l_bc: A binary indicator variable for whether islands b and c are directly linked (aka you can walk only on bridge squares from b to c).

For bridge building costs c_ij, the objective value to minimize is `sum_ij c_ij * x_ij`. We need to add the following constraints to the model:

• We need to ensure the y_ijbcn variables are valid. We can always only reach a water square if we build a bridge there, so `y_ijbcn <= x_ij` for every water location (i, j). Further, `y_ijbc1` must equal 0 if (i, j) does not border island b. Finally, for n > 1, `y_ijbcn` can only be used if a neighboring water location was used in step n-1. Defining `N(i, j)` to be the water squares neighboring (i, j), this is equivalent to `y_ijbcn <= sum_{(l, m) in N(i, j)} y_lmbc(n-1)`.
• We need to ensure the l_bc variables are only set if b and c are linked. If we define `I(c)` to be the locations bordering island c, this can be accomplished with `l_bc <= sum_{(i, j) in I(c), n} y_ijbcn`.
• We need to ensure that all islands are linked, either directly or indirectly. This can be accomplished in the following way: for every non-empty proper subset S of islands, require that at least one island in S is linked to at least one island in the complement of S, which we’ll call S’. In constraints, we can implement this by adding a constraint for every non-empty set S of size <= K/2 (where K is the number of islands), `sum_{b in S} sum_{c in S'} l_bc >= 1`.

For a problem instance with K islands, W water squares, and specified maximum path length N, this is a mixed integer programming model with `O(K^2WN)` variables and `O(K^2WN + 2^K)` constraints. Obviously this will become intractable as the problem size becomes large, but it may be solvable for the sizes you care about. To get a sense of the scalability, I’ll implement it in python using the pulp package. Let’s first start with the smaller 7 x 9 map with 3 islands at the bottom of the question:

``````import itertools
import pulp
water = {(0, 2): 2.0, (0, 3): 1.0, (0, 4): 1.0, (0, 5): 1.0, (0, 6): 2.0,
(1, 0): 2.0, (1, 1): 9.0, (1, 2): 1.0, (1, 3): 9.0, (1, 4): 9.0,
(1, 5): 9.0, (1, 6): 1.0, (1, 7): 9.0, (1, 8): 2.0,
(2, 0): 1.0, (2, 1): 9.0, (2, 2): 9.0, (2, 3): 1.0, (2, 4): 9.0,
(2, 5): 1.0, (2, 6): 9.0, (2, 7): 9.0, (2, 8): 1.0,
(3, 0): 9.0, (3, 1): 1.0, (3, 2): 9.0, (3, 3): 9.0, (3, 4): 5.0,
(3, 5): 9.0, (3, 6): 9.0, (3, 7): 1.0, (3, 8): 9.0,
(4, 0): 9.0, (4, 1): 9.0, (4, 2): 1.0, (4, 3): 9.0, (4, 4): 1.0,
(4, 5): 9.0, (4, 6): 1.0, (4, 7): 9.0, (4, 8): 9.0,
(5, 0): 9.0, (5, 1): 9.0, (5, 2): 9.0, (5, 3): 2.0, (5, 4): 1.0,
(5, 5): 2.0, (5, 6): 9.0, (5, 7): 9.0, (5, 8): 9.0,
(6, 0): 9.0, (6, 1): 9.0, (6, 2): 9.0, (6, 6): 9.0, (6, 7): 9.0,
(6, 8): 9.0}
islands = {0: [(0, 0), (0, 1)], 1: [(0, 7), (0, 8)], 2: [(6, 3), (6, 4), (6, 5)]}
N = 6

# Island borders
iborders = {}
for k in islands:
iborders[k] = {}
for i, j in islands[k]:
for dx in [-1, 0, 1]:
for dy in [-1, 0, 1]:
if (i+dx, j+dy) in water:
iborders[k][(i+dx, j+dy)] = True

# Create models with specified variables
x = pulp.LpVariable.dicts("x", water.keys(), lowBound=0, upBound=1, cat=pulp.LpInteger)
pairs = [(b, c) for b in islands for c in islands if b < c]
yvals = []
for i, j in water:
for b, c in pairs:
for n in range(N):
yvals.append((i, j, b, c, n))

y = pulp.LpVariable.dicts("y", yvals, lowBound=0, upBound=1)
l = pulp.LpVariable.dicts("l", pairs, lowBound=0, upBound=1)
mod = pulp.LpProblem("Islands", pulp.LpMinimize)

# Objective
mod += sum([water[k] * x[k] for k in water])

# Valid y
for k in yvals:
i, j, b, c, n = k
mod += y[k] <= x[(i, j)]
if n == 0 and not (i, j) in iborders[b]:
mod += y[k] == 0
elif n > 0:
mod += y[k] <= sum([y[(i+dx, j+dy, b, c, n-1)] for dx in [-1, 0, 1] for dy in [-1, 0, 1] if (i+dx, j+dy) in water])

# Valid l
for b, c in pairs:
mod += l[(b, c)] <= sum([y[(i, j, B, C, n)] for i, j, B, C, n in yvals if (i, j) in iborders[c] and B==b and C==c])

# All islands connected (directly or indirectly)
ikeys = islands.keys()
for size in range(1, len(ikeys)/2+1):
for S in itertools.combinations(ikeys, size):
thisSubset = {m: True for m in S}
Sprime = [m for m in ikeys if not m in thisSubset]
mod += sum([l[(min(b, c), max(b, c))] for b in S for c in Sprime]) >= 1

# Solve and output
mod.solve()
for row in range(min([m[0] for m in water]), max([m[0] for m in water])+1):
for col in range(min([m[1] for m in water]), max([m[1] for m in water])+1):
if (row, col) in water:
if x[(row, col)].value() > 0.999:
print "B",
else:
print "-",
else:
print "I",
print ""
``````

This takes 1.4 seconds to run using the default solver from the pulp package (the CBC solver) and outputs the correct solution:

``````I I - - - - - I I
- - B - - - B - -
- - - B - B - - -
- - - - B - - - -
- - - - B - - - -
- - - - B - - - -
- - - I I I - - -
``````

Next, consider the full problem at the top of the question, which is a 13 x 14 grid with 7 islands:

``````water = {(i, j): 1.0 for i in range(13) for j in range(14)}
islands = {0: [(0, 0), (0, 1), (1, 0), (1, 1), (2, 0), (2, 1)],
1: [(9, 0), (9, 1), (10, 0), (10, 1), (10, 2), (11, 0), (11, 1),
(11, 2), (12, 0)],
2: [(0, 7), (0, 8), (1, 7), (1, 8), (2, 7)],
3: [(7, 7), (8, 6), (8, 7), (8, 8), (9, 7)],
4: [(0, 11), (0, 12), (0, 13), (1, 12)],
5: [(4, 10), (4, 11), (5, 10), (5, 11)],
6: [(11, 8), (11, 9), (11, 13), (12, 8), (12, 9), (12, 10), (12, 11),
(12, 12), (12, 13)]}
for k in islands:
for i, j in islands[k]:
del water[(i, j)]

for i, j in [(10, 7), (10, 8), (10, 9), (10, 10), (10, 11), (10, 12),
(11, 7), (12, 7)]:
water[(i, j)] = 20.0

N = 7
``````

MIP solvers often obtain good solutions relatively quickly and then spend a huge about of time trying to prove the optimality of the solution. Using the same solver code as above, the program does not complete within 30 minutes. However, you can provide a timeout to the solver to get an approximate solution:

``````mod.solve(pulp.solvers.PULP_CBC_CMD(maxSeconds=120))
``````

This yields a solution with objective value 17:

``````I I - - - - - I I - - I I I
I I - - - - - I I - - - I -
I I - - - - - I - B - B - -
- - B - - - B - - - B - - -
- - - B - B - - - - I I - -
- - - - B - - - - - I I - -
- - - - - B - - - - - B - -
- - - - - B - I - - - - B -
- - - - B - I I I - - B - -
I I - B - - - I - - - - B -
I I I - - - - - - - - - - B
I I I - - - - - I I - - - I
I - - - - - - - I I I I I I
``````

To improve the quality of the solutions you obtain, you could use a commercial MIP solver (this is free if you are at an academic institution and likely not free otherwise). For instance, here’s the performance of Gurobi 6.0.4, again with a 2-minute time limit (though from the solution log we read that the solver found the current best solution within 7 seconds):

``````mod.solve(pulp.solvers.GUROBI(timeLimit=120))
``````

This actually finds a solution of objective value 16, one better than the OP was able to find by hand!

``````I I - - - - - I I - - I I I
I I - - - - - I I - - - I -
I I - - - - - I - B - B - -
- - B - - - - - - - B - - -
- - - B - - - - - - I I - -
- - - - B - - - - - I I - -
- - - - - B - - B B - - - -
- - - - - B - I - - B - - -
- - - - B - I I I - - B - -
I I - B - - - I - - - - B -
I I I - - - - - - - - - - B
I I I - - - - - I I - - - I
I - - - - - - - I I I I I I
``````