Finding locations of words as lists of coordinates in a grid of letters
Question:
Given a grid of letters and a list of words, find the location of each word as a list of coordinates. Resulting list can be in any order, but coordinates for individual words must be given in order. Letters cannot be reused across words, and letters. Each given word is guaranteed to be in the grid. Consecutive letters of words are either down or to the right (i.e no reversed words or reversed sections of words, only down or to the right).
For example, given the following grid and set of words,
[
['d', 'r', 'd', 'o', 'r', 's'],
['o', 'b', 'i', 'g', 'n', 'c'],
['g', 'f', 'n', 'm', 't', 'a'],
['x', 's', 'i', 'a', 'n', 't']
]
words1 = [ "dog", "dogma", "cat" ]
output the list of coordinates below:
findWords(grid, words)->
[ [ (1, 5), (2, 5), (3, 5) ], # cat
[ (0, 2), (0, 3), (1, 3), (2, 3), (3, 3)], # dogma
[ (0, 0), (1, 0), (2, 0) ], # dog
]
In this example, the "dog" in "dogma" cannot be used for the word "dog" since letters cannot be reused.
Answers:
Here is my attempt at a solution. First, I find all possible paths that I can take to spell any of the words. Paths are indexed by the word that they spell. Then I iterate through all possible combinations of paths by adding one possible path per word at a time while maintaining a seen set. Once I run out of feasible paths for a word before I find them all, then I backtrack.
def findWords(grid, words):
# Regular old dfs through the grid, we only go right or down
def dfs(row, col, path, idx):
if idx == len(word):
if word in all_paths:
all_paths[word].append(list(path))
else:
all_paths[word] = [list(path)]
else:
if row + 1 < len(grid):
if grid[row+1][col] == word[idx]:
path.append((row+1, col))
dfs(row+1, col, path, idx+1)
path.pop()
if col + 1 < len(grid[0]):
if grid[row][col+1] == word[idx]:
path.append((row, col+1))
dfs(row, col+1, path, idx+1)
path.pop()
# For each word, find all possible paths through the grid to spell the word
# Each path is a collection of coordinates as is desired from the function
# Paths are indexed by word and stored in a list in a dictionary
all_paths = {}
for row in range(len(grid)):
for col in range(len(grid[0])):
for word in words:
if grid[row][col] == word[0]:
dfs(row, col, [(row, col)], 1)
# Try all possible combinations of paths from each letter
def dfs2(idx):
if idx == len(words):
return True
word = words[idx]
for path in all_paths[word]:
for loc in path:
if loc in seen:
return False
for loc in path:
seen.add(loc)
if dfs2(idx+1):
retlst.append(path)
return True
else:
for loc in path:
seen.remove(loc)
return False
# Backtrack through possible combinations
seen = set([])
retlst = []
dfs2(0)
return retlst
There’s probably a way to DFS through possible combinations of paths WHILE you’re DFSing through the words you need to spell to avoid pre-computing all paths, but it was way too complicated for me to figure out.
Based on this answer, first you want to make a dictionary which maps letter to positions:
board = [
['d', 'r', 'd', 'o', 'r', 's'],
['o', 'b', 'i', 'g', 'n', 'c'],
['g', 'f', 'n', 'm', 't', 'a'],
['x', 's', 'i', 'a', 'n', 't']
]
words = [ "dog", "dogma", "cat" ]
letter_positions = {}
for y, row in enumerate(board):
for x, letter in enumerate(row):
letter_positions.setdefault(letter, []).append((x, y))
>>> letter_positions
{'d': [(0, 0), (2, 0)],
'r': [(1, 0), (4, 0)],
'o': [(3, 0), (0, 1)],
's': [(5, 0), (1, 3)],
'b': [(1, 1)],
'i': [(2, 1), (2, 3)],
'g': [(3, 1), (0, 2)],
'n': [(4, 1), (2, 2), (4, 3)],
'c': [(5, 1)],
'f': [(1, 2)],
'm': [(3, 2)],
't': [(4, 2), (5, 3)],
'a': [(5, 2), (3, 3)],
'x': [(0, 3)]}
As in the linked answer, you should keep track of valid moves. Also you can only move down or right, so I added a plus condition compared to the original answer. I left the find_word
function unchanged.
def is_valid_move(position, last):
if last == []:
return True
if position[0] < last[0] or position[1] < last[1]:
return False # only allow down and right
return (
abs(position[0] - last[0]) <= 1 and
abs(position[1] - last[1]) <= 1
)
def find_word(word, used=None):
if word == "":
return []
if used is None:
used = []
letter, rest = word[:1], word[1:]
for position in letter_positions.get(letter) or []:
if position in used:
continue
if not is_valid_move(position, used and used[-1]):
continue
path = find_word(rest, used + [position])
if path is not None:
return [position] + path
return None
A little bit of explanation of the logic of find_word
. The idea here is to take the first letter of the word in letter
and store every other letter in rest
, then iterate over the possible positions of that letter. Filter those positions based on if it’s used and if it’s a valid move. After that, recursively call find_word
on the rest of the letters.
for word in words:
print(find_word(word))
[(0, 0), (0, 1), (0, 2)] # dog
[(2, 0), (3, 0), (3, 1), (3, 2), (3, 3)] # dogma
[(5, 1), (5, 2), (5, 3)] # cat
Well, the indexing is flipped compared to the question, but that shouldn’t be a big problem.
The task of finding the words in the grid can be done through the solutions provided in the other answers, or through tries, suffix trees or arrays.
As an example, based on the answer given by @Péter Leéh, this would be a modified version for finding all paths using python3
:
grid = [
['d', 'r', 'd', 'o', 'r', 's'],
['o', 'b', 'i', 'g', 'n', 'c'],
['g', 'f', 'n', 'm', 't', 'a'],
['x', 's', 'i', 'a', 'n', 't']
]
words1 = [ "dog", "dogma", "cat" ]
# Building the dense grid
dense_grid = {}
for row, line in enumerate(grid):
for col, letter in enumerate(line):
dense_grid.setdefault(letter, []).append((row, col))
# Finding all paths for all words
def is_valid_move(p, q):
return ( p[0] == q[0] and p[1]+1 == q[1] ) or ( p[0]+1 == q[0] and p[1] == q[1] )
def find_all_paths(curr_pos, suffix, dense_grid=dense_grid):
if len(suffix) == 0:
return [[curr_pos]]
possible_suffix_paths = []
for pos in dense_grid[suffix[0]]:
if is_valid_move(curr_pos, pos):
possible_suffix_paths += find_all_paths(pos, suffix[1:])
# Since the list of positions is ordered, I can skip the rest
elif pos[0] - curr_pos[0] >= 2:
break
return [ [curr_pos] + p for p in possible_suffix_paths ]
words_paths = [
[ path for pos in dense_grid[word[0]] for path in find_all_paths(pos, word[1:]) ]
for word in words1
]
The final dense_grid
is a dictionary from character to list of positions in the grid, being the positions represented by (row, column)
:
{
'd': [(0, 0), (0, 2)],
'r': [(0, 1), (0, 4)],
'o': [(0, 3), (1, 0)],
's': [(0, 5), (3, 1)],
'b': [(1, 1)],
'i': [(1, 2), (3, 2)],
'g': [(1, 3), (2, 0)],
'n': [(1, 4), (2, 2), (3, 4)],
'c': [(1, 5)],
'f': [(2, 1)],
'm': [(2, 3)],
't': [(2, 4), (3, 5)],
'a': [(2, 5), (3, 3)],
'x': [(3, 0)]
}
The final words_paths
is a list containing for each word a list of all possible paths, being each path defined by a sequence (list) of positions in the grid:
[
[
[(0, 0), (1, 0), (2, 0)], # dog
[(0, 2), (0, 3), (1, 3)]
],
[
[(0, 2), (0, 3), (1, 3), (2, 3), (3, 3)] # dogma
],
[
[(1, 5), (2, 5), (3, 5)] # cat
]
]
After you have all the possible paths for all the words, you can find the words with unique characters by transforming the problem into a digraph maximum flow problem.
To do the transformation of this problem, for every word, you have to create a starting and ending node, henceforth called START_word
and END_word
. The START_word
nodes are connected to all the first positions of the paths of the word, which will then be connected to the second positions, and so on. The last positions of all the paths of the word will then be connected to the END_word
node.
The nodes of the positions are unique across the graph. Meaning that words sharing the same positions in the grid, will also share the same nodes.
Now that we have the graph representing all the possible paths for all the words, we just need to connect a SOURCE
node to all the starting nodes, and connect all the ending nodes to a TARGET
node. With the resulting graph, you can the solve the maximum flow problem, where every edge in the graph as a capacity of 1
.
This would be the resulting graph that you get from the problem you defined in the question:
However, to make sure that there are no nodes where the minimum of the in degree and out degree is greater than 1, we also need to add choking nodes. Assuming that a node has this characteristic, we need to remove all the out edges, and connect the original node with a single choking node. To the choking node is then added the original node’s out edges.
I tested this idea using the library networkx
, and here is the code I used to test it:
import networkx as nx
# Connecting source node with starting nodes
edges = [ ("SOURCE", "START_"+word) for word in words1 ]
# Connecting ending nodes with target nodes
edges += [ ("END_"+word, "TARGET") for word in words1 ]
# Connecting characters between them and to the starting and ending nodes too
edges += list(set(
( s_node if isinstance(s_node, tuple) else s_node,
t_node if isinstance(t_node, tuple) else t_node )
for word, paths in zip(words1, words_paths)
for path in paths
for s_node, t_node in zip(["START_"+word] + path, path + ["END_"+word])
))
# Generating graph from the nodes and edges created
g = nx.DiGraph()
g.add_edges_from(edges, capacity=1)
# Adding choke nodes if required
node_edge_dict = {}
nodes_indeg_gt1 = [ node for node, in_deg in g.in_degree() if not isinstance(node, str) and in_deg > 1 ]
for s_node, t_node in g.out_edges(nodes_indeg_gt1):
node_edge_dict.setdefault(s_node, []).append(t_node)
for node, next_nodes in node_edge_dict.items():
if len(next_nodes) <= 1: continue
choke_node = node + (-1,)
g.add_edge(node, choke_node, capacity=1)
g.add_edges_from([ (choke_node, p) for p in next_nodes ], capacity=1)
g.remove_edges_from([ (node, p) for p in next_nodes ])
# Solving the maximum flow problem
num_words, max_flow_dict = nx.algorithms.flow.maximum_flow(g, "SOURCE", "TARGET")
# Extracting final paths for all the words
final_words_path = []
for word in words1:
word_path = []
start = "START_"+word
end = "END_"+word
node = start
while node != end:
node = next( n for n,f in max_flow_dict[node].items() if f == 1 )
if isinstance(node, str) or len(node) == 3: continue
word_path.append(node)
final_words_path.append(word_path)
print(final_words_path)
The output for the problem stated in the question is this:
[
[(0, 0), (1, 0), (2, 0)], # dog
[(0, 2), (0, 3), (1, 3), (2, 3), (3, 3)], # dogma
[(1, 5), (2, 5), (3, 5)] # cat
]
Approach
- Find paths that spell words. We only continue a path as long as its a prefix of a word.
- We quickly check if a word is a prefix by using bisect_left to check if it’s found in the list of words (a fast alternative to Trie Tree).
- We gather the list of paths for each word
- We reduce the paths to the non-overlapping ones to satisfy the requirement that no two words share a cell letter.
Code
from bisect import bisect_left
def find_words(board, words, x, y, prefix, path):
' Find words that can be generated starting at position x, y '
# Base case
# find if current word prefix is in list of words
found = bisect_left(words, prefix) # can use binary search since words are sorted
if found >= len(words):
return
if words[found] == prefix:
yield prefix, path # Prefix in list of words
# Give up on path if what we found is not even a prefix
# (there is no point in going further)
if len(words[found]) < len(prefix) or words[found][:len(prefix)] != prefix:
return
# Extend path by one lettter in boarde
# Since can only go right and down
# No need to worry about same cell occurring multiple times in a given path
for adj_x, adj_y in [(0, 1), (1, 0)]:
x_new, y_new = x + adj_x, y + adj_y
if x_new < len(board) and y_new < len(board[0]):
yield from find_words(board, words, x_new, y_new,
prefix + board[x_new][y_new],
path + [(x_new, y_new)])
def check_all_starts(board, words):
' find all possilble paths through board for generating words '
# check each starting point in board
for x in range(len(board)):
for y in range(len(board[0])):
yield from find_words(board, words, x, y, board[x][y], [(x, y)])
def find_non_overlapping(choices, path):
' Find set of choices with non-overlapping paths '
if not choices:
# Base case
yield path
else:
word, options = choices[0]
for option in options:
set_option = set(option)
if any(set_option.intersection(p) for w, p in path):
# overlaps with path
continue
else:
yield from find_non_overlapping(choices[1:], path + [(word, option)])
def solve(board, words):
' Solve for path through board to create words '
words.sort()
# Get choice of paths for each word
choices = {}
for word, path in check_all_starts(board, words):
choices.setdefault(word, []).append(path)
# Find non-intersecting paths (i.e. no two words should have a x, y in common)
if len(choices) == len(words):
return next(find_non_overlapping(list(choices.items()), []), None)
Tests
Test 1
from pprint import pprint as pp
words = [ "dog", "dogma", "cat" ]
board = [
['d', 'r', 'd', 'o', 'r', 's'],
['o', 'b', 'i', 'g', 'n', 'c'],
['g', 'f', 'n', 'm', 't', 'a'],
['x', 's', 'i', 'a', 'n', 't']]
pp(solve(board, words))
Output
Test 1
[('dog', [(0, 0), (1, 0), (2, 0)]),
('dogma', [(0, 2), (0, 3), (1, 3), (2, 3), (3, 3)]),
('cat', [(1, 5), (2, 5), (3, 5)])]
Test 2
words = ["by","bat"]
board = [ ['b', 'a', 't'],
['y', 'x', 'b'],
['x', 'x', 'y'], ]
pp(solve(board, words))
Output
Test 2
[('bat', [(0, 0), (0, 1), (0, 2)]),
('by', [(1, 2), (2, 2)])]
Here is another way to do it:
def sol(word, board):
rows = len(board)
cols = len(board[0])
coordinates = []
wordCnt = 0
co = []
result = []
def getWord(row, col, word, wordCnt, board):
if row < 0 or col < 0 or row > len(board)-1 or col > len(board[0])-1 or wordCnt > len(word) -1 or board[row][col] != word[wordCnt]:
return
result.append(word[wordCnt])
co.append((row, col))
getWord(row+1, col, word, wordCnt+1, board)
getWord(row, col+1, word, wordCnt+1, board)
return co, result
for row in range(rows):
for col in range(cols):
if board[row][col] == word[wordCnt]:
co, result = getWord(row, col, word, wordCnt, board)
if ''.join(result[-len(word):]) == word:
print(co[-len(word):])
sol('cat', board)
Given a grid of letters and a list of words, find the location of each word as a list of coordinates. Resulting list can be in any order, but coordinates for individual words must be given in order. Letters cannot be reused across words, and letters. Each given word is guaranteed to be in the grid. Consecutive letters of words are either down or to the right (i.e no reversed words or reversed sections of words, only down or to the right).
For example, given the following grid and set of words,
[
['d', 'r', 'd', 'o', 'r', 's'],
['o', 'b', 'i', 'g', 'n', 'c'],
['g', 'f', 'n', 'm', 't', 'a'],
['x', 's', 'i', 'a', 'n', 't']
]
words1 = [ "dog", "dogma", "cat" ]
output the list of coordinates below:
findWords(grid, words)->
[ [ (1, 5), (2, 5), (3, 5) ], # cat
[ (0, 2), (0, 3), (1, 3), (2, 3), (3, 3)], # dogma
[ (0, 0), (1, 0), (2, 0) ], # dog
]
In this example, the "dog" in "dogma" cannot be used for the word "dog" since letters cannot be reused.
Here is my attempt at a solution. First, I find all possible paths that I can take to spell any of the words. Paths are indexed by the word that they spell. Then I iterate through all possible combinations of paths by adding one possible path per word at a time while maintaining a seen set. Once I run out of feasible paths for a word before I find them all, then I backtrack.
def findWords(grid, words):
# Regular old dfs through the grid, we only go right or down
def dfs(row, col, path, idx):
if idx == len(word):
if word in all_paths:
all_paths[word].append(list(path))
else:
all_paths[word] = [list(path)]
else:
if row + 1 < len(grid):
if grid[row+1][col] == word[idx]:
path.append((row+1, col))
dfs(row+1, col, path, idx+1)
path.pop()
if col + 1 < len(grid[0]):
if grid[row][col+1] == word[idx]:
path.append((row, col+1))
dfs(row, col+1, path, idx+1)
path.pop()
# For each word, find all possible paths through the grid to spell the word
# Each path is a collection of coordinates as is desired from the function
# Paths are indexed by word and stored in a list in a dictionary
all_paths = {}
for row in range(len(grid)):
for col in range(len(grid[0])):
for word in words:
if grid[row][col] == word[0]:
dfs(row, col, [(row, col)], 1)
# Try all possible combinations of paths from each letter
def dfs2(idx):
if idx == len(words):
return True
word = words[idx]
for path in all_paths[word]:
for loc in path:
if loc in seen:
return False
for loc in path:
seen.add(loc)
if dfs2(idx+1):
retlst.append(path)
return True
else:
for loc in path:
seen.remove(loc)
return False
# Backtrack through possible combinations
seen = set([])
retlst = []
dfs2(0)
return retlst
There’s probably a way to DFS through possible combinations of paths WHILE you’re DFSing through the words you need to spell to avoid pre-computing all paths, but it was way too complicated for me to figure out.
Based on this answer, first you want to make a dictionary which maps letter to positions:
board = [
['d', 'r', 'd', 'o', 'r', 's'],
['o', 'b', 'i', 'g', 'n', 'c'],
['g', 'f', 'n', 'm', 't', 'a'],
['x', 's', 'i', 'a', 'n', 't']
]
words = [ "dog", "dogma", "cat" ]
letter_positions = {}
for y, row in enumerate(board):
for x, letter in enumerate(row):
letter_positions.setdefault(letter, []).append((x, y))
>>> letter_positions
{'d': [(0, 0), (2, 0)],
'r': [(1, 0), (4, 0)],
'o': [(3, 0), (0, 1)],
's': [(5, 0), (1, 3)],
'b': [(1, 1)],
'i': [(2, 1), (2, 3)],
'g': [(3, 1), (0, 2)],
'n': [(4, 1), (2, 2), (4, 3)],
'c': [(5, 1)],
'f': [(1, 2)],
'm': [(3, 2)],
't': [(4, 2), (5, 3)],
'a': [(5, 2), (3, 3)],
'x': [(0, 3)]}
As in the linked answer, you should keep track of valid moves. Also you can only move down or right, so I added a plus condition compared to the original answer. I left the find_word
function unchanged.
def is_valid_move(position, last):
if last == []:
return True
if position[0] < last[0] or position[1] < last[1]:
return False # only allow down and right
return (
abs(position[0] - last[0]) <= 1 and
abs(position[1] - last[1]) <= 1
)
def find_word(word, used=None):
if word == "":
return []
if used is None:
used = []
letter, rest = word[:1], word[1:]
for position in letter_positions.get(letter) or []:
if position in used:
continue
if not is_valid_move(position, used and used[-1]):
continue
path = find_word(rest, used + [position])
if path is not None:
return [position] + path
return None
A little bit of explanation of the logic of find_word
. The idea here is to take the first letter of the word in letter
and store every other letter in rest
, then iterate over the possible positions of that letter. Filter those positions based on if it’s used and if it’s a valid move. After that, recursively call find_word
on the rest of the letters.
for word in words:
print(find_word(word))
[(0, 0), (0, 1), (0, 2)] # dog
[(2, 0), (3, 0), (3, 1), (3, 2), (3, 3)] # dogma
[(5, 1), (5, 2), (5, 3)] # cat
Well, the indexing is flipped compared to the question, but that shouldn’t be a big problem.
The task of finding the words in the grid can be done through the solutions provided in the other answers, or through tries, suffix trees or arrays.
As an example, based on the answer given by @Péter Leéh, this would be a modified version for finding all paths using python3
:
grid = [
['d', 'r', 'd', 'o', 'r', 's'],
['o', 'b', 'i', 'g', 'n', 'c'],
['g', 'f', 'n', 'm', 't', 'a'],
['x', 's', 'i', 'a', 'n', 't']
]
words1 = [ "dog", "dogma", "cat" ]
# Building the dense grid
dense_grid = {}
for row, line in enumerate(grid):
for col, letter in enumerate(line):
dense_grid.setdefault(letter, []).append((row, col))
# Finding all paths for all words
def is_valid_move(p, q):
return ( p[0] == q[0] and p[1]+1 == q[1] ) or ( p[0]+1 == q[0] and p[1] == q[1] )
def find_all_paths(curr_pos, suffix, dense_grid=dense_grid):
if len(suffix) == 0:
return [[curr_pos]]
possible_suffix_paths = []
for pos in dense_grid[suffix[0]]:
if is_valid_move(curr_pos, pos):
possible_suffix_paths += find_all_paths(pos, suffix[1:])
# Since the list of positions is ordered, I can skip the rest
elif pos[0] - curr_pos[0] >= 2:
break
return [ [curr_pos] + p for p in possible_suffix_paths ]
words_paths = [
[ path for pos in dense_grid[word[0]] for path in find_all_paths(pos, word[1:]) ]
for word in words1
]
The final dense_grid
is a dictionary from character to list of positions in the grid, being the positions represented by (row, column)
:
{
'd': [(0, 0), (0, 2)],
'r': [(0, 1), (0, 4)],
'o': [(0, 3), (1, 0)],
's': [(0, 5), (3, 1)],
'b': [(1, 1)],
'i': [(1, 2), (3, 2)],
'g': [(1, 3), (2, 0)],
'n': [(1, 4), (2, 2), (3, 4)],
'c': [(1, 5)],
'f': [(2, 1)],
'm': [(2, 3)],
't': [(2, 4), (3, 5)],
'a': [(2, 5), (3, 3)],
'x': [(3, 0)]
}
The final words_paths
is a list containing for each word a list of all possible paths, being each path defined by a sequence (list) of positions in the grid:
[
[
[(0, 0), (1, 0), (2, 0)], # dog
[(0, 2), (0, 3), (1, 3)]
],
[
[(0, 2), (0, 3), (1, 3), (2, 3), (3, 3)] # dogma
],
[
[(1, 5), (2, 5), (3, 5)] # cat
]
]
After you have all the possible paths for all the words, you can find the words with unique characters by transforming the problem into a digraph maximum flow problem.
To do the transformation of this problem, for every word, you have to create a starting and ending node, henceforth called START_word
and END_word
. The START_word
nodes are connected to all the first positions of the paths of the word, which will then be connected to the second positions, and so on. The last positions of all the paths of the word will then be connected to the END_word
node.
The nodes of the positions are unique across the graph. Meaning that words sharing the same positions in the grid, will also share the same nodes.
Now that we have the graph representing all the possible paths for all the words, we just need to connect a SOURCE
node to all the starting nodes, and connect all the ending nodes to a TARGET
node. With the resulting graph, you can the solve the maximum flow problem, where every edge in the graph as a capacity of 1
.
This would be the resulting graph that you get from the problem you defined in the question:
However, to make sure that there are no nodes where the minimum of the in degree and out degree is greater than 1, we also need to add choking nodes. Assuming that a node has this characteristic, we need to remove all the out edges, and connect the original node with a single choking node. To the choking node is then added the original node’s out edges.
I tested this idea using the library networkx
, and here is the code I used to test it:
import networkx as nx
# Connecting source node with starting nodes
edges = [ ("SOURCE", "START_"+word) for word in words1 ]
# Connecting ending nodes with target nodes
edges += [ ("END_"+word, "TARGET") for word in words1 ]
# Connecting characters between them and to the starting and ending nodes too
edges += list(set(
( s_node if isinstance(s_node, tuple) else s_node,
t_node if isinstance(t_node, tuple) else t_node )
for word, paths in zip(words1, words_paths)
for path in paths
for s_node, t_node in zip(["START_"+word] + path, path + ["END_"+word])
))
# Generating graph from the nodes and edges created
g = nx.DiGraph()
g.add_edges_from(edges, capacity=1)
# Adding choke nodes if required
node_edge_dict = {}
nodes_indeg_gt1 = [ node for node, in_deg in g.in_degree() if not isinstance(node, str) and in_deg > 1 ]
for s_node, t_node in g.out_edges(nodes_indeg_gt1):
node_edge_dict.setdefault(s_node, []).append(t_node)
for node, next_nodes in node_edge_dict.items():
if len(next_nodes) <= 1: continue
choke_node = node + (-1,)
g.add_edge(node, choke_node, capacity=1)
g.add_edges_from([ (choke_node, p) for p in next_nodes ], capacity=1)
g.remove_edges_from([ (node, p) for p in next_nodes ])
# Solving the maximum flow problem
num_words, max_flow_dict = nx.algorithms.flow.maximum_flow(g, "SOURCE", "TARGET")
# Extracting final paths for all the words
final_words_path = []
for word in words1:
word_path = []
start = "START_"+word
end = "END_"+word
node = start
while node != end:
node = next( n for n,f in max_flow_dict[node].items() if f == 1 )
if isinstance(node, str) or len(node) == 3: continue
word_path.append(node)
final_words_path.append(word_path)
print(final_words_path)
The output for the problem stated in the question is this:
[
[(0, 0), (1, 0), (2, 0)], # dog
[(0, 2), (0, 3), (1, 3), (2, 3), (3, 3)], # dogma
[(1, 5), (2, 5), (3, 5)] # cat
]
Approach
- Find paths that spell words. We only continue a path as long as its a prefix of a word.
- We quickly check if a word is a prefix by using bisect_left to check if it’s found in the list of words (a fast alternative to Trie Tree).
- We gather the list of paths for each word
- We reduce the paths to the non-overlapping ones to satisfy the requirement that no two words share a cell letter.
Code
from bisect import bisect_left
def find_words(board, words, x, y, prefix, path):
' Find words that can be generated starting at position x, y '
# Base case
# find if current word prefix is in list of words
found = bisect_left(words, prefix) # can use binary search since words are sorted
if found >= len(words):
return
if words[found] == prefix:
yield prefix, path # Prefix in list of words
# Give up on path if what we found is not even a prefix
# (there is no point in going further)
if len(words[found]) < len(prefix) or words[found][:len(prefix)] != prefix:
return
# Extend path by one lettter in boarde
# Since can only go right and down
# No need to worry about same cell occurring multiple times in a given path
for adj_x, adj_y in [(0, 1), (1, 0)]:
x_new, y_new = x + adj_x, y + adj_y
if x_new < len(board) and y_new < len(board[0]):
yield from find_words(board, words, x_new, y_new,
prefix + board[x_new][y_new],
path + [(x_new, y_new)])
def check_all_starts(board, words):
' find all possilble paths through board for generating words '
# check each starting point in board
for x in range(len(board)):
for y in range(len(board[0])):
yield from find_words(board, words, x, y, board[x][y], [(x, y)])
def find_non_overlapping(choices, path):
' Find set of choices with non-overlapping paths '
if not choices:
# Base case
yield path
else:
word, options = choices[0]
for option in options:
set_option = set(option)
if any(set_option.intersection(p) for w, p in path):
# overlaps with path
continue
else:
yield from find_non_overlapping(choices[1:], path + [(word, option)])
def solve(board, words):
' Solve for path through board to create words '
words.sort()
# Get choice of paths for each word
choices = {}
for word, path in check_all_starts(board, words):
choices.setdefault(word, []).append(path)
# Find non-intersecting paths (i.e. no two words should have a x, y in common)
if len(choices) == len(words):
return next(find_non_overlapping(list(choices.items()), []), None)
Tests
Test 1
from pprint import pprint as pp
words = [ "dog", "dogma", "cat" ]
board = [
['d', 'r', 'd', 'o', 'r', 's'],
['o', 'b', 'i', 'g', 'n', 'c'],
['g', 'f', 'n', 'm', 't', 'a'],
['x', 's', 'i', 'a', 'n', 't']]
pp(solve(board, words))
Output
Test 1
[('dog', [(0, 0), (1, 0), (2, 0)]),
('dogma', [(0, 2), (0, 3), (1, 3), (2, 3), (3, 3)]),
('cat', [(1, 5), (2, 5), (3, 5)])]
Test 2
words = ["by","bat"]
board = [ ['b', 'a', 't'],
['y', 'x', 'b'],
['x', 'x', 'y'], ]
pp(solve(board, words))
Output
Test 2
[('bat', [(0, 0), (0, 1), (0, 2)]),
('by', [(1, 2), (2, 2)])]
Here is another way to do it:
def sol(word, board):
rows = len(board)
cols = len(board[0])
coordinates = []
wordCnt = 0
co = []
result = []
def getWord(row, col, word, wordCnt, board):
if row < 0 or col < 0 or row > len(board)-1 or col > len(board[0])-1 or wordCnt > len(word) -1 or board[row][col] != word[wordCnt]:
return
result.append(word[wordCnt])
co.append((row, col))
getWord(row+1, col, word, wordCnt+1, board)
getWord(row, col+1, word, wordCnt+1, board)
return co, result
for row in range(rows):
for col in range(cols):
if board[row][col] == word[wordCnt]:
co, result = getWord(row, col, word, wordCnt, board)
if ''.join(result[-len(word):]) == word:
print(co[-len(word):])
sol('cat', board)