# Longest Repeating Subsequence: Edge Cases

## Problem

While solving the Longest Repeating Subsequence problem using bottom-up dynamic programming, I started running into an edge case whenever a letter was repeated an odd number of times.

The goal is to find the longest subsequence that occurs twice in the string using elements at different indices. The ranges can overlap, but the indices should be disjoint (i.e., `str[1]`, `str[4]` and `str[2]`, `str[6]` can be a solution, but not `str[1]`, `str[2]` and `str[2]`, `str[3]`.

## Minimum Reproducible Example

``````s = 'AXXXA'

n = len(s)

dp = [['' for i in range(n + 1)] for j in range(n + 1)]

for i in range(1, n + 1):
for j in range(1, n + 1):
if (i != j and s[i - 1] == s[j - 1]):
dp[i][j] = dp[i - 1][j - 1] + s[i - 1]
else:
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])

print(dp[n][n])
``````

## Question

Any pointers on how to avoid this?

With input s = ‘AXXXA’, the answer should be either A or X, but the final result returns XX, apparently pairing up the third X with both the first X and the second X.

## False Start

I don’t want to add a check on a match (`s[i - 1] == s[j - 1]`) to see if `s[i - 1] in dp[i - 1][j - 1]` because another input might be something like `AAJDDAJJTATA`, which must add the `A` twice.

• For retrieving the longest, it’s best to implement a new function, with the result of `dp` grid.

• Your algorithm is fine, you only need to increment your new `dp` by 1, when `s[i - 1] == s[j - 1]`:

``````    n = len(s)
dp = [[0 for _ in range(n + 1)] for _ in range(n + 1)]
for i in range(1, n + 1):
for j in range(1, n + 1):
if s[i - 1] == s[j - 1] and i != j:
dp[i][j] = 1 + dp[i - 1][j - 1]
else:
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])

return dp[-1][-1]

``````

If you want to build the longest, you can use the dp grid and whenever the `if s[i - 1] == s[j - 1] and i != j` satisfies, append the char to the longest:

``````    def get_longest():
i, j = n, n
lrs = []
while i > 0 and j > 0:
if s[i - 1] == s[j - 1] and i != j:
lrs.append(s[i - 1])
i -= 1
j -= 1
elif dp[i - 1][j] > dp[i][j - 1]:
i -= 1
else:
j -= 1

return ''.join(lrs[::-1])
``````

### Code

``````def LRS(s):
def get_longest():
i, j = n, n
lrs = []
while i > 0 and j > 0:
if s[i - 1] == s[j - 1] and i != j:
lrs.append(s[i - 1])
i -= 1
j -= 1
elif dp[i - 1][j] > dp[i][j - 1]:
i -= 1
else:
j -= 1

return ''.join(lrs[::-1])
n = len(s)
dp = [[0 for _ in range(n + 1)] for _ in range(n + 1)]
for i in range(1, n + 1):
for j in range(1, n + 1):
if s[i - 1] == s[j - 1] and i != j:
dp[i][j] = 1 + dp[i - 1][j - 1]
else:
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
print(get_longest())
return dp[-1][-1]

s = 'AXXXA'
print(LRS(s))

``````

### Prints

XX
2

You can keep track of the indices of the last added characters, and make sure that when two characters are the same, their indices have to be not only different to each other but also to that of the last added character:

``````s = 'AXXXA'
n = len(s)
dp = [[('', 0, 0) for i in range(n + 1)] for j in range(n + 1)]

for i in range(1, n + 1):
for j in range(1, n + 1):
last = dp[i - 1][j - 1]
if last[2] != i != j != last[1] and s[i - 1] == s[j - 1]:
dp[i][j] = last[0] + s[i - 1], i, j
else:
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1], key=lambda t: t[0])

print(dp[n][n][0]) # outputs X
``````

Actually, your initial algorithm and its answer are correct (… but this is a good question because others might confuse what an LRS means).

Given your input (`in`), the subsequences (`s1`, `s2`) are:

``````in: AXXXA
s1:  XX
s2:   XX
``````

So `XX` (length 2) is indeed the correct answer here.

`X` would be the correct answer for the problem’s non-overlapping version, where the ranges – not just individual indices – must be disjoint.

Categories: questions
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.