LeetCode 1044. Longest Duplicate Substring
Question
Given a string s
, consider all duplicated substrings: (contiguous) substrings of s that occur 2 or more times. The occurrences may overlap.
Return any duplicated substring that has the longest possible length. If s
does not have a duplicated substring, the answer is ""
.
Example 1:
1 | Input: s = "banana" |
Example 2:
1 | Input: s = "abcd" |
Constraints:
2 <= s.length <= 3 * 104
s
consists of lowercase English letters.
Source: https://leetcode.com/problems/longest-duplicate-substring/
Solution
One intuitive idea is 2D DP, which has a O(n^2)
time complexity. Finding duplicate substring in one string can be regarded as a variant of the problem of finding common substrings between two strings, like LeetCode 718. Maximum Length of Repeated Subarray.
How to solve this problem faster? The key point is how to determine whether two substring are identical. To prove that two substring are identical, we have to compare their characters one by one. However, we do not have to do so to be sure that two substring are not identical. Instead, if the hash values of two substrings are not the same, these two substrings must be different.
Computing the hash value of each substring is still time consuming. We use rolling hash function, which treats character in a string as digits and takes the numeric value of these digits as the hash value. Thus, for a new string that only changes a few characters (in this problem, the first and the last), rolling hash function can compute the new hash value quickly.
In the rolling hash function, we use a large prime number 10^9+7
to avoid overflow.
A long duplicate string must contain shorter duplicate strings. Also, we only care about the longest duplicate substring. We can use binary search to find the largest length of duplicate substring.
1 | /** |
LeetCode 1044. Longest Duplicate Substring
http://yenotes.org/2022/01/10/LeetCode-1044-Longest-Duplicate-Substring/