题目:
哦,不!你不小心把一个长篇文章中的空格、标点都删掉了,并且大写也弄成了小写。
像句子"I reset the computer. It still didn’t boot!"已经
变成了"iresetthecomputeritstilldidntboot"。在处理标点符号和大小写之前,
你得先把它断成词语。当然了,你有一本厚厚的词典dictionary,不过,
有些词没在词典里。假设文章用sentence表示,设计一个算法,把文章断开,
要求未识别的字符最少,返回未识别的字符数。
注意:本题相对原题稍作改动,只需返回未识别的字符数
示例:
输入:
dictionary = ["looked","just","like","her","brother"]
sentence = "jesslookedjustliketimherbrother"
输出: 7
解释: 断句后为"jess looked just like tim her brother",共7个未识别字符。
来源:力扣(LeetCode)
链接:https://leetcode-cn.com/problems/re-space-lcci
著作权归领扣网络所有。商业转载请联系官方授权,非商业转载请注明出处。
答案:
时间:
30min
class Solution:
def respace(self, dictionary: List[str], sentence: str) -> int:
d=set(dictionary)
n=len(sentence)
dp=[0]*(n+1)
for i in range(1,n+1):
dp[i]=dp[i-1]+1
for j in range(i+1):
if sentence[j-1:i] in d:
dp[i]= min(dp[j-1],dp[i])
return dp[-1]
答案二:
class Solution:
def respace(self, dictionary: List[str], sentence: str) -> int:
n=len(sentence)
d=set(dictionary)
@functools.lru_cache(None)
def dfs(i):
if i==-1:return 0
res=dfs(i-1)+1
for j in range(i+1):
if sentence[j:i+1] in d:
res=min(res,dfs(j-1))
return res
return dfs(n-1)
答案三:Trie加速
后缀Trie
from collections import defaultdict
from functools import reduce
TrieNode = lambda: defaultdict(TrieNode)
class Trie:
def __init__(self):
self.trie = TrieNode()
def insert(self, word):
reduce(dict.__getitem__, word, self.trie)['end'] = True
def search(self, word):
return reduce(lambda d,k: d[k] if k in d else TrieNode(), word, self.trie).get('end', False)
def startsWith(self, word):
return bool(reduce(lambda d,k: d[k] if k in d else TrieNode(), word, self.trie).keys())
class Solution:
def respace(self, dictionary: List[str], sentence: str) -> int:
n=len(sentence)
d=set(dictionary)
tree=Trie()
for word in d:
tree.insert(word[::-1])
dp=[0]*(n+1)
for i in range(1,n+1):
curNode = tree.trie
dp[i]=dp[i-1]+1
for j in range(i,0,-1):
c = sentence[j-1]
if c not in curNode:break
if "end" in curNode[c]:
dp[i]=min(dp[i],dp[j-1])
curNode=curNode[c]
return dp[-1]
要点:
1. dp 定义
第 i 个字符(从1开始)中没见过的数
遍历的时候,从起点向右与当前点连线,找到即可转移。
下标很乱。
同一个位置,dp的下标要比 sentence的下标 大1