Organizing Strings of Different Lengths
In “Enumerating k-mers Lexicographically”, we introduced the lexicographic order for strings of the same length constructed from some ordered underlying alphabet. However, our experience with dictionaries suggests that we should be able to order strings of different lengths just as easily. That is, we already have an intuitive sense that “APPLE” comes before “APPLET”, which comes before “ARTS,” and so we should be able to apply this intuition toward cataloguing genetic strings of varying lengths.
Problem
Say that we have strings s=s1s2⋯sms=s1s2⋯sm and t=t1t2⋯tnt=t1t2⋯tn with m<nm<n. Consider the substring t′=t[1:m]t′=t[1:m]. We have two cases:
- If s=t′s=t′, then we set s<Lexts<Lext because ss is shorter than tt (e.g., APPLE<APPLETAPPLE<APPLET).
- Otherwise, s≠t′s≠t′. We define s
Lexts>Lext if s>Lext′s>Lext′ (e.g., APPLET<LexARTSAPPLET<LexARTS because APPL<LexARTSAPPL<LexARTS).
Given: A permutation of at most 12 symbols defining an ordered alphabet AA and a positive integer nn (n≤4n≤4).
Return: All strings of length at most nn formed from AA, ordered lexicographically. (Note: As in “Enumerating k-mers Lexicographically”, alphabet order is based on the order in which the symbols are given.)
Sample Dataset
D N A
3
Sample Output
D
DD
DDD
DDN
DDA
DN
DND
DNN
DNA
DA
DAD
DAN
DAA
N
ND
NDD
NDN
NDA
NN
NND
NNN
NNA
NA
NAD
NAN
NAA
A
AD
ADD
ADN
ADA
AN
AND
ANN
ANA
AA
AAD
AAN
AAA
解题思路:森林中树根到子节点的路径
其实本题就是上一题 Enumerating k-mers Lexicographically 枚举的扩展,结果就是从任一根节点出发,到达其子节点的所有路径。
森林中以树根节点为起点的所有路径.drawio
如上图,我们只有枚举树高度为 k=3
即可,将所有从根节点出发的路径记录下即可。
from typing import List
class Solution:
def combine(self, s: str, k: int) -> List[str]:
res = []
path = []
def backtrace(idx: int) -> None:
"""回溯函数:每个字符 s[idx] 可以和其后 [idx+1, n) 中选择 n-1 个构成组合"""
if k >= len(path): # 达到 k 个组合,加入答案中
res.append(''.join(path)) # 此处必须拷贝,否则传引用是同一份 path
if k == len(path): return # 不准继续添加
for j in range(len(s)): # [idx, n) 表示可重复选择
path.append(s[j]) # 选择当前 s[j]
backtrace(j + 1) # 下一个字符在 [j+1,n) 选
path.pop()
backtrace(0)
return res
seqs = input().split() # 输出序列
k = int(input()) # 转为整数
for s in Solution().combine(seqs, k):
print(s)