值得一提的是,在 Common Lisp中,字符串是数组,即字符串也是序列。也就是说,数组和序列中的一些概念也可以应用在字符串上。当你找不到某个特定的字符串的函数时,试试在数组或序列通用的函数中找找看。
ASDF3中的一些操作字符串的函数(strcat, string-prefix-p, string-enclosed-p, first-char, last-char, split-string, stripln)。
Quicklisp 中一些拓展包有更多的函数和更简单的方法:
- str:
trim,words,unwords,lines,unlines,concat,split,shorten,repeat,replace-all,starts-with?,ends-with?,blankp,emptyp… - cl-change-case:包括将字符串转换成驼峰命名格式、参数格式、蛇形命名格式的函数,这些函数在
str包中也有。 - mk-string-metrics:包含一些高效测量字符串编辑距离的函数(Damerau-Levenshtein, Hamming, Jaro, Jaro-Winkler, Levenshtein 等)
cl-ppcre: 常用的包,包括字符串的一些正则表达式的操作。
一些关于字符串格式的资源:
- CLHS 官方文档
- 快速入门
- CLHS 在 HexstreamSoft 中的总结
- Slime tips:
C-c C-d ~加上字符格式就可以直接查看相关文档。
创建字符串
format nil
(defparameter *person* "you")(format nil "hello ~a" *person*) ;; => "hello you"
make-string count
(make-string 3 :initial-element #\♥) ;; => "♥♥♥"
访问子串:subseq
因为字符串也是序列,所以可以使用访问子序列的函数来访问子串。其中字符串的索引也和数组一样,从 0 开始。
(defparameter *my-string* (string "Groucho Marx"))*MY-STRING*(subseq *my-string* 8)"Marx"(subseq *my-string* 0 7)"Groucho"(subseq *my-string* 1 5)"rouc"
也可以同时使用 subseq 和 setf
(defparameter *my-string* (string "Harpo Marx"))*MY-STRING*(subseq *my-string* 0 5)"Harpo"(setf (subseq *my-string* 0 5) "Chico")"Chico"*my-string*"Chico Marx"
要注意的是,字符串不是可变的。引用 HyperSpec 的一段话:”如果新的字符串与子串的长度不同,由较短的字符串决定替换的长度。”
(defparameter *my-string* (string "Karl Marx"))*MY-STRING*(subseq *my-string* 0 4)"Karl"(setf (subseq *my-string* 0 4) "Harpo")"Harpo"*my-string*"Harp Marx"(subseq *my-string* 4) "o Marx")"o Marx"*my-string*"Harpo Mar"
访问单个字符:char
(defparameter *my-string* (string "Groucho Marx"))*MY-STRING*(char *my-string* 11)#\x(char *my-string* 7)#\Space(char *my-string* 6)#\o(setf (char *my-string* 6) #\y)#\y*my-string*"Grouchy Marx"
schar 适用于对效率要求比较高的场景。
由于字符串是数组,因此也是序列,所以也可以经常使用 aref 和 elt 函数( char 的效率可能要高一些。)
(defparameter *my-string* (string "Groucho Marx"))*MY-STRING*(aref *my-string* 3)#\u(elt *my-string* 8)#\M
(stream-external-format *standard-output*):UTF-8(code-char 200)#\LATIN_CAPITAL_LETTER_E_WITH_GRAVE(char-code #\LATIN_CAPITAL_LETTER_E_WITH_GRAVE)200(code-char 1488)#\HEBREW_LETTER_ALEF(char-code #\HEBREW_LETTER_ALEF)1488
操作字符串的部分
* (remove #\o "Harpo Marx")"Harp Marx"* (remove #\a "Harpo Marx")"Hrpo Mrx"* (remove #\a "Harpo Marx" :start 2)"Harpo Mrx"* (remove-if #'upper-case-p "Harpo Marx")"arpo arx"* (substitute #\u #\o "Groucho Marx")"Gruuchu Marx"* (substitute-if #\_ #'upper-case-p "Groucho Marx")"_roucho _arx"* (defparameter *my-string* (string "Zeppo Marx"))*MY-STRING** (replace *my-string* "Harpo" :end1 5)"Harpo Marx"* *my-string*"Harpo Marx"
* (replace-all "Groucho Marx Groucho" "Groucho" "ReplacementForGroucho")"ReplacementForGroucho Marx ReplacementForGroucho"
(defun replace-all (string part replacement &key (test #'char=))"Returns a new string in which all the occurrences of the partis replaced with replacement."(with-output-to-string (out)(loop with part-length = (length part)for old-pos = 0 then (+ pos part-length)for pos = (search part string:start2 old-pos:test test)do (write-string string out:start old-pos:end (or pos (length string)))when pos do (write-string replacement out)while pos)))
追加字符串
concatenate
* (concatenate 'string "Karl" " " "Marx")"Karl Marx"* (concatenate 'list "Karl" " " "Marx")(#\K #\a #\r #\l #\Space #\M #\a #\r #\x)
strcat concat
(uiop:strcat "karl" " marx")(str:concat "foo" "bar")
构建由多个部分组成的字符串
* (defparameter *my-string* (make-array 0:element-type 'character:fill-pointer 0:adjustable t))*MY-STRING** *my-string*""* (dolist (char '(#\Z #\a #\p #\p #\a))(vector-push-extend char *my-string*))NIL* *my-string*"Zappa"
format
* (format nil "This is a string with a list ~A in it"'(1 2 3))"This is a string with a list (1 2 3) in it"* (format nil "The Marx brothers are:~{ ~A~^,~}."'("Groucho" "Harpo" "Chico" "Zeppo" "Karl"))"The Marx brothers are: Groucho, Harpo, Chico, Zeppo, Karl."
concatenate
* (format nil "The Marx brothers are:~{ ~A~}."'("Groucho" "Harpo" "Chico" "Zeppo" "Karl"))"The Marx brothers are: Groucho Harpo Chico Zeppo Karl."
with-output-to-string
* (with-output-to-string (stream)(dolist (char '(#\Z #\a #\p #\p #\a #\, #\Space))(princ char stream))(format stream "~S - ~S" 1940 1993))"Zappa, 1940 - 1993"
同一时间处理字符串的一个字符
map
* (defparameter *my-string* (string "Groucho Marx"))*MY-STRING** (map 'string #'(lambda (c) (print c)) *my-string*)#\G#\r#\o#\u#\c#\h#\o#\Space#\M#\a#\r#\x"Groucho Marx"
loop
* (loop for char across "Zeppo"collect char)(#\Z #\e #\p #\p #\o)
字符串倒序
reversee
(defparameter *my-string* (string "DSL"))*MY-STRING*(reverse *my-string*)"LSD"
Common Lisp 中没有直接将字符串按照单词进行倒序的,你需要自己去定义函数或是使用 str 包
(defparameter *singing* "singing in the rain")*SINGING*(str:words *SINGING*)("singing" "in" "the" "rain")(reverse *)("rain" "the" "in" "singing")(str:unwords *)"rain the in singing"
自定义函数
* (defun split-by-one-space (string)"Returns a list of substrings of stringdivided by ONE space each.Note: Two consecutive spaces will be seen asif there were an empty string between them."(loop for i = 0 then (1+ j)as j = (position #\Space string :start i)collect (subseq string i j)while j))SPLIT-BY-ONE-SPACE* (split-by-one-space "Singing in the rain")("Singing" "in" "the" "rain")* (split-by-one-space "Singing in the rain")("Singing" "in" "the" "" "rain")* (split-by-one-space "Cool")("Cool")* (split-by-one-space " Cool ")("" "Cool" "")* (defun join-string-list (string-list)"Concatenates a list of stringsand puts spaces between the elements."(format nil "~{~A~^ ~}" string-list))JOIN-STRING-LIST* (join-string-list '("We" "want" "better" "examples"))"We want better examples"* (join-string-list '("Really"))"Really"* (join-string-list '())""* (join-string-list(nreverse(split-by-one-space"Reverse this sentence by word")))"word by sentence this Reverse"
大小写转换
* (string-upcase "cool")"COOL"* (string-upcase "Cool")"COOL"* (string-downcase "COOL")"cool"* (string-downcase "Cool")"cool"* (string-capitalize "cool")"Cool"* (string-capitalize "cool example")"Cool Example"* (string-capitalize "cool example" :start 5)"cool Example"* (string-capitalize "cool example" :end 5)"Cool example"* (defparameter *my-string* (string "BIG"))*MY-STRING** (defparameter *my-downcase-string* (nstring-downcase *my-string*))*MY-DOWNCASE-STRING** *my-downcase-string*"big"* *my-string*"big"
在 HyperSpec 中有一段话
for STRING-UPCASE, STRING-DOWNCASE, and STRING-CAPITALIZE, string is not modified. However, if no characters in string require conversion, the result may be either string or a copy of it, at the implementation’s discretion.
也就是说,下面这个例子中与实现无关,结果可能是 “BIG” 或是 “BUG”。如果想要一个确切的结果,使用 copy-seq
(defparameter *my-string* (string "BIG"))*MY-STRING*(defparameter *my-upcase-string* (string-upcase *my-string*))*MY-UPCASE-STRING*(setf (char *my-string* 1) #\U)#\U*my-string*"BUG"*my-upcase-string*"BIG"
使用 format 函数
转换为小写:~( ~)
(format t "~(~a~)" "HELLO WORLD");; => hello world
首字母大写:~:( ~)
(format t "~:(~a~)" "HELLO WORLD")Hello WorldNIL
第一个单词首字母大写:~@( ~)
(format t "~@(~a~)" "hello world")Hello worldNIL
转换为大写:~@:( ~)
(format t "~@:(~a~)" "hello world")HELLO WORLDNIL
删除空格
string-trim
(string-trim " " " trim me ")"trim me"(string-trim " et" " trim me ")"rim m"(string-left-trim " et" " trim me ")"rim me "(string-right-trim " et" " trim me ")" trim m"(string-right-trim '(#\Space #\e #\t) " trim me ")" trim m"(string-right-trim '(#\Space #\e #\t #\m) " trim me ")
符号与字符串之间的转换
intern
* (in-package "COMMON-LISP-USER")#<The COMMON-LISP-USER package, 35/44 internal, 0/9 external>* (intern "MY-SYMBOL")MY-SYMBOLNIL* (intern "MY-SYMBOL")MY-SYMBOL:INTERNAL* (export 'MY-SYMBOL)T* (intern "MY-SYMBOL")MY-SYMBOL:EXTERNAL* (intern "My-Symbol")|My-Symbol|NIL* (intern "MY-SYMBOL" "KEYWORD"):MY-SYMBOLNIL* (intern "MY-SYMBOL" "KEYWORD"):MY-SYMBOL:EXTERNAL
* (symbol-name 'MY-SYMBOL)"MY-SYMBOL"* (symbol-name 'my-symbol)"MY-SYMBOL"* (symbol-name '|my-symbol|)"my-symbol"* (string 'howdy)"HOWDY"
字符与字符串之间的转换
* (coerce "a" 'character)#\a* (coerce (subseq "cool" 2 3) 'character)#\o* (coerce "cool" 'list)(#\c #\o #\o #\l)* (coerce '(#\h #\e #\y) 'string)"hey"* (coerce (nth 2 '(#\h #\e #\y)) 'character)#\y* (defparameter *my-array* (make-array 5 :initial-element #\x))*MY-ARRAY** *my-array*#(#\x #\x #\x #\x #\x)* (coerce *my-array* 'string)"xxxxx"* (string 'howdy)"HOWDY"* (string #\y)"y"* (coerce #\y 'string)#\y can't be converted to type STRING.[Condition of type SIMPLE-TYPE-ERROR]
字符串查找
find, position
* (find #\t "The Hyperspec contains approximately 110,000 hyperlinks." :test #'equal)#\t* (find #\t "The Hyperspec contains approximately 110,000 hyperlinks." :test #'equalp)#\T* (find #\z "The Hyperspec contains approximately 110,000 hyperlinks." :test #'equalp)NIL* (find-if #'digit-char-p "The Hyperspec contains approximately 110,000 hyperlinks.")#\1* (find-if #'digit-char-p "The Hyperspec contains approximately 110,000 hyperlinks." :from-end t)#\0* (position #\t "The Hyperspec contains approximately 110,000 hyperlinks." :test #'equal)17* (position #\t "The Hyperspec contains approximately 110,000 hyperlinks." :test #'equalp)0* (position-if #'digit-char-p "The Hyperspec contains approximately 110,000 hyperlinks.")37* (position-if #'digit-char-p "The Hyperspec contains approximately 110,000 hyperlinks." :from-end t)43
count
* (count #\t "The Hyperspec contains approximately 110,000 hyperlinks." :test #'equal)2* (count #\t "The Hyperspec contains approximately 110,000 hyperlinks." :test #'equalp)3* (count-if #'digit-char-p "The Hyperspec contains approximately 110,000 hyperlinks.")6* (count-if #'digit-char-p "The Hyperspec contains approximately 110,000 hyperlinks." :start 38)5
查找子串: search
* (search "we" "If we can't be free we can at least be cheap")3* (search "we" "If we can't be free we can at least be cheap" :from-end t)20* (search "we" "If we can't be free we can at least be cheap" :start2 4)20* (search "we" "If we can't be free we can at least be cheap" :end2 5 :from-end t)3* (search "FREE" "If we can't be free we can at least be cheap")NIL* (search "FREE" "If we can't be free we can at least be cheap" :test #'char-equal)15
将字符串转换为数字
转换为整数型:parse-integer
* (parse-integer "42")422* (parse-integer "42" :start 1)22* (parse-integer "42" :end 1)41* (parse-integer "42" :radix 8)342* (parse-integer " 42 ")423* (parse-integer " 42 is forty-two" :junk-allowed t)423* (parse-integer " 42 is forty-two")Error in function PARSE-INTEGER:There's junk in this string: " 42 is forty-two".
转换成任意的数字:read-from-string
* (read-from-string "#X23")354* (read-from-string "4.5")4.53* (read-from-string "6/8")3/43* (read-from-string "#C(6/8 1)")#C(3/4 1)9* (read-from-string "1.2e2")120.000015* (read-from-string "symbol")SYMBOL6* (defparameter *foo* 42)*FOO** (read-from-string "#.(setq *foo* \"gotcha\")")"gotcha"23* *foo*"gotcha"
转换成浮点型:parse-float
(ql:quickload "parse-float")(parse-float:parse-float "1.2e2");; 120.00001;; 5
数字转换成字符串
write-to-string prin1-to-string princ-to-string
(write-to-string 250)"250"(write-to-string 250.02)"250.02"(write-to-string 250 :base 5)"2000"(write-to-string (/ 1 3))"1/3"
字符串比较
equal, equalp, mismatch
(string= "Marx" "Marx")T(string= "Marx" "marx")NIL(string-equal "Marx" "marx")T(string< "Groucho" "Zeppo")T(string "groucho" "Zeppo")NIL(string-lessp "groucho" "Zeppo")0(mismatch "Harpo Marx" "Zeppo Marx" :from-end t :test #'char=)3
字符串格式
(defparameter movies '((1 "Matrix" 5)(10 "Matrix Trilogy swe sub" 3.3)))
结果如下:
1 Matrix 510 Matrix Trilogy swe sub 3.3
使用 mapcar 格式化
(mapcar (lambda (it)(format t "~a ~a ~a~%" (first it) (second it) (third it)))movies)
输出结果为:
1 foo baar 510 last 3.3
format 的结构
使用 format 进行格式化时,都是以 ~ 开始,要输出 ‘~’ 符号时,使用 ~~, 多个 ‘~’: ~10~
R: 将数字转换成英文:(format t "~R" 20)=> “twenty”$: 金融:(format t "~$" 21982)=> 21892.00D,B,O,X: 十进制,二进制,八进制,十六进制F: 修复的浮点数~A(~a):(format t "~a" movies)=> “((1 Matrix 5) (10 Matrix Trilogy swe sub 3.3))”~%/~&:~&不会输出新行如果输出的流已经有了。输出 10 个新行~10%。~T:~10T10 个 tab
文本对齐
右对齐:a
(format nil "~20a" "yo");; "yo "(mapcar (lambda (it)(format t "~2a ~a ~a~%" (first it) (second it) (third it)))movies);; 1 Matrix 5;; 10 Matrix Trilogy swe sub 3.3(mapcar (lambda (it)(format t "~2a ~25a ~2a~%" (first it) (second it) (third it)))movies);; 1 Matrix 5;; 10 Matrix Trilogy swe sub 3.3
左对齐: @
(format nil "~20@a" "yo");; " yo"(mapcar (lambda (it)(format nil "~2@a ~25@a ~2a~%" (first it) (second it) (third it)))movies);; 1 Matrix 5;;10 Matrix Trilogy swe sub 3.3
小数对齐:F
(format t "~,2F" 20.1);; 20.10(mapcar (lambda (it)(format t "~2@a ~25a ~2,2f~%" (first it) (second it) (third it)))movies);; 1 Matrix 5.00;; 10 Matrix Trilogy swe sub 3.30
格式化已格式的字符串
(let ((padding 30))(format nil "~va" padding "foo"));; "foo "
Capturing what is is printed into a stream
Inside (with-output-to-string (mystream) …), everything that is
printed into the stream mystream is captured and returned as a
string:
(defun greet (name &key (stream t));; by default, print to standard output.(format stream "hello ~a" name))(let ((output (with-output-to-string (stream)(greet "you" :stream stream))))(format t "Output is: '~a'. It is indeed a ~a, aka a string.~&" output (type-of output)));; Output is: 'hello you'. It is indeed a (SIMPLE-ARRAY CHARACTER (9)), aka a string.;; NIL
See also
- Pretty printing table data, in ASCII art, a tutorial as a Jupyter notebook.
