基础内容 - 基本类型-字符串-string - 《golang》

与其他编程语言的差异
字符串类型的内部结构定义
unicode 和 utf8
字符串处理">字符串处理
字符串转化

与其他编程语言的差异

string 是数据类型，不是指针或者引用类型，默认初始化的空值是空字符串，不是nil。
string 是只读的 byte slice
string 的 byte 数组可以存放任何数据，len 求byte的长度

字符串类型的内部结构定义

对于标准编译器，字符串类型的内部结构声明如下：

type _string struct {
    elements *byte // 引用着底层的字节
    len      int   // 字符串中的字节数
}

从这个声明来看，可以将一个字符串的内部定义看作为一个字节序列。事实上，我们确实可以把一个字符串看作是一个元素类型为byte的（且元素不可修改的）切片。

unicode 和 utf8

Unicode 是一种编码（code point），UTF8是unicode的存储实现（转化为字节序列的规则），详情参考字符编码笔记：ASCII，Unicode 和 UTF-8。

基本类型-字符串-string - 图1

在UTF-8编码中，一个码点值可能由1到4个字节组成。

英语码点值（均对应一个英语字符）均由一个字节组成
中文字符（均对应一个中文字符）均由三个字节组成
emoji 均由4个字节组成

示例，字母，中文，emoji 的 unicode 和 utf8 编码

// 英文字符
strEn := "a"
fmt.Printf("%s bytes len is %d\n", strEn, len(strEn))         // a bytes len is 1
fmt.Printf("%s chars len is %d\n", strEn, len([]rune(strEn))) // a chars len is 1
fmt.Printf("%s chars len is %d\n", strEn, utf8.RuneCountInString(strEn)) // a chars len is 1
// unicode 编码
fmt.Printf("%s unicode is %x\n", strEn, []rune(strEn)[0])     // a unicode is 61
// 存储形式
fmt.Printf("%s utf8 is %x\n", strEn, strEn)                   // a utf8 is 61
// 中文字符
strZh := "中"
fmt.Printf("%s bytes len is %d\n", strZh, len(strZh))         // 中 bytes len is 3
fmt.Printf("%s chars len is %d\n", strZh, len([]rune(strZh))) // 中 chars len is 1
fmt.Printf("%s chars len is %d\n", strZh, utf8.RuneCountInString(strZh)) // 中 chars len is 1
// unicode 编码
fmt.Printf("%s unicode is %x\n", strZh, []rune(strZh)[0])     // 中 unicode is 4e2d
// 存储形式
fmt.Printf("%s utf8 is %x\n", strZh, strZh)                   // 中 utf8 is e4b8ad
// 表情
emoji := "🌈"
fmt.Printf("%s bytes len is %d\n", emoji, len(emoji))  // 🌈 bytes len is 4
fmt.Printf("%s chars len is %d\n", emoji, len([]rune(emoji))) // 🌈 chars len is 1
fmt.Printf("%s chars len is %d\n", emoji, utf8.RuneCountInString(emoji)) // 🌈 chars len is 1
// unicode 编码
fmt.Printf("%s unicode is %x\n", emoji, []rune(emoji)[0]) // 🌈 unicode is 1f308
// 存储形式
fmt.Printf("%s utf8 is %x\n", emoji, emoji) // 🌈 utf8 is f09f8c88

字符串处理

包含
- func Contains(s, substr string) bool 是否包含子字符串
- func ContainsAny(s, chars string) bool 包含任意子串
切割
- func Fields(s string) []string 按空白切割
- func Split(s, sep string) []string 按指定内容切割
- func SplitAfter(s, sep string) []string
开头结尾判断
- func HasPrefix(s, prefix string) bool 开头
- func HasSuffix(s, suffix string) bool 结尾
替换
- func Replace(s, old, new string, n int) string 替换指定长度的内容
- func ReplaceAll(s, old, new string) string 替换所有的匹配的内容
删除指定字符
- func Trim(s string, cutset string) string 删除指定字符
- func TrimSpace(s string) string 删除两端空白字符
- func TrimLeft(s string, cutset string) string
- func TrimRight(s string, cutset string) string
- func TrimPrefix(s, prefix string) string
- func TrimSuffix(s, suffix string) string

示例，字符串分割连接

str := "hello, world!"
for _, v := range strings.Split(str, ",") {
    fmt.Println(v)
}
// out
hello
 world!

字符串转化

func Atoi(s string) (int, error)
func Itoa(i int) string
func ParseBool(str string) (bool, error)
func ParseFloat(s string, bitSize int) (float64, error)
func ParseInt(s string, base int, bitSize int) (i int64, err error)

示例，字符串/整数相互转化

s := strconv.Itoa(10)
fmt.Println("str" + s)
i, err := strconv.Atoi("10")
if err != nil {
    fmt.Println("convert failed")
    os.Exit(1)
}
fmt.Println(i)

https://gfw.go101.org/article/string.html