:::info 字符串:Array-Style Strings and string class

  1. 数组风格的字符串
  2. String class :::

Array-style strings

  • An array-style string (null-terminated strings/arrays of characters) is a series of characters stored in bytes in memory.
  • This kind of strings can be declared as follows

字符串本身就是数组,将字符存到数组中,就形成了字符串。

  1. // initchar.cpp
  2. char rabbit[16] = {'P', 'e', 't', 'e', 'r'}; // 数组的类型为char,元素个数为16个
  3. char bad_pig[9] = {'P', 'e', 'p', 'p', 'a', ' ', 'P', 'i', 'g’}; //a bad one! 正好9个元素,对待一个数组字符串时,字符串的结束标志就是0,如果没有,该字符串数组时不会停止的,直到找到0。
  4. char good_pig[10] = {'P', 'e', 'p', 'p', 'a', ' ', 'P', 'i', 'g', '\0'};
  1. // Returns the number of characters, the first NULL will not be included.
  2. // 返回字符串数组的长度
  3. size_t strlen( const char *str );

字符串长度 vs. 字符串数组的长度

  1. char rabbit[16] = {'P', 'e', 't', 'e', 'r'}; // 数组的类型为char,元素个数为16个
  • 字符串数组的长度16
  • 字符串长度为为5
    1. char name[10] = {'Y', 'u', '\0', 'S', '.', '0'};
    2. cout << strlen(name) << endl;
    3. // 字符串长度为2,
    ```cpp

    include

    include

    using namespace std;

int main() { char rabbit[16] = {‘P’, ‘e’, ‘t’, ‘e’, ‘r’}; cout << “String length is “ << strlen(rabbit) << endl; for(int i = 0; i < 16; i++) cout << i << “:” << +rabbit[i] << “(“ << rabbit[i] << “)” << endl;

  1. char bad_pig[9] = {'P', 'e', 'p', 'p', 'a', ' ', 'P', 'i', 'g'};
  2. char good_pig[10] = {'P', 'e', 'p', 'p', 'a', ' ', 'P', 'i', 'g', '\0'};
  3. cout << "Rabbit is (" << rabbit << ")" << endl;
  4. cout << "Pig's bad name is (" << bad_pig << ")" << endl;
  5. cout << "Pig's good name is (" << good_pig << ")" << endl;
  6. char name[10] = {'Y', 'u', '\0', 'S', '.', '0'};
  7. cout << strlen(name) << endl;
  8. return 0;

}

```
// result 
String length is 5 // 因为r后面是0,所以停止,否则字符串数组继续索引,直到遇到0
0:80(P)
1:101(e)
2:116(t)
3:101(e)
4:114(r)
5:0()
6:0()
7:0()
8:0()
9:0()
10:0()
11:0()
12:0()
13:0()
14:0()
15:0()
Rabbit is (Peter)
Pig's bad name is (Peppa PigPeter)  // 此时,就能看出,数组越界了,也说明了rabbit字符串数组是存储再bad_pig的后面,
Pig's good name is (Peppa Pig)
2

字符串数组最后一定要加个0.

String literals

  • It isn’t convenient to initial a string character by character.
  • String literals can help. ```cpp char name1[] = “Southern University of Science and Technology”; char name2[] = “Southern University of “ “Science and Technology”; // 2 == 1 char name3[] = “ABCD”; //how many bytes for the array? 字符串长度为4, 数组长度为5
![image.png](https://cdn.nlark.com/yuque/0/2021/png/353587/1636794774699-867a581f-06fc-4e2b-ab1f-e6fad278203d.png#clientId=u4f03136c-d5a6-4&from=paste&height=150&id=ua80827b8&name=image.png&originHeight=300&originWidth=287&originalType=binary&ratio=1&size=10455&status=done&style=none&taskId=uc82a9439-5419-4a48-928e-5c1ad238bd1&width=143.5)
```cpp
const wchar_t[] s5 = L"ABCD";  // 
const char16_t[] s9 = u"ABCD"; //since C++11
const char32_t[] s6 = U"ABCD"; //since C++11

字符串操作
Copy

char* strcpy( char* dest, const char* src );  // 内存拷贝:src -> dest

该API并没有说明src 和 dest数组有多长,那么,拷贝的时候拷贝多少元素呢?src有多少元素就拷贝多少。那么为题来了,如果dest的数组空间不够,那么就会拷贝溢出。

  • Safer one:

    char *strncpy(char *dest, const char *src, size_t count); // 最多拷贝count个元素,避免数组越界,
    // 可以将count设置为src 和 dest长度的最小值
    

    Concatenate: appends a copy of src to dest

  • 将两个字符串合并成一个

    char *strcat( char *dest, const char *src ); // 将src拷贝到dest的后面,也面临着数组越界的风险,
    // 也有更为安全的API -> `strncat`
    

    Compare

  • 字符串的比较

    int strcmp( const char *lhs, const char *rhs );
    

    ```cpp

    include

    include

    using namespace std;

int main() { char str1[] = “Hello, \0CPP”; char str2[] = “SUSTech”; char result[128];

for(int i = 0; i < 16; i++)
    cout << i << ":" << +str1[i] << "(" << str1[i] << ")" << endl;

strcpy(result, str1);
cout << "Result = " <<  result << endl;
strcat(result, str2);
cout << "Result = " <<  result << endl;

cout << "strcmp() = " << strcmp(str1, str2) << endl;

//strcat(str1, str2); //danger operation!
//cout << "str1 = " << str1 << endl;

}

```
// results
0:72(H)
1:101(e)
2:108(l)
3:108(l)
4:111(o)
5:44(,)
6:32( )
7:0()
8:67(C)
9:80(P)
10:80(P)
11:0()
12:12()
13:0())
13:0()
14:0()
15:0()
Result = Hello,
Result = Hello, SUSTech
strcmp() = -11 // (H-S, 72-83)

string class

  • Null-terminated strings are easy to be out of bound, and to cause problems.
  • string class provides functions to manipulate and examinate strings.

之前的string数组不会进行边界检查,一不小心就会出错,数组越界等各种问题;string类会大幅度降低出错概率

std::string str1 = "Hello";
std::string str2 = "SUSTech";
std::string result = str1 + ", " + str2;
#include <iostream>
#include <string>

using namespace std;
int main()
{
    std::string str1 = "Hello";
    std::string str2 = "SUSTech";
    std::string result = str1 + ", " + str2;

    cout << "result = " + result << endl;

    cout << "The length is " << result.length() << endl;

    cout << "str1 < str2 is " << (str1 < str2) << endl;

    return 0;
}
// results
result = Hello, SUSTech
The length is 14
str1 < str2 is 1
// Different types of strings
std::string 
std::wstring 
std::u8string //(C++20)
std::u16string //(C++11)
std::u32string //(C++11)