1. 生成Series数据

  1. import pandas as pd
  2. import numpy as np
  3. ser = pd.Series(np.random.randint(1, 10, 4))
  4. print(ser)
  5. 0 9
  6. 1 8
  7. 2 3
  8. 3 8
  9. dtype: int32

2. 求幂

  1. print(np.exp(ser))
  2. 0 8103.083928
  3. 1 2980.957987
  4. 2 20.085537
  5. 3 2980.957987
  6. dtype: float64

3. 生成DataFrame数据1

  1. df = pd.DataFrame(np.random.randint(1, 10, (2, 3)), columns=['a', 'b', 'c'])
  2. print(df)
  3. a b c
  4. 0 5 1 2
  5. 1 8 9 2

4. 做乘除运算并取一个sin

  1. print(np.sin(df*np.pi/4))
  2. a b c
  3. 0 -7.071068e-01 0.707107 1.0
  4. 1 -2.449294e-16 0.707107 1.0

5. 随机生成特定条件的DataFrame数据

  1. A = pd.DataFrame(np.random.randint(1, 10, (3, 4)))
  2. print(A)
  3. 0 1 2 3
  4. 0 9 2 3 9
  5. 1 3 9 3 3
  6. 2 7 1 1 1

6. 索引对齐

  1. A = pd.Series([2, 4, 6], index=[0, 1, 2])
  2. B = pd.Series([1, 3, 5], index=[1, 2, 3])
  3. print(A)
  4. print(B)
  5. print(A+B) # 任何缺失用NaN填充
  6. 0 2
  7. 1 4
  8. 2 6
  9. dtype: int64
  10. 1 1
  11. 2 3
  12. 3 5
  13. dtype: int64
  14. 0 NaN
  15. 1 5.0
  16. 2 9.0
  17. 3 NaN
  18. dtype: float64

7. 不同索引值数据相加

  1. print(A.add(B, fill_value=0))
  2. 0 2.0
  3. 1 5.0
  4. 2 9.0
  5. 3 5.0
  6. dtype: float64

8. 不同布局DataFrame数据相加

  1. A = pd.DataFrame(np.random.randint(1, 10, (2, 3)))
  2. B = pd.DataFrame(np.random.randint(1, 10, (3, 4)))
  3. print("A=\n", A)
  4. print("B=\n", B)
  5. print(A+B)
  6. A=
  7. 0 1 2
  8. 0 1 6 8
  9. 1 4 8 6
  10. B=
  11. 0 1 2 3
  12. 0 5 6 5 8
  13. 1 3 4 9 1
  14. 2 5 2 4 9
  15. 0 1 2 3
  16. 0 6.0 12.0 13.0 NaN
  17. 1 7.0 12.0 15.0 NaN
  18. 2 NaN NaN NaN NaN

9. 指定列表为索引值

指定索引值,直接用列表,可以自动分开加到里面

  1. A = pd.DataFrame(np.random.randint(1, 10, (2, 2)), columns=list("AB"))
  2. B = pd.DataFrame(np.random.randint(1, 10, (3, 3)), columns=list("BAC"))
  3. print("A=\n", A)
  4. print("B=\n", B)
  5. A=
  6. A B
  7. 0 4 9
  8. 1 2 4
  9. B=
  10. B A C
  11. 0 1 1 1
  12. 1 4 1 2
  13. 2 1 8 7

10. 计算均值

  1. print('A的均值', np.mean(A.values))
  2. A的均值 4.75

11. 均值填充缺失值

  1. print(A.add(B))
  2. print(A.add(B, fill_value=np.mean(A.values))) # 将A的均值填充到缺失值里面去相加
  3. A B C
  4. 0 5.0 10.0 NaN
  5. 1 3.0 8.0 NaN
  6. 2 NaN NaN NaN
  7. A B C
  8. 0 5.00 10.00 5.75
  9. 1 3.00 8.00 6.75
  10. 2 12.75 5.75 11.75

12. 按行来减

  1. A = np.random.randint(1, 10, (3, 4))
  2. print("A=\n", A)
  3. print(A-A[0])
  4. A=
  5. [[3 7 4 4]
  6. [8 3 7 6]
  7. [6 6 2 6]]
  8. [[ 0 0 0 0]
  9. [ 5 -4 3 2]
  10. [ 3 -1 -2 2]]

13. 将ndarray转换为DataFrame并指定列名

  1. df = pd.DataFrame(A, columns=list('QUWE'))
  2. print(df)
  3. Q U W E
  4. 0 3 7 4 4
  5. 1 8 3 7 6
  6. 2 6 6 2 6

14. 按列计算,用的是DataFrame里面的减函数

  1. print(df.subtract(df['U'], axis=0))
  2. Q U W E
  3. 0 -4 0 -3 -3
  4. 1 5 0 4 3
  5. 2 0 0 -4 0

15. 隐式索引

取第0行,限定步长为2

  1. half = df.iloc[0, ::2]
  2. print(half)
  3. Q 3
  4. W 4
  5. Name: 0, dtype: int32