1. 生成Series数据
import pandas as pd
import numpy as np
ser = pd.Series(np.random.randint(1, 10, 4))
print(ser)
0 9
1 8
2 3
3 8
dtype: int32
2. 求幂
print(np.exp(ser))
0 8103.083928
1 2980.957987
2 20.085537
3 2980.957987
dtype: float64
3. 生成DataFrame数据1
df = pd.DataFrame(np.random.randint(1, 10, (2, 3)), columns=['a', 'b', 'c'])
print(df)
a b c
0 5 1 2
1 8 9 2
4. 做乘除运算并取一个sin
print(np.sin(df*np.pi/4))
a b c
0 -7.071068e-01 0.707107 1.0
1 -2.449294e-16 0.707107 1.0
5. 随机生成特定条件的DataFrame数据
A = pd.DataFrame(np.random.randint(1, 10, (3, 4)))
print(A)
0 1 2 3
0 9 2 3 9
1 3 9 3 3
2 7 1 1 1
6. 索引对齐
A = pd.Series([2, 4, 6], index=[0, 1, 2])
B = pd.Series([1, 3, 5], index=[1, 2, 3])
print(A)
print(B)
print(A+B) # 任何缺失用NaN填充
0 2
1 4
2 6
dtype: int64
1 1
2 3
3 5
dtype: int64
0 NaN
1 5.0
2 9.0
3 NaN
dtype: float64
7. 不同索引值数据相加
print(A.add(B, fill_value=0))
0 2.0
1 5.0
2 9.0
3 5.0
dtype: float64
8. 不同布局DataFrame数据相加
A = pd.DataFrame(np.random.randint(1, 10, (2, 3)))
B = pd.DataFrame(np.random.randint(1, 10, (3, 4)))
print("A=\n", A)
print("B=\n", B)
print(A+B)
A=
0 1 2
0 1 6 8
1 4 8 6
B=
0 1 2 3
0 5 6 5 8
1 3 4 9 1
2 5 2 4 9
0 1 2 3
0 6.0 12.0 13.0 NaN
1 7.0 12.0 15.0 NaN
2 NaN NaN NaN NaN
9. 指定列表为索引值
指定索引值,直接用列表,可以自动分开加到里面
A = pd.DataFrame(np.random.randint(1, 10, (2, 2)), columns=list("AB"))
B = pd.DataFrame(np.random.randint(1, 10, (3, 3)), columns=list("BAC"))
print("A=\n", A)
print("B=\n", B)
A=
A B
0 4 9
1 2 4
B=
B A C
0 1 1 1
1 4 1 2
2 1 8 7
10. 计算均值
print('A的均值', np.mean(A.values))
A的均值 4.75
11. 均值填充缺失值
print(A.add(B))
print(A.add(B, fill_value=np.mean(A.values))) # 将A的均值填充到缺失值里面去相加
A B C
0 5.0 10.0 NaN
1 3.0 8.0 NaN
2 NaN NaN NaN
A B C
0 5.00 10.00 5.75
1 3.00 8.00 6.75
2 12.75 5.75 11.75
12. 按行来减
A = np.random.randint(1, 10, (3, 4))
print("A=\n", A)
print(A-A[0])
A=
[[3 7 4 4]
[8 3 7 6]
[6 6 2 6]]
[[ 0 0 0 0]
[ 5 -4 3 2]
[ 3 -1 -2 2]]
13. 将ndarray转换为DataFrame并指定列名
df = pd.DataFrame(A, columns=list('QUWE'))
print(df)
Q U W E
0 3 7 4 4
1 8 3 7 6
2 6 6 2 6
14. 按列计算,用的是DataFrame里面的减函数
print(df.subtract(df['U'], axis=0))
Q U W E
0 -4 0 -3 -3
1 5 0 4 3
2 0 0 -4 0
15. 隐式索引
取第0行,限定步长为2
half = df.iloc[0, ::2]
print(half)
Q 3
W 4
Name: 0, dtype: int32