Students15.xlsx

1.追加

  • 追加,把一张数据表的内容追加到另一张数据表的后面 这种操作叫union
  1. import pandas as pd
  2. page_001 = pd.read_excel('tmp1\Students15.xlsx', sheet_name='Page_001')
  3. page_002 = pd.read_excel('tmp1\Students15.xlsx', sheet_name='Page_002')
  4. students = page_001.append(page_002).reset_index(drop=True)
  5. # reset_index()重新设置index drop=True用来放弃原来的index
  6. print(students)
  • pd.concat()传入一个列表

  • concat将两张表串联起来默认从上到下0

  • 可以将数据根据不同的轴作简单的融合

  • 参数说明
    objs: series,dataframe或者是panel构成的序列list
    axis: 需要合并链接的轴,0是行,1是列
    join:连接的方式 inner,或者outer,join=’outer’为预设值,因此未设定任何参数时,函数默认join=’outer’。此方式是依照column来做纵向合并,有相同的column上下合并在一起,其他独自的column个自成列,原本没有值的位置皆以NaN填充。
    join = ‘innner’表示只有相同的column合并在一起,其他的会被抛弃。

  1. import pandas as pd
  2. page_001 = pd.read_excel('data/028/Students16.xlsx', sheet_name='Page_001')
  3. page_002 = pd.read_excel('data/028/Students16.xlsx', sheet_name='Page_002')
  4. """
  5. concat
  6. """
  7. students = pd.concat([page_001, page_002]).reset_index(drop=True)
  8. print(students)

2.手动创建一行的数据

  1. import pandas as pd
  2. page_001 = pd.read_excel('tmp1\Students15.xlsx', sheet_name='Page_001')
  3. page_002 = pd.read_excel('tmp1\Students15.xlsx', sheet_name='Page_002')
  4. students = page_001.append(page_002).reset_index(drop=True)
  5. stu1 = pd.Series({'ID': 41, 'Name': 'Denve', 'Score': 100})
  6. students = students.append(stu1, ignore_index=True)
  7. print(students)

3.直接修改表中的数值

  • 直接修改———students.at[39, ‘Name’] = ‘Baby’

  • 替换单元格——students.iloc[39] = stu2

  1. import pandas as pd
  2. page_001 = pd.read_excel('tmp1\Students15.xlsx', sheet_name='Page_001')
  3. page_002 = pd.read_excel('tmp1\Students15.xlsx', sheet_name='Page_002')
  4. students = page_001.append(page_002).reset_index(drop=True)
  5. # students.at[39, 'Name'] = 'Baby'
  6. # students.at[39, 'Score'] = '120'
  7. stu2 = pd.Series({'ID': 40, 'Name': 'Baby', 'Score': 120})
  8. students.iloc[39] = stu2
  9. print(students)

4.两行之中新加一行

要求:在id为20和21中间插入一行 index=19和20 利用切片

  1. import pandas as pd
  2. page_001 = pd.read_excel('tmp1\Students15.xlsx', sheet_name='Page_001')
  3. page_002 = pd.read_excel('tmp1\Students15.xlsx', sheet_name='Page_002')
  4. students = page_001.append(page_002).reset_index(drop=True)
  5. stu3 = pd.Series({'ID': 101, 'Name': 'Mike', 'Score': 120})
  6. part1 = students[:20]
  7. part2 = students[20:]
  8. students = part1.append(stu3, ignore_index=True).append(part2).reset_index(drop=True)
  9. print(students)

5.删除数据行

  • students.drop(index=[0, 1, 2], inplace=True)

  • students.drop(index=range(0, 10), inplace=True)

  • students.drop(index=students[0:10].index, inplace=True)

  1. import pandas as pd
  2. page_001 = pd.read_excel('tmp1\Students15.xlsx', sheet_name='Page_001')
  3. page_002 = pd.read_excel('tmp1\Students15.xlsx', sheet_name='Page_002')
  4. students = page_001.append(page_002).reset_index(drop=True)
  5. students.drop(index=[0, 1, 2], inplace=True)
  6. # students.drop(index=range(0, 10), inplace=True)
  7. # students.drop(index=students[0:10].index, inplace=True)
  8. print(students)

6.删除空数据行

  1. import pandas as pd
  2. page_001 = pd.read_excel('tmp1\Students15.xlsx', sheet_name='Page_001')
  3. page_002 = pd.read_excel('tmp1\Students15.xlsx', sheet_name='Page_002')
  4. students = page_001.append(page_002).reset_index(drop=True)
  5. """
  6. 创建空数据行
  7. """
  8. for i in range(5, 15):
  9. students['Name'].at[i] = ''
  10. """
  11. 删除空数据行
  12. """
  13. missing = students.loc[students['Name'] == '']
  14. students.drop(index=missing.index, inplace=True)
  15. students = students.reset_index(drop=True)
  16. print(students)