1.自动填充
数据表:
首先,导入数据集:
import pandas as pdbooks = pd.read_excel('tmp\Books.xlsx', skiprows=3, usecols='C:F')# 因为原数据集前三行是空的,所以我们使用skiprows=3 来跳过前三行,# 数据集的列是从C-F,所以我们用usecols=‘C:F'来取到正确的列
- 如何让其自动生成ID?
# 通过for循环books = pd.read_excel('tmp\Books.xlsx', skiprows=3, usecols='C:F', dtype={'ID': str})for i in books.index:books['ID'].at[i] = i + 1# books.at[i, 'ID'] = i + 1print(books)"""ID Name InStore Date0 1 Book_001 NaN NaN1 2 Book_002 NaN NaN2 3 Book_003 NaN NaN3 4 Book_004 NaN NaN4 5 Book_005 NaN NaN5 6 Book_006 NaN NaN6 7 Book_007 NaN NaN7 8 Book_008 NaN NaN8 9 Book_009 NaN NaN9 10 Book_010 NaN NaN10 11 Book_011 NaN NaN11 12 Book_012 NaN NaN12 13 Book_013 NaN NaN13 14 Book_014 NaN NaN14 15 Book_015 NaN NaN15 16 Book_016 NaN NaN16 17 Book_017 NaN NaN17 18 Book_018 NaN NaN18 19 Book_019 NaN NaN19 20 Book_020 NaN NaN"""
- 规定索引可以除2的InStore为’Yes’,否则为’No’
books = pd.read_excel('tmp\Books.xlsx', skiprows=3, usecols='C:F', dtype={'InStore': str})for i in books.index:books.at[i, 'InStore'] = 'Yes' if i % 2 == 0 else 'No'print(books)"""ID Name InStore Date0 NaN Book_001 Yes NaN1 NaN Book_002 No NaN2 NaN Book_003 Yes NaN3 NaN Book_004 No NaN4 NaN Book_005 Yes NaN5 NaN Book_006 No NaN6 NaN Book_007 Yes NaN7 NaN Book_008 No NaN8 NaN Book_009 Yes NaN9 NaN Book_010 No NaN10 NaN Book_011 Yes NaN11 NaN Book_012 No NaN12 NaN Book_013 Yes NaN13 NaN Book_014 No NaN14 NaN Book_015 Yes NaN15 NaN Book_016 No NaN16 NaN Book_017 Yes NaN17 NaN Book_018 No NaN18 NaN Book_019 Yes NaN19 NaN Book_020 No NaN"""
2.函数填充
Books (1).xlsx
首先,导入数据集:
import pandas as pdbooks = pd.read_excel('tmp\Books (1).xlsx', index_col='ID')# index_col = 'ID'设置index以ID列
1.求出价格Price(Price = ListPrice * Discount )
books['Price'] = books['ListPrice'] * books['Discount']print(books)"""Name ListPrice Discount PriceID1 Book_001 10 0.5 5.02 Book_002 20 0.5 10.03 Book_003 30 0.5 15.04 Book_004 40 0.5 20.05 Book_005 50 0.5 25.06 Book_006 60 0.5 30.07 Book_007 70 0.5 35.08 Book_008 80 0.5 40.09 Book_009 90 0.5 45.010 Book_010 100 0.5 50.011 Book_011 110 0.5 55.012 Book_012 120 0.5 60.013 Book_013 130 0.5 65.014 Book_014 140 0.5 70.015 Book_015 150 0.5 75.016 Book_016 160 0.5 80.017 Book_017 170 0.5 85.018 Book_018 180 0.5 90.019 Book_019 190 0.5 95.020 Book_020 200 0.5 100.0"""
2.只计算5-15这一段的价格
books = pd.read_excel('tmp\Books (1).xlsx', index_col='ID')for i in range(5, 16):books['Price'].at[i] = books['ListPrice'].at[i] * books['Discount'].at[i]print(books)"""Name ListPrice Discount PriceID1 Book_001 10 0.5 NaN2 Book_002 20 0.5 NaN3 Book_003 30 0.5 NaN4 Book_004 40 0.5 NaN5 Book_005 50 0.5 25.06 Book_006 60 0.5 30.07 Book_007 70 0.5 35.08 Book_008 80 0.5 40.09 Book_009 90 0.5 45.010 Book_010 100 0.5 50.011 Book_011 110 0.5 55.012 Book_012 120 0.5 60.013 Book_013 130 0.5 65.014 Book_014 140 0.5 70.015 Book_015 150 0.5 75.016 Book_016 160 0.5 NaN17 Book_017 170 0.5 NaN18 Book_018 180 0.5 NaN19 Book_019 190 0.5 NaN20 Book_020 200 0.5 NaN"""
3.把每本书都涨价2元
import pandas as pdbooks = pd.read_excel('tmp\Books (1).xlsx', index_col='ID')# books['ListPrice'] += 2books['ListPrice'] = books['ListPrice'].apply(lambda x: x + 2)print(books)# df.apply(f, axis=0) 将f函数应用到由各行各列所形成的一维数组上"""Name ListPrice Discount PriceID1 Book_001 12 0.5 NaN2 Book_002 22 0.5 NaN3 Book_003 32 0.5 NaN4 Book_004 42 0.5 NaN5 Book_005 52 0.5 NaN6 Book_006 62 0.5 NaN7 Book_007 72 0.5 NaN8 Book_008 82 0.5 NaN9 Book_009 92 0.5 NaN10 Book_010 102 0.5 NaN11 Book_011 112 0.5 NaN12 Book_012 122 0.5 NaN13 Book_013 132 0.5 NaN14 Book_014 142 0.5 NaN15 Book_015 152 0.5 NaN16 Book_016 162 0.5 NaN17 Book_017 172 0.5 NaN18 Book_018 182 0.5 NaN19 Book_019 192 0.5 NaN20 Book_020 202 0.5 NaN"""
