python介绍

来源:极客时间-0基础学python

课程目录：

python零基础 - 图1

python介绍

关于anaconda

官网:https://www.anaconda.com/
anaconda入门:https://docs.anaconda.com/anaconda/user-guide/getting-started/
安装for Linux:https://docs.anaconda.com/anaconda/install/linux/

你可能已经安装了 Python，那么为什么还需要 Anaconda？有以下3个原因：
1）Anaconda 附带了一大批常用数据科学包，它附带了 conda、Python 和 150 多个科学包及其依赖项。因此你可以立即开始处理数据。
2）管理包
Anaconda 是在 conda（一个包管理器和环境管理器）上发展出来的。
在数据分析中，你会用到很多第三方的包，而conda（包管理器）可以很好的帮助你在计算机上安装和管理这些包，包括安装、卸载和更新包。
3）管理环境
为什么需要管理环境呢？
比如你在A项目中用了 Python 2，而新的项目B老大要求使用Python 3，而同时安装两个Python版本可能会造成许多混乱和错误。这时候 conda就可以帮助你为不同的项目建立不同的运行环境。

还有很多项目使用的包版本不同，比如不同的pandas版本，不可能同时安装两个 Numpy 版本，你要做的应该是，为每个 Numpy 版本创建一个环境，然后项目的对应环境中工作。这时候conda就可以帮你做到。

一些python学习的实用小网站

1.官方文档:https://docs.python.org/zh-cn/3/
2.ipython:https://ipython.org/
ipython是一个python的交互式shell，比默认的python shell好用得多，支持变量自动补全，自动缩进，支持bash shell命令，内置了许多很有用的功能和函数。学习ipython将会让我们以一种更高的效率来使用python。同时它也是利用Python进行科学计算和交互可视化的一个最佳的平台。
IPython提供了两个主要的组件：

1.一个强大的python交互式shell 2.供Jupyter notebooks使用的一个Jupyter内核（IPython notebook）

3.jupyter: https://jupyter.org/
在网页中编写调试python
4.pychram:https://www.jetbrains.com/pycharm/download/
集成开发环境
5.pip:https://pypi.org/project/pip/
安装包和依赖

来自某同事的爬虫思维导图

spiderman (1).xmind

python的各种推导式

推导式comprehensions（又称解析式），是Python的一种独有特性。推导式是可以从一个数据序列构建另一个新的数据序列的结构体。共有三种推导，在Python2和3中都有支持：

列表(list)推导式
字典(dict)推导式
集合(set)推导式

列表推导式

基本格式为：
[表达式 for 变量 in 列表] 或者 [表达式 for 变量 in 列表 if 条件]
具体可分为两种：

* [x for x in data if condition]

此处if主要起条件判断作用，data数据中只有满足if条件的才会被留下，最后统一生成为一个数据列表。

* [exp1 if condition else exp2 for x in data]

此处if…else主要起赋值作用，当data中的数据满足if条件时将其做exp1处理，否则按照exp2处理，最后统一生成为一个数据列表

为了加深理解我们举个例子： variable = [out_exp_res for out_exp in input_list if out_exp == 2] out_exp_res:　　列表生成元素表达式，可以是有返回值的函数 for out_exp in input_list：　　迭代input_list将out_exp传入out_exp_res表达式中 if out_exp == 2：　　根据条件过滤哪些值可以

我们可以再举几个例子：

multiples = [i for i in range(30) if i % 3 == 0] print(multiples) Output: [0, 3, 6, 9, 12, 15, 18, 21, 24, 27] multiples = [squared(i) for i in range(30) if i % 3 == 0] print multiples Output: [0, 9, 36, 81, 144, 225, 324, 441, 576, 729] data = [‘driver’, ‘2017-07-13’, 1827.0, 2058.0, 978.0, 1636.0, 1863.0, 2537.0, 1061.0] (1)若我要取得以上列表中值大于2000的数值，这里可以使用列表推导式的形式①： [x for x in data if x > 2000] 得到如下结果（字符串类型数据被认为是无穷大数）：[‘driver’, ‘2017-07-13’, 2058.0, 2537.0] (2)若要解决我上面提到的问题，则需要使用列表推导式的形式② : [int(x) if type(x) == float else x for x in data] 得到结果：[‘driver’, ‘2017-07-13’, 1827, 2058, 978, 1636, 1863, 2537, 1061]

两个例子肯定不够理解的，我们要实战一下，亲自上手敲敲代码
例1：过滤掉长度小于或等于3的字符串列表，并将剩下的转换成大写字母：

>>> names = ['Bob','Tom','alice','Jerry','Wendy','Smith']
>>> new_names = [name.upper()for name in names if len(name)>3]
>>> print(new_names)
['ALICE', 'JERRY', 'WENDY', 'SMITH']

例2：生成间隔5分钟的时间列表序列：

>>> time = ['%.2d:%.2d'%(h,m )for h in range(24) for m in range(0,60,5) ]
>>> print(time)
['00:00', '00:05', '00:10', '00:15', '00:20', '00:25', '00:30', '00:35', '00:40', '00:45', '00:50', '00:55', '01:00', '01:05', '01:10', '01:15', '01:20', '01:25', '01:30', '01:35', '01:40', '01:45', '01:50', '01:55', '02:00', '02:05', '02:10', '02:15', '02:20', '02:25', '02:30', '02:35', '02:40', '02:45', '02:50', '02:55', '03:00', '03:05', '03:10', '03:15', '03:20', '03:25', '03:30', '03:35', '03:40', '03:45', '03:50', '03:55', '04:00', '04:05', '04:10', '04:15', '04:20', '04:25', '04:30', '04:35', '04:40', '04:45', '04:50', '04:55', '05:00', '05:05', '05:10', '05:15', '05:20', '05:25', '05:30', '05:35', '05:40', '05:45', '05:50', '05:55', '06:00', '06:05', '06:10', '06:15', '06:20', '06:25', '06:30', '06:35', '06:40', '06:45', '06:50', '06:55', '07:00', '07:05', '07:10', '07:15', '07:20', '07:25', '07:30', '07:35', '07:40', '07:45', '07:50', '07:55', '08:00', '08:05', '08:10', '08:15', '08:20', '08:25', '08:30', '08:35', '08:40', '08:45', '08:50', '08:55', '09:00', '09:05', '09:10', '09:15', '09:20', '09:25', '09:30', '09:35', '09:40', '09:45', '09:50', '09:55', '10:00', '10:05', '10:10', '10:15', '10:20', '10:25', '10:30', '10:35', '10:40', '10:45', '10:50', '10:55', '11:00', '11:05', '11:10', '11:15', '11:20', '11:25', '11:30', '11:35', '11:40', '11:45', '11:50', '11:55', '12:00', '12:05', '12:10', '12:15', '12:20', '12:25', '12:30', '12:35', '12:40', '12:45', '12:50', '12:55', '13:00', '13:05', '13:10', '13:15', '13:20', '13:25', '13:30', '13:35', '13:40', '13:45', '13:50', '13:55', '14:00', '14:05', '14:10', '14:15', '14:20', '14:25', '14:30', '14:35', '14:40', '14:45', '14:50', '14:55', '15:00', '15:05', '15:10', '15:15', '15:20', '15:25', '15:30', '15:35', '15:40', '15:45', '15:50', '15:55', '16:00', '16:05', '16:10', '16:15', '16:20', '16:25', '16:30', '16:35', '16:40', '16:45', '16:50', '16:55', '17:00', '17:05', '17:10', '17:15', '17:20', '17:25', '17:30', '17:35', '17:40', '17:45', '17:50', '17:55', '18:00', '18:05', '18:10', '18:15', '18:20', '18:25', '18:30', '18:35', '18:40', '18:45', '18:50', '18:55', '19:00', '19:05', '19:10', '19:15', '19:20', '19:25', '19:30', '19:35', '19:40', '19:45', '19:50', '19:55', '20:00', '20:05', '20:10', '20:15', '20:20', '20:25', '20:30', '20:35', '20:40', '20:45', '20:50', '20:55', '21:00', '21:05', '21:10', '21:15', '21:20', '21:25', '21:30', '21:35', '21:40', '21:45', '21:50', '21:55', '22:00', '22:05', '22:10', '22:15', '22:20', '22:25', '22:30', '22:35', '22:40', '22:45', '22:50', '22:55', '23:00', '23:05', '23:10', '23:15', '23:20', '23:25', '23:30', '23:35', '23:40', '23:45', '23:50', '23:55']

例3: 求(x,y),其中x是0-5之间的偶数，y是0-5之间的奇数组成的元祖列表:

list = [(x,y) for x in range(5) if x%2 == 0 for y in range(5) if y%2 == 1]
print(list)
D:\anaconda\python.exe D:/bilibili大学/简书代码/推导式.py
[(0, 1), (0, 3), (2, 1), (2, 3), (4, 1), (4, 3)]
Process finished with exit code 0

例4: 求M中3,6,9组成的列表:

M = [[1,2,3],[4,5,6],[7,8,9]]
list_1 = [row[2] for row in M]
print(list_1)
D:\anaconda\python.exe D:/bilibili大学/简书代码/推导式.py
[3, 6, 9]
Process finished with exit code 0

例5: 求M中斜线1,5,9组成的列表:

M = [[1,2,3],[4,5,6],[7,8,9]]
list_1 = [M[x][x] for x in range(len(M)) ]
print(list_1)
D:\anaconda\python.exe D:/bilibili大学/简书代码/推导式.py
[1, 5, 9]
Process finished with exit code 0

例6: 求M,N中矩阵和元素的乘积:

M = [[1,2,3],[4,5,6],[7,8,9]]
N = [[2,2,2],[3,3,3], [4,4,4]]
list = [M[row][col]*N[row][col] for row in range(3) for col in range(3)]
print(list)
D:\anaconda\python.exe D:/bilibili大学/简书代码/推导式.py
[2, 4, 6, 12, 15, 18, 28, 32, 36]
Process finished with exit code 0

使用()生成generator：将俩表推导式的[]改成()即可得到生成器。

multiples = (i for i in range(30) if i % 3 is 0)
print(type(multiples))
Output: <type 'generator'>

字典推导式

我们看先来看使用字典推导式的基础模板：

{ key:value for key,value in existing_data_structure }

这里和list有所不同，因位dict里面有两个关键的属性，key 和 value，但大同小异，我们现在的expression部分可以同时对 key 和 value 进行操作
下面来看最常见的应用
例1: 用字典推导式配合枚举的使用案例：

strings = ['import','is','with','if','file','exception','shim','lucy']
dict = {k:v for v,k in enumerate(strings)}
print(dict)
*********************************
D:\anaconda\python.exe D:/bilibili大学/简书代码/推导式.py
{'import': 0, 'is': 1, 'with': 2, 'if': 3, 'file': 4, 'exception': 5, 'shim': 6, 'lucy': 7}
Process finished with exit code 0

从这个例题我们发散一下，上题的k是字符串，v是序列。如果我们更换kv呢：

strings = ['import','is','with','if','file','exception','shim','lucy']
dict = {k:v for k,v in enumerate(strings)}
print(dict)
*************************
D:\anaconda\python.exe D:/bilibili大学/简书代码/推导式.py
{0: 'import', 1: 'is', 2: 'with', 3: 'if', 4: 'file', 5: 'exception', 6: 'shim', 7: 'lucy'}
Process finished with exit code 0

显然中间的kv就是控制键值的。
关于enumerate（）函数请参考：https://www.runoob.com/python/python-func-enumerate.html
例2：互换key和value的值：

person = {'角色名':'宫本武藏','定位':'刺客'}
person_reverse = {k:v for v,k in person.items()}
#person_reverse = {v:k for k,v in person.items()}#也可以实现
print(person_reverse)
******************************************
D:\anaconda\python.exe D:/bilibili大学/简书代码/推导式.py
{'宫本武藏': '角色名', '刺客': '定位'}
Process finished with exit code 0

例3：源数据的key是字母的大小写混在一起，我们想统计同一个字母（不论大小写）的key所对应的键值对的和：

nums = {'a':10,'b':20,'A':5,'B':3,'d':4}
num_frequency  = {k.lower():nums.get(k.lower(),0) + nums.get(k.upper(),0)
                  for k in nums.keys() }
#nums是字典，nums.get(k.lower(),0)的意思是在字典nums中查找小写Key
#找到了返回KEY对应的Value,否则返回参数0,nums.get(k.upper(),0)同上
print(num_frequency)
*******************
D:\anaconda\python.exe D:/bilibili大学/简书代码/推导式.py
{'a': 15, 'b': 23, 'd': 4}
Process finished with exit code 0

例4：我们有一个fruit的list，现在想要得到每一种水果的单词长度:

fruits = ['apple','orange','banana','mango','peach']
fruits_dict = {fruit:len(fruit) for fruit in fruits}
print(fruits_dict)
********************************************
D:\anaconda\python.exe D:/bilibili大学/简书代码/推导式.py
{'apple': 5, 'orange': 6, 'banana': 6, 'mango': 5, 'peach': 5}
Process finished with exit code 0

集合推导式（Set comprehensions）

让我们看先来看使用集合推导式的基础模板：

{ expression for item in Sequence if conditional }

其实集合推导式和list的推导式很像，但是既然是集合，肯定会配合利用Set的特有属性来实现我们的目的。
对Set数据结构不够了解，推荐参考：https://segmentfault.com/a/1190000018109634?_ea=7068836
例1：首先，我们来看一个根据Set值唯一的特性的例子，我们有一个list叫names，用来存储名字，其中的数据很不规范，有大写，小写，还有重复的，我们想要去重并把名字的格式统一为首字母大写，实现方法便是用Set推导式：

names = [ 'Bob', 'JOHN', 'alice', 'bob', 'ALICE', 'James', 'Bob','JAMES','jAMeS' ]
new_names = {n[0].upper() + n[1:].lower() for n in names}
print(new_names)
***********************************************************
D:\anaconda\python.exe D:/bilibili大学/简书代码/推导式.py
{'Bob', 'James', 'John', 'Alice'}
Process finished with exit code 0

python的文件操作

python中的文件的概念比较广泛
可以是一个文件,也可以是一个网络请求,还可以是两段程序之间的交互,总结来说就是数据流

测试python文件的内建函数

# 测试python文件的内建函数
#输入
file1 = open('测试文件.txt','w')
file1.write('hello world')
file1.close()
#输出
file2 = open('测试文件.txt')
print(file2.read())
file2.close()
#追加的输入不要覆盖
file3 = open('测试文件.txt','a')
file3.write('追加的内容')
file3.close()

#　文件操作
file4 = open('测试文件2.txt')
print(file4.readline())
file4.close()
# 逐行处理
file5 = open('测试文件2.txt')
for line in file5.readlines():
    print(line)
    print('#############')
file5.close()
# 处理完文件后,回到文件的开头
file6 = open('测试文件2.txt')
print(file6.tell()) # 告诉文件的指针
print(file6.read(3)) # 移动文件的指针
print(file6.tell())# 告诉文件的指针
file6.seek(0)
print(file6.tell())# 告诉文件的指针
# 第一个参数代表偏移位置　第二个参数 0从开头开始偏移 1从当前位置偏移　从文件结尾开始偏移
print(file6.read(10))
file6.seek(5, 0)
print(file6.tell())

python异常处理

# 单个异常
try:
    year = int(input('please:'))
except ValueError:
    print('请输入数字')
# 多个异常
try:
    year = int(input('please:'))
except (ValueError,AttributeError,KeyError):
    print('请输入数字')
# 打印错误信息
try:
    print(1/0)
except ZeroDivisionError as e:
    print('0不能做除数　%s' %e)
# 补货全部异常
try:
    print(1/'a')
except Exception as e:
    print('%s' %e)
# 自定义错误　手动抛出错误　raise
try:
    raise NameError('自定义错误')
except NameError as e:
    print('我的自定义错误　%s' %e)
# finally用法
try:
    a = open('测试文件1.txt')
except Exception as e:
    print('%s' %e)
finally:
    print('关闭资源')
    a.close()