python官方文档

python标准库

Python常用小段程序

注：

Cython

Cython项目以开发能把 Python代码转换为等价C代码的编译器为基础。C代码随后在 Cython
环境中执行。这种编译机制使得在 Python代码中嵌亼能够提升效率的C代码成为可能。 Cython可
以被视为一门新的编程语言,它的发明实现了两种编程语言的融合。可以从网上找到大量相关文
档。
建议你访问链接

Jython

跟 Cython相对应的,还有完全用Java语言开发和编译的 Jython。它是由 Jim hugunin在1997年
开发的(http://www.jython.org)。Jython是用Java语言实现的Python编程语言。更进一步来讲,它
具有以下特点: Python的扩展和包是用Java类而不是 Python模块实现的。

PyPy

PyPy解释器是一种即时( Just-in-time,JIT)编译器,它在运行时直接把 Python代码转化为机
器码。这样做是为了提升代码的执行速度,却因此只使用了所有 Python命令中的很少一部分。这
个只包含少数 Python命令的子集被称作 RPython。
关于PyPy的更多信息,请访问其官网http://pypy.org

python基础

1.避免显示for循环，使用函数式编程

items = [1,2,3,4,5]
for item in items:
    item + 1
# out:[2,3,4,5,6]

使用map函数

items = [1,2,3,4,5]
def inc(x):return x+1
list(map(inc,items))
# out:[2,3,4,5,6]

使用lambda

items = [1,2,3,4,5]
list(map(lambda x:x+1),items)
# out:[2,3,4,5,6]
# filter()函数只抽取函数返回结果为True的列表元素。
list(filter(lambda x:x<4),items)
# out:[1,2,3]

2.列表生成式

s = [x**2 for x in range(5)]:
s
# out:[0,1,4,9,16]

3.遍历map键值对

for key,value in map:
    print(key,value)

4.判断一个变量是否是某个类型

当我们拿到一个对象的引用时，如何知道这个对象是什么类型、有哪些方法呢？reference
type() 判断对象类型
isinstance() 判断class的类型
dir() 获得一个对象的所有属性和方法
配合getattr()、setattr()以及hasattr()，我们可以直接操作一个对象的状态：

>>>isinstance(a, list)
True
>>> isinstance(b, Animal)
True
>>> isinstance(c, Dog)
True
>>> type(123)
<class 'int'>
>>> type('str')
<class 'str'>
>>> type(None)
<type(None) 'NoneType'>
>>> type(123)==type(456)
True
>>> type(123)==int
True
>>> type('abc')==type('123')
True
>>> type('abc')==str
True
>>> type('abc')==type(123)
False

dir() 获得一个对象的所有属性和方法
配合getattr()、setattr()以及hasattr()，我们可以直接操作一个对象的状态：

>>>isinstance(a, list)
True
>>> isinstance(b, Animal)
True
>>> isinstance(c, Dog)
True
>>> type(123)
<class 'int'>
>>> type('str')
<class 'str'>
>>> type(None)
<type(None) 'NoneType'>
>>> type(123)==type(456)
True
>>> type(123)==int
True
>>> type('abc')==type('123')
True
>>> type('abc')==str
True
>>> type('abc')==type(123)
False

5.enumerate

enumerate is useful for obtaining an indexed list:

nums = [2,3,4]
for i,e in enumerate(nums,1):
    print(i,e)
print('--------------')
for i,e in enumerate(nums,2):
    print(i,e)
输出：
1 2
2 3
3 4
------------
2 2
3 3
4 4

6.作图美化

%config InlineBackend.figure_format = 'retina'
%matplotlib inline    
import seaborn as sns 
sns.set(font= "Kaiti",style="ticks",font_scale=1.4)
import matplotlib
import matplotlib.pyplot as plt
matplotlib.rcParams['axes.unicode_minus']=False # 解决坐标轴的负号显示问题

7.python函数参数带**

参考
解释
查阅资料后发现，参数前面加上 号，意味着参数的个数不止一个，另外带一个星号（）参数的函数传入的参数存储为一个元组（tuple），带两个（）号则是表示字典（dict）。
一个（）号还可以解压参数列表。
还可以同时使用一个（）和（*）

def t4(a, b=10, *args, **kwargs):
        print(a)
        print(b)
        print(args)
        print(kwargs)
t4(1, 2, 3, 4, e=5, f=6, g=7)
# 1
# 2
# 3 4
# {'e': 5, 'g': 7, 'f': 6}

8.python并行（1）

Parallel,delayed用法参考

9.import logging （python日志）

日志常用指引
日志操作手册 | 级别 | 何时使用 | | —- | —- | | DEBUG | 细节信息，仅当诊断问题时适用。 | | INFO | 确认程序按预期运行。 | | WARNING | 表明有已经或即将发生的意外（例如：磁盘空间不足）。程序仍按预期进行。 | | ERROR | 由于严重的问题，程序的某些功能已经不能正常执行 | | CRITICAL | 严重的错误，表明程序已不能继续执行 |

默认的级别是 WARNING，意味着只会追踪该级别及以上的事件，除非更改日志配置。
所追踪事件可以以不同形式处理。最简单的方式是输出到控制台。另一种常用的方式是写入磁盘文件。

def finalUse():
    logging.basicConfig(filename='example.log', level=logging.DEBUG, format='%(asctime)s %(message)s', datefmt='%m-%d-%Y %H:%M:%S')
    str = 'string'
    num = 10
    logging.debug('This message should go to the log file')
    logging.info('So should this')
    logging.warning('And this, too')
    logging.info('vary as string "%s" and number "%d" can be also used in.', str, num)
    logging.error('And non-ASCII stuff, too, like Øresund and Malmö')

import logging
def demo01():
    # 简单例子
    logging.warning('Watch out!')  # will print a message to the console
    logging.info('I told you so')  # will not print anything
    str = '变量'
    num = 10
    logging.warning('print %s %d', str, num)
def demo02():
    logging.basicConfig(filename='example.log', level=logging.DEBUG)
    logging.debug('This message should go to the log file')
    logging.info('So should this')
    logging.warning('And this, too')
    logging.error('And non-ASCII stuff, too, like Øresund and Malmö')
def demo03():
    # 更改显示消息的格式
    logging.basicConfig(format='%(levelname)s:%(message)s', level=logging.DEBUG) # 没有 root 了
    logging.debug('This message should appear on the console')
    logging.info('So should this')
    logging.warning('And this, too')
def demo04():
    # 在消息中显示日期/时间
    logging.basicConfig(format='%(asctime)s %(message)s', datefmt='%m-%d-%Y %I:%M:%S %p')
    # logging.basicConfig(format='%(asctime)s %(message)s', datefmt='%m-%d-%Y %H:%M:%S')
    logging.warning('is when this event was logged.')
def finalUse():
    logging.basicConfig(filename='example.log', level=logging.DEBUG, format='%(asctime)s %(message)s', datefmt='%m-%d-%Y %H:%M:%S')
    str = 'string'
    num = 10
    logging.debug('This message should go to the log file')
    logging.info('So should this')
    logging.warning('And this, too')
    logging.info('vary as string "%s" and number "%d" can be also used in.', str, num)
    logging.error('And non-ASCII stuff, too, like Øresund and Malmö')
if __name__ == '__main__':
    # demo04()
    finalUse()

10.迭代器 iter

官方文档参考

class MySentences(object):
    def __init__(self, dirname):
        self.dirname = dirname
    def __iter__(self):
        for fname in os.listdir(self.dirname):
            for line in open(os.path.join(self.dirname, fname)):
                yield line.split()
sentences = MySentences('/some/directory') # a memory-friendly iterator
model = gensim.models.Word2Vec(sentences)

Numpy

1. array()

a = np.array([1,2,3])
# a = np.array([1,2,3],dtype = float16)  # 指定数据类型
# a = np.array([1,2,3],dtype = int16)
# a = np.array([1,2,3],dtype = float64)
# a = np.array([1,2,3],dtype = complex) # 复数
b = np.array([1,2,3],[4,5,6])
a

a.ndim
# out:1
a.size
# out:3
a.shape
# out:(3L,)
a.itemsize
# out: (a =)3   (b =)6

检测新创建的对象是否是 ndarray很简单,只需要把新声明的变量传递给type()函数即可

type(a)

调用变量的 dtype属性,即可获知新建的 ndarray属于哪种数据类型。
a.dtype
dtype( ‘ int32 ‘ )
我们刚建的这个数组只有一个轴,因而秩的数量为1,它的型为(3,1)。这些值的获取方法如
下:轴数量需要使用ndim属性,数组长度使用size属性,而数组的型要用 shape属性。
a.ndim
a.size
a.shape

ndarray对象拥有另外一个叫作 itemsize的重要属性。它定义了数组中每个元素的长度为几个
字节。

b.itemsze

2.np.ones()、np.zeros()、np.arange()

>>>npzeros((3, 3))
array([[0.,0.,0.],
       [0.,0.,0.],
       [0.,0.,0.]])
>>np.ones((3,3))
array([[1.,1.,1.],
       [1.,1.,1.],
       [1.,1.,1.]])

>>>np arange(4, 10)
array([4,5,6,7,8,9])
>>>np arange(o, 12, 3) # 加入步长
array([0,3,6,9])
# arange 与 Python的 range()函数有所不同了, range()函数只可以使用整数作为步长
>>>np arange(o,6,0.6） # 步长可以是float、double
array([0.,0.6,1.2,1.8,2.4,3.,3.6,4,2,4.8,5.4])
>>>np arange(0, 12).reshape (3,4)
array([0,1,2,3],
      [4,5,6,7],
      [8,9,10,11])

3.np.random.random()

np.random.random(3,3)
np.random.randomn(3,3) # 数据符合正态分布

4.矩阵积 dot()，矩阵乘法*

np.dot(A,B)
A.dot(B)
A*B 不等于 B*A

5.索引、切片

索引简单理解，不记

切片

a[:,0]
a[:,1:]
a[:,-1]

6.结构化数组

bytes        b1
int            i1,i2,i4,i8
unsigned ints    u1,u2,u4,u8
floats            f2,f4,f8
complex            C8,C16
fixed length strings a<n>

>>>structured= np.array([(1,'First, 0.5, 1+2j),(2,'Second, 1.3, 2-2j),
                      (3,'Third',o.8,1+3j)], dtype=('i2,a6,f4,c8'))
>>> structured

编程笔记

python常用小段程序