树莓派下paudio安装与声音监控运用

在树莓派平台上使用pyaudio实现usb麦克风的录音功能,进而可以实现人机交互,实现语音识别和语音合成。
参考pyaudio官方文档,链接地址如下:

http://people.csail.mit.edu/hubert/pyaudio/docs/

pyaudio是python的模块,在树莓派下安装pyaudio 首先需安装portaudio.dev
安装步骤如下:
1、安装portaudio.dev : sudo apt-get install portaudio.dev
2、安装python-pyaudio: sudo apt-get install python-pyaudio
3、安装sox快速检测麦克风配置是否正确sudo apt-get install sox
4、测试麦克风配置是否正确,树莓派终端输入以下命令
rec temp.wav
树莓派下pyaudio安装与声音监控应用 - 木木的博客 - CSDN博客 - 图1
5、测试pyaudio 代码如下:录音40s并保存为audio.wav播放

  1. #_*_ coding:UTF-8 _*_
  2. # @author: zdl
  3. # 测试pyaudio 使用pyaudio录音,录音完毕播放录音内容
  4. # 需要安装pyaudio 安装过程在教程中讲解
  5. # pyaudio API函数库参考: http://people.csail.mit.edu/hubert/pyaudio/docs/#pyaudio.Stream.write
  6. import wave
  7. from pyaudio import PyAudio,paInt16
  8. # 设置采样参数
  9. NUM_SAMPLES = 2000
  10. TIME = 2
  11. chunk = 1024
  12. # read wav file from filename
  13. def read_wave_file(filename):
  14. fp = wave.open(filename,'rb')
  15. nf = fp.getnframes() #获取采样点数量
  16. print('sampwidth:',fp.getsampwidth())
  17. print('framerate:',fp.getframerate())
  18. print('channels:',fp.getnchannels())
  19. f_len = nf*2
  20. audio_data = fp.readframes(nf)
  21. # save wav file to filename
  22. def save_wave_file(filename,data):
  23. wf = wave.open(filename,'wb')
  24. wf.setnchannels(1) # set channels 1 or 2
  25. wf.setsampwidth(2) # set sampwidth 1 or 2
  26. wf.setframerate(16000) # set framerate 8K or 16K
  27. wf.writeframes(b"".join(data)) # write data
  28. wf.close()
  29. #recode audio to audio.wav
  30. def record():
  31. pa = PyAudio() # 实例化 pyaudio
  32. # 打开输入流并设置音频采样参数 1 channel 16K framerate
  33. stream = pa.open(format = paInt16,
  34. channels=1,
  35. rate=16000,
  36. input=True,
  37. frames_per_buffer=NUM_SAMPLES)
  38. audioBuffer = [] # 录音缓存数组
  39. count = 0
  40. # 录制40s语音
  41. while count<TIME*20:
  42. string_audio_data = stream.read(NUM_SAMPLES) #一次性录音采样字节的大小
  43. audioBuffer.append(string_audio_data)
  44. count +=1
  45. print('.'), #加逗号不换行输出
  46. # 保存录制的语音文件到audio.wav中并关闭流
  47. save_wave_file('audio.wav',audioBuffer)
  48. stream.close()
  49. # 播放后缀为wav的音频文件
  50. def play():
  51. wf = wave.open(r"audio.wav",'rb') # 打开audio.wav
  52. p = PyAudio() # 实例化pyaudio
  53. # 打开流
  54. stream = p.open( format=p.get_format_from_width(wf.getsampwidth()),
  55. channels=wf.getnchannels(),
  56. rate=wf.getframerate(),
  57. output=True)
  58. # 播放音频
  59. while True:
  60. data = wf.readframes(chunk)
  61. if data == "":break
  62. stream.write(data)
  63. # 释放IO
  64. stream.stop_stream()
  65. stream.close()
  66. p.terminate()
  67. # main函数 录制40s音频并播放
  68. if __name__ == '__main__':
  69. print('record ready...')
  70. record()
  71. print('record over!')
  72. play()

程序演示结果:

树莓派下pyaudio安装与声音监控应用 - 木木的博客 - CSDN博客 - 图2

使用不同的usb摄像头会出现采样率报错的问题,解决方案参考以下博客:
https://blog.csdn.net/u013860985/article/details/79326379
https://blog.csdn.net/u013372900/article/details/80296125
添加一个新的~/.asoundrc 到pi目录下:sudo nano ~/.asoundrc
输入以下内容:

  1. pcm.!default {
  2. type hw
  3. card 1
  4. }
  5. ctl.!default {
  6. type hw
  7. card 1
  8. }

或者:

  1. pcm.!default {
  2. type asym
  3. playback.pcm {
  4. type plug
  5. slave.pcm "hw:0,0"
  6. }
  7. capture.pcm {
  8. type plug
  9. slave.pcm "hw:1,0"
  10. }
  11. }
  12. ctl.!default {
  13. type hw
  14. card 1
  15. }

树莓派下pyaudio安装与声音监控应用 - 木木的博客 - CSDN博客 - 图3

树莓派音频输出设置:
1、选择树莓派 audio output 为AUTO或者3.5mm sudo raspi-config 设置
2、如何调整输出音量,终端输入 alsamixer 命令然后上下键调整

3、语音合成的音频为MP3文件,pyaudio只能播放wav文件,所以需要安装MP3播放插件 mplayer。
安装 sudo apt-get install mplayer
测试 mplayer xx.mp3 (需先进入xx.mp3的文件夹下)
Python 程序播放 MP3: os.system('mplayer %s' % 'xx.mp3')

思考:如何采用pyaudio制作声音的监控装置呢?监控噪声、人或者物体运动然后提示报警?

  1. #coding:utf-8
  2. # @author: zdl
  3. #需要安装pyaudio
  4. import wave
  5. import numpy as np
  6. from pyaudio import PyAudio,paInt16
  7. import time
  8. NUM_SAMPLES = 2000
  9. global t
  10. chunk = 1024
  11. def play(filename):
  12. wf = wave.open(filename,'rb')
  13. p = PyAudio()
  14. stream = p.open( format=p.get_format_from_width(wf.getsampwidth()),
  15. channels=wf.getnchannels(),
  16. rate=wf.getframerate(),
  17. output=True)
  18. while True:
  19. data = wf.readframes(chunk)
  20. if data == "":break
  21. stream.write(data)
  22. stream.stop_stream()
  23. stream.close()
  24. p.terminate()
  25. def save_wave_file(filename,data): #save data to filename
  26. wf = wave.open(filename,'wb')
  27. wf.setnchannels(1) #set channel
  28. wf.setsampwidth(2) #采样字节 1 or 2
  29. wf.setframerate(16000) #采样频率 8K or 16K
  30. wf.writeframes(b"".join(data))
  31. wf.close()
  32. # 声音监视函数
  33. def Monitor():
  34. pa = PyAudio() # 调用句柄
  35. stream = pa.open(format = paInt16,
  36. channels=1,
  37. rate=16000,
  38. input=True,
  39. frames_per_buffer=NUM_SAMPLES)
  40. print('开始缓存录音')
  41. audioBuffer = []
  42. rec = []
  43. audioFlag = False
  44. t = False
  45. while True:
  46. data = stream.read(NUM_SAMPLES,exception_on_overflow = False) #add exception para
  47. #if audioFlag == True:
  48. # rec.append(data) #剪切的语音文件
  49. audioBuffer.append(data) #录音源文件
  50. audioData = np.fromstring(data,dtype=np.short) #字符串创建矩阵
  51. largeSampleCount = np.sum(audioData > 2000)
  52. temp = np.max(audioData)
  53. print temp
  54. if temp > 8000 and t == False: # 声音阈值,结合实验确定,实现声音监控
  55. t = 1 #开始录音
  56. print "检测到语音信号,开始录音"
  57. begin = time.time()
  58. print temp
  59. if t:
  60. end = time.time()
  61. if end-begin > 5:
  62. timeFlag = 1 #5s录音结束
  63. if largeSampleCount > 20:
  64. saveCount = 4
  65. else:
  66. saveCount -=1
  67. if saveCount <0:
  68. saveCount =0
  69. if saveCount >0:
  70. rec.append(data)
  71. else:
  72. if len(rec) >0 or timeFlag:
  73. save_wave_file('01.wav',rec)
  74. rec = []
  75. t = 0
  76. timeFlag = 0
  77. stream.stop_stream()
  78. stream.close()
  79. p.terminate()
  80. if __name__ == '__main__':
  81. Monitor()

可以查看我的github获取更多信息:https://github.com/dalinzhangzdl/AI_Car_Raspberry-pi