This is the documentation of the way to save audio chunks during long-term recording with python
First of all
These are my development environments
Hardware
- Raspberry Pi CM4 Model B
- Waveshare Raspberry Pi CM4 IO board (https://www.waveshare.com/cm4-io-base-b.htm)
- ReSpeaker 2-Mics Pi HAT (https://wiki.seeedstudio.com/ReSpeaker/)
Software
- Raspberry Pi OS Bullseys
- PyAudio (https://people.csail.mit.edu/hubert/pyaudio/)
Background
Last week, I recorded a voice audio with PyAudio, which was about 25 minutes long. Then it took over 10 minutes to process the audio stream. I just copied and pasted from PyAudio dosc about how to record.
I could improve the time somehow. However, I thought I didn't have to wait to finish recording and could start saving while recording.
First of all, I googled the way to do that because I thought there should have been some ways or libraries. After researching for a while, I didn’t find any good solutions to it. So I decided to document my way.
What I did
I didn’t find the same as what I wanted to do. However, I found the following article written by the Google team and thought I could modify the code lines for my purpose.
This code snippet is for the real-time transcription through sending API to a Google service. So I need to make it save audio chunks somehow instead of sending API requests. In the reference, they use a generator to send requests but I decided to save chunks in the stream_callback
function because the PyAudio docs say the function is called in a thread apart from the main by default. I thought I could do it with little effort.
stream_callback
is called in a separate thread (from the main thread).
So I changed the reference like below.
def _fill_buffer(
self: object,
in_data: object,
frame_count: int,
time_info: object,
status_flags: object,
) -> object:
"""Continuously collect data from the audio stream, into the buffer.
Args:
in_data: The audio data as a bytes object
frame_count: The number of frames captured
time_info: The time information
status_flags: The status flags
Returns:
The audio data as a bytes object
"""
logger.info(f'fill buffer: {len(self._recording_frames)}')
self._recording_frames.append(in_data)
if len(self._recording_frames) >= RECORDING_CHUNK_SIZE:
saving_frames = self._recording_frames[:]
self._recording_frames = []
self._count += 1
self._save(saving_frams, self._count,
return None, pyaudio.paContinue
def _save(self, frames, count):
with wave.open(f'{CHUNK_DIR}{self._session_id}_{count:02}.{FILE_EXTENSION}', 'wb') as wf:
wf.setnchannels(self._channel)
wf.setsampwidth(self._audio_interface.get_sample_size(self._sample_width))
wf.setframerate(self._rate)
wf.writeframes(b''.join(frames))
logger.info(f'Finish recording: count: {count}, frames: {len(frames)}')
However, the _save
function blocked the main thread even though the docs say the function is run in a separate thread. I tested the callback_function
with time.sleep
and saw the log. The main thread was blocked by time.sleep
in the callback_function
, which means the recording stopped during the time.sleep
in my case. Actually, I haven’t got the reason yet. But I changed the code snippet to create the thread by myself like below for now.
def _fill_buffer(
self: object,
in_data: object,
frame_count: int,
time_info: object,
status_flags: object,
) -> object:
"""Continuously collect data from the audio stream, into the buffer.
Args:
in_data: The audio data as a bytes object
frame_count: The number of frames captured
time_info: The time information
status_flags: The status flags
Returns:
The audio data as a bytes object
"""
logger.info(f'fill buffer: {len(self._recording_frames)}')
self._recording_frames.append(in_data)
if len(self._recording_frames) >= RECORDING_CHUNK_SIZE:
saving_frames = self._recording_frames[:]
self._recording_frames = []
self._count += 1
# FIXME: Should not have to create a thread by myself
self._create_chunk_saving_thread(saving_frames, self._count)
return None, pyaudio.paContinue
def _create_chunk_saving_thread(self, saving_frames, count):
created_at = datetime.datetime.now().strftime('%y%m%d%H%M%S')
saving_thread = Thread(
target=self._save,
args=(saving_frames, count, created_at,),
daemon=True,
)
saving_thread.start()
logger.info(f'Start saving: session_id: {self._session_id}, count: {self._count}')
def _save(self, frames, count, start_time):
with wave.open(f'{CHUNK_DIR}{self._session_id}_{count:02}.{FILE_EXTENSION}', 'wb') as wf:
wf.setnchannels(self._channel)
wf.setsampwidth(self._audio_interface.get_sample_size(self._sample_width))
wf.setframerate(self._rate)
wf.writeframes(b''.join(frames))
logger.info(f'Finish recording: count: {count}, start_time: {start_time}, frames: {len(frames)}')
Afterward, the chunk saving worked well without blocking audio recording. These are the whole code lines.
I’ll update after I get the reason why the callback_function
blocked the main thread.
That’s it!