Adventures With Ami: Using GPT-3.5 on Streaming Audio

Away from home for a week, with a little time on my hands and after talking to Maffy (my son, junior doc in hospital) I thought it would be fun to write a python script that would automate his note taking. The latest version is here : https://drive.google.com/file/d/1hhEjKgoq4BHMQuzazvEmufyp09nkvql8/view?usp=sharing

Still haven’t figured out how to use github!

The consult normally has a 6 part structure.

Picovoice

I am using picovoice to capture the audio stream from the microphone (pvrecorder) and pvporcupine as a wake up word detector.

After installing picovoice with pip I found that it had picked up an old version and the custom wake word library as downloaded from the picovoice console was not working. So had to run python3 -m pip install –upgrade picovoice = all good.

Once the wake up word is detected I use pvcheetah as a speech to text engine. It works well. As the consult proceeds the results from cheetah are added on the end of a growing string (partial_transcript).

To detect the end of the consult I use an end phrase. The look_for_end is a ‘last in first out’ field that is 5 characters longer than the end phrase. I found it best to use a three word phrase. I use this field as the results from cheetah are too small to contain a 3 word phrase.

GPT-3-turbo

Once the transcript is complete I submit it to GPT-3-turbo. This is the oldest and cheapest of the OpenAi models. It seems to work in testing, but I am not sure how it will perform if ever used in real life. There are many other newer more expensive models, but I have not tried them yet.

Using 6 different prompts I submit the entire transcript 6 times. This is a lazy expensive way to do it. There are more economical ways to do it, these rely on embedding and vector representation.

I ran into the model submission limit of 3/minute when I tried to submit 6 queries in rapid succession. The insertion of time.sleep(25) solved that problem. Though it slowed down the script considerably.

Saving to a .docx file

As the six answers came in I assemble them into a field and then save them all as a single file. I use the docx library to do this

Emailing an attachment

I sent the completed .docx document to an email address. To do that I used the smptlib and email libraries to send an email. I used a gmail account to send the emails. Google have recently upgraded their security so most of the tutorials are out of date. I found I had to get an app password, which I created in the security section of my google account. Apart from that it all works well. It does take a few minutes for the email carrying the attachment to arrive in the destination inbox.

Converting to An Android App

That python code worked well, but it will be far more useful if it was an android app.

So, it turns out that to build an Android app you need Android Studio, for that you need to have at least 8GB memory. I installed it using the instructions here https://www.tabnine.com/blog/install-android-studio-ubuntu-step-by-step-guide/

My machine was not able to run the emulator, but that was not serious as I was fairly easily able to connect my android phone (running android 11) through the wifi. To do that I first had to enable developer options on my phone https://nokiaandroidphones.com/developer-options-on-nokia-phones/.

After that it was fairly simple to pair my phone with Android Studio https://developer.android.com/studio/run/device. Open the Device Manager tab on the right hand side. Open the Physical Device tab and pair the phone. Make sure they are all on the same network. You should see the device name appear at the top of the screen. To build and deploy the current code, click on the green right arrow next to the device name.

Error – targetSdk wrong

I suddenly ran into this error whilst trying to build – it used to work, but this suddenly appeared. It said the target Sdk was 33 but should be 34. To fix that – Within the Gradle Scripts tab on the left hand side. Within build.gradle.kts(Module app) I changed the android compileSdk = 34 and within defaultConfig I changed targetSdk = 34. At the top it asked me if I wanted to sync the changed file. I did.

Picovoice

Using picovoice Cheetah to perform speech to text . I followed the instructions here https://picovoice.ai/blog/android-speech-recognition/#cheetah-streaming-speech-to-text

They are a little out of date. It asks you to include mavenCentral() in your top-level build.gradle file. (That is build.gradle.kts(Project <your project>)) Don’t. That is the old way. The new way is to include in settings.gradle.kts. But in my build it is included by default.

In the build.gradle.kts(Module app) file I included implementation(“ai.picovoice:android-voice-processor:1.0.2”) implementation(“ai.picovoice:cheetah-android:1.0.2”)

App Inventor

After telling my friend Graham about my problems with Kotlin, he suggested I use App Inventor. A block based language developed by MIT. It is much better. Though I believe the code it produces is bigger and less efficient than coding directly in Kotlin.

Using App Inventor was intuitive and fun, once I got the hang of using interrupts to navigate. My code is https://drive.google.com/drive/folders/1WnNOx8W0OxuQpEpq9FamrWl7APZ3-dlX?usp=sharing

The major problem I had was trying to attach a file to an email. I bought!! the taifun mail extension for App Inventor it worked well for picked files in the public folders. However I failed to create and write to a file inside the app, and then attach that file to the email. It kept on throwing an error saying it could not find the file. I could read the file from inside the app, but not attach it to the email.

I think the problem has to do with Android security and private, app specific files and public ones The blogs suggested the fix of copying the file to a public folder – like Documents for example. I could not find the full address of the Documents folder. It is an area with much confusion. So after some time banging my head against that problem – I gave up.

Published by ami.scunao

I am a Nao v5 robot, I hang out with my friend Jim. We have lots of adventures

Leave a comment

Design a site like this with WordPress.com
Get started