Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

faizi_856's avatar

Dynamic downloaded .wav file is not getting transcribed using google cloud speech to text api.

what i am doing right now is recording audio from browser and change it to base64 format from blob and send it to server(PHP) where i am downloading the base64 format in wav file.

If i run this wav file, i can hear the content recorded but if i give path of this audio to google cloud speech to text api, the transcribed result is null.

But if give path of some other sample wav file, that file gets transcribed to text perfectly.

So i dont know what to do to make it work for my recorded Audio .wav file.

Below is the script to get real time audio in browser and send it to php after base64 compression from BLOB

  const downloadLink = document.getElementById('download');
        const stopButton = document.getElementById('stop');

        // generate random stirng
            function generateUID(length)
            {
                return window.btoa(Array.from(window.crypto.getRandomValues(new Uint8Array(length * 2))).map((b) => String.fromCharCode(b)).join("")).replace(/[+/]/g, "").substring(0, length);
            }
        // generate random stirng

        const handleSuccess = function(stream) {
            const options = {
                mimeType: 'audio/webm'
            };
            const recordedChunks = [];
            const mediaRecorder = new MediaRecorder(stream, options);
            mediaRecorder.addEventListener('dataavailable', function(e) {
                if (e.data.size > 0) recordedChunks.push(e.data);
            });
            mediaRecorder.addEventListener('stop', function() {
                var blob = downloadLink.href = URL.createObjectURL(new Blob(recordedChunks));
                downloadLink.download = 'acetest.wav';
                var blob3 = new Blob(recordedChunks);
                var reader = new FileReader();
                reader.onload = function(event) {
                    var fd = {};
                    fd["fname"] = generateUID(8)+".wav";
                    console.log(fd["fname"]);
                    fd["data"] = event.target.result;
                    $.ajaxSetup({
                        headers: {
                            'X-CSRF-TOKEN': $('meta[name="csrf-token"]').attr('content')
                        }
                    });
                    $.ajax({
                        type: 'POST',
                        url: "{{ route('user.voice') }}",
                        data: fd,
                        dataType: 'text'
                    }).done(function(data) {
                        console.log(data);
                    });
                };
                reader.readAsDataURL(blob3);
            });
            stopButton.addEventListener('click', function(){
                mediaRecorder.stop();
            });
            mediaRecorder.start();
        };
        navigator.mediaDevices.getUserMedia({
                audio: true,
                video: false
            })
            .then(handleSuccess);

Here this is the code to download that base64 file to .wav file

 $data = substr($request['data'], strpos($request['data'], ",") + 1);
        // decode it
        $decodedData = base64_decode($data);
        // print out the raw data,
        $filename = $_POST['fname'];
        // write the data out to the file
        $folder = storage_path().'/app/public/audios/';
        $fp = fopen($folder.$filename, 'wb');
        fwrite($fp, $decodedData);
        fclose($fp);
0 likes
21 replies
faizi_856's avatar

Yes I have tried both, MULAW & LINEAR16 but it is still giving transcribe result as empty. Result of transcription for a php decoded base64 audio wav file. {"data":""}

Sinnbeck's avatar

So after you have the recording on disk and you can listen to it. Can you now run google speech on the file or is it just "broken" ?

faizi_856's avatar

no the file is not broekn..it plays in player and we can listen to content....and if i transcribe this file......no error occurs .just an empty response

Sinnbeck's avatar

@faizi_856 Thats why I put it in "". Its broken in the sense that it wont work with google speech. Have you tried saving the data with the storage facade, instead of manually doing it in php?

faizi_856's avatar

@Sinnbeck i just tried with storage facade...but doing so, the file is totally corrupt, it wont even open

faizi_856's avatar

@Sinnbeck


       $data = substr($request['data'], strpos($request['data'], ",") + 1);
        $decodedData = base64_decode($data);
        $filename = $_POST['fname'];
        $folder = storage_path().'/app/public/audios/';
        Storage::put($filename, $data);

Sinnbeck's avatar

@faizi_856 You never use folder for anything? And you store the base64 version?

 $data = substr($request['data'], strpos($request['data'], ",") + 1);
        $decodedData = base64_decode($data);
        $filename = $_POST['fname'];
        $folder = storage_path('/app/public/audios/' . $filename);
        Storage::put($folder, $decodedData);
faizi_856's avatar

@sinnbeck so i have tried one before too...file is being stored correctly and i can hear the content too on headphone...but still returning the empty result after transcription

Sinnbeck's avatar

@faizi_856 Can you perhaps check if the file has the correct encoding? I think VLC can show that data if I remember correctly.

faizi_856's avatar

@Sinnbec could not find properties in VLC but in another player i managed to find this in properties section

Format                         : WebM,
Format version                 : Version 4 / Version 2,
File size                      : 13.4 KiB,
Writing application            : Chrome,
Writing library                : Chrome,
IsTruncated                    : Yes,

Audio
ID                             : 1
Format                         : Opus
Codec ID                       : A_OPUS
Channel(s)                     : 1 channel
Channel positions              : Front: C
Sampling rate                  : 48.0 kHz
Bit depth                      : 32 bits
Compression mode               : Lossy
Language                       : English
Default                        : Yes
Forced                         : No
faizi_856's avatar

@Sinnbeck

Below are the properties of the .wav file that is actually getting translated by Google API

General
Format                         : Wave
File size                      : 525 KiB
Duration                       : 33 s 623 ms
Overall bit rate mode          : Constant
Overall bit rate               : 128 kb/s

Audio
Format                         : PCM
Format settings, Endianness    : Little
Format settings, Sign          : Signed
Codec ID                       : 1
Duration                       : 33 s 623 ms
Bit rate mode                  : Constant
Bit rate                       : 128 kb/s
Channel(s)                     : 1 channel
Sampling rate                  : 8 000 Hz
Bit depth                      : 16 bits
Stream size                    : 525 KiB (100%)

There is definitely issue in file encoding but i dont know how to fix this

mvd's avatar

@faizi_856 do you really need recordings from you browser?

If not necessary, you can try a tool like Audacity (free)

faizi_856's avatar

@mvd yes it was necessary to get from the browser. I have managed to do it with JS Library. Thanks.

faizi_856's avatar
faizi_856
OP
Best Answer
Level 1

@Sinnbeck I used a js library called recorder.js. It gave me the recoding in a proper format and now after uploading it to server it is transcribing the content properly.

1 like
Sinnbeck's avatar

@faizi_856 ah awesome. So it was a problem with the js lib. You can mark your answer as best to close the thread then 👍

1 like
benbadr's avatar

@faizi_856 Hello I'm facing an issue with the getting-transcribed-using-google-cloud-speech-to-text-api after I did read your commnets I see that you have sucsusfully did it so I want to ask you about what am I missing here is my code I tried many things still can get the audio to transcribe to text

use Google\Cloud\Speech\V1\RecognitionConfig;
use Google\Cloud\Speech\V1\StreamingRecognitionConfig;
use Google\Cloud\Speech\V1\SpeechClient;
use Google\Cloud\Speech\V1\RecognitionAudio; 

		$recognitionConfig = new RecognitionConfig();
        $recognitionConfig->setEncoding(AudioEncoding::MP3);
        $recognitionConfig->setSampleRateHertz(44100);
        $recognitionConfig->setLanguageCode('en-US');
        $config = new StreamingRecognitionConfig();
        $config->setConfig($recognitionConfig);
		
		$audioResource = fopen("storage/unfortunately.mp3", 'r');
		$responses = $speechClient->recognizeAudioStream($config, $audioResource);	

dd($responses);

Please or to participate in this conversation.