miliclub.blogg.se - Azure speech to text read audio

Azure speech to text read audio code#
Azure speech to text read audio series#

The web app contacts the backend on the relative URL /app, now available thanks to the “bring your own function" feature. With the functions linked to the static app, now open the static app’s public URL, record an audio file, and transcribe it. The external function is then linked to the static web app and exposed under the same hostname: Click the Link to a Function app link, select the translator function, then click the Link button. Instead, we link an external function to the static web app. Setting the api_location property in the GitHub action step to an empty string enforces this.

The static web app doesn’t expose any functions. Bringing Your Own Functionįirst, open the static website resource in Azure and select the Functions link in the left-hand menu. Use the bring your own function feature of static web apps to enable this URL function.

This approach would allow the static web app to interact with the functions using relative URLs rather than hardcoding external hostnames. This URL is acceptable, but it would be even better to expose the function app by the same hostname as the static web app. Once deployed, the Azure function is available on its own domain name like. settings " SPEECH_REGION=yourspeechregion" We build this Docker image with the following command, replacing dockerhubuser with a Docker Hub user name:Īz functionapp config appsettings set -name yourappname -resource-group yourresourcegroup #FROM /azure-functions/java:3.0-java$JAVA_VERSIONĮNV AzureWebJobsScriptRoot=/home/site/wwwroot \ĪzureFunctionsJobHost_Logging_Console_IsEnabled=trueĬOPY -from=installer-env #FROM /azure-functions/java: 3.0-java$JAVA_VERSION-core-tools AS installer-envįROM /azure-functions/java: 3.0-java$JAVA_VERSION-build AS installer-envįROM /azure-functions/java: 3.0-java$JAVA_VERSION-appservice # This image additionally contains function core tools – useful when using custom extensions We do this using the ByteArrayReader class: To use compressed audio files, we extend the PullAudioInputStreamCallback class to provide a reader that consumes a compressed audio file byte array with no additional processing. This is why GStreamer is one of our backend app’s prerequisites. However, audio files recorded by a browser will almost certainly be in a compressed format like WebM.įortunately, the Azure SDK does allow converting compressed audio files using the GStreamer library. The first challenge of transcribing audio files is that the Azure APIs only natively support WAV files. client-sdk 1.19.0 3 okhttp 4.9.3 commons-io commons-io 2.11.0 commons-text 1.9 Working with Compressed Audio Files Note that it uses Java 11 instead of Java 8, which is the version the Microsoft documentation specifies: To create the sample application, run the following command. The Microsoft documentation provides instructions for creating the sample project that forms the base for this tutorial. This key is needed to interact with the service from the application: Microsoft’s documentation provides instructions on creating a speech service in Azure.Īfter creating the service, take note of the key. The backend API will serve as a proxy between the frontend web app and the Azure speech service. The Azure Speech service provides the ability to convert speech to text.

Azure speech to text read audio code#

This application’s complete source code is available on GitHub, and the backend application is available as the Docker image: mcasperson/translator. Also, our backend application requires installing the Java 11 JDK. This tutorial requires the Azure functions runtime version 4 and GStreamer to convert WebM audio files for the Azure APIs to process.

This tutorial will build the first part of the API, exposing the /transcribe endpoint to convert an audio file into text. A backend API, written in Java as an Azure function app, handles this logic. However, the frontend web app doesn’t contain the logic required to process audio or text. It exposed a wizard to record speech in the browser, transcribe it, translate it, then convert the resulting text back to audio.

Azure speech to text read audio series#

The first part of this three-part series built a frontend web app for a universal translator.