Configure Google Cloud Speech-to-Text API

Overview

The Speech-to-Text API enables easy integration of Google speech recognition technologies into developer applications. It allows you to send audio and receive a text transcription from the service.

What we’ll cover
In this lab, you will learn how to:

Create an API key
Create a Speech-to-Text API request
Call the Speech-to-Text API

Step 1: Create an API Key

In the Google Cloud Console, navigate to Navigation menu > APIs & services > Credentials.
Click on Create credentials and select API key.
Copy the generated key and click Close.

Save API Key as Environment Variable

Connect to your VM instance via SSH.
In the command line, set the environment variable

export API_KEY=<YOUR_API_KEY>

Step 2: Create Your Speech-to-Text API Request

Create a new file named request.json:

touch request.json

Open the file in a text editor and add the following JSON configuration, specifying the audio file’s URI:

{
  "config": {
    "encoding": "FLAC",
    "languageCode": "en-US"
  },
  "audio": {
    "uri": "gs://cloud-samples-tests/speech/brooklyn.flac"
  }
}

Step 3: Call the Speech-to-Text API

curl -s -X POST -H "Content-Type: application/json" --data-binary @request.json "https://speech.googleapis.com/v1/speech:recognize?key=${API_KEY}"

The response will include the transcript and a confidence score.

Save Response to a File
curl -s -X POST -H “Content-Type: application/json” –data-binary @request.json “https://speech.googleapis.com/v1/speech:recognize?key=${API_KEY}” > result.json

Conclusion

Congratulations! You have successfully used the Speech-to-Text API to transcribe an audio file. This hands-on lab demonstrated how to create an API key, construct a request, and call the Speech-to-Text service.

Source link
lol