POST /transcripts

Transcribe an audio file from a given URL. The API will download the audio found at the URL and transcribe it at our crowdsourcing platform.

JSON Parameters

Parameter Example Required Description
audioSrcUrl http://foo.com/bar.mp3 Yes URL for the audio file to be transcribed. This must be a publicly available URL, or temporarily public URL.
asrTranscriptUrl http://foo.com/bar.json
No URL for the transcript file generated by an ASR system to serve as input to human transcription. This must be Amazon Transcribe JSON format.
tags ["test26", "test1"] No Tags can be used to add context to a transcription or keep track of certain topics in your pipeline. Use it to categorize, prioritize or further diagnose transcription progress based on the needs for your projects or customers.
speakerCount n No Number of speakers in the audio. If transcribing dual channel (stereo) audio, and you know there is exactly one speaker on each channel (for example, a phone call), setting  speakerCount to 2 will enable speaker labels. If unsure of speaker count, make an uppermost guess.
languageCode en Yes The language code for the language used in the input media file. Can be  en  es  etc.
Country code related to accent like US UK etc is not accepted.
options

{"formatText": false}


{"truncatedWords": false}

{"overlpadedVoices": false}

{"nonAudible": false}

{"nonVerbal": false}


{"phoneticWriting": false}

{"interjections": false}
No Optional parameters for manipulating the transcript.

Setting  formatText to false will disable punctuation and casing.

Setting truncatedWords to false will disable annotation of truncated words. 

Setting  overlapedVoices to false will disable annotation of voice overlapping.

Setting  nonAudible to false will disable annotation of non audible speech fragments.

Setting  nonVerbal to false will disable annotation of non verbal human noises like laughs.

Setting  phoneticWriting to false will disable annotation of words writed according to their phonetics.

Setting  interjections to false will disable transcription of Uhms and Ers. 

Example: Create a transcript and disable text formatting and truncated words annotation.

curl --request POST \     
  --url 'https://api.atext.io/transcripts' \     
  --header 'Authorization: Basic AUTHORIZATION_TOKEN' \     
  --data '
  {
    "audioSrcUrl": "AUDIO_URL",
    "tags": ["sample", "project-1"],
    "speakerCount": 2,
    "languageCode": "es",
    "options": 
    {
      "formatText": false,
      "truncatedWords": false
    }
  }'

Expected Response

{
        "uuid": "16a54748-da46-4080-b4a1-fa18a6b223cf",
        "status": "started",
        "audioSrcUrl": "AUDIO_URL",
        "language": "en",
        "tags": [
            "demo",
            "completed"
        ],
        "speakerCount": 2,
        "options": {
            "formatText": false,
            "truncatedWords": false
        },
        "createdAt": "2018-12-08T01:25:49.031Z"
    }

When you create a transcript, the status goes from  started to processing to ready. Processing time normally takes under 24 hours.

To get the result of the transcript, pull for the transcript id with  GET requests.

Did this answer your question? Thanks for the feedback There was a problem submitting your feedback. Please try again later.

Still need help? Contact Us Contact Us