Create a note from an audio file using a two-step process. First, generate a presigned URL for audio upload, then create the note from the uploaded audio.
This feature is available for Enterprise plans only. Both endpoints require Enterprise-level API keys.

Step 1: Generate Upload URL

POST /v1/notes/from-audio/generate-upload-url

Generate a presigned URL for uploading an audio file to our servers.

Request Body

fileExtension
string
required
File extension of the audio file. Options: m4a, mp3, wav, aac
duration
number
required
Duration of the audio file in seconds (maximum 4 hours / 14,400 seconds)

Response

presignedUrl
string
The presigned URL for uploading the audio file to S3. This URL expires in 10 minutes.
audioUrl
string
The final URL where the audio file will be accessible after upload. Use this URL in Step 2.

Example Request

curl -X POST 'https://api.caret.so/v1/notes/from-audio/generate-upload-url' \
  -H 'Authorization: Bearer {api_key}' \
  -H 'Content-Type: application/json' \
  -d '{
    "fileExtension": "m4a",
    "duration": 1800
  }'

Example Response

{
  "presignedUrl": "https://caret-cdn.s3.ap-northeast-2.amazonaws.com/temp/note_upload_20241225_143052_abc123.m4a?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=...",
  "audioUrl": "https://caret-cdn.s3.ap-northeast-2.amazonaws.com/temp/note_upload_20241225_143052_abc123.m4a"
}

Step 2: Upload Audio File

Use the presignedUrl from Step 1 to upload your audio file directly to our servers.

Example Upload

curl -X PUT '{presignedUrl}' \
  -H 'Content-Type: audio/m4a' \
  --data-binary '@/path/to/your/audio.m4a'

Step 3: Create Note from Audio

POST /v1/notes/from-audio

Create a note from the uploaded audio file using the audioUrl from Step 1.

Request Body

audioUrl
string
required
The audio URL returned from Step 1 (generate upload URL)
title
string
Optional title for the note. If not provided, a title will be generated automatically.
tags
array
Array of tag IDs to associate with the note
kind
string
Type of meeting. Options: online, in-person, podcast. Default: in-person
existingNoteId
string
Optional existing note ID to update instead of creating a new note

Response

noteId
string
The ID of the created note

Example Request

curl -X POST 'https://api.caret.so/v1/notes/from-audio' \
  -H 'Authorization: Bearer {api_key}' \
  -H 'Content-Type: application/json' \
  -d '{
    "audioUrl": "https://caret-cdn.s3.ap-northeast-2.amazonaws.com/temp/note_upload_20241225_143052_abc123.m4a",
    "title": "Weekly Team Meeting",
    "tags": ["01887270-23f5-7da0-b95c-9a9e9ebc3c25"],
    "kind": "online"
  }'

Example Response

{
  "noteId": "01887270-12d4-7da0-b95c-9a9e9ebc3b13"
}

Complete Workflow Example

Here’s a complete example of the audio-to-note conversion process:
# Step 1: Generate upload URL
UPLOAD_RESPONSE=$(curl -X POST 'https://api.caret.so/v1/notes/from-audio/generate-upload-url' \
  -H 'Authorization: Bearer {api_key}' \
  -H 'Content-Type: application/json' \
  -d '{
    "fileExtension": "m4a",
    "duration": 1800
  }')

# Extract URLs from response
PRESIGNED_URL=$(echo $UPLOAD_RESPONSE | jq -r '.presignedUrl')
AUDIO_URL=$(echo $UPLOAD_RESPONSE | jq -r '.audioUrl')

# Step 2: Upload audio file
curl -X PUT "$PRESIGNED_URL" \
  -H 'Content-Type: audio/m4a' \
  --data-binary '@/path/to/your/audio.m4a'

# Step 3: Create note from audio
curl -X POST 'https://api.caret.so/v1/notes/from-audio' \
  -H 'Authorization: Bearer {api_key}' \
  -H 'Content-Type: application/json' \
  -d "{
    \"audioUrl\": \"$AUDIO_URL\",
    \"title\": \"Weekly Team Meeting\",
    \"kind\": \"online\"
  }"

Processing Status

After creating a note from audio, the note will go through several processing stages:
  1. transcript-processing: The audio is being transcribed
  2. summarizing: The transcript is being summarized and processed
  3. completed: The note is fully processed and ready
You can check the processing status by getting the note details using the returned noteId.

Important Notes

  • Audio files must be under 4 hours in duration
  • Supported formats: M4A, MP3, WAV, AAC
  • The presigned URL expires in 10 minutes after generation
  • Processing time depends on audio length and quality
  • Enterprise plan required for this feature