Skip to main content
POST
/
api
/
v1
/
client
/
txt2music
cURL
curl --request POST \
  --url https://api.deapi.ai/api/v1/client/txt2music \
  --header 'Accept: <accept>' \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form 'caption=upbeat electronic dance music with energetic synths' \
  --form model=ACE-Step-v1.5-turbo \
  --form 'lyrics=[Instrumental]' \
  --form duration=30 \
  --form inference_steps=8 \
  --form guidance_scale=7 \
  --form seed=-1 \
  --form format=flac \
  --form bpm=120 \
  --form 'keyscale=C major' \
  --form timesignature=4 \
  --form vocal_language=en \
  --form 'reference_audio=<string>' \
  --form webhook_url=https://your-server.com/webhooks/deapi \
  --form reference_audio.0='@example-file' \
  --form reference_audio.1='@example-file'
{
  "data": {
    "request_id": "c08a339c-73e5-4d67-a4d5-231302fbff9a"
  }
}
Text-to-Music generates music tracks from text descriptions. You can control genre, tempo, key, time signature, and even provide lyrics. Optionally upload a reference_audio file for style transfer — the model will use it as a stylistic reference for the generated track. The endpoint returns a task ID to track processing status. Ideal for apps needing automated music creation — background tracks, jingles, or full songs with vocals.
Prerequisite: To ensure a successful request, you must first consult the Model Selection endpoint to identify a valid model slug and check specific limits.
Reference audio requirements (optional):
  • Supported formats: MP3, WAV, FLAC, OGG, M4A
  • Maximum file size: 10 MB
  • Duration must be within model-specific limits
This endpoint uses multipart/form-data content type to support file uploads.

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Headers

Accept
enum<string>
default:application/json
required
Available options:
application/json

Body

multipart/form-data

Music generation parameters

caption
string
required

Text description of the music to generate

Example:

"upbeat electronic dance music with energetic synths"

model
string
required

The model to use for music generation. Available models can be retrieved via the GET /api/v1/client/models endpoint.

Example:

"ACE-Step-v1.5-turbo"

lyrics
string
required

Lyrics for the music. Use "[Instrumental]" for instrumental tracks without vocals.

Example:

"[Instrumental]"

duration
number
required

Duration in seconds (10-600)

Example:

30

inference_steps
integer
required

Number of diffusion inference steps (1-100). Use 8 for turbo models, 32+ for base models.

Example:

8

guidance_scale
number
required

Classifier-free guidance scale (0-20)

Example:

7

seed
integer
required

Random seed. Use -1 for random.

Example:

-1

format
string
required

Audio output format

Example:

"flac"

bpm
integer | null

Beats per minute (30-300)

Example:

120

keyscale
string | null

Musical key/scale (e.g. "C major", "F# minor")

Example:

"C major"

timesignature
integer | null

Time signature. Must be 2, 3, 4, or 6.

Example:

4

vocal_language
string | null

Language code for vocals (e.g. "en", "es", "fr")

Example:

"en"

reference_audio
file | null

Optional reference audio file for style transfer. Supported formats: mp3, wav, flac, ogg, m4a. Max size configurable (default 10MB). Duration must be within model-specific limits.

webhook_url
string<uri> | null

Optional HTTPS URL to receive webhook notifications for job status changes (processing, completed, failed). Must be HTTPS. Max 2048 characters.

Maximum string length: 2048
Example:

"https://your-server.com/webhooks/deapi"

Response

ID of the inference request.

data
object

Information from success endpoint