![]() |
![]() |
MLLP's Transcription and Translation platform (TTP): gRPC-based streaming API for real-time transcription of continuous audio streams.
To use this API, you must use your TTP API credentials (please see: https://ttp.mllp.upv.es/index.php?page=api). The usage of this service will be registered in your account info page.
Logical steps to follow to use the service are:
Method Name | Request Type | Response Type | Description |
GetAuthToken | AuthTokenRequest | AuthTokenResponse | Get a valid auth token to perform furter rpc requests. Without a valid auth token, all other rpc calls with return a PERMISSION_DENIED status code error. |
GetTranscribeSystemsInfo | google.protobuf.Empty | TranscribeSystemsInfoResponse | Get information about all available streaming transcription (ASR) systems. Requires a valid auth token, supplied as metadata. |
Transcribe | TranscribeRequest | TranscribeResponse | Transcribe a stream of raw audio samples using a streaming transcription (ASR) system. Requires a valid auth token, supplied as metadata. |
AddASRNode | AddASRNodeRequest | AddASRNodeResponse | Admin method: add an ASR Server. Requires a valid auth token, supplied as metadata. |
ListASRNodes | google.protobuf.Empty | ListASRNodesResponse | Admin method: list ASR Servers. Requires a valid auth token, supplied as metadata. |
RemoveASRNode | RemoveASRNodeRequest | RemoveASRNodeResponse | Admin method: remove an ASR Server. Requires a valid auth token, supplied as metadata. |
Field | Type | Description |
host | string | ASR Server host |
port | int32 | ASR Server port |
Field | Type | Description |
code | AddASRNodeResponse.Code | Response code |
details | string | Response details |
Mandatory parameter for the GetAuthToken rpc call.
Contains TTP API user credentials (please see: https://ttp.mllp.upv.es/index.php?page=api)
Field | Type | Description |
user | string | TTP API user |
password | string | TTP API secret key |
Message returned by the GetAuthToken rpc call.
Returns a valid auth token (auth_token) to call all other rpc services.
Important note: for all the other rpc calls, auth token must be supplied as metadata with the following header key: "x-mllp-auth-token".
Field | Type | Description |
code | AuthTokenResponse.Code | Status code of the response |
auth_token | string | Authentication token, to be provided in the rest of rpc calls |
expiry_date | int32 | Expiry date in UNIX timestamp format |
Field | Type | Description |
host | string | ASR Server host |
port | int32 | ASR Server port |
Field | Type | Description |
host | string | ASR Server host |
port | int32 | ASR Server port |
Field | Type | Description |
code | RemoveASRNodeResponse.Code | Response code |
details | string | Response details |
Mandatory parameter of the Transcribe rpc call.
First TranscribeRequest message must contain only a valid transcription system ID (system_id), obtained with a previous rpc call to GetTranscribeSystemsInfo.
Subsequent TranscribeRequest messages must contain bytes of raw audio data, from a continuous audio stream, with the following properties: 1 channel, 16KHz, and signed 16bit little endian. Package size is variable.
Note: a valid auth token must be supplied as metadata, please see AuthTokenResponse.
Field | Type | Description |
system_id | int32 | Transcription (ASR) system identifier (obtained in a previous rpc call to GetTranscribeSystemsInfo). Only for the first TranscribeRequest message sent. |
data | bytes | Audio raw data, in the following format: single channel, 16Khz, signed 16bit little endian. |
token | string | A custom string to be injected in the audio stream, and to be returned unmodified by the system, preserving its time alignment wrt audio data. |
Messages returned by the Transcribe rpc call as a continuous stream.
These messages will contain the transcription of the incoming audio stream (sent as TranscribeRequest), divided in two parts:
Note: call to Transcribe might return a RESOURCE_EXHAUSTED rpc error code in the following cases:
Field | Type | Description |
status | TranscribeResponse.Status | Status of the response. |
hyp_novar | string | Consolidated transcription text of the previous audio segment. |
hyp_var | string | Ongoing, increasing and varying transcription text. |
eos | bool | Set to true if the transcription system determines the end of a segment. |
Field | Type | Description |
code | TranscribeResponse.Status.Code | Status code |
details | string | Error details, when code > 0. |
Specific information about a transcription (ASR) system, provided within a TranscribeSystemsInfoResponse.
Field | Type | Description |
id | string | ASR system tag. |
langs | TranscribeSystemInfo.Lang | List of languages supported by this ASR system. |
Field | Type | Description |
code | string | Language code (ISO 639-1). |
text | string | Language name. |
Messages returned by the GetTranscribeSystemsInfo rpc call. One message per transcription (ASR) system.
Field | Type | Description |
info | TranscribeSystemInfo | Information about the transcription (ASR) system. |
num_decoders | int32 | Total number of running decoders (transcription rooms) for that particular ASR system. |
num_decoders_available | int32 | Number of decoders (transcription rooms) currently available for that particular ASR system. |
id | int32 | System ID, to be used in the first TranscribeRequest message sent via the Transcribe rpc call. |
Name | Number | Description |
OK | 0 | Regular response |
ERR | 1 | An error occurred |
Name | Number | Description |
OK | 0 | Regular response |
ERR | 1 | An error occurred |
Name | Number | Description |
OK | 0 | Regular response |
ERR | 1 | An error occurred |
Name | Number | Description |
OK | 0 | Regular response. Transcription is provided. |
ERR_NO_RECO_AVAILABLE | 1 | No transcriptors available for the provided system ID. |
ERR_UNKNOWN_SYSTEM | 2 | Provided system ID is unknown. |
ERR_RECO | 3 | An unexpected error ocurred when processing the input audio data by the transcription system |
ERR_WRONG_FORMAT | 4 | Bad format of input audio data. Please check specs of TranscribeRequest messages. |
.proto Type | Notes | Python Type | Php Type | C++ Type | Java Type |
double | float | float | double | double | |
float | float | float | float | float | |
int32 | Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead. | int | integer | int32 | int |
int64 | Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead. | int/long | integer/string | int64 | long |
uint32 | Uses variable-length encoding. | int/long | integer | uint32 | int |
uint64 | Uses variable-length encoding. | int/long | integer/string | uint64 | long |
sint32 | Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s. | int | integer | int32 | int |
sint64 | Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s. | int/long | integer/string | int64 | long |
fixed32 | Always four bytes. More efficient than uint32 if values are often greater than 2^28. | int | integer | uint32 | int |
fixed64 | Always eight bytes. More efficient than uint64 if values are often greater than 2^56. | int/long | integer/string | uint64 | long |
sfixed32 | Always four bytes. | int | integer | int32 | int |
sfixed64 | Always eight bytes. | int/long | integer/string | int64 | long |
bool | boolean | boolean | bool | boolean | |
string | A string must always contain UTF-8 encoded or 7-bit ASCII text. | str/unicode | string | string | String |
bytes | May contain any arbitrary sequence of bytes. | str | string | string | ByteString |