MLLP-TTP gRPC Streaming API: protocol description

MLLP's Transcription and Translation platform (TTP): gRPC-based streaming API for real-time transcription of continuous audio streams.

To use this API, you must use your TTP API credentials (please see: https://ttp.mllp.upv.es/index.php?page=api). The usage of this service will be registered in your account info page.

RPC Services

MLLPStreaming

Logical steps to follow to use the service are:

  1. Call GetAuthToken with your TTP API credentials to get a valid auth token. This token has to be provided to the rest of rpc methods as metadata with the following header key: "x-mllp-auth-token".
  2. Call GetTranscribeSystemsInfo, to obtain a list of currently available transcription (ASR) systems.
  3. Call Transcribe sending a continuous audio stream.

Method NameRequest TypeResponse TypeDescription
GetAuthToken AuthTokenRequest AuthTokenResponse

Get a valid auth token to perform furter rpc requests. Without a valid auth token, all other rpc calls with return a PERMISSION_DENIED status code error.

GetTranscribeSystemsInfo google.protobuf.Empty TranscribeSystemsInfoResponse

Get information about all available streaming transcription (ASR) systems. Requires a valid auth token, supplied as metadata.

Transcribe TranscribeRequest TranscribeResponse

Transcribe a stream of raw audio samples using a streaming transcription (ASR) system. Requires a valid auth token, supplied as metadata.

AddASRNode AddASRNodeRequest AddASRNodeResponse

Admin method: add an ASR Server. Requires a valid auth token, supplied as metadata.

ListASRNodes google.protobuf.Empty ListASRNodesResponse

Admin method: list ASR Servers. Requires a valid auth token, supplied as metadata.

RemoveASRNode RemoveASRNodeRequest RemoveASRNodeResponse

Admin method: remove an ASR Server. Requires a valid auth token, supplied as metadata.

RPC Messages

AddASRNodeRequest

FieldTypeDescription
host string

ASR Server host

port int32

ASR Server port

AddASRNodeResponse

FieldTypeDescription
code AddASRNodeResponse.Code

Response code

details string

Response details

AuthTokenRequest

Mandatory parameter for the GetAuthToken rpc call.

Contains TTP API user credentials (please see: https://ttp.mllp.upv.es/index.php?page=api)

FieldTypeDescription
user string

TTP API user

password string

TTP API secret key

AuthTokenResponse

Message returned by the GetAuthToken rpc call.

Returns a valid auth token (auth_token) to call all other rpc services.

Important note: for all the other rpc calls, auth token must be supplied as metadata with the following header key: "x-mllp-auth-token".

FieldTypeDescription
code AuthTokenResponse.Code

Status code of the response

auth_token string

Authentication token, to be provided in the rest of rpc calls

expiry_date int32

Expiry date in UNIX timestamp format

ListASRNodesResponse

FieldTypeDescription
host string

ASR Server host

port int32

ASR Server port

RemoveASRNodeRequest

FieldTypeDescription
host string

ASR Server host

port int32

ASR Server port

RemoveASRNodeResponse

FieldTypeDescription
code RemoveASRNodeResponse.Code

Response code

details string

Response details

TranscribeRequest

Mandatory parameter of the Transcribe rpc call.

First TranscribeRequest message must contain only a valid transcription system ID (system_id), obtained with a previous rpc call to GetTranscribeSystemsInfo.

Subsequent TranscribeRequest messages must contain bytes of raw audio data, from a continuous audio stream, with the following properties: 1 channel, 16KHz, and signed 16bit little endian. Package size is variable.

Note: a valid auth token must be supplied as metadata, please see AuthTokenResponse.

FieldTypeDescription
system_id int32

Transcription (ASR) system identifier (obtained in a previous rpc call to GetTranscribeSystemsInfo). Only for the first TranscribeRequest message sent.

data bytes

Audio raw data, in the following format: single channel, 16Khz, signed 16bit little endian.

token string

A custom string to be injected in the audio stream, and to be returned unmodified by the system, preserving its time alignment wrt audio data.

TranscribeResponse

Messages returned by the Transcribe rpc call as a continuous stream.

These messages will contain the transcription of the incoming audio stream (sent as TranscribeRequest), divided in two parts:

Note: call to Transcribe might return a RESOURCE_EXHAUSTED rpc error code in the following cases:

FieldTypeDescription
status TranscribeResponse.Status

Status of the response.

hyp_novar string

Consolidated transcription text of the previous audio segment.

hyp_var string

Ongoing, increasing and varying transcription text.

eos bool

Set to true if the transcription system determines the end of a segment.

TranscribeResponse.Status

FieldTypeDescription
code TranscribeResponse.Status.Code

Status code

details string

Error details, when code > 0.

TranscribeSystemInfo

Specific information about a transcription (ASR) system, provided within a TranscribeSystemsInfoResponse.

FieldTypeDescription
id string

ASR system tag.

langs TranscribeSystemInfo.Lang

List of languages supported by this ASR system.

TranscribeSystemInfo.Lang

FieldTypeDescription
code string

Language code (ISO 639-1).

text string

Language name.

TranscribeSystemsInfoResponse

Messages returned by the GetTranscribeSystemsInfo rpc call. One message per transcription (ASR) system.

FieldTypeDescription
info TranscribeSystemInfo

Information about the transcription (ASR) system.

num_decoders int32

Total number of running decoders (transcription rooms) for that particular ASR system.

num_decoders_available int32

Number of decoders (transcription rooms) currently available for that particular ASR system.

id int32

System ID, to be used in the first TranscribeRequest message sent via the Transcribe rpc call.

AddASRNodeResponse.Code

NameNumberDescription
OK 0

Regular response

ERR 1

An error occurred

AuthTokenResponse.Code

NameNumberDescription
OK 0

Regular response

ERR 1

An error occurred

RemoveASRNodeResponse.Code

NameNumberDescription
OK 0

Regular response

ERR 1

An error occurred

TranscribeResponse.Status.Code

NameNumberDescription
OK 0

Regular response. Transcription is provided.

ERR_NO_RECO_AVAILABLE 1

No transcriptors available for the provided system ID.

ERR_UNKNOWN_SYSTEM 2

Provided system ID is unknown.

ERR_RECO 3

An unexpected error ocurred when processing the input audio data by the transcription system

ERR_WRONG_FORMAT 4

Bad format of input audio data. Please check specs of TranscribeRequest messages.

Scalar Value Types

.proto TypeNotesPython TypePhp TypeC++ TypeJava Type
double float float double double
float float float float float
int32 Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead. int integer int32 int
int64 Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead. int/long integer/string int64 long
uint32 Uses variable-length encoding. int/long integer uint32 int
uint64 Uses variable-length encoding. int/long integer/string uint64 long
sint32 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s. int integer int32 int
sint64 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s. int/long integer/string int64 long
fixed32 Always four bytes. More efficient than uint32 if values are often greater than 2^28. int integer uint32 int
fixed64 Always eight bytes. More efficient than uint64 if values are often greater than 2^56. int/long integer/string uint64 long
sfixed32 Always four bytes. int integer int32 int
sfixed64 Always eight bytes. int/long integer/string int64 long
bool boolean boolean bool boolean
string A string must always contain UTF-8 encoded or 7-bit ASCII text. str/unicode string string String
bytes May contain any arbitrary sequence of bytes. str string string ByteString