MLLP | Media Transcription and Translation Platform

About the MLLP Platform

What is the MLLP transcription and translation platform?
The MLLP transcription and translation platform is an online platform for automated and assisted mutilingual media subtitling and text translation created by Universitat Politècnica de València's Machine Learning and Language Processing (MLLP) research group. It provides support for the transcription and translation of video, audio and the full content of MOOCs, and integrates other MLLP-developed technologies such as Text-to-Speech synthesis for enhanced accessibility.
Who are the MLLP?
The Machine Learning and Language Processing (MLLP) research group (www.mllp.upv.es) is composed of researchers based at the Universitat Politècnica de València's Departament de Sistemes Informàtics i Computació. Our main research areas of interest are: machine learning and applications; natural language processing; and educational technologies and big data.
One of the main activities of the MLLP group has been the development of technologies for the automatic transcription and translation of video, audio and learning contents, most recently within the EU projects transLectures and EMMA. These technologies have been deployed within these two projects, and are also being provided now to other universities and organizations.
What services do you offer?
1. Remote automatic multilingual media subtitling and text translation services (full MOOC content support). Including:
  - Automatic media transcription in several languages, with topic adaptation for improved accuracy.
  - Automatic media translation into several languages.
  - Automatic text translation into several languages.
  - Text-to-speech synthesis.
  - Additional adaptation options for large repositories, to further improve the accuracy of the automatic transcriptions and translations.
2. An online service for the management and edition of automatic transcriptions and translations. Including:
  - TLP Media Player: An advanced interface for the post-editing of multilingual subtitles.
  - TLP Transcription Editor: An advanced interface for audio-synched, full text transcription post-editing.
  - TLP Text Translation Editor: An advanced interface enabling side-by-side text translation post-editing.
  - TLP Web Service: An advanced API enabling the automation and integration of the MLLP Platform's tasks in your media workflow.
Who provides the technology behind the MLLP Platform?
The MLLP Platform has been developed 100% at Universitat Politècnica de València (UPV).
The statistical models we use for automatic transcription and translation have been developed at the UPV's Machine Learning and Language Processing research group (MLLP). Our speech recognition engine is our own TLK: The transLectures-UPV Toolkit, while the MLLP Platform, its API and its advanced post-editing interface are based on our own TLP: The transLectures-UPV Platform software.
These technologies have matured and been put into practice for large video repositories and full MOOC courses in the EU projects transLectures and EMMA, and are now also available for use by other universities and organizations.
Our technology adapts to your videos for enhanced accuracy, going beyond what generalist speech recognition systems can provide. Furthermore, using our own technology makes us able to customize our systems for interested organizations through premium accounts.
Which transcription languages do you support?
We are continuously adding new languages to our automatic transcription services. Currently the transcription languages we support are:
- Català
- Deutsch
- English
- Español
- Français
- Italiano
- Dutch
- Português
- Slovene
Which translation language pairs do you support?
As with transcription, we are continuously adding new language pairs to our automatic translation services. Currently our supported languages are:
- Català → Español, English
- Deutsch → English, Français
- English → Català, Romanian, Italiano, Français, Ukrainian, Slovene, Deutsch, Português, Español
- Español → Català, Português, Galego, English
- Français → Deutsch, English
- Italiano → English
- Dutch → English
- Português → Español, English
- Slovene → English
Do I need to install any software to use your services?
No. Our transcription and translation services are 100% cloud based, so you can access and transcribe your media files through the Internet.
Can I integrate your services with my current technology and workflow?
Indeed! We have developed an advanced API through which you can ingest media and text, and manage your transcriptions and translations. You will find customized API information within your account when you register.
Do you offer a trial period?
Of course. You can register right now and upload up to 5 videos (or 2 hours in total) and 50 text documents to be transcribed and translated by our automatic services.

Using the MLLP Platform

How do I begin using the MLLP platform for automatic media subtitling?
These are the basic steps for the most important functionalities of the MLLP platform:
- Automatic multilingual subtitling:
- API access:
What can I do to obtain the best transcription results?
The speech recognition system will work with three elements that you provide: the video or audio file itself; the title of the recording; and the slides and external documents (if there are any).
- The video (or audio) file is the main input for the system. The recording conditions are important for the accuracy of the automatic transcription. Best results will be obtained for videos with only one speaker, and both the quality of the recording and the clarity of the speech and pronunciation will have an impact on the quality of the transcription. Non-speech elements such as background music can hinder speech recognition and impact transcription results negatively.
- The title of the talk can be used to automatically search for related documents on the net, from which vocabulary and language characteristics will be extracted to improve the transcription. Try to be descriptive with the title: a generic title such as “Medicine” will be less useful for the system than something more specific such as “Methodology for data analysis in medical sciences”. Remember to switch on Topic Adaptation in the media upload form to take advantage of this feature (currently available for English, Spanish and Catalan).
- Finally, if the video shows any accompanying slides or you have other documents related to the contents of the recording, providing the system with these files will allow it to analyse them as well and use their contents to improve transcription results. Switch on Topic Adaptation in the media upload form to take advantage of this feature (currently available for English, Spanish and Catalan).
What is the recommended workflow to minimize the effort to obtain quality multilingual subtitles?
To post-edit the generated automatic subtitles to your liking with the minimum possible effort, we recommend following this order:
1. Upload your media to generate the automatic transcription, and initial automatic translations into the languages you request.
2. Revise the automatic transcription. The MLLP Platform's TLP Media Player includes advanced subtitle editing functionalities for this purpose (alternatively, you can export your subtitles and import later a revised version).
3. Regenerate the automatic translations from the revised transcription (you can request this for any given video in the "My videos" section).
4. When the new automatic translations are complete, open the video again and revise the improved automatic translations.
Where can I find a more detailed user guide for automatic media subtitling in the MLLP platform?
If the instructions and tips above didn't cover what you need, here is the MLLP transcription and translation platform User Guide.
(Please note: as the MLLP platform is constantly evolving, parts of the guide might not correspond exactly to the most recent version).

Contact

How can I contact you and learn more?
For news and updates, you can visit the MLLP website. And follow us on Twitter! @mllpresearch.
For support and information, contact us at mllp-support@upv.es.