Home Technologies ASR Automatic Speech Recognition

Automatic Speech Recognition (ASR)

What is Automatic Speech Recognition?

Automatic Speech Recognition (ASR) is the term given to the technology used to transcribe spoken words into written text.

Ubiqus uses one form of ASR – Large Vocabulary Continuous Speech Recognition (LVCSR) – based on the automatic identification of very short audio sequences. This technology makes it possible to produce an extremely high quality transcription, provided that the recording used has been made correctly. ASR has seen significant developments in recent years, and our R&D team is contributing to its continual growth.

Our working method means that we can handle not only recordings containing non-specialized vocabulary, but also those that include more specific terms (technical, legal, medical, etc).

The production of a final transcription involves a 4-step process:

1 | Voice Activity Detection

Firstly, it is important to identify when talking/speech is present during the recording, in order to cut the soundtrack into segments. The machine will then work on each of these segments.

2 | Diarization

Next, we need to identify the different speakers in each recording, and to group them into segments according to their identity, solving the problem of ‘who speaks when?’ For this, the machine uses different models containing specific data (languages, voice). In this way, it can differentiate the subtleties of a language (such as accents, for example). Note that at this point, we are still processing the data in a “mathematical” way.

3 | Decoding

This is when the actual transcription starts. A list of possible syllables (phonemes) is established for each audio segment. For now, no full sentences have been generated, only one long list of possibilities, each with a score.

4 | Rescoring

From all the phonemes and words learned during the initial phase, the computer chooses those that are likely to form the most accurate sentence (it’s a little like the way a GPS identifies the best route). It is this sentence that is transcribed into the document.

This process is applied to every segment of the recording. The final result is a complete transcription.

At the end of this automated process, the document is re-read by our teams, in the same way as any other Ubiqus document. In addition to checking the content as a whole, the proofreader will also ensure the speech has been correctly attributed.

To learn more about our translation interfaces, contact us and see the following information about our APIs:

Combining technology and human know-how at Ubiqus

Are you used to the quality of Ubiqus documents and the idea of testing automatic transcription is tempting? Give it a go! The standard quality level of an automatic transcription remains as high as that of a “traditional” transcription. And in any case, once the automatic transcription has been carried out, a human translator proofreads the transcription… just as they would for a traditional transcription!

The sectors using
our technologies

Learn more about our technological solutions for your industry.

Finance

• Standard translation

• Automatic online translation

• Minutes and summaries

Find Out More

Medical

• Specialized translation

• Medical transcription

• Online medical translation

Find out more

Lifestyle

• Subtitling of your promotional videos

• Adaptation of your packaging

• Translation of your e-commerce site

find out more

IT & media

• Automatic online translation
• Translation API and connectors
• Video content localization
• Optimized web translation

Find out more

Legal

• Legal translation

• Sworn translation

• Online legal translation

Find out more

Public Sector

• Minute-taking and summaries

• Translation

• Interpreting

Find out more

Industry

• Technical translation

• Compliance with your industry standards

• Minutes and meeting summaries

Find out more

Aerospace

• Technical translation

• Interpreting

• Writing minutes and meeting summaries

Find out more

Luxury

• Translation and proofreading

• Graphic design

• Copy editing

Find out more

Shall we talk about your project?

Contact us

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gcl_au	3 months	This cookie is used by Google Analytics to understand user interaction with the website.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.

Cookie	Duration	Description
IDE	1 year 24 days	Used by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
test_cookie	15 minutes	This cookie is set by doubleclick.net. The purpose of the cookie is to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.

Cookie	Duration	Description
_gat_UA-10280076-1	1 minute	No description
_gat_UA-163547-2	1 minute	No description
_gat_UA-3077856-2	1 minute	No description
_gat_UA-48089560-1	1 minute	No description
_gat_UA-91114028-2	1 minute	No description
4DSID	session	No description
CONSENT	16 years 9 months 3 days 15 hours	No description
cookielawinfo-checkbox-functional	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-others	1 year	No description

Automatic Speech Recognition (ASR)

What is Automatic Speech Recognition?

1 | Voice Activity Detection

2 | Diarization

3 | Decoding

4 | Rescoring

Combining technology and human know-how at Ubiqus

The sectors using our technologies

Finance

Medical

Lifestyle

IT & media

Legal

Public Sector

Industry

Aerospace

Luxury

Satisfied clients who don’t hesitate to tell us so

The sectors using
our technologies