Saltar al contenido
SoftwaresCRM 馃寪 Guides for learning to surf the Android

This is how Google’s sound sensor works, one of the worst technologies on Android

Google plus has explained in detail how one of the best creations to date works.

That's not magic: This is how Google's sound sensor, one of the most beautiful technologies for Android, works
The Google plus Pixel recorder aplicación is one of the best tools Google plus has come up with so far.

It launched alongside the Google plus Pixel 4 in 2019 and has been a staple programa on Pixel series devices ever since. He is voice recording aplicación It seems like a fácil tool, but Google plus made a espectáculo of it advances in artificial intelligence, machine learning and speech recognition.

Recently, Google plus added an option to this aplicación that’s almost like magic: it enables automatically recognize whether several conversation partners are involved in a conversationand label each’s interventions for later assign tags in the transcription of the recording (the usuario himself cánido later change such tags to the names of the interlocutors). All of this happens in real time and on device, sin conexión into the internet.

But how Operation seems easybehind this function is a very advanced technologywhat google plus wanted explain in detail.

The Tensor processor brings to life one of the Google plus Pixel’s best features

In a artículo focusing on advances related to artificial intelligence, Google plus explains this much of the loudspeaker labeling system It acts on the Tensor únidad central de procesamiento block, the processor built into Google plus Pixel series devices starting with the Pixel 6. However, they plan in the future delegate part of the tasks to the Tensor Processing Unit (TPU). to disminuye power consumption.

The execution of this function is based on a system of Choose intermediary named “Turn To Diarize“. His job is create models from machine learning optimizedto get Segment after the intervening hours of audio recording in real time, using the technical resources available in the Google plus Pixel.

Google plus has put together a number of different techniques to make this system work effectively. On the one hand, it is able to recognize every time there is a change of mediator in the recording by a responsible coding model Download everyone’s voice characteristics.

On the other hand, A A grouping algorithm is responsible for assigning the labels to each person who takes part in the recording.

When the audio recording is segmented into homogeneous rounds of speakers, we use a speaker encoder model to extract an embedding vector (i.y también., vector d) that represents the voice characteristics of each round of speakers.

One of the most destacable features of this feature is Learn from your mistakes over time. Google plus explains that as the model analyzes more and more audio, it is able to do so Assign labels more preciselyand it is even possible Make corrections to previously assigned tags.

As the model consumes more audio input in our real-time speaker classification system, it builds confidence in the predicted speaker labels and may occasionally make corrections to previously unreliable predicted speaker labels. The Recorder aplicación automatically updates on-screen speaker labels as you record to reflect the latest and most accurate predictions.

That’s pretty incredible All this process perro be done on a móvil without having to complejo turístico to any connection to a server, and in real time. And although today the Automatic tagging is only available in EnglishThe feature is expected to include support for multiple languages ​​in the future.

Configuration