Just In
- 15 hrs ago OnePlus Nord CE 4 Key Specifications Confirmed Ahead of Launch: Check Expected Price, Specifications
- 21 hrs ago IPL 2024: Best Airtel Prepaid Plans to Stream Indian Premier League
- 22 hrs ago Motorola Edge 50 Pro Price in India Leaked Ahead of Official Announcement
- 1 day ago Samsung Galaxy A55 vs Galaxy S23 FE: The Battle of Samsung’s Premium Mid-rangers!
Don't Miss
- Sports Who Won Yesterday IPL 2024 Matches, RR vs LSG & GT vs MI: Yesterday IPL Match Result, Top Players, Award Winners
- Finance Rs 4.50/Share Dividend: Ex-Date In 4 Days On 28th March; Buy PSU Maharatna Stock?
- News Hardik Pandya Unfazed By Mumbai Indians' Narrow Loss In IPL Opener
- Lifestyle Lunar Eclipse On Holi 2024: Should You Play Holi On Chandra Grahan? If Not, Why?
- Movies Sundaram Master OTT Release Date And Platform Fixed: Here's When And Where To Watch Harsha Chemudu's Movie
- Automobiles Apple to Comply with European Act, Offering Users Google Maps as an Option
- Education Most favourite trades of the students selected in WBJEE
- Travel Learn About the Types of US Visas Available for Indian Immigrants and the Visa Process
Google develops Tacotron 2, a human-like text-to-speech AI system
Taking a giant leap towards its "AI first" dream, Google has come up with a text-to-speech AI system. This new system is sure to confuse you with its human-like articulation.
The new text-to-speech artificial intelligence system developed by Google is called Tacotron 2 and delivers an AI-generated computer speech that matches with the voice of humans, claims a report by Inc.com. At the Google I/O 2017, the company's CEO Sundar Pichai announced that it will start focusing on "AI first" and launch several products and features such as Smart Reply for Gmail, Google Lens, and Google Assistant for iPhone.
As per a paper that was published at arXiv.org, Tacotron 2 creates a spectrogram of the text that is a visual representation of how the speech should actually sound. This image is put through the existing WaveNet algorithm of Google that uses the image and brings artificial intelligence close to mimicking human speech. The WaveNet algorithm can learn different voices and generate artificial breaths easily.
The researchers of Tacotron 2 were quoted stating that their model achieves a mean opinion score (MOS) of 4.53 while the professionally recorded speech achieves a MOS of 4.58. From the audio samples, Google claims that the Tacotron 2 can detect the difference between nouns and verbs (such as desert and present as these words play the role of both noun and verb) based on the context and alter the pronunciation accordingly. The AI system can capitalize words and apple proper inflection when a question is asked instead of making a statement, claims the company.
However, the Google engineers have not revealed a lot of details about the Tacotron 2 text-to-speech artificial intelligence system. But they have left a clue for the developers to figure out their progress in developing the system.
As per the report, each '.wav' file sample has a specific filename that is either 'gen' or 'gt'. The paper that is published points out that 'gen' is the speech generated by Tacotron 2 and 'gt' is that of the real human speech. To be specific, 'gt' stands for 'ground truth', which is a machine learning term meaning
'the real deal'.
-
99,999
-
1,29,999
-
69,999
-
41,999
-
64,999
-
99,999
-
29,999
-
63,999
-
39,999
-
1,56,900
-
1,39,900
-
1,29,900
-
79,900
-
65,900
-
12,999
-
96,949
-
16,499
-
38,999
-
49,999
-
30,700
-
8,999
-
36,999
-
38,999
-
1,17,840
-
35,000
-
23,960
-
82,510
-
16,258
-
25,999
-
26,999