Published: Monday, May 13, 2020
Alexis “Lexi Bogan” had a vivacious voice before the summer of last year.
She would sing Taylor Swift and Zach Bryan songs in the car. She was always laughing, even when she was taming misbehaving toddlers or discussing politics with her friends around a backyard firepit. She was a soprano when she attended high school.
The voice disappeared.
In August, doctors removed a tumor that was threatening her life. It was located near the rear of her brain. Bogan struggled to swallow and was unable to greet her parents when the breathing tube was removed a month after it had been inserted. Her speech remains impaired despite months of rehabilitation. Strangers, friends and even her family struggle to understand the words she uses.
The 21-year old got her voice back in April. It’s not the real voice, but one generated by artificial intelligent that she can summon through a phone application. After being trained on a 15 second time capsule of her teenaged voice, which she recorded as part of a high-school project cooking demonstration video, her artificial but remarkably realistic-sounding AI can now say anything she wants.
She enters a few sentences or words into her phone, and the app reads them out loud.
Bogan’s AI said, “Hi. Can I please order a large iced espresso with brown sugar and oats?” as she held her phone out of the window of her car at a Starbucks drive through.
Experts warn that the rapidly improving AI voice cloning technology could amplify scams on phones, disrupt democratic elections, and violate dignity of those who have never consented to their voices being recreated in order to say things that they never said.
It was used to create deepfake robocalls for New Hampshire voters, mimicking the President Joe Biden. Authorities in Maryland recently accused a high-school athletic director of using AI to create a fake audio clip that portrayed the principal of the school making racist remarks.
Bogan, a team at Lifespan Hospital Group in Rhode Island and their doctors believe that they have found a way to justify the risks. Bogan, the only person with her condition, is the first to be able recreate a voice using OpenAI’s Voice Engine. Other AI providers such as ElevenLabs have tested similar technologies for people with speech impairments and loss. One lawyer now uses a voice clone to speak in court.
Rohaid Ali is a resident in neurosurgery at Brown University Medical School and Rhode Island Hospital. He said, “We hope Lexi will be a pioneer as technology advances.” He said that millions of people suffering from debilitating strokes or throat cancer, as well as neurogenerative diseases, could benefit.
“We must be aware of the risks but not forget the patient or the social good,” Dr. Fatima mirza, a resident who worked on the pilot, said. We’re able help Lexi regain her voice, and she can speak in a way that is true to her.
OpenAI, the maker of ChatGPT, was attracted to Mirza and Ali because they had previously worked on a project with Lifespan, using an AI chatbot for the purpose of simplifying medical consent forms. OpenAI, a San Francisco-based company, reached out to the San Francisco firm while searching for promising medical applications of its new AI voice generator earlier this year.
Bogan’s recovery from surgery was still slow. Doctors at Hasbro Children’s Hospital, Providence, were alerted by headaches, blurry eyesight and a slack face last summer. A vascular tumor, the size of golf ball, was discovered pressing on her brainstem and intertwined with blood vessels and cranial nervous system.
The pediatric neurosurgeon, Dr. Konstantina Svokos, said that it was a struggle to control the bleeding and remove the tumor.
Svokos explained that the location and severity of the tumor, combined with the difficulty of the 10-hour operation, affected Bogan’s ability to control her tongue muscles and her vocal cords. This impaired her ability to speak and eat.
Bogan explained, “It was almost as if a piece of my identity had been taken away when I lost my vocal cords.”
This year, the feeding tube was removed. Speech therapy is ongoing, and she can speak clearly in a quiet environment. However, there are no signs that her natural voice will return.
Bogan stated, “I was beginning to forget how I sounded at some point.” “I’ve gotten so used to the way I sound.”
She would hand the phone to her mother whenever the phone rang in their home, a suburb of Providence called North Smithfield. She felt that she was burdening friends when they went to noisy restaurants. Her hearing-impaired father struggled to understand what she was saying.
OpenAI technology was being tested on a patient at the hospital.
Ali stated that the first person to come to Dr. Svokos’ mind was Lexi. We reached out to Lexi, not knowing her reaction. She was willing to give it a try and see what would happen.
Bogan went back several years to find an appropriate recording of her voice in order to “train” AI on the way she speaks. It was a video where she demonstrated how to make pasta salad.
Her doctors deliberately fed the AI only a 15-second video clip. Other parts of the video are imperfect due to cooking sounds. OpenAI also needed only a few samples to improve over the previous technology that required much longer samples.
They knew they could get something out of the 15 seconds, which would be crucial for future patients who may not have a trace of their voice online. It may be enough to leave a brief voicemail for a family member.
Everyone was amazed by the quality when they first tested the voice clone. The occasional glitches – a misspelled word or a missing intonation – were almost imperceptible. In April, Bogan was given a phone app designed specifically for her that she alone can use.
Pamela Bogan’s mother said, “I cry every time I hear the sound of her voice.”
Lexi Bogan said, “It’s amazing that I can hear that sound again.” She added that it “boosted my confidence back to where it was a little before all of this happened.”
She uses the app 40 times per day, and she sends feedback to future patients. As a first experiment, she spoke to the children at the preschool in which she is a teaching assistant. She entered “ha ha haha” expecting to get a robot response. She was surprised to hear her old laugh.
She’s asked for help at Target and Marshall’s by using the app. She’s been able to reconnect with her father. It’s also made it easier for the girl to order fast-food.
Bogan’s doctors are cloning voices from other willing Rhode Islanders and plan to introduce the technology in hospitals all over the world. OpenAI has said that it will expand the use of Voice Engine cautiously, as it is not yet available to the public.
Several smaller AI startups sell voice-cloning to entertainment studios and make it more widely available. The majority of voice-generation vendors prohibit impersonation and abuse. However, they differ in their enforcement methods.
Jeff Harris, OpenAI’s product lead, said: “We want to ensure that all users whose voices are used in the service consent on a continuous basis.” We want to ensure that the technology is not used for political purposes. We’ve been very selective in the people we give this technology to.
Harris said OpenAI will next develop a “voice authentication tool” that allows users to replicate their own voice. He said that this could be “limiting” for patients like Lexi who suddenly lost her speech abilities. “We do believe that we will need to have high trust relationships, especially with the medical providers, in order to give a bit more unfettered accessibility to the technology.”
Bogan’s doctors were impressed by her ability to think about the ways in which technology can help other people with speech impairments.
Mirza stated that she had thought of ways to improve and change the process. “She has been an inspiration to us.”
Bogan envisions a future AI voice engine which can be used to improve older speech recovery remedies, such as the electrolarynx (a robotic-sounding device) or voice prosthesis. It could also translate words in real-time or integrate with the body.
She is less certain about what will occur as she ages and her AI voice still sounds like it did when she was a teenager. She said that the technology might “age” her voice.
She said that she has found a way to find her voice, even though it is not fully restored.
___
OpenAI and The Associated Press have a technology and licensing agreement which allows OpenAI to access a part of AP’s text archives.
Source: ABC News