The well-known speaker Thomas MacEntee recently gave a presentation to my local genealogy club about how to use AI. During the Q&A period, I asked him whether AI could translate and transcribe foreign language audio for free. His answer was a definite yes. So this week I began to experiment.
I have a 1998 video interview of my mother's first cousin, Viola, speaking emotionally about her experiences as a Holocaust survivor and her early years with her family. The interview was conducted in Russian in Israel through the USC Shoah Foundation. A decade ago, a friend who knows Russian kindly translated the gist of this video interview. Now I wanted to see what AI could do for me, for free, in better understanding family history comments that Viola made early in the interview.
Process: video audio to digital audio to transcript
First, I popped the DVD into my player and as soon as the interview began, I started recording a voice memo on my iPhone. My first audio recording was 11 minutes long. This was important because Thomas said that without a paid AI account, it's better to keep projects shorter and simpler to get things done.
Next, I had to change the m4a recording to mp3 format, which I did with a free online converter (I used CloudConvert but there are other sites out there).
I tried uploading the mp3 to ChatGPT (free version) and asked for a transcription and translation from the Russian. But the AI responded: "It looks like I can’t run Whisper (speech-to-text) directly in this environment, so I can’t transcribe the MP3 automatically here."
So I uploaded the mp3 recording in Russian to TurboScribe (one of many sites that do this) and I asked for a free transcription. I chose the best quality/accuracy and within minutes, I downloaded the written output as a pdf.
Formatted translation from the Russian
Finally, I took the pdf of the Russian transcription and uploaded it to ChaptGPT, explaining a bit about this being an interview. I asked for this transcript to be translated into English and formatted as interviewer and interviewee.
The AI had no difficulty distinguishing between the voices of the person asking the questions and the person answering. It did ask whether I wanted a summary or a complete transcription (I wanted everything). Also it asked whether I wanted some original terminology left as is, with translation in brackets (yes).
ChatGPT finished that initial translation and asked for me to upload more so it could create a single, seamless document. So I went back and recorded 6 more minutes, going through the audio to digital audio to mp3 conversion rigamarole, next getting the free transcription, and then uploading the pdf from this second segment to ChatGPT.
This time, I named Viola as the interviewee and the AI showed her name in front of all of her responses. In the blink of two eyes, the answer showed me both segments compiled into one seamless interview about Viola's mother, father, and grandparents and their life before World War II. The top of page one is shown here.
Output: Formatted to my specs
Chapt GPT preserved some of the original terminology (see image at top, look for the word for tavern) and some of the less distinct words were picked up and translated, too.
No cut and paste for me. I just asked for a .docx Word document, which was quickly created for easy and free download. The output is in complete sentences, with proper punctuation, a smooth read. I added a note that ChatGPT had created the document and the date. Done!
Use with caution
ChatGPT warns that it can make mistakes (see image here). I also asked it to please delete the file at the end, after I had finished my download.
Given how many steps were needed to go from video interview to final document, there are multiple opportunities for mistakes/omissions/typos to creep in. Very likely there are some nuances that got lost along the way but in the end, I believe this was a successful experiment. Thank you to Thomas MacEntee for the encouragement!