2 Comments
Feb 14·edited Feb 14Liked by Marco Moauro

very interested in this project, any thought or ideas of this being done on other type videos that do not have a transcript available? AI can ingest and analyze audio at a much faster rate than humanly possible, so something like building the transcript from the audio of a video on the fly.

I think I may have found some information, looking for a self hosted or development solution similar to this -> https://summify.io/

Expand full comment
author

Hi Movah, I'm glad this is of interest to you!

Yes I had thought about it, there are solutions like Deepgram (https://deepgram.com/product/speech-to-text) or MS Azure (https://learn.microsoft.com/en-us/azure/ai-services/speech-service/get-started-stt-diarization?tabs=windows&pivots=programming-language-csharp) that starting from a video extract the text via ASR.

I was also thinking about the self hosting of Whisper from OpenAI.

Definitely a topic I will explore in the future!

Expand full comment