Hey, I'm Marco and welcome to my newsletter!
As a software engineer, I created this newsletter to share my first-hand knowledge of the development world. Each topic we will explore will provide valuable insights, with the goal of inspiring and helping all of you on your journey.
In this episode I show you the process behind Quickview, my crypto newsletter where every day I publish Al-based summaries of YouTube videos of top creators on the topic.
I am developing a SaaS to create email sequences on the Substack platform, a feature currently unavailable. You can set up sequences such as sending emails to free subscribers who have signed up for 7 days or sending emails to paying customers after 30 days.
If you are interested, please feel free to contact me. I will be happy to talk and share current progress.
1) 🚀 Quickview
The world of cryptocurrency changes a lot, with news every day, which makes it hard to keep up. I follow about twelve popular creators on YouTube who post videos almost every day, so 6 to 8 videos a day and about 50 videos a week.
That's why a few months ago I thought I would summarize the YouTube videos of these creators and share them in a newsletter. That's how Quickview was born. You can subscribe here:
Being a big fan of technology, I thought, why not find a way to automatically collect, summarize, and share new video summaries?
So I did, and in this episode I will show you how.
2) 🗄️ Database modeling
As support I created a PostgreSQL Database in which I modeled the reality of interest in this way:
videos: stores all posted videos, has only references including creator and unique YouTube id.
creators: Stores all creators for whom the videos are of interest.
categories: contains creator categories, each creator can have at least one category.
languages: contains the list of languages encoded for the application.
summaries: contains video summaries, refers to a video summarized in a specific language.
newsletters_substack and substack_tokens: contain the references that allow the videos to be published automatically on the newsletter Substack, we find the newsletter url and the token for impersonation.
newsletters: table representing the publication of an episode for a specific newsletter Substack for a specific video category. Each episode may contain a variable list of summaries.
3) 🤖 How automation happens
As I mentioned earlier the process consists of 3 major steps:
Collection of YouTube videos
Generation of the summary
Publication of a new post every day
To automate these three steps, I made a simple backend in which I created several scripts, each for a specific task. I then deployed it on Heroku, setting it up to automate these tasks with the help of the Heroku Scheduler add-on.
You can refer to these posts I published earlier for a simple implementation:
1) Collection of YouTube videos
To collect YouTube videos, I use a script that runs every hour (via cron). It checks the YouTube channels of the creators stored in the database, gathers video IDs, and saves them. Additionally, for each video, I fetch the publication date by making another HTTP call to the video's webpage. I save this date information to prioritize summarizing the most recent videos.
2) Generation of the summary
Every hour, another scheduled task (cron) checks all the videos that haven't been summarized yet. I use ChatGPT for summarizing. Through calls to the completions API, I can create summaries for each video, containing up to 6 paragraphs. I've explained the process in an earlier post; if you missed it, you can find it here:
I've made enhancements in this version. With the release of the new model gpt-3.5-turbo-1106, the context window has expanded from 4k to 16k tokens without an increase in cost. I've removed the recursive summarization part, which used to lose context when splitting and merging text. Now, with a 4 times larger context window, I can summarize the video in one call, resulting in a better quality outcome.
The key factor now is the prompt, and the one I use is like this:
system message:
You are designed to work with any type of YouTube video transcript.
Act as a journalist. Your task is to write a summary of the transcript that the user will provide you.
You'll focus on identifying key themes and ideas across various content genres.
user message:
here I pass the entire captioning of the video
system message:
Generate the summary in the language: "${LANGUAGE_CODE_TO_LANGUAGE[language]}" between 300 and 500 words.
The chosen language for Quickview is English, as the target language can be flexible. After generating the summaries, I save and store them in the database, linking each summary to its respective video.
3) Publication of a new post every day
I've set up a scheduled task (cron) that runs daily at 6 p.m. Italian time. It fetches summaries for the latest 6 videos to create a new newsletter post. Initially, I did this manually every day, but I recently improved the script. By reverse engineering the Substack platform, I found a way to create a draft post directly, which I schedule for the next morning. This has been challenging and time-consuming but it saves me a lot of time daily.
At the beginning of the post I also include a bulleted list of the topics covered in the episode, to do this I use ChatGPT in this way: I extract 5 topics from each video and combine them, resulting in a total of 30. Then, I again use artificial intelligence to summarize these topics into 5 topics, which I include in the post, and use to generate the episode title again leveraging ChatGPT.
Here is an example of a published post:
And that’s it for today! If you are finding this newsletter valuable, consider doing any of these:
🍻 Read with your friends — Implementing lives thanks to word of mouth. Share the article with someone who would like it.
📣 Provide your feedback - We welcome your thoughts! Please share your opinions or suggestions for improving the newsletter, your input helps us adapt the content to your tastes.
I wish you a great day! ☀️
Marco