Hey, I'm Marco and welcome to my newsletter!
As a software engineer, I created this newsletter to share my first-hand knowledge of the development world. Each topic we will explore will provide valuable insights, with the goal of inspiring and helping all of you on your journey.
In this episode, I will explain how to make a search engine using Elasticsearch. We will use short descriptions of YouTube videos to make an app. This app will let you find summaries of videos using sentences.
At the end of this episode, I'll show you a user interface application that I created. You can use it to see the project in action.
👋 Introduction
In the next parts, I'll talk about the systems I use. I'll show you each step so you can do it yourself for your own projects!
1) Prerequisites
Node.js: For this tutorial you need to have Yarn and Node.js installed, specifically I used the LTS version 20.9.0. If you don't have Node.js on your machine you can install it from the official website.
Heroku: I explained step by step in this episode how to create the account to take advantage of the 1000 hours executions plan:
Elasticsearch: we must have a running instance to realise the project, I will explain later how to have one for free with Heroku.
You can download for free all the code shown directly from my Github repository: https://github.com/marcomoauro/fulltext-search-be
🔎 Elasticsearch
Elasticsearch is a distributed search engine and open source data analysis infrastructure. It is designed to store, analyse and query large amounts of data in real time. It is known for its speed, scalability and flexibility. Elasticsearch is particularly useful for full-text search, log and event analysis applications, and is often used as part of an ELK (Elasticsearch, Logstash, Kibana) architecture.
Elasticsearch works by dividing data into documents, which are JSON objects containing the data to be indexed. These documents are organised into indexes, which can be thought of as logical groups of similar documents. Once the data has been indexed, it can be queried using a variety of search and analysis methods provided by Elasticsearch.
Elasticsearch offers a wide range of functionalities, including:
Full-Text Search: Elasticsearch is known for its ability to perform full-text searches on large datasets efficiently and quickly.
Horizontal Scalability: It is designed to easily scale horizontally, which means it can handle a high volume of queries and data distributed across multiple nodes.
Geographical Search: Elasticsearch offers geographical search functionality, which allows location-based data to be found and analysed.
Log Analysis: Elasticsearch is often used for log analysis, which means it can be used to monitor and analyse large amounts of log data from servers, applications and other systems.
Semantic Search: Elasticsearch can be used to create semantically intelligent search engines that understand the meaning of words and are able to provide more relevant search results.
Aggregation and Analysis: Elasticsearch offers a number of aggregation functionalities that allow statistics to be calculated on indexed data.
In short, Elasticsearch is a powerful data search and analysis engine that allows large datasets to be stored, analysed and queried in real time, offering a range of advanced features for full-text search, log analysis and more.
1) Host a free version with Heroku!
we can use the Bonsai Elasticsearch addon shown in this episode, the free plan allows us to create indexes that can contain a maximum of 35000 documents with a total weight of 125 MB.
We will see in the next section how to add it to the Heroku project.
👨💻 Let's get down to practice
You can download for free all the code shown directly from my Github repository: https://github.com/marcomoauro/fulltext-search-be
Let us now move on to implementation, the steps we need to address are as follows:
Creation of the server based on my backend template
Host Bonsai Elasticsearch add-on
Index creation
Seed with Youtube video summaries
Api development: route, controller and model
Deploy and test api
1) Creation of the server based on my backend template
Let us start with the backend template I made for Node.js and showed in this episode:
Create folders with:
mkdir fulltext-search-be
cd fulltext-search-be
Now copy and paste the template inside, remember to update the name key in the package.json file. The project tree should look like this:
You can safely remove the files controllers/newsletters.js, models/Newsletter.js and the api GET /newsletters/:id defined in router.js, they were introduced in the previous episode, we will no longer need them.
Keep reading with a 7-day free trial
Subscribe to Implementing to keep reading this post and get 7 days of free access to the full post archives.