Have you ever listened to a song and liked it overall, but felt that certain features could be improved? Maybe the chorus was too loud, the energy level wasn’t quite right, or there were either too many or too few words. I’ve had those experiences too, and that’s what inspired me to create a song recommender based on user reviews. Now, if you wonder, how does it work? Let’s explain the process with four key steps:
Choose Dataset
The main goal is to create a working example so I try to keep dataset small because of computational concerns yet large enough to provide relevant results. The dataset I used consisted of approximately 170,000 songs obtained from the Spotify API. To measure the songs and extract their relevant features, I utilized the same 15 features provided by Spotify. These features offer valuable insights into the characteristics of each song.
- Acousticness: This value describes how acoustic a song is. A score of 1.0 means the song is most likely to be an acoustic one.
- Instrumentalness: This value represents the amount of vocals in the song. The closer it is to 1.0, the more instrumental the song is.
- Liveness: This value describes the probability that the song was recorded with a live audience. According to the official documentation “a value above 0.8 provides strong likelihood that the track is live”.
- Speechiness: “Speechiness detects the presence of spoken words in a track”. If the speechiness of a song is above 0.66, it is probably made of spoken words, a score between 0.33 and 0.66 is a song that may contain both music and words, and a score below 0.33 means the song does not have any speech.
- Energy: “(energy) represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy”.
- Danceability: “Danceability describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable”.
- Valence: “A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry)”.
Create Model
I use k-mean clustering as a main model after Standard Scaler. K-means clustering is a popular unsupervised machine learning algorithm used to group similar data points into clusters. It works by iteratively assigning data points to clusters and updating the cluster centroids until convergence. The algorithm begins by randomly selecting K centroids, which represent the initial cluster centers. Then, in the assignment step, each data point is assigned to the cluster whose centroid is closest to it, typically based on the Euclidean distance. Once all data points are assigned, the algorithm proceeds to the update step, where the centroids are recalculated by taking the mean of the data points assigned to each cluster. These two steps are repeated iteratively until the centroids no longer move significantly or a maximum number of iterations is reached, resulting in the final clustering solution.
Process Reviews with NLP
When it comes to using user reviews to recommend songs, there’s a tricky aspect that needs attention. The idea behind my NLP-based Song Recommender project was to leverage a language model for analyzing the reviews and suggesting necessary changes to the song features mentioned earlier. Since I needed a language model for this task, I use ChatGPT because it is powerful and has an API. Since, creating a fine-tuned model can be expensive and requires a significant amount of data, which I didn’t have I explored an alternative approach: prompt engineering.
Prompt engineering involves crafting a prompt for the language model to elicit the desired response. In my case, I utilized a prompt that starts with “act as,”(probably you are familiar with) and I provided the song values array, including attributes like duration, energy, loudness, and more. Along with the song values, I included the user review and asked the model to suggest changes to the values based on the review. Initially, the results were not promising. However, I also included the description values from the dataset allowing the language model to generate more meaningful and relevant recommendations based on the user review and the song’s attributes.
Create Application
I use Streamlit, a powerful framework for creating web applications, to create the application. Streamlit offered the perfect combination of aesthetics, ease of development, and functionality for my project. I also use streamlit function menu for enhancing visuals and functionality. All you need to do to get recommendations is click the random song button for a random song or enter a song and then enter your critique about the song after clicking the start recommendation button a spinner will appear Once the spinner completes you can access recommended songs on the results page. Also thanks for the base of Streamlit application for abdelrhmanelruby.
Conclusion
Although it’s important to note that the dataset used for training the model was relatively small (170k), the recommender still aims to provide valuable suggestions. While it may not reach its full potential due to the limited data, it serves as a starting point for exploring new songs that align with your individual tastes.
So, if you’re looking to fine-tune your music experience and discover songs that better match your preferences, give this song recommender a try. Enter your review, and let the algorithm work its magic to recommend songs that you’re more likely to enjoy.