![](https://crypto4nerd.com/wp-content/uploads/2023/06/18DrG2pW5Y7MdbiqltcwECg-1024x473.png)
Bingyang Hou, Jae Ihn, Jinhong Liu, Ashwin Shankar — As students from University of British Columbia’s Master of Data Science in Computational Linguistics, we introduce our Capstone project: a fine-tuned language model that automatically generates outreach messages that are tailored to sender and recipient info.
Image from — https://dripify.io/the-perfect-linkedin-message-best-practices-3-templates/
—
Cold contacting is a common marketing strategy for businesses who are trying to expand their customer base. You know those emails you get from people you’ve never met? That’s an example of cold contacting, and it’s like door-to-door sales, but in the digital world. Popular social networking platforms such as LinkedIn even offer advertising campaigns where outreach messages can be sent to target audiences in bulk. You probably received such messages yourself — but how did you respond to them? Most often than not, you would have recognized the messages as obvious spams, and simply deleted them from your inbox.
But honestly, just like those door-to-door sales, the success rate is… well, not so high. What the messages are lacking is a personal touch — an acknowledgment of the recipient’s work or interests, something that can help build a meaningful connection, a little warmth somewhere in there — that makes you feel like people noticed you for being specifically “you” and not just any random person off the internet. After all, who would want to take the time to respond to a boring, cookie-cutter message that could be meant for anybody? If the sender could just show that they had spent some time to get to know you… Now, that would be a different story.
There is just one catch, though. There is usually a trade-off between the level of personalization and the efficiency of generating outreach messages. Think of it like this: it’s like writing a heartfelt letter to each customer — nice in theory, but who’s realistically got the time for that? Writing hyper-personalized messages would definitely improve the recipient response rate, but this comes at the expense of the extra time and energy that goes into carefully crafting those messages. If only there was a way to automate this entire process, generating high quality messages in bulk…
During our fast-paced six-week project, we rolled up our sleeves and dived in, building a machine learning pipeline designed to craft personalized messages. These messages are tailored to the sender and receiver’s specific information, streamlining the entire process.
To build our data product, we took a page from the book of resourcefulness — fine-tuning an existing model rather than starting from scratch. This means we took a model that others had already trained and then tailored it to our needs. It’s an approach that gave our small-but-mighty team the ability to create powerful models without the need for a mountain of resources.
Among the many pre-trained models that are available, we chose to use one called Flan-T5. This model has a knack for following instructions, which made it a perfect match for our needs. We could prompt it with a simple task — ‘Write a new message’ — feed it some details, and even show it some examples of what we believe makes a good message. Flan-T5 was our helpful message chef, cooking up exactly what we ordered.
Our final data product is extremely easy to use. All you have to do is supply information regarding the sender and receiver, and the language model will take care of the rest. For the sender information, you can provide your company’s value proposition in free text form. For the receiver information, you can provide data scraped from the receiver’s LinkedIn profile. The model will then pull from both these info to create compelling candidate messages in a matter of seconds.
One challenge that we encountered was securing enough data to train our model. Our industry partner, Rocketbrew, provided us with data scraped from about 1,500 LinkedIn profiles, along with corresponding personalized messages written by their team. But in order to train an effective model, we wanted to secure data at least in the order of tens of thousands.
Instead of throwing in the towel, we decided to generate our own synthetic data, based on the data that we already had. Synthetic data can be thought of as artificially created information that closely mirrors real data. It’s like a stand-in, or a stunt double, that performs the role of the real data in various situations like model training in machine learning.
Initially, we thought of creating a rule-based system to make new messages, but this quickly turned out to be not flexible enough to create exemplar messages. Variety was crucial for us, especially since we wanted to make a model that generalized well to new senders and receivers.
We finally settled on engineering prompts for OpenAI’s ChatGPT (GPT-3.5 turbo), which was able to generate natural messages that were appropriate for the provided data. With this technique, we were able to generate 8,000* more messages, boosting the training process and performance of our final model.
Another challenge that we faced, perhaps the most significant one, was teaching our model true “personalization”. Such a qualitative value was too abstract and subjective to be accurately captured through machine learning. The model did demonstrate some degree of personalization, at least much more than one-size-fit-all outreach messages.. But it wasn’t quite human-like, missing some of the connections a human writer might instinctively make between sender and receiver.
This was especially difficult since every message we crafted had to fit within the stringent 300-character limit set by LinkedIn. This imposed a limit on how much personalization aspect could be included in each message, making it a rather tricky balancing act.
This is an inherent problem of natural language understanding–machines learn linguistic patterns, not truly “understand” language–but we still attempted to address this by designing a custom evaluation model. Using a set of messages that were scored based on their level of personalization, our evaluation model performed contrastive learning, where it learned patterns to differentiate pairs of messages with varying scores. Although this evaluation model was not perfect, it served as a pretty good proxy of measuring personalization.
Stepping out of the relatively predictable world of the labs, we were introduced to the raw, vast expanse of real-world challenges through this project. Each obstacle we encountered taught us something new and made us more resourceful.
The data scarcity issue from before was a big change for us. In the controlled environment of a lab, data is often abundant and readily available, neatly organized for experimentation. However, in the real world, we found ourselves having to work with a limited amount of data, challenging us to use it effectively and efficiently and find new ways to increase the quantity of available data while making sure the new data met some kind of quality standards.
Throughout the past 6 weeks, we gained valuable experience as budding data scientists.
In terms of technical skills, we had the opportunity to practice new techniques like synthetic data generation, prompt engineering, using virtual machines, and fine-tuning large language models. We were particularly impressed by the performance of ChatGPT in generating synthetic data. If we could rewind time, we would certainly tell our past selves to embrace ChatGPT sooner instead of trying to use a rule-based approach for a highly subjective project like ours. While rule-based approaches are certainly vital in a wide variety of situations, this one just wasn’t one of them.
We leveled up our soft skills as well, learning how to manage longer projects, how to communicate with clients through meetings, and how to document the projects to be continued by other teams in the future. One particularly daunting task was securing funding for our project. This was a significant departure from our typical data-centric tasks and pushed us to develop a whole new set of skills. Through this, we learned the art of negotiating and advocating for our work in order to secure the resources we needed to see it through.
We also became familiar with the hardships of working as a data scientist, and the overall procedure of creating a data science solution for real-world problems. Innovation often means leaving behind the tried-and-tested and daring to explore the unknown. In the realm of technology, where advancements like AI and Machine Learning are transforming the landscape, it’s important to adapt, experiment, and be ready to venture into new territories. And that’s the crux of our takeaway: embracing new possibilities can propel us towards solutions we might not have even imagined. And this isn’t just limited to our project; it’s a valuable lesson we carry forward in our future endeavours.
Our capstone project was the perfect segue for us to transition from master’s students to professional data scientists. This journey wasn’t devoid of hurdles, but overcoming them has been rewarding.
As we embark on our next tech journeys, we’d like to remember that every challenge we encounter is just another opportunity for growth. With the right tools and a pinch of creativity, there’s no problem that can’t be solved. So, where do we go from here? Maybe it’s creating an automated birthday wisher, or a bot that places coffee orders based on your mood. Whatever it is, we’ll be sure to keep our spirits high, and our code even higher.
Signing off until our next adventure, adieu!