Creative optimization using Computer Vision algorithms | by JR Njogu

Business Understanding

The client is an online mobile ad business that provides the following services to its clients:

the ability to design an interactive Ad — what is also called a “creative”. A creative is a rich Ad containing interaction elements through a mini-game engine, video, text, images, etc…
serving these creatives to audiences on behalf of the client.
optimization of the creative designing process and also targeting guaranteed inventories with sophisticated machine learning algorithms.

Each advertisement has its own Creative. These creatives were made based on the experience of designers and the company. As a result, there is no way of evaluating creatives during production and knowing how well they might perform when they are served. To change this predicament, an algorithm can be developed to help optimise creatives based on campaign performance data.

In order to do that, you are tasked with developing a deep learning based CV algorithm that segments objects from creative assets and relates them to KPI parameters of the corresponding campaigns.

Objectives

To extract and find the best features that attract a user
to interact with the last screen of an ad i.e. the advertiser’s target
page.
To implement a machine learning algorithm that will determine how these features relate to KPI parameters of the corresponding ad campaigns.
To develop and apply deep-learning based CV techniques for
creative optimization in mobile advertising.

Creative — an advertisement (ad) that a user sees and interacts with when browsing a website or using an ad-powered mobile app.

Dynamic Creative Optimization (DCO)

DCO uses artificial intelligence (AI) and machine learning technology to create hyper-personalized experiences for viewers.

A dynamic creative is an ad in which components — headlines, descriptions, images, CTAs, etc. — are changed in real-time according to parameters predefined by the advertiser. Common parameters include the time of day, weather, location, etc. The aim is to create a bigger impact on the viewer.

It’s an automated process that leverages existing customer data and other connected data sources plus real-time testing and analytics to select the most effective combination of creative elements for each viewer. DCO is even more effective for returning visitors and existing customers, creating increasingly relevant ads that drive more engagement and boost conversions.

In conclusion, the benefits of DCO include personalization, automation, real-time response, improved performance, improved ROI, and reduced ad fatigue.

Data

Downloaded a compressed file from the S3 bucket. The downloaded file was unzipped to generate asset folders. Each folder stores the components that were used to create Ad creatives. In each folder there are image files in either .png or .jpg format. Some of the folders contain short video clips.

Feature extraction

Extracted important features from the images provided in the dataset. These features included:

Faces — along with the emotions that are conveyed. For this task, Deepface was used. It is a lightweight face recognition and facial attribute analysis (age, gender, emotion and race) framework for python. It is a hybrid face recognition framework that wraps other state-of-the-art models.
colors — dominant colors were extracted into a dataframe. The extcolor Python package was used for this task.
Texts — used Pytesseract. It is an OCR tool for Python that works as a wrapper for the Tesseract-OCR Engine. This library can read all image types (.jpeg,.png, .gif, .bmp, .tiff, etc.) and recognize text in images.
Logos — used template matching. Template Matching is a method for searching and finding the location of a template image in a larger image. OpenCV comes with a function matchTemplate() for this purpose. It simply slides the template image over the input image and compares the template and patch of input image under the template image. There are 6 template matching operations. Read more here.
CTA buttons — used template matching. The image of the CTA button was passed as an argument along with the actual full image that the button was in.
Object detection — used Yolov7. It is an object detection model known for its speed and accuracy. It can analyze videos and images. Yolov7 weights are trained using Microsoft’s COCO dataset, and no pre-trained weights are used.
Engagement button — used template matching. The image of the engagement button was passed as an argument along with the actual full image that the button was in.

A pipeline was created to extract the above features. For this pipeline to work, it requires the location of the assets folder specified as a path. After entering the folder path, the image files in each folder can then be accessed for feature extraction. The results of feature extraction were saved in dataframes and then saved in CSV files.

The data in the CSV files was then used in modelling. A CNN algorithm was implemented.

As the number of epochs increases, the loss value reduces.

In conclusion, it is important for Ads to be impactful and the way to do this is by finding out which elements of the Ad resonate with the viewer. Each viewer has different preferences. Once these preferences are determined, they can be used to fine-tune an Ad in a way that would increase the CTR.

In the future, I recommend extracting features from videos in addition to images.

The code for this project can be found here.

Source link