![](https://crypto4nerd.com/wp-content/uploads/2023/07/1KYuUTNhslO7fQwGQqqYszg-1024x1045.png)
In real-world scenarios, Machine Learning (ML) models are rarely standalone components. They are typically part of a larger system that involves preprocessing, business logic, and post-processing. This article explores two common approaches to architecting such systems in production.
π: You can think of the final deployable system as follows:
ππ»π³π²πΏπ²π»π°π² π₯π²πππΉππ = Preprocessing + Business Logic + ML Model + Post-Processing + Business Logic
In many cases, it is likely to extend to:
ππ»π³π²πΏπ²π»π°π² π₯π²πππΉππ = Preprocessing + Business Logic + ML Model + Post-Processing + Business Logic + Additional ML Model + β¦
In the real-world production environment, you will find two common ways this system is architected
π: Single Service Deployment
The most straightforward approach to package the deployable system is to combine all additional processing logic with the ML model and deploy it as a single service.
Hereβs how it would work for a request-response type of deployment:
- The backend service calls the ML service exposed via gRPC.
- The ML service retrieves features from a feature store. Preprocessing and additional business logic are applied to these features.
- The processed features are then fed into the ML model.
- The inference results are subjected to additional post-processing and business logic.
- The final results are returned to the backend service and can be used in the product application.
π: Business Logic Decoupled from the ML Model
Another approach involves introducing a separate service that sits between the backend product service and the service exposing the ML model. This architecture is often referred to as βThe Blenderβ because it facilitates blending, which combines multiple ML model inference results (referred to as βrefers to Dβ) to provide more powerful statistical predictions.
Hereβs the breakdown of the diagram:
1: The backend service calls the service containing business logic rules, which is exposed via gRPC (referred to as βThe Blenderβ).
2: The service containing business logic rules calls the ML service exposed via gRPC.
3β5: Same as steps 2β4 in example B.
6: The Blender receives the results and applies business rules to the inference results.
7: The final results are returned to the backend service and can be utilized within the product application.
This architecture becomes valuable when multiple individuals work on the pipeline and various combinations of model versions and business logic need to be tested in production. In a future long-form newsletter issue, Iβll delve deeper into this topic, so stay tuned!