In this article series, especially we we will talk about especially Trendify’s approach about segmentation and we compare traditional segmentation and Trendify segmentation.

This is the 3rd article of our post series and its content about;

III . What is Trendify Segmentation Approach ? 

Comparing Traditional and Trendify Segmentation for ;

  • Data Preparation
  • Feature Selection & Data Conversion
  • Model Selection
  • Segmentation Profiling
  • Trendify Pipeline

III. What is Trendify Segmentation Approach

In our article, we will talk about how Trendify product / SKU segmentation works in general, and with which features it stands out from traditional segmentation. The comparison of traditional and Trendify segmentation logic is summarized in Image 1. 

* You are provided to import all the data Trendify product into the system in different ways. 

  • You can upload file to the system; from your local computer, from your table in the database, or via an API. 

* After you upload your data to the system, the data preparation process starts. In the preparation process; 

  • Different methodologies are used to fill in the blanks in the data; These methods are the KNN Imputer method or Trendify’s own method of filling in missing data. By filling the data in the most appropriate way, necessary observations in product segmentation are obtained. 
  • In order to extract the data from the outlier (discrete observations), the z-score method is tried with different standard deviation intervals and the most suitable outlier group is tried to be obtained. Also outliers are segmented within themselves by Trendify. 
  • It is ensured that appropriate transformations are made for columns containing values such as date, time and time for the data. In this way, the input to the model is provided without losing time-containing values. 
  • Various scaling techniques are used to scale the data. In this way, columns that differ from each other are drawn to the same range of values. 

*Feature Selection & conversion processes are handled automatically. 

  • Columns that are similar to each other and that do not benefit the model are removed. To find features that are similar to each other, the correlation between columns is obtained at and according to this  value columns with similarity above a certain threshold are removed. In addition, with the help of different feature selection methods, features that do not benefit the model or improve the model less are extracted. In this way, both model accuracy is increased and time is saved. 
  • Stable columns with very few values are automatically removed. It is best to exclude columns with very few discrete values, as they do not contribute to the model and magnify the data. 
  • Categorical and numeric columns are determined automatically. The most important thing here is to pay attention to the column types. If the number of discrete numerical values for you is less than a certain threshold, Trendify models the column categorically. In this way, you are one step closer to achieving the best product segmentation by transforming the data into the most appropriate column types and inserting them into the model. 
  • With dimentionality reduction methods, variables are made more concentrated and more simplified. By reducing the data to a different size, it is possible to both represent the data better and save time by reducing the data. 

* Selected models are tested iteratively and a segmentation is made according to which model has the highest success at each stage.  

  • If the product group is not sufficiently differentiated, the model is run again and segmentation continues until the groups reach the optimal level. It uses clustering score evaluation metrics for the success of the model. Thanks to the models selected at each stage, more efficient and effective results are obtained than classical clustering algorithms. 
  • In order to perform product segmentation in the most appropriate way, hyperparameter optimization is performed on the model, and Trendify algorithms complete the process in the background, model fit is performed according to the hyperparameters that give the best results, and create the most accurate product groups. 

* It also automatically handles the creation of profiles for product groups. 

  • Profiling/defining segments seems to be a relatively underappreciated issue in the literature. The Trendify product aims to produce explainable results so that the model results can be directly integrated into the processes. 
  • All the features that are meaningful and best expressing the products are highlighted within the segmentation
  • It offers the easiest workflow for business, thanks to the direct action and selected features. 

E.g; In a model segmented according to 10 variables, if the result is 5 segments, we can understand that each segment does not differentiate in every variable. Here, it is easy to show which variables are grouped by the segments formed by the Trendify product. Thus, making and implementing segment-specific decisions can be more reliable and accurate. 

The following example is a model result in which products are segmented according to 13 variables. With the Trendify product, all segments are created automatically and their definitions and prominent features are automatically determined. Green variables are those higher than the overall mean; red variables denote those lower than the overall mean. 

Thanks for reading

Date : 11.01.2021

Author : Mustafa Gencer (Data Scientist , TRENDIFY)

Related Blog Posts
veri toplama araclari
İşiniz Kolaylaştıracak, Popüler Veri Toplama Araçları Nelerdir?

Büyük verilerden yapılan analizler ile kararları iyileştiren ve stratejik iş hamleleri yapmak için güven veren içgörüler elde edebilirsiniz.

Product SKU segmentation 7
Product / SKU Segmentation 7

Trendify Segmentation Product Demonstration 

Product / SKU Segmentation 6

What is the key points of product segmentation for Data Scientist?