Detect anomalies in customers behaviour based on their specific recurrent patterns of shopping

2018-02-06 09:13:23

I am working on my thesis but I am stuck on the last task to solve.

STEP 1) I have a dataset composed of three variables:

fidelity_card_ID: fidelity card associated to the purchases

shopping_date: day when the purchases were made

cluster: express the pattern of this shopping visit

Examples of clusters description are: shopping for clothes, shopping for housecleaning products, shopping for a meal, shopping for weekly grocery, etc.

STEP 2) Each fidality_card_ID has a unique profile in terms of clusters composition.

For example, 100% of shopping visits made by fidelity_card_ID == 1 are clustered as "shopping for clothes". On the other hand, there is fidelity_card_ID == 2 which 99% of shopping visits were clustered as "shopping for housecleaning products" and there is 1% of shopping visits clustered as "shopping for a meal".

Question

STEP 3) What is the correct approach to develop a model to classify/predict/detect for each fidelity_card_ID those shopping vists that

  • You are looking at an unsupervised learning problem, i.e. your transactions do not have "regular" or "irregular" activity labels. Regularity is customer dependent, you can try to derive customer specific regularity features, e.g. the most frequent category for that customer (and whether or not a new activity is deviant from that)given day of the week, location of the customer, etc. and then label some of your data (semi-supervised, just because labelling all may not be feasible) and fit a single classifier. There will not be an easy shortcut here I am afraid.

    Depending on your dataset, you can carry out novelty & outlier detection.

    Or you can look at one-class supervised learning.

    I am not going into more detail, there are plenty of threads on this website discussing these two.

    2018-02-06 10:49:28