Skip to content

Market Basket Analysis of Grocery and Retail dataset by Association Rules mining

Notifications You must be signed in to change notification settings

satishrath185/Market-Basket-Analysis

Repository files navigation

Market-Basket-Analysis of

  1. Grocery Dataset
  2. Online Retail

Business Value

Market Basket Analysis is the analysis of past buying behaviourof customers to find out which are the products that are bought together by the customers. That means to find out the association between various products. If the retail's management can find this association, while placing the products in the shop, these associated products can be put together. Or, when seeing that a customer is buying a product, the salesman can offer the associated product to the customer.

We find this association by Association Rule learning which is a machine-learning rule based approach that generates relationship between variables in a dataset. It has major application in retail industry including e-commerce.

Problem Statement

To determine the association between various products in the basket by analysing the customer purchase pattern of multiple items.

Data

Grocery data

Each row of data represents a transaction and the attributes the product purchased. For value 0 the attribute item has not been purchased, for value the attribute item has been purchased in that particular transaction

Online Retail

Each row of data represents a transaction for a particular item and the attributes correspond to the following:

InvoiceNo : Unique identifier for transaction

StockCode : Unique identifier for the stock item being purchased

Description : Description of item

Quantity : Number of units purchased

InvoiceDate : Date of purchase

UnitPrice : Cost of one unit of the item

CustomerID : Unique Identifier for customer

Country : Country of transaction

Approach

  • Importing Necessary Dependencies

  • Loading Data

  • Data Exploration and Visualization

  • Data Processing

    • Data Cleaning
    • Transforming data to one transaction per row
    • One Hot Encoding of purchases made
  • Generating Association Rules

  • Refining the rules

Evaluation

In order to establish association rules between items we will be using the apriori algorithm which uses a bottomm-up approach where frequent items (items bought together) are extended one item at a time and groups of candidates are tested against the availbale dataset. The process continues until no further extensions are found. It uses the concept of Support, Confidence and Lift to establish the rules.

Rules which have a higher support and confidence than the predefined support and confidence are taken into account.

Support, confidence and Lift is given by:

Data Exploration and Visualization

Grocery Dataset__

Top Sold Items

Association Rules

Online Retail Dataset

Processed Data

Association Rules

Selected Rules

Conclusion

Grocery Dataset

The rules states that people who bought other vegetables are likely to purchase root vegetables and the Confidence of the rule is 46% which means 46% of the time people bought other vegetables they also bought root vegetables and the Lift for the given rule is 2.24 which means the probability of finding the root vegetables in the transactions having other vegetables is greater than the normal probability of finding the it had the two items been not associated. A lift value of 1 indicates absence of association between the two items.

Online Dataset

Understanding the rules for this dataset we see SET/6 RED SPOTTY PAPER PLATES has a confidence of 80% and lift of 6.03 with itemset SET/20 RED RETROSPOT PAPER NAPKINS which means 80% of the times when the latter item was bought SET/6 RED SPOTTY PAPER PLATES was also bought.

About

Market Basket Analysis of Grocery and Retail dataset by Association Rules mining

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published