Skip to content

KelvinJC/text-analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

Text analytics

A case study on text processing and data transformation.

My decision to embark on this project was informed by the vital role text plays in business organisations. It serves as the medium for storing contractual obligations, invoices, reports, logs and many other mission-critical artifacts.

I chose the analysis of messages sent between users of Whatsapp, an extremely popular social media platform. The objective was to gain insight on the growth and churn rate of a group.

This project included steps in data transformation involving:

  • Sourcing unstructured data in the form of social media chat messages,
  • Cleaning the data
  • Transforming it into a structured format in the form of a Pandas dataframe.
  • Descriptive analysis to yield insights on growth rate and churn.

Some of the questions I investigated include:

User activity

  • Volume of messages per hour
  • Throughput of messages per day of week, per month
  • Top 10 active members
  • Geographical spread of members
  • Number of messages by type

Growth rate

  • Number of members joined, YOY
  • Members removed, YOY
  • Members churned, YOY

Dashboard

Tools and Libraries:

  • Jupyter Notebook
  • Numpy
  • Pandas
  • Matplotlib
  • Regex
  • MS Excel

NB

As can be proven by a cursory look at the notebook and dashboard, I took great care to preserve users' personal data (such as phone numbers) from being displayed.

Releases

No releases published

Packages

No packages published