My decision to embark on this project was informed by the vital role text plays in business organisations. It serves as the medium for storing contractual obligations, invoices, reports, logs and many other mission-critical artifacts.
I chose the analysis of messages sent between users of Whatsapp, an extremely popular social media platform. The objective was to gain insight on the growth and churn rate of a group.
- Sourcing unstructured data in the form of social media chat messages,
- Cleaning the data
- Transforming it into a structured format in the form of a Pandas dataframe.
- Descriptive analysis to yield insights on growth rate and churn.
- Volume of messages per hour
- Throughput of messages per day of week, per month
- Top 10 active members
- Geographical spread of members
- Number of messages by type
- Number of members joined, YOY
- Members removed, YOY
- Members churned, YOY
- Jupyter Notebook
- Numpy
- Pandas
- Matplotlib
- Regex
- MS Excel
As can be proven by a cursory look at the notebook and dashboard, I took great care to preserve users' personal data (such as phone numbers) from being displayed.