CodeForces problemset data visualization and analysis

Problem Statement

The goal of this project is to gather problemset information of this website. Later we utilized the scraped data (1200) to understand the following demographics and correlations using Tableau Dashboard:

Category-wise average difficulty of the problems.
Total problem solve count according to the Problems Category.
Total solves in overall contest.
Category-wise total problems.
Category-wise Problems Tags and relation.
Tags-wise problems details.

To see the visualization and findings you can visit the public dashboard here[ 1-4, 5, 6 ].

Findings From Problem Difficulties:

Findings From Problem Tags:

Findings From Problem Categories and Tags:

Findings and Observations

Mostly, difficulty of the categories increases alphabetically. For instance - B is difficult than A. On the other hand C is difficult than both A & B. Category A1, B1, G2, G3,.. which has a number on it's prefix didn't follow this criteria.
If the problems difficulty increases, mostly count of total solve decreses. You can find the difficulty of the problem from point 1. Difficulty of problems cateogry - A, B, C are less than G3, I, J etc. So, A, B. C, D has highest solve count.
Solve counts in contest mostly depend on problems difficulty rathan than total contest occured overall. We see if the problem has less difficulty, then it got higher solves. e.g - A & B have the same contest count but as A is less difficult than B, so contestant love to solve A rather than a bit difficult problem B.
Category wise problems count is same as the no of contest occured. All contest have problem A, B. So, it has the same count. C, D also occur almost every contest but in the problemset those problems were not maintain any serail. So, most probably this scenario occures for C, D also. But, G3, O, M, N category problem occurs rarely.
We see a list of tags and problems category. From here one can get a rough idea about what type of problems belongs to which type of problem topics/tags. But from the data it's really hard to come up with a decision because it shows arbitrary results.
One can get more clear idea about a problem. One can find the difficulty of a problem and it's category. Difficulty level again depends on the category(vice-versa). A problem has low difficulty than B, C, D etc. But, here also we can't take any decision about the tags. One category problem contains multiple tags.

Build From Sources and Run the Code

Clone the repository git clone https://github.com/MdTanvirHossainTusher/CF_visualizer.git
Open the repository in your preferable editor and enter to CF_visualizer directory.
Create a virtual environment with these commands in windows- Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass .\venv\Scripts\activate
Install selenium web driver and pandas into the virtual environment by the below commands- pip install selenium webdriver_manager pip install pandas
Run scraper.py with this command- python scraper.py
To deactivate the virtual environment type this command- deactivate

To run `notebooks/CF_VISUALIZER.ipynb` -

Open the file in Google colab or Jupyter notebook and run all the cells sequentially.

N.B: You will get the data/cf_problem_details.csv file after running successfully scraper.py. On the other hand, after running notebooks/CF_VISUALIZER.ipynb you will get the cleaned dataset as data/clean_cf_problem_details_dataset.csv

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
browser		browser
data		data
images		images
models		models
notebooks		notebooks
pages		pages
resources		resources
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
scraper.py		scraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CodeForces problemset data visualization and analysis

Problem Statement

Findings and Observations

Build From Sources and Run the Code

To run `notebooks/CF_VISUALIZER.ipynb` -

About

Releases

Packages

Languages

License

MdTanvirHossainTusher/CF_visualizer

Folders and files

Latest commit

History

Repository files navigation

CodeForces problemset data visualization and analysis

Problem Statement

Findings and Observations

Build From Sources and Run the Code

To run notebooks/CF_VISUALIZER.ipynb -

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

To run `notebooks/CF_VISUALIZER.ipynb` -

Packages