metacritic_parsing

The main aim of this project is to provide a foundational comparison between BeautifulSoup4 library and Request-HTML library for parsing websites.

BS4 has been widely used as the de-facto library for parsing any website or html document. It is known to be very user-friendly and easy to use even for beginners. As an alternative, Requests-HTML was released by the same people who made the requests library. The main advantages that it boasts includes full support for JavaScript and Async support.

In this project, we parse the Metacritic webpages containing the ratings of all videogames in their records. There are total 181 pages as of writing this but I have only parsed 100 pages for convenience. The scripts can be easily expanded for all 181 pages. The webpages do not need JavaScript support, so the playing field is level. The data, consisting for the name, score, release date and the platform is stored in a csv file after all the parsing is done.

I have also made a small visualization in Jupyter of the data obtained.

Improvements:

The obvious one is parsing all 181 pages. Also, ScraPy is another very commonly used library for such tasks. I does have added functionality for making parsing from multiple webpages easier

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
bs4_metacritic.csv		bs4_metacritic.csv
requestshtml_metacritic.csv		requestshtml_metacritic.csv
using_bs4.py		using_bs4.py
using_requestshtml.py		using_requestshtml.py
visualization.ipynb		visualization.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

metacritic_parsing

Improvements:

About

Releases

Packages

Languages

najeebuddinm98/metacritic_parsing

Folders and files

Latest commit

History

Repository files navigation

metacritic_parsing

Improvements:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages