Skip to content

TV archive aggregator scrapping content from telemagazyn.pl and indexing it with more comfortable search system.

License

Notifications You must be signed in to change notification settings

milosz08/tv-archive-aggregator

Repository files navigation

TV archive aggregator

TV archive aggregator scrapping content from telemagazyn.pl and indexing it with more comfortable search system.

Table of content

Features

The main goal of this project is to improve the archiving system for TV programs and make them more easily searchable for interesting content from a given time period.

This project consists of 3 sub-projects:

  • web-scrapper - scrapping data from website and saving in DB (desktop Java Swing app),
  • data-server - Rest API written in Spring Boot,
  • web-ui - Web client written in React and MUI component library.

Main features:

  • make availability to scrap content from telemagazyn.pl and saving in defined structure in MySQL database,
  • provide web API for preexisting web client or other clients (for example mobile),
  • provide web UI for searching content by program type, TV channel or genre with advanced search system and additional data visualization tools.

Gallery

Prerequisites

  • for develop environment:
    • Node v18 or higher (and corresponding NPM installation),
    • JDK 17 or higher,
    • Docker and Docker compose.
  • for running environment:
    • JRE 17 or higher (only for desktop web-scrapper app),
    • Docker and Docker compose.

Clone and install

  1. To install the program on your computer, use the command below:
$ git clone https://github.com/milosz08/tv-archive-aggregator
  1. Create docker containers for data-server, web-ui and MySQL database via:
docker-compose up -d

This command should create 3 docker containers:

Application Port Description
tv-scrapper-mysql-db 4850 MySQL database port
tv-scrapper-data-server 4851 Rest API port
tv-scrapper-web-ui 4852 Web client port
  1. Build and create executable JAR file of web-scrapper desktop app:
  • for UNIX environment type:
./mvnw clean assembly:assembly
  • for Windows environment type:
.\mvnw.cmd clean assembly:assembly

This command create tv-scrapper-1.0.0.jar file in .bin directory. All application logs will be in logs directory. Optionally you can create .env file with database connection details (not required):

DB_HOST=localhost
DB_PORT=4850
DB_USERNAME=root
DB_PASSWORD=admin
DB_NAME=aggregator-db

Prepare develop environment

  1. Clone and install via git clone command (see Clone and install section).
  2. Optionally, change MySQL root password in .env file:
TV_SCRAPPER_MYSQL_PASSWORD=admin
  1. Go to root directory and run MySQL database via:
$ docker-compose up -d tv-scrapper-mysql-db

This command should initialize MySQL database with two tables: tv_channels and tv_programs_data.

  1. Create .env files:
  • for UNIX environment via create-env.sh:
$ chmod +x create-env.sh
$ ./create-env.sh \
DB_HOST=localhost \
DB_PORT=4850 \
DB_USERNAME=root \
DB_PASSWORD=admin \
DB_NAME=aggregator-db

NOTE: Optionally, add -d flag, if you can remove already existing .env file:

  • for Windows environment via create-env.ps1:
.\create-env.ps1 -Arguments `
DB_HOST=localhost, `
DB_PORT=4850, `
DB_USERNAME=root, `
DB_PASSWORD=admin, `
DB_NAME=aggregator-db

NOTE: Optionally, add -Overwrite flag, if you can remove already existing .env file:

  1. That's it. Now you can run data-server, web-scrapper and web-ui in your favorite IDE (prefers Intellij IDEA for first two and Visual Studio Code for web-ui).

Tech stack

  • Java SE 17
  • Swing UI
  • Spring Boot 3
  • MySQL with JDBC Spring data
  • React with Tanstack Query
  • MUI components library

Author

Created by Miłosz Gilga. If you have any questions about this application, send message: personal@miloszgilga.pl.

License

This software is on Apache 2.0 License.