Skip to content

A python script to scrape the data from websites and storing into mannered form

Notifications You must be signed in to change notification settings

abhishek-2k2/WEBSCRAPING

Repository files navigation

Web Scraping Project: Flipkart and Amazon Data Extraction

This project demonstrates web scraping techniques to extract laptop titles and prices from Flipkart and Amazon websites using Python. The extracted data is then converted into Excel (.xlsx) files for further analysis.

Table of Contents

Introduction

This project showcases Python web scraping techniques using BeautifulSoup and requests libraries to fetch laptop titles and prices from Flipkart and Amazon websites. The data is then organized into structured Excel files for easy analysis.

Importing Libraries

Importing necessary libraries for web scraping and data handling:


import requests
from bs4 import BeautifulSoup
import pandas as pd
import openpyxl

Functions

Defining functions to extract data from Flipkart and Amazon:


# Function to extract data from Flipkart
def extract_from_flipkart(url, tag_title, tag_price):
    # Implementation code
    ...

Function to extract data from Amazon

def extract_from_amazon(url, tag_title, tag_price): # Implementation code ...

Usage

Running the web scraping functions to fetch laptop data:


# Run web scraping on Flipkart
for i in range(2, 11):
    extract_from_flipkart(f"https://www.flipkart.com/search?q=laptops&page={i}", 'div._4rR01T', 'div._30jeq3._1_WHN1')

Run web scraping on Amazon

for i in range(2, 11): extract_from_amazon(f"https://www.amazon.in/s?k=laptops&page={i}", 'span.a-size-medium.a-color-base.a-text-normal', 'span.a-price-whole')

Results

Converted extracted data into Excel (.xlsx) files:


# Convert data to Excel files
df_flipkart = pd.DataFrame.from_dict(data)
df_flipkart.to_excel("flipkart.xlsx", index=False)

df_amazon = pd.DataFrame.from_dict(data1) df_amazon.to_excel("amazon.xlsx", index=False)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

About

A python script to scrape the data from websites and storing into mannered form

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published