Skip to content

Zeeshanahmad4/NLP--Data-extraction-Microsoft-Word-documents-into-a-CSV

Repository files navigation

NLP--Data-extraction-NLP-Microsoft-Word-documents-into-a-CSV


Logo

NLP-Data Extraction

Microsoft word into csv(More than 200 docs tested)

Table of Contents

About The Project

Demo

Demo

Code

Code

Output Data

Output-Data

Built With

Prerequisites

Installation

  1. Clone the repo
git clone https://github.com/Zeeshanahmad4/NLP--Data-extraction-Microsoft-Word-documents-into-a-CSV.git
  1. Install python packages
pip install docx docxpy string

Usage

Extract data from Microsoft Word documents into a CSV file or an Excel file. About 250 Word documents and each file is structured in a Microsoft Word table with repetitive titles. All documents are in French and need to keep or extract text and hyperlinks in the documents.

Includes Files

  1. scriptpyfile.py
  2. script.ipynb
  3. output.csv

Roadmap

See the open issues for a list of proposed features (and known issues).

Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

Distributed under the MIT License. See LICENSE for more information.