You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for showing your interest in our library. I'd be happy to explain and provide assistance to help make your decision. As of now, our main focus has been on creating a Crawl function that is robust, fast, and can extract proper information from a given URL. And I can say that we have been able to achieve that level. The second part, which is currently under development and hopefully will be available within two to three weeks, is our scraper module. While the crawling goal involved focusing on a single URL, the scraping module's goal is to traverse the website as a graph, extract all the information in a neat and organized manner.
Right now, you can use the crawler, extract all the links and external links, and then decide what you're going to do about those links and crawl them again. Additionally, you can wait for these scraping modules to be released. However, remember that our library is making progress and we continue to grow as more people use it and raise their issues.
Therefore, when you start using our projects, you will get really good support at this stage of our library. We learn from your projects and improve our library, and in return, you will receive our support. Feel free to try, continue, and let us know; we'll help each other along the way. Thank you again.
I want to scrape a forum to get data for maybe llm model fine-tuning
forums have boards
boards have threads
thread include details like title, author, datetime..etc
it's better to support pages
is this tool good for this requirement?
The text was updated successfully, but these errors were encountered: