Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Goose is non-functional in Python 3 #148

Open
fake-name opened this issue Sep 16, 2014 · 13 comments
Open

Goose is non-functional in Python 3 #148

fake-name opened this issue Sep 16, 2014 · 13 comments

Comments

@fake-name
Copy link

Title is largely self explanatory.

Primary limitation seems to be reliance on BeautifulSoup 3, which has been EOL for quite a while now, and really should be migrated away from.

@fake-name
Copy link
Author

Actually, where is beautifulsoup used at all? I can't find any reference in the codebase to it at all It's being used in lxml somewhere, somehow, despite no explicit mention of it anywhere.

Also, unittest sucks, and doesn't report anything informative when you have an importerror. You can apparently use nosetests to run the same tests with sane output.

jieba can be replaced with jieba3k.

@fake-name
Copy link
Author

Going through everything, it appears that the heavy dependency on soupparser is a problem. Runtime patching in bs4 instead of bs3 is not workable, since lxml uses invalid arguments to __init__.

@fake-name
Copy link
Author

I have unit tests working.

Ran 126 tests in 10.607s

FAILED (errors=54, failures=49)

Welp! Time to look at other text extractors.

Is there any timeline on python 3 compatibility?

@hnykda
Copy link

hnykda commented Nov 25, 2014

+1 for python 3 support... Is there any schedule? Or you don't care at all?

@fake-name
Copy link
Author

@kotrfa - It's not a direct equivalent, but I wound up using python-readability for text extraction. It works well enough.

@vetal4444
Copy link

Prepare PR to add py3 support: #220

@xanderdunn
Copy link

+1 for this. Why uses Python 2!?

@hipoglucido
Copy link

Still waitting for Python 3 support :)

@hnykda
Copy link

hnykda commented Jun 7, 2016

I believe this project is dead. Use https://github.com/codelucas/newspaper instead, which is inspired by goose and supports Python 3 flawlessly.

@hipoglucido
Copy link

Yep, I already knew it but I just wanted to do some comparison of the available tools. Indeed, I will use it. Thanks!

@LukeB42
Copy link

LukeB42 commented Sep 5, 2016

Any plans to introduce Python 3 support to this project?

1 similar comment
@LukeB42
Copy link

LukeB42 commented Sep 5, 2016

Any plans to introduce Python 3 support to this project?

@lababidi
Copy link

Hi everyone, this may come off as self promotion, but I went ahead and forked goose to work with python3. http://github.com/goose3/goose3 Enjoy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants