Skip to content

yv/cwb3-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is a Python wrapper to the low-level API of CQP which allows
you to access CQP corpora in the same way as Perl's CWB::CL

cwb3-python is aimed at exposing all the nice new features of
CWB 3.5 and later to users of either Python 2.7 or Python 3.3+.

To install the module, please use the standard
 python setup.py build
 sudo python setup.py install
command sequence.

If you use the newest version of CQP (CWB 3.0), you need to
change the value of the "extra_libs" variable in setup.py.

To give you an idea how this works, see the following sample:

--- 8< ---
from CWB.CL import Corpus

# open the corpus
corpus=Corpus('TUEPP')
# get sentences and words
sentences=corpus.attribute('s','s')
words=corpus.attribute('word','p')
postags=corpus.attribute('pos','p')
# retrieve offsets of the 1235th sentence (0-based)
s_1234=sentences[1234]

for w,p in zip(words[s_1234[0]:s_1234[1]+1],postags[s_1234[0]:s_1234[1]+1]):
    print "%s/%s"%(w,p)

--- 8< ---

In order to test the CWB.CL module's correct installation
independently of any CQP corpora, you can do a
 python -m doctest tests/idlist.txt
which should terminate with no output when everything is well.

About

Python interface to CWB 3.5+

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published