Skip to content

AzTreeBank is a syntactically annotated treebank for the Azerbaijani language

Notifications You must be signed in to change notification settings

LocalDoc-Azerbaijan/AzTreeBank

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

AzTreeBank

AzTreeBank is a syntactically annotated treebank for the Azerbaijani language, following the Universal Dependencies guidelines.

Data Sources

The data in AzTreeBank was collected from a variety of sources, including:

  • Books
  • Wikipedia
  • News websites (sports, politics, and other topics)
  • Scientific and literary articles

Data Generation

The annotations in AzTreeBank were generated automatically, providing a broad coverage of syntactic structures in the Azerbaijani language.

Authors

AzTreeBank was developed and maintained by the LocalDoc team.

License

This dataset is licensed under the Creative Commons NonCommercial 4.0 International License (CC BY-NC 4.0). You are free to share and adapt the material, provided it is not used for commercial purposes.

Language

The corpus is entirely in Azerbaijani.

Statistics

  • Sentences: 75,225
  • Tokens: 1,167,589

Annotation

The annotations include parts of speech (POS) tags, morphological features, and syntactic dependency relations following the Universal Dependencies schema.

Contact

For any inquiries or further information, please contact the LocalDoc team at [v.resad.89@gmail.com].

About

AzTreeBank is a syntactically annotated treebank for the Azerbaijani language

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published