Skip to content

Japanese learning tools which uses wadoku export data. Written with typescript and next.js.

License

Notifications You must be signed in to change notification settings

ThorbenSchiller/nihongosensei

Repository files navigation

日本語先生 (Nihongosensei)

Live: nihongosensei.app 日本語先生 This project provides a collection of useful tools for learning japanese.

Furigana

Live:

A simple generator for adding furigana to japanese text via kuroshiro is available under nihongosensei.app/furigana.

Dictionary

A dictionary is available based on wadoku.de xml data.

Schema

The dictionary uses a single table for now which holds the converted xml entry in json and additional fields ot enable text search.

create table entry
(
    id         int unsigned                        not null
        primary key,
    entry_json json                                null,
    lastchange timestamp default CURRENT_TIMESTAMP not null on update CURRENT_TIMESTAMP,
    jlpt       tinyint unsigned                    null
)
    charset = utf8;

create table entry_map
(
    entry_id int                          not null,
    text     varchar(255) charset utf8mb4 not null,
    primary key (entry_id, text)
);

create fulltext index entry_map__text
    on entry_map (text);

create index entry_map__text_index
    on entry_map (text);

create table entry_ref
(
    target_id    int          not null,
    source_id    int          not null,
    type         varchar(255) not null,
    subentrytype varchar(255) null,
    primary key (target_id, source_id)
);

create index entry_ref__target_id
    on entry_ref (target_id);

Import Data

See https://github.com/nihongosensei/wadoku-export-reader

Wadoku XML Exports

See https://www.wadoku.de/downloads/xml-export/

Wadoku Data License

See https://www.wadoku.de/wiki/display/WAD/Wadoku.de-Daten+Lizenz

JLPT Levels

JLPT levels are imported from wikipedia: https://en.wiktionary.org/wiki/Appendix:JLPT

Examples

Multiple Senses: 167612

Def and Text inside a TR: 1707

With a ref inside a Sense: 273

Tr followed by a def: 208

Tr followed by a def with multiple Tr: 515

Multiple defs after a tr: 4029690

Usg with type and reg: 4151

Long list of senses: 8042046

With etym: 11712

With etym which has a ref: 8545

With etym which has a foreign word: 490814

With multiple etyms: 2516676

Usg with type HINT: 3778315

Usg with type TIME: 8444455

Def followed by text: 8444455

Usg on entry level: 5075870

Famn and title: 226081

Season word: 10000528

Many senses: 5260527

Verb with 2 doushi definitions: 2972828

About

Japanese learning tools which uses wadoku export data. Written with typescript and next.js.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Languages