Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support automatic fill-in of internal TOCs for non-English Languages #388

Open
wonhyeongseo opened this issue Jul 14, 2023 · 1 comment
Open

Comments

@wonhyeongseo
Copy link

Description

Hello @mishig25 ,

I am writing to raise a concern about the lack of support for automatic fill-in of internal TOCs (Tables of Contents) for non-English languages in the huggingface documentation.

For example in the vietnamese course, Giới thiệu(introduction) is converted as gii-thiu(good-natured) in the url.
image
Putting an english anchor should be necessary for all of the documentation regardless of language. This is because when people switch languages (from en->vi), the current system disallows them to see the same part of the documentation. This is troublesome for long api docs. Ideally, when a user makes a simple change in the language code, they will get to the section they want fast. Note: Right-to-Left languages like Arabic have a unique linking structure, and I couldn't figure out where to put the custom anchor.

Please refer to huggingface/course#376 for details.

Expected Behavior

Ideally, the software should be language agnostic. The automatic TOC fill-in functionality should recognize headers in any language, extract them, and populate the TOC accordingly. But as a first step, extraction of the English anchor and automatic fill-in to other languages is fine.

Current Behavior

As of now, when working with non-English languages, we have to manually add in these custom anchors inside double brackets [[anchor]] when translating. The internal TOC remains blank or fails to function if not handled this way. Contributors are left to manually create TOCs, which can be a tedious process for extensive documents.

Steps to Reproduce

  1. Navigate to https://huggingface.co/learn/nlp-course/vi/chapter1/1#gii-thiu
  2. Replace nlp-course/vi to nlp-course/en in the URL.

Possible Solution

This problem could potentially be resolved by inserting the custom English headers only at build-time.

I am hoping for a positive and prompt response. Please let me know if you need any further information. I intend to work on this issue with @jinnsp and @gabrielwithappy to send a PR.

Thank you for your attention to this matter.

@gabrielwithappy
Copy link

gabrielwithappy commented Jul 14, 2023

Dear @wonhyeongseo
I do believe this will help many contributors! :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants