Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offloading the Hosting and Maintenance of the Non-English Docs #24984

Open
WestLangley opened this issue Nov 19, 2022 · 36 comments
Open

Offloading the Hosting and Maintenance of the Non-English Docs #24984

WestLangley opened this issue Nov 19, 2022 · 36 comments

Comments

@WestLangley
Copy link
Collaborator

WestLangley commented Nov 19, 2022

I propose that the Non-English docs be hosted elsewhere -- and maintained by others.

Being responsible for ensuring the English and non-English docs remain in sync is not an efficient use of the three.js maintainers' time.

Furthermore, it is a burden that is only going to increase as more languages are supported and the docs become more complete.

This thread is for discussing how offloading the hosting and maintenance of the Non-English docs may best be accomplished.

It would be helpful if someone would be willing to take responsibility for this task. 😇

@takahirox
Copy link
Collaborator

+1 for this

Being responsible for ensuring the English and non-English docs remain in sync is not an efficient use of the three.js maintainers' time.

I agree with this. I have been feeling the same.

@Mugen87
Copy link
Collaborator

Mugen87 commented Nov 19, 2022

Being responsible for ensuring the English and non-English docs remain in sync

I think it's sufficient if we just state that the maintainers only manage the English docs from now on. In this way, we can still keep the other languages in the repository which makes it easier to overlook the translation process and host everything.

@Mugen87
Copy link
Collaborator

Mugen87 commented Nov 19, 2022

BTW: This discussion should also include the translations of the manual.

@Mugen87 Mugen87 added the Manual label Nov 19, 2022
@WestLangley
Copy link
Collaborator Author

In this way, we can still keep the other languages in the repository.

No. That would be a poor decision from a management standpoint. The maintainers would then still be responsible for responding to bug reports and merge requests for the other languages.

@Mugen87
Copy link
Collaborator

Mugen87 commented Nov 19, 2022

The maintenance in that regard is minor. I have no problem to be personally responsible for this task.

@donmccurdy
Copy link
Collaborator

donmccurdy commented Nov 21, 2022

For a mature project like three.js, high-quality translations (or internationalization / "i18n") feel very important, and I'd be sorry for that area of the project to become stale. Really grateful for the folks doing these translations. If we simply offload translation to "the community" and don't set up a good workflow for keeping things in sync, then I expect the translations may degrade quickly. Our hands-off approach to @types/three will not suffice here.

With @Mugen87, I am glad to be involved in this task in any way. This is important to me.

It may be that the current setup is not that great for translators either, and offloading (done well) might actually improve things. For example, a separate repository might allow maintainers of the translations to set up tooling that we haven't provided here, for quickly identifying:

  • new strings, not yet translated to language X
  • updated strings, changed since translation to language X
  • progress, % of documentation translated to language X

I'd love to hear from people doing these translations: What works today? What doesn't? How do you feel having translations occur in a separate repository, possibly with more dedicated tooling to support that? Are there tools for this you've used before, and liked?

@WestLangley
Copy link
Collaborator Author

I'd be sorry for that area of the project to become stale.

They are already stale.

@Mugen87
Copy link
Collaborator

Mugen87 commented Feb 8, 2023

The maintenance work I do is straightforward right now because when a single documentation page is updated, the changes can often be copy/pasted with an editor (new code example, updated signature etc.). But granted it is not a scalable solution to copy each documentation page and translate the contents by hand. However, using a real translation system with placeholders in the documentation templates and a language database seems out of scope.

I'm not an expert in this field but would it be possible to avoid any manual translation and just ask the developer to use a tool e.g. like Google Translate? Are the results considered to be good enough now?

I've tested this today with some guides from the documentation and the results are about right (at least when you translate into German). I mean the tool translates some words that it shouldn't (like render target) and the structure of a sentence is sometimes a bit weird but at least you understand the actual content of the article.

I wonder if there is a way to improve the English docs so a translation tool can produce better results. Like preventing the translation of certain words or defining their syntax (e.g. that "render targets" should not be translated and is the plural of "render target"). Maybe it would be possible to tweak the results up to a point where it can be considered as "good enough" and a manual translation isn't necessary anymore.

@donmccurdy
Copy link
Collaborator

donmccurdy commented Feb 8, 2023

However, using a real translation system with placeholders in the documentation templates and a language database seems out of scope.

I wish I knew a practical way to do this, but, yeah. I'm not sure we have the resources.

I wonder if there is a way to improve the English docs so a translation tool can produce better results

One option would be to change how we're using <code> blocks today. We use them for longform code blocks, but that's really what <pre><code> is for:

<pre><code>
import * as THREE from 'three';

...
</code></pre>

If we change that, we could start using <code/> for inline tokens instead, like:

<p>
  This guide explains how to use <code>GLTFLoader</code>.
</p>

Then we just need to prevent translation from affecting code tags, but limit syntax highlighting to <pre><code> blocks.

@Mugen87
Copy link
Collaborator

Mugen87 commented Feb 8, 2023

It seems doing this:

 <span translate="no">render target</span>

prevents Google Translate from translating the words into another language. I've tested this with Chrome and the built-in translating option. The translate attribute is not Google Translate specific so other tools should recognize it as well.

@Mugen87
Copy link
Collaborator

Mugen87 commented Feb 8, 2023

At least Google Translate does already not translate the code examples in the documentation. I guess this happens because the code blocks already have the translate="no" attribute set.

e.setAttribute( 'translate', 'no' );

@donmccurdy
Copy link
Collaborator

Yeah, it may be more practical to use <code /> blocks inline, and let the script add translate="no", rather than using a span with attributes each time.

@Mugen87
Copy link
Collaborator

Mugen87 commented Feb 8, 2023

Um, I'm just not sure if it's right to use the <code /> element like that. AFAIK, it's intended to define computer code in HTML files. We just want to mark specific text from not being processed by the translator. Sounds a bit like a misusage to me. To me, the<span> tag with an additional attribute feels like the cleaner approach.

@donmccurdy
Copy link
Collaborator

donmccurdy commented Feb 8, 2023

I only mean using it when we're referring to classes, methods, and variables. It's an inline element, not a block element, in normal usage:

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/code

This helps with styling too, I'm often putting words like THREE.Group in italics to identify them, but we could use consistent styling with semantic markup instead.

But yeah, if we're trying to prevent a non-code word or phrase from being translated, then <span translate="no" /> is probably better.

@LeviPesin
Copy link
Contributor

LeviPesin commented Feb 9, 2023

I'm not an expert in this field but would it be possible to avoid any manual translation and just ask the developer to use a tool e.g. like Google Translate? Are the results considered to be good enough now?

Depends on the output language -- e.g. for Russian it's generally good enough for few words or a phrase, but a paragraph can have some problems. But it is understandable.

@Mugen87
Copy link
Collaborator

Mugen87 commented Feb 9, 2023

But it is understandable.

This is what I'd like to clarify. Can we consider the quality of translating tools as "good enough"? And is there a way to improve the English docs such that the output is improved.

For example when I prevent the translation of the term "render target", the German translation does not know whether the term is singular or plural. The resulting structure of a sentence can sound strange then.

I wonder if an additional Glossary could help here: https://cloud.google.com/translate/docs/advanced/glossary

Has somebody already make experience with that technique?

@LeviPesin
Copy link
Contributor

Should we actually prevent translation of terms like render targets? Maybe only proper names like algorithm names or class names (CSM, Mesh, etc) should not be translated... I'm not sure how I would translate "render target" to Russian -- both literal translation (like, "цель рендера" -- it's more of "the target of the rendering", which is wrong) and just leaving untranslated seem bad options to me. It may be more appropriate to even use transcription and "invent" a new term like "рендертаргет"...

@Mugen87
Copy link
Collaborator

Mugen87 commented Feb 9, 2023

Should we actually prevent translation of terms like render targets?

That was discussed above. A Glossary could also help in this context. It seems a more effective way than just using <span translate="no" />. Just wondering if someone already used this feature.

@mrdoob
Copy link
Owner

mrdoob commented Mar 15, 2023

Google Translate has a service for translating sites but seems like it break pretty badly...

https://threejs-org.translate.goog/docs/?_x_tr_sl=en&_x_tr_tl=es&_x_tr_hl=en-US&_x_tr_pto=wapp#manual/en/introduction/Creating-a-scene

@AlexandreAllard
Copy link
Contributor

AlexandreAllard commented Mar 15, 2023

This is what I'd like to clarify. Can we consider the quality of translating tools as "good enough"? And is there a way to improve the English docs such that the output is improved.

Hello, I'm the one who started translating the documentation into french, I feel like translating tools are absolutely not "good enough". In french it creates nonsense and confuses the reader, lately one contributor translated some pages in French, after 10s I knew it was "Google Translated" because of all the weird sentences it contained.

ThreeJS has a large subset of specific words that can't be easily translated, that shouldn't be translated or that need a human interpretation to be translated, so I think that the documentation is doomed to be of poor/average quality if we use translating tools. A page like the "color management" one, cannot be translated without someone picking the best/understandable sentences/words.

But on the other hand, I understand that the automated translation may be a way of having a neat translation-workflow.

@AngyDev
Copy link
Contributor

AngyDev commented Mar 15, 2023

Hello, I don't mind having a separate repository, if it's easier to manage, that way the translations could be maintained by those who are actually interested in doing it. I think google translate is not the best tool to automate translations, it's not always clear what it means, especially given the topic. It may not be a good idea, a system can be created that when changes are made to the English documentation an issue is opened for all other translations so that those who want to can update their chosen language.

Another translator, better than Google Translate is DeepL Translate, have you ever heard of this?

@Mugen87
Copy link
Collaborator

Mugen87 commented Mar 15, 2023

I've just mentioned Google Translate since I assume it is what Chrome is internally using when you open the context menu and use the translation option.

Does DeepL Translate offer a service that let's you translate entire web pages like in #24984 (comment)?

@NaNshekhar04

This comment was marked as off-topic.

@AngyDev
Copy link
Contributor

AngyDev commented Mar 16, 2023

Yes, there are APIs that offer the translation of a document, the only thing is that there is a free plan and a pro one, with the first one, it's possible to translate until 500000 characters per month.

@Mael-Kehl
Copy link
Contributor

This is what I'd like to clarify. Can we consider the quality of translating tools as "good enough"? And is there a way to improve the English docs such that the output is improved.

Hello, I'm the one who started translating the documentation into french, I feel like translating tools are absolutely not "good enough". In french it creates nonsense and confuses the reader, lately one contributor translated some pages in French, after 10s I knew it was "Google Translated" because of all the weird sentences it contained.

ThreeJS has a large subset of specific words that can't be easily translated, that shouldn't be translated or that need a human interpretation to be translated, so I think that the documentation is doomed to be of poor/average quality if we use translating tools. A page like the "color management" one, cannot be translated without someone picking the best/understandable sentences/words.

But on the other hand, I understand that the automated translation may be a way of having a neat translation-workflow.

Hello there, It seems like I've been hit by a lost bullet in the discussion, I used deepl on pages like geometries and I only needed to edit 20% of the results because they were pretty consistent. However in languages such as French you are obligated to always check the translator output because of the specific words you mentioned.
Based on @AngyDev answer, I suggest using a default translator whose translation can be verified by a contributor. A check/validation Icon could be used to specify that the pages have been checked by a human, so people can adapt their reading and so have a critical opinion.

@AlexandreAllard
Copy link
Contributor

This is what I'd like to clarify. Can we consider the quality of translating tools as "good enough"? And is there a way to improve the English docs such that the output is improved.

Hello, I'm the one who started translating the documentation into french, I feel like translating tools are absolutely not "good enough". In french it creates nonsense and confuses the reader, lately one contributor translated some pages in French, after 10s I knew it was "Google Translated" because of all the weird sentences it contained.
ThreeJS has a large subset of specific words that can't be easily translated, that shouldn't be translated or that need a human interpretation to be translated, so I think that the documentation is doomed to be of poor/average quality if we use translating tools. A page like the "color management" one, cannot be translated without someone picking the best/understandable sentences/words.
But on the other hand, I understand that the automated translation may be a way of having a neat translation-workflow.

Hello there, It seems like I've been hit by a lost bullet in the discussion, I used deepl on pages like geometries and I only needed to edit 20% of the results because they were pretty consistent. However in languages such as French you are obligated to always check the translator output because of the specific words you mentioned. Based on @AngyDev answer, I suggest using a default translator whose translation can be verified by a contributor. A check/validation Icon could be used to specify that the pages have been checked by a human, so people can adapt their reading and so have a critical opinion.

Hey mate, no offense intended, no need to justify yourself or your workflow for translating the documentation, ThreeJS is only love and geometries, nothing else haha.
The only thing is that, if, whitout speaking with you, I have been able to determine that you have used an automated translation tool, it means that the output isn't natural/correct enough for it to be enjoyable for an user, that was my initial point.

@donmccurdy
Copy link
Collaborator

donmccurdy commented Jul 24, 2023

Proposal:

I think it's a promising direction: however, a small but important blocker is described at the end.

@WSPluta
Copy link

WSPluta commented Dec 13, 2023

did you see how godot is managing translation and documentation? We should create same solution for three.js

https://hosted.weblate.org/projects/godot-engine/godot-docs/

@WestLangley
Copy link
Collaborator Author

This suggestion is not getting any traction. Closing due to lack of interest.

@Mugen87
Copy link
Collaborator

Mugen87 commented Sep 8, 2024

Please let's not close the issue. It is an important one and I highly vote for what has been suggested in #24984 (comment).

I think it is just a matter of resources that nobody works at this issue at the moment. Many devs are currently focused on WebGPU related tasks.

@Methuselah96
Copy link
Contributor

Methuselah96 commented Sep 8, 2024

I would love to help in any way I can, primarily if it makes it easier to maintain doc comments in the TypeScript definitions, since right now I have to manually copy/paste those and reformat them into a JSDoc. I'm not sure that we can fully generate TypeScript definitions directly from the JSDoc quite yet, but I think it could be very useful as a starting point for the TS types.

If the current proposal is acceptable to everyone, I'd be happy to start working on it. I can put together a proof-of-concept branch if that would be helpful. Are we okay with just adding JSDoc comments to start, or do we want to make sure the doc generation part works decently well first?

@Methuselah96
Copy link
Contributor

Here's a quick proof-of-concept: #29357. In this case the types are simple enough that we could use the generated .d.ts without any modifications, which would be awesome.

@donmccurdy
Copy link
Collaborator

I'm not sure that we can fully generate TypeScript definitions directly from the JSDoc quite yet, ...

Yes, this was the 'blocker' I mentioned in my comment – I'm not sure we could generate acceptable TypeScript definitions from JSDoc ... but I see no reason that generating documentation from JSDoc shouldn't work well. If that's still enough of a win to move forward, then let's do it.

I would also be grateful for a sanity-check from one or more people who are involved in contributing translations – does contributing translated strings into a more machine-readable file (.csv, .yml, or .sqlite?) sound like an improvement over the current structure of writing in .html files? I think it would be, because it's much easier to identify English source strings that have yet to be translated, or that have changed since the last translation ... but I'm not translating the content myself, and I want to be sure we're making that easier not harder... 😅

@Methuselah96
Copy link
Contributor

Methuselah96 commented Sep 13, 2024

If I understand correctly, the main pain point is contributors and maintainers needing to keep the non-English docs in-sync, so it seems like anything that reduces the need to update the docs in 7 different places would be an improvement.

@donmccurdy Do you have a plan for generating the English-language API docs from JSDoc and/or can I help getting that in place? It seems like it would be nice to have that ready before we start adding the JSDoc so that we're not adding to the duplication of docs. Or if we're fine with the duplication, I can start adding JSDoc, it would still be useful for three-ts-types.

@donmccurdy
Copy link
Collaborator

@Methuselah96 I think that's a correct statement of the pain point, yes!

If we were confident about getting TS definitions too, I had been thinking of using ts-morph (I have some experience with this) to parse the definitions into a simple JSON representation of the three.js public API (for example, api.json). But there may be other/better tools for extracting a JSON representation of the API from the JSDoc, too.

Then, a script should extract all English-language strings for translation (docs.en.csv). Non-English language strings would appear in other files (docs.cn.csv). Finally, documentation would be built by a script that processes the JSON, looks up translations from the CSVs, and outputs HTML files for each language in a build step.

I think it would be good to have a proof-of-concept for the full process before we begin committing a lot of JSDoc to the three.js repository, but adding sample JSDoc for one or two files sooner seems OK to me if that's helpful.

@Methuselah96
Copy link
Contributor

Methuselah96 commented Sep 15, 2024

Per this comment, I think we want to steer clear of defining types that are specific to TS (e.g., generics/type parameters or conditional types) in this repo, and focus only on how JSDoc can be used to generate docs. Then in three-ts-types I can use the JSDoc types as a starting point, and add TS-specific things like generics/type parameters (used in classes like EventDispatcher and BufferGeometry) and conditional types (used in TSLCore) as patches on top of that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests