I think this seems like a good idea, but even among native English speakers a lot of readers seem to ignore, disbelieve, or misunderstand all the places people warn about the importance and nuance of subtle details (e.g. why you should assume all the obvious thoughts that come to mind about how to solve or do X won’t work by default), which are just the kinds of things that are likely to get somewhat garbled in translation. (Note: I’m not asserting that all such warnings are true, but I do think they are important). Not sure what kinds of disclaimers or warnings should be included with this kind of translated content, but I would definitely hope there are some.
One cheap test might be to translate into Dutch and then back to English with the same prompt twice. How garbled does the output end up? Does it elide important nuances?
(Though this might overestimate the quality given that the original pieces in English are likely in the training data. If you’re fluent in another language, I’d be quite curious about the results in the target language).
That is a sensible test! However, in my testing, I just asked a fluent-in-Dutch co-worker about the results, and he was satisfied. There were some problems, but it was basically OK.
Depending on how high a bar you want to meet, it may be worthwhile to also try with a language farther from English than Dutch is. If I really wanted to be thorough, I might try to take two bilingual people, give one the English and one the non-English, and ask them to discuss the results with one another and/or with you. Or, even better, give it to people who don’t speak English, and don’t know it was translated from English, and therefore can’t subconsciously use knowledge of English to correct translation problems. But for a lot of use cases I agree that these would be unnecessary.
As in, the warnings will get garbled, or the subtle details will? I agree the subtle details are more likely to be garbled. Which would suck for e.g. a young alignment researcher trying to deeply understand some translated article, say. Still seems plausibly worth it? Even more so if the person reading doesn’t intend to be a researcher, or gain a deep understanding.
(I hope this doesn’t show up as an answer to my own question...)
I think this seems like a good idea, but even among native English speakers a lot of readers seem to ignore, disbelieve, or misunderstand all the places people warn about the importance and nuance of subtle details (e.g. why you should assume all the obvious thoughts that come to mind about how to solve or do X won’t work by default), which are just the kinds of things that are likely to get somewhat garbled in translation. (Note: I’m not asserting that all such warnings are true, but I do think they are important). Not sure what kinds of disclaimers or warnings should be included with this kind of translated content, but I would definitely hope there are some.
One cheap test might be to translate into Dutch and then back to English with the same prompt twice. How garbled does the output end up? Does it elide important nuances?
(Though this might overestimate the quality given that the original pieces in English are likely in the training data. If you’re fluent in another language, I’d be quite curious about the results in the target language).
That is a sensible test! However, in my testing, I just asked a fluent-in-Dutch co-worker about the results, and he was satisfied. There were some problems, but it was basically OK.
That’s good news!
Depending on how high a bar you want to meet, it may be worthwhile to also try with a language farther from English than Dutch is. If I really wanted to be thorough, I might try to take two bilingual people, give one the English and one the non-English, and ask them to discuss the results with one another and/or with you. Or, even better, give it to people who don’t speak English, and don’t know it was translated from English, and therefore can’t subconsciously use knowledge of English to correct translation problems. But for a lot of use cases I agree that these would be unnecessary.
As in, the warnings will get garbled, or the subtle details will? I agree the subtle details are more likely to be garbled. Which would suck for e.g. a young alignment researcher trying to deeply understand some translated article, say. Still seems plausibly worth it? Even more so if the person reading doesn’t intend to be a researcher, or gain a deep understanding.
(I hope this doesn’t show up as an answer to my own question...)
The latter, and for the rest, I agree on all counts.