Expansive translations: considerations and possibilities

ozziegooen18 Sep 2020 15:39 UTC

43 points

Distillation & Pedagogy Inferential Distance World Modeling World Optimization

A crowd probably best served by a wide variety of translations

TLDR: Language translation is a decent first step for written works, but the ideal looks more like an empathetic personal tutor. There’s a lot to do in-between, both in the near term with human labor, and the longer term with Machine Learning.

Epistemic Status: I’m not an experienced researcher in this field. I’ve read a few audiobooks on language and thought about the area, but I’m sure I’m failing to reference many key papers and books. I’m fairly uncertain about all of this, I suggest taking my opinion very lightly (if at all) and thinking through the issue yourself. If you know of other materials I or other readers should know about, references would be appreciated.

Feedback Preferences: I’m sure I’m wildly wrong on many things. Feedback is highly appreciated. I will take little offense on rude comments so go wild. That said, don’t expect long responses.

I’m not sure how to best write this, so I’ll divide things into a few vignettes.

Some rough statements which think are misguided:

“Our book is available in 20 countries, so is accessible to 3 Billion people.”

“Once GPT-n can translate perfectly between languages, everyone will be able to communicate with each other”

“I’m not going to try to rephrase or re-explain this (technical) book, because you really should just read it directly”

“I don’t see why you wrote up those concepts, they were explained in more detail in a previous post”

Retellings and their skeptics:

There’s been an interesting trend recently of books that retell the ideas of other, older books. See:
How Proust Can Change Your Life
How Adam Smith Can Change Your Life
A Jane Austen Education: How Six Novels Taught Me About Love, Friendship, and the Things That Really Matter
And all the other books mentioned in this post.

I haven’t seen a better term for these, so I’ll refer to these as “retellings”.

More near to our community, Robert Wiblin has recently posted a piece on “Ugh fields”, which acts as a retelling of LessWrong posts from 10 years ago.

To some audiences, these retellings are not only a waste of time, but lossy explanations of superior sources. What if readers stop at the retellings and skip the sources, and are left with false impressions? Clearly the solution is to point readers to the source and skip the in-between. Maybe add a minor amount of context, but to be fair, the people with the time to attempt this are generally not capable of doing a good enough job to not cause net harm.

But why not stop there? Language translation is also a lossy process. Not only are languages famously challenging to translate, but sometimes substantive modifications are introduced. The Spanish Harry Potter translation changed a pet from a frog to a turtle.[1] Perhaps the ideal is to ask that people learn all languages they are interested in reading content in, in order to ensure they do not make the mistakes of deceitful translations.

An expansive view of translation:

I’m going to stop here and get to postulates:

1. There’s a fuzzy line between language translation and retelling.

Just because two people speak English doesn’t mean they think in the same words. There’s a whole lot that goes into retelling that’s different from modifying the language.

2. There’s a fuzzy line between language translation and linguistic variety translation.

Linguistics has the concept of varieties; languages are one type, but so are dialects, syles, and several other terms I wasn’t honestly previously familiar with before working on this piece. Just as there can be the translation of languages, it makes total sense to also have translation of these varieties as well. Google Translate already has support for a few specific dialects as of now but stops there.

3. There’s a fuzzy line between linguistic varieties, inferential distance, and worldviews.

Even if a translation matches one’s exact preferred language, dialect, register, lexicon, and style, they could be left with a distance of inference (or education) and worldview. With regards to inferential distance, specific topics could be expanded upon or contracted for different audiences. With regards to worldview (my quick word for “comprehensive set of beliefs”), topics could be discussed that best fit with a given worldview, even if there is some level of exclusion. The topics could also be presented with evidence for how they fit into one’s worldview.

4. Given that there are fuzzy lines between all of the above, it is reasonable to assume that translations on things other than “language” are quite reasonable.

I think we consider “retellings” as equivalent to a liberal definition of “translation”. How Proust Can Change Your Life can be viewed as a translation of Proust’s work for a specific cluster of modern audiences. This would indicate that we may have far too few works like this, not too many. Perhaps we could use a “How Proust Changed My Life, as a 60th-percentile-in-Math 10th Grader in Saint Mill’s Academy.”[2]

5. Even if the same effective message is recreated, there are aspects of its delivery that matter.

If Bill Gates rewrote Superintelligence in his own words, it would be a big deal, even if the writing was just as effective as that of Superintelligence. The fact that Bill Gates both took the time to write such a work, and took the risk and opportunity cost of publicizing it, is a valuable signal. (This is the point I’m least excited about, but wanted to point it out for completeness)

Why are modern translations so narrow?

So, if translation makes sense outside of language translation, why do book publishers stop at language translation?

The obvious answer is cost, but I think some less obvious answers that are tradition and challenging categorization. It would seem weird to have a specific translation of Harry Potter where the characters all spoke in a specific Tumblr vernacular or where Harry Potter grew up with Amish parents. Shakespear’s plays use Early Modern English, but it would be juvenile for us to read them in Modern English. If Wikipedia were to attempt a new language geared for “analytical philosophers”, I’m sure any definition and separation would be met with a fair bit of controversy.

Things are changing. First, some of us are okay being weird if the costs are justified. Most novel ideas seem weird at first, but there are still groups pushing them forward.

Second, Machine Learning is progressing rapidly. It seems possible that if ML could succeed in language translation, it could later get to completely personalized translation. Imagine a system where when you land on a Wikipedia page, it translates it into a version optimized for you at that time. The examples change to things in your life, and any concepts difficult for you get explained in detail. It would be like a highly cognitively empathetic personal teacher.

Expansive Translations and Power

I think we can consider what I’m referring to as expansive translations, or “highly expansive translations”, which are distinct from narrow or language translations.

One intellectual criticism of expansive translations is that they could be used by powerful actors to manipulate culture in their favor. Expansive translation is a powerful tool and any power increase in malicious hands could produce disastrous outcomes. Perhaps The Crusades could have been avoided if religious figures weren’t allowed to stray from the original texts. Expansive translations allow for censorship when controlled by authorities. I think the crux here comes down to a more fundamental opinion on the potential of technology and intellectual progress. This gets messy, so I’m going to table this for now and return if there are readers who care about it.

Grab Bag of Related Thoughts

There are probably thousands of books and what amounts to billions of hours of teaching to explain the same small sets of religious teachings. I.E. I’m sure one could come up with subsets of Christianity thought for which thousands of books and hours of local teaching (religious sermons and similar) were focused on. I imagine that similar educational endeavors should expect proportional costs.
Whenever any new fad comes to Silicon Valley, it seems like everyone has to re-explain it. Search “What is Bitcoin?” or “What is Intermittent Fasting?” for examples. I remember being amused at digital currency magazines that would include multiple articles to define Bitcoin, in the same magazine. This might appear wasteful, and I’m sure it often is, but it seems to serve a bunch of purposes that are hard to get around.
From above, the “Chesterton’s Fence” thing to do here seems to be “have lots of explanations for different audiences”. This could mean things like having articles explaining even seemingly simple concepts on LessWrong and the EA Forum, for those audiences. Perhaps we should have our own articles describing intermittent fasting or Bitcoin.
In popular media, communicators seem to specialize in audiences more than topics. A “golf magazine” really writes anything of interest to a specific “golf interested community”, rather than everything about golfing to a broad set of audiences.
If one has a message they want to be spread widely, it would be near impossible to personally advocate it to all of these groups as well as existing communicators do. It could be better to try to partner with or encourage the existing communicators.
CFAR used the term “murphyjitsu” instead of “pre-mortem”, even though they are the same thing (I think). They knew this, but did it because “murphyjistu” was preferable to their community. This used to really annoy me because I was worried that this would further disconnect them from the literature, but I’ve grown to support the decision. As long as the link between the two is clear enough, the benefit of using a custom definition seems much greater than the costs.
Right now one of the greatest challenges to expansive translation seems that of poor terminological coordination and capabilities. For example, Ayn Rand originally wanted to use the name “existentialism” for her work, but later changed it to “objectivism” as “existentialism” was already taken.[3] Perhaps it would have been more ideal for her to call it “existentialism”, which would auto-translate to “existentialism(2)” when there could be sources of confusion.
I imagine it could be highly valuable to be able to experiment publicly with terminology. Right now definitional work that touches existing fields feels like touching on their toes, but lots of important terms are a mess between academic fields. It would be great to have experimenters iterate and test out a bunch of options in limited settings, but in a systematic and intentional way.
I’m a big fan of YouTube summaries. I’ve learned almost all of my knowledge from textbooks (which are summaries of other sources), Wikipedia, and other non-source teaching methods. Asking that people read all the original sources is not at all a scalable solution to growing fields.
All of the reinterpretations of Shakespear’s plays are other good examples of retellings. Maybe I should have started this post with those instead of those trite pop-lit examples.

[1] https://harrypotter.fandom.com/wiki/Trevor

[2] This brings to mind “Chicken soup for the X soul”

[3] https://en.wikipedia.org/wiki/Ayn_Rand#Atlas_Shrugged_and_Objectivism

Note: The image on the top is by Taylor Heery and was posted on Unsplash. Link here. I used it because the New York subway system is what I think of when I imagine a bunch of people with very different backgrounds meeting each other. Most speak English, but are diverse in a wide variety of ways that could hypothetically use a large set of customized translations.

What links here?

Misha Yagudin and Ozzie Gooen Discuss LLMs and Effective Altruism by Ozzie Gooen (EA Forum; 6 Jan 2023 22:59 UTC; 47 points)

ozziegooen18 Sep 2020 15:39 UTC

43 points

15 comments6 min readLW link

Distillation & Pedagogy Inferential Distance World Modeling World Optimization

mingyuan 18 Sep 2020 22:18 UTC
17 points
Interesting post! This comment is going to be more some random thoughts I had while reading than a proper response.
First, translations of Shakespeare is a super interesting arena. No Fear Shakespeare ‘translates’ the original plays into modern English, which I admit is a helpful idea, but there’s a problem with these beyond just the feeling of being juvenile: the ‘translations’ are often wrong, sometimes blatantly so. One such line I remember well is from Hamlet. The original is:
Horatio, or I do forget myself!
which became
Nice to see you again, Horatio—that is your name, right?
The No Fear Shakespeare version has clearly translated this as if it were, “Horatio—or do I forget myself?” (all punctuation in Shakespeare is arbitrary, so the important thing here is “I do” vs “do I”). The NFS version makes absolutely no sense in the context of the play, because Horatio is Hamlet’s best friend! At the time of this meeting they haven’t seen each other in a few months, but the reason Horatio is in Denmark at all is to be a companion to Hamlet, because they’re BFFs. Hamlet definitely fucking knows his name. My interpretation of the original line is something like “Welcome Horatio, who I would as soon forget as my own self,” i.e. basically the exact opposite of what NFS said.
---
Why are modern translations so narrow? What level of nuance would you like them to capture? A lot of the beauty—and a lot of the meaning—in Shakespeare is in the specific use of language (rhythm, imagery, sound, antithesis, repetition, et cetera). Arguably you just can’t get every iota of the meaning out of Shakespeare without the exact original words. So it seems like it necessarily has to be a spectrum. (Incidentally this reminds me of the question of whole brain emulation: at what level of resolution does the emulation have sufficient fidelity to ‘be’ you, if you can’t replicate the entire brain quark for quark?)
In translations of poetry—something I have amateur experience with—you have a lot of decisions to make. Do you try to preserve both the meter and the rhyme scheme? That already severely restricts your choice of words, making it harder to get across the explicit meaning. And then what about alliteration, assonance, in-rhymes? What about double (or triple) meanings? (when these latter take a backseat the omitted meaning is often added as a footnote or annotation). What if there’s cultural information that doesn’t carry over well to your intended audience? And if you’ve taken all the foregoing considerations into account, what’s the likelihood that it’s even possible for you to end up with something that feels true to the original (in the sense that poetry evokes certain images and emotions)?
There’s a great scene in Henry IV Part 2 that’s almost entirely sexual puns*. If you had to write a Spanish version of this scene, would it be better to preserve the literal (primary) meaning of each line? Or should you write a new scene that has the same basic plot, but where you’re at liberty to change the literal meanings of the lines so you can fit in more sexual puns that actually make sense in Spanish? From what I’ve seen most translators choose the former (though perhaps this is specific to Shakespeare due to the crazy reverence in which people hold his works). I think this is a mistake artistically, but also there’s clearly a trade-off here. Without incredibly intensive labor, you can’t have it both ways.
I don’t know much about ML so perhaps it would be a lot easier for it to solve these problems than it is for a human. But given the current strategy of training on corpora, and the fact that I doubt there are more than a handful of good examples of how to balance all of these considerations, I’m not sure how an ML system would learn to do it. But again, I’m not a computers person.
---
Although I was talking about poetry, I think a lot of the same broad considerations apply to the idea of personalized translations (e.g. of Wikipedia). I can see an ML system learning my idiolect just by listening to me ²⁴⁄₇, but learning where my inferential distances are (for lack of better phrasing) seems quite a lot harder? Though… perhaps I am just underestimating GPT-N?
---

*The link doesn’t do it justice at all, but just for example look at the words ‘stab’, ‘enter’, ‘lusty’, and ‘fist’, and the delightful line ‘do me, do me, do me your offices’.
What links here?
- mingyuan's comment on Sunzi’s《Methods of War》- Introduction by lsusr (18 Nov 2020 19:31 UTC; 12 points)
- ozziegooen 18 Sep 2020 22:53 UTC
  7 points
  Parent
  No Fear Shakespeare ‘translates’ the original plays into modern English, which I admit is a helpful idea, but there’s a problem with these beyond just the feeling of being juvenile: the ‘translations’ are often wrong, sometimes blatantly so.
  Agreed that translations are often wrong, but I don’t think this is reason to give up on them! Translations between languages often fail, but I’m thankful we have them.
  
  The alternative to translation that I was taught in school about Shakespeare was to just give us the source and have us figure it out. I’m absolutely sure we did a terrible job at it, even worse than that bad translation. I don’t remember ever having a lesson on how to translate Early Modern English to Modern English. I think I barely understood how large the difference was, let alone interpreted it correctly.
  
  My knowledge on this topic comes from the Great Courses course “The Story of Language” by John McWhorter. Lecture 7 is great and goes into detail on the topic.
  
  Some quotes, transcribed here:
  “We don’t process Shakespeare as readily as we often suppose. With all humility I think there is a kind of mythology—a bit of a hoax—surrounding our reception of Shakespeare as educated people. And I will openly admit that, except when I have read a Shakespeare play—and this is particularly the tragedies—when I go and hear it, cold, at normal speed, I don’t understand enough to make the evening worth it.
  
  “I don’t like to admit it—I learned long ago that you’re not supposed to say so—but it’s true. And even as somebody who loves languages and is familiar with English and all its historical layers, I have seen The Tempest not once, not twice, but three times, never having gotten down to reading that particular play, I have never known what in the world was going on in that play.
  
  “And I seriously doubt if I am alone. And it’s not that the language is poetry. Poetry’s fine. It’s because Shakespeare in many ways was not writing in the language that I am familiar with. It’s been many many centuries and the language has changed.
  
  “One friend of mine said that the only time he had gone to Shakespeare and really genuinely understood it the way we understand a play by O’Neal or by Tony Kushner is when he saw Hamlet in France because it was in relatively modern French and he was very good at French.”
  - mingyuan 19 Sep 2020 1:22 UTC
    8 points
    Parent
    Since we’re basically just on a Shakespeare tangent now, and I really like talking about Shakespeare—I was lucky to have an extremely thorough education in Early Modern English starting from a very young age (starting around 7, I think). Essentially, my theater did Shakespeare completely uncut, and before memorizing your lines you had to listen to cassette tapes where the founder of the theater took you through the full meaning of every single line. I think he recorded these with multiple sources open in front of him, and he’d already devoted decades of study to Shakespeare by the time I was born. And then school gave me a thorough education in literary analysis, and putting all that together, I claim I have a better understanding of Shakespeare than the vast majority of Shakespearean actors, and probably the majority of Shakespeare scholars as well. (I believe most professional Shakespearean actors have no fucking clue what they’re saying most of the time, and how in heck is the audience supposed to understand what’s going on if the actors don’t?)
    My vocabulary in Shakespearean English is more limited than my native English vocabulary, but I’d still say I’m comfortably fluent in Early Modern English, perhaps even better than I am at French. My friends say that it’s really fun to read through Shakespeare plays with me because they actually know what’s going on. Shakespeare is really funny! In addition to being really beautiful and moving and incredibly fun to act.
    Anyway, I’m sorry your school sucked and also that all schools suck. I wish I could give everyone the education in Shakespeare that was given to me. I have ideas on how to make that happen, but alas, doesn’t seem like a priority with the world the way it is.
- ozziegooen 18 Sep 2020 23:08 UTC
  2 points
  Parent
  Why are modern translations so narrow? What level of nuance would you like them to capture?
  By narrow I mean they are aiming to provide language-language translation, but they could hypothetically done on a much more granular level. For instance, a translation that matches the very specific vernacular of some shared Dutch & Jamaican family with its own code words. And there’s no reason the semantics can’t be considerably changed. Maybe Hamlet could be adjusted to take place in whichever professional context a small community would be most likely to understand, and then presented as a post modern punk musical because that community really likes post modern punk musicals. Whatever works.
  
  One could argue that “liberal translations could never improve on the source, and therefore we need to force everyone to only use the source.” I disagree.
  
  In translations of poetry—something I have amateur experience with—you have a lot of decisions to make.
  Very true! There’s actually a lot of discussion of this around Harry Potter, which needed a lot of translations very quickly, and does have a fair bit of wordplay and the like. See here:
  https://en.wikipedia.org/wiki/Harry_Potter_in_translation
  I’m sure there must be a far greater deal of similar discussion around Biblical translations. See the entire field of Hermeneutics, for instance.
  
  That said, I’d note I’m personally interested in this for collective epistemic reasons. I think that the value of “an large cluster of people can better understand each other and thus do much better research and epistemic and moral thinking” is a bigger priority than doing this for artistic reasons, though perhaps it’s less interesting.
  - mingyuan 19 Sep 2020 1:32 UTC
    2 points
    Parent
    For instance, a translation that matches the very specific vernacular of some shared Dutch & Jamaican family with its own code words. And there’s no reason the semantics can’t be considerably changed. Maybe Hamlet could be adjusted to take place in whichever professional context a small community would be most likely to understand, and then presented as a post modern punk musical because that community really likes post modern punk musicals. Whatever works.
    Yeah okay that is a far more radical definition of ‘translation’ than I was working with. I buy translating things into idiolects (like the Dutch + Jamaican family), but I’m still skeptical of the second half of that paragraph. The problem being, in order to translate Hamlet into a new context and format, you have to make decisions about what the point of Hamlet is. There’s a vague sense in which The Lion King is a version of Hamlet, but you obviously take away very different things from the two experiences. You’d have to have a very clear goal in mind when constructing your professional-context postmodern punk musical Hamlet, and the choice of that goal would make a huge difference to the end product.
    - ozziegooen 19 Sep 2020 10:12 UTC
      2 points
      Parent
      You’d have to have a very clear goal in mind when constructing your professional-context postmodern punk musical Hamlet, and the choice of that goal would make a huge difference to the end product.
      Agreed. This is a radical definition.
      
      As translation gets more and more expansive, it becomes more difficult to ensure consistency and quality. But it also leads to a lot of value generation, so can often be worth it.
      
      Hamilton, the Musical, was arguably a retelling / “expansive translation” of the book, which itself was a summary of the original documents. I think most people who originally heard about the idea of Hamilton thought it could never work because of how weird (and expansive) it was. Not only was it presented for people who liked musicals, but it was sort of optimized to appeal specifically to communities of color. It doesn’t only translate the older dialects into modern English, but it converts it specifically to the vernacular and musical preferences of parts of Hip Hop culture.
      
      I’m a big fan of that. I’m sure a lot of information was lost along the way, but the value proposition of this dramatic reinterpretation is clear to many viewers.
      
      Now, not every potential translator may be as talented as Lin-Manuel Miranda now, but the potential is still clear, and in the future we’ll have AI to help us.
Davidmanheim 30 Sep 2020 11:22 UTC
12 points
As the post notes, inferential distance relates to differing worldviews and life experiences. This was written to an audience that mostly understands what inferential distance has to do with different worldviews—how would you explain it to a different audience?

Well, a typical translation doesn’t try to bridge the gap between languages, it just picks something on the far side of the gap that seems similar to the one on the near side. But that leaves something out.
An example of this is in translations of Harry Potter, where Dumbledore’s password is translated into a local sweet. The UK versions has “Sherbet Lemon” while the US version has “Lemon drop.” Are these the same? I assumed so, but actually it seems the UK version has a “fizzy sweet powder” on the inside. In Danish and Swedish, it’s (mis?) translated as lemon ice cream—which isn’t the same at all. And in Hebrew, it’s translated as Krembo, which doesn’t even get close to translating the meaning correctly—it’s supposed to be an “equivalent children’s dessert”—but the translation simply doesn’t work, because you can’t carry a Krembo around in your pocket, since it would melt. Does this matter? Well, the difference between a kindly old wizard who carries around a sucking candy, and one who carries around a kind-of-big marshmallow dessert. But that’s beside the point—you don’t translate the life experience that growing up eating sherbert lemons gives you, you find an analogue.

The only way to translate a specific word or term accurately could be to provide so much background that the original point is buried, and the only way to translate an idea is to find an analogue that the reader already understands. And that’s why translation is impossible—but we do it anyways, and just accept that the results are fuzzy equivalents, and accept that worldviews are different enough that bridging the gap is impossible.
- ozziegooen 2 Oct 2020 11:39 UTC
  2 points
  Parent
  Like that example a lot, thanks for the comment.
  
  Perhaps one could say that a complete translation of an English work would include a full description of English culture. This is kind of similar to complexity and Turing machines. Any program could be described in any programming language, by first fully describing the programming language in question.
  
  One point I’d bring up is to understand Harry Potter not as a necessary and complete work, but rather as the best method J.K.Rowling had of fulfilling her intentions using limited time and resources. It’s possible she didn’t care about “Sherbet Lemon”, but all she cared about was to raise some experience in the reader, as a way to optimize a greater pleasure. Perhaps a translator would realize this, and find some superior detail, both for people with other languages, and even for future English works.
  
  On the more intense end, it could be later identified that setting the plot around Magical Wizards is inferior to doing so in space stations, and many of the details are revised accordingly, but in ways that would maximize the benefits of the material.
  - Davidmanheim 5 Oct 2020 4:51 UTC
    4 points
    Parent
    Agreed—and this reminds me of the observation that all of physics is contained in a single pebble; with enough undesrstnding, you could infer all of physics from close observation of quantum effects, find gravity at a very small scale if you had sensitive enough instruments, know much of natural history, liked the fact that earth has running water that made the stone smooth, that it must be in a universe more than a certain age given its composition, etc. With enough detail, any facet of a story requires effectively unlimited detail to fully understand.
    
    And that makes it clear that we don’t intend for every translation to be of unlimited depth—but the depth of the translation matters, and we trade off between depth of translation and accuracy. Translating Sherbert Lemon as Lemon Sorbet is probably a lack of understanding and an overly direct literal-but-incorrect meaning, while translating it as Crembo might be a reasonable choice because of the context, but is not at all a literal translation.
Filipe Marchesini 19 Sep 2020 8:10 UTC
11 points
For me the idea of expansive translations is fantastic. Every time I read a new post in Lesswrong that brings important information to the table, I think about translating it into Portuguese and bringing the information to the members of my tribe. But obviously I don’t think about translating literally, word for word, because I can see the loss of information that this would bring. I know exactly how I could write in Portuguese that would bring the sensations desired by the original author of the post, considering all the cultural nuances and inferential distances. When you really know more than one language you can see why and when it is a bad idea to translate literally.

So how could we improve an expansive translation system? Suppose I took this post from Lesswrong and translated it into Portuguese. Then I would post the translation of the post in a software or expansive translation platform for arbitrary sites. Our new expansive-translations dot com, ou our new chrome extension.

Translators in the platform could give a score (from 0 to 10) of how good that translation looked for different translation formats: translation for children, translation for people with little or no math background, literal translations, focused translations for people with visual, auditory weaknesses, etc. Also people who would come into contact with those translations could give a grade of how easy it was to understand the subject matter.

Thus, we could create a market for expansive translations focused on people of different styles. For example, the system could consider that translations by people with similar mathematical/computational background to mine would probably please me more than expansive translations focused on a lay audience. Obviously this would depend on the type of subject matter, because I am a complete layman when it comes to various subjects, but in general the similarities of my profile with the translator’s profile could be a proxy for me to find good expansive translations. Also, the score I assign for each expansive translation can be used to understand what kind of expansive translation fits me more.

It would be interesting if I could even select an expansive translation of each category. Today I want to explain what bitcoin is to my grandmother, what would be the best way to do that? Surely expert translators for this kind of audience would know how to do it much better than me. I would select a specific category and see several expansive translations sorted by relevance (a metric that considered inferential distances, similar characteristics between the one who wrote and the one who reads, etc).

Each person reading an expansive translation could also assign a score to the post. I can imagine the many problems that such platforms could introduce, but having a diversity of expansive translations would help a lot and I would certainly use it often. For example, a market I would certainly pay to be part of is one of expansive translations of scientific articles. By hovering the mouse over a paragraph of an article a pop-up could appear indicating that there were 8 translators with 8 different expansive translations for the same paragraph. I could click on a (+) and then select the expansive translations I would like to read.

Certainly each translator can elaborate the ideas of that paragraph in different styles, considering differential inferential distances from the reader, etc. Suppose I read three expansive translations among the eight. I could select which one pleased me the most. Then we would use machine learning to train a system that could predict what kind of expansive translation I would identify myself with the most in a set of expansive translations.

Maybe we could still do optional microtransactions for those good expansive translators. E.g., I select the best expansive translation and pay a few cents or microcents, as simple as a like button in the corner of each expansive translation. This way we could ensure benefits and incentives for expansive translators to produce the best translations as they could be rewarded in status and financially for anyone.

I can see a lot of ways in which we could monetize this system, so we could get more money to put on research and improve the system even more. Rewarding directly good translators is an idea to ensure that we don’t lose the best candidates. I will stop my babble here, but there are lot more I can talk about this topic. Very interesting this topic, ozziegooen. Also, I believe I could program this system myself. But let me know what you think.
- ozziegooen 19 Sep 2020 10:36 UTC
  3 points
  Parent
  Thanks so much Filipe, and I’m excited to see your thoughts on the topic. I think this kind of imagining is highly valuable.
  
  I don’t have much context about you personally, but from my engineering and entrepreneurial experience, my main piece of feedback would be that I get the sense that you think this might be a whole lot easier than I think it would be. Something like what you propose sounds very interesting, but I think this initial proposal would be challenging to do well without tons of money and time. I’ve seen my fair share of people start far overambitious projects, totally (though predictably) fail, and be heartbroken as a result.
  
  I think it’s worthwhile to do the following, but think about them in distinct buckets:
  1) Imagine what great systems would be like with near unbounded resources.
  2) Figure out what pragmatic steps we can take in the short term to get started.
  
  Both of these are valuable. All of my post was in the former camp, and I would suggest that your post mostly is as well.
  
  Some thoughts on the comment, in the vein of category (1):
  
  Translators in the platform could give a score (from 0 to 10) of how good that translation looked for different translation formats
  This is a minor point, but I would suggest a system where people rank who good the translation is for individual people (with many defined attributes), instead of trying to bucket things into different categories. Defining the categories is a really messy process that will leave artifacts. This is kind of a classic ML prediction sort of problem.
  Thus, we could create a market for expansive translations focused on people of different styles.
  I think that the current infrastructure for setting up markets in the regular ways are quite mediocre. Another option would be to hire a team of translators working full-time, but monitor and optimize their performance.
  ---
  On the topic of obtaining source data, using new content generation would be very expensive, and I could imagine it being difficult to do well. I think the word for “expansive translators” isn’t “translator”, but “communicator”, for instance, so the people to learn from are the popular communicators, not people with translation experience.
  
  I think there’s already a lot of content out there if you’re a bit creative. There are probably tens of thousands of “What is Bitcoin” posts on YouTube and other platforms aimed at a wide variety of audiences, combined with metrics for how popular these are. If you could find ways of learning from those, I would be more optimistic.
  
  Our new expansive-translations dot com, ou our new chrome extension.
  
  Arbital had features kind of like what I’m suggesting. They identified a need, but found it very challenging to get people to actually do the writing. I suggest checking out the comments from that thread to learn about their experiences.
  
  I’d be enthusiastic about making browser extensions to augment LessWrong in some key ways. It’s possible translation could start small; like with the replacement (hopefully with hovers that demonstrate this) of some key words with words one may better know.
Aaro Salosensaari 24 Oct 2020 12:50 UTC
7 points
Maybe I am misreading, but in case not everone is not aware of them, in many ways concept-wise the “expansive translation” sounds quite similar to critical editions / translations [1] and other similar scholarly annotated editions [2]. This kind of work usually includes more or less extensive commentaries (often presented as footnotes or endnotes) by later scholars that attempt to explain meaning and context of the original text that may be lost on their contemporary audience. (Critical editions also attempt to deal with cases where there are several differing or fragmented versions of the source text or the original language is very archaic or dead which ofc leads to issues with faitful translations.)

Sometimes the commentaries become influential on their own right. (Weirder things have also happened, like when according to one textual history theory, one day in China somebody shoehorned in a historical novel to terse ancient administrative logbook as “a commentary” to make it more interesting as the original chronicle was attributed to most famous of all scholars, thus creating a renowned classic [3].)

The point being, in scholar traditions it is traditional to distinguish between the text / translation and the commentaries, which at the same time deals with impossibility of perfect translations (by acknowledging that explanations of translation choices at minimum and in general, also other annotations are warranted) but also tries to retain a text with considerable fidelity to its source (such as, has no large insertions to improve it for contemporary audience) which is also useful.

[1] https://en.m.wikipedia.org/wiki/Textual_criticism

[2] https://en.m.wikipedia.org/wiki/Annotated_edition

[3] https://en.m.wikipedia.org/wiki/Zuo_zhuan
Vaughn Papenhausen 18 Sep 2020 19:31 UTC
4 points

Imagine a system where when you land on a Wikipedia page, it translates it into a version optimized for you at that time. The examples change to things in your life, and any concepts difficult for you get explained in detail. It would be like a highly cognitively empathetic personal teacher.

Hmm, something about this bothers me, but I’m not entirely sure what. At first I thought it was something about filter bubbles, but of course that can be fixed; just tune the algorithm so that it frames things in a way that is just optimally outside your intellectual filter bubble/comfort zone.

Now I think it’s something more like: it can be valuable to have read the same thing as other people; if everyone gets their own personalized version of Shakespeare, then people lose some of the connection they could have had with others over reading Shakespeare, since they didn’t really read the same thing. And also, it can be valuable for different people to read the same thing for another reason: different people may interpret a text in different ways, which can generate new insights. If everyone gets their own personalized version, we lose out on some of the insights people might have had by bouncing their minds off of the original text.

I guess this isn’t really a knockdown argument against making this sort of “personal translator” technology, since there’s no reason people couldn’t turn it off sometimes and read the originals, but nevertheless, we don’t have a great track record of using technology like this wisely and not overusing it (I’m thinking of social media here).
- ozziegooen 18 Sep 2020 19:46 UTC
  3 points
  Parent
  Thanks for the comment!
  In regards to being able to read “the same thing” as other people; I would of course agree this is one benefit of the current system. Any novel system will have downsides, this is a downside for sure. I think the upsides are far more significant than this one downside at least. Generally we don’t mind tutors or educational YouTube courses that are made to be particularly useful for small groups of people, even though these things do decrease the amount of standardization.
  we don’t have a great track record of using technology like this wisely and not overusing it
  Agreed. With great power comes great responsibility, and often we don’t use that responsibility that well. But two things:
  1) The upsides are really significant. If “being really good” at teaching people generic information is too powerful to be scary, that doesn’t leave us much hope for other tech advancements.
  2) Even if it comes out to be net-negative, it could be useful to investigate further (like investigating if it is net-negative).
Jiro 25 Oct 2020 16:15 UTC
2 points
The fact that people have different understanding of the same texts and have to “translate” them through an inferential distance is a necessary evil. Just because something is a necessary evil doesn’t mean it’s good, and certainly doesn’t mean that we should be fine with deliberately creating more of it.