I have a general belief that internet epistemic hygiene norms should include that, when you quote someone, not only should you link to the source, but you should link to the highlight of that source. In general, if you highlight text on a webpage and right-click, you can “copy link to highlight” which when opened scrolls to and highlights that text. (Random example on Wikipedia.)
Further on this theme, archive.is has the interesting feature of constantly altering the URL to point to the currently highlighted bit of text, making this even easier. (Example, and you can highlight other bits of text to see it change.) Currently I overall don’t like it because I constantly highlight text while I’m reading it, and so am v annoyed by the URL constantly changing, but it’s plausible I’d get over this in time, and it’d be a good feature to add to LW.
The archive.is feature is also better because the normal “copy link to highlight” can often be unwieldily and long. Also I recall it sometimes not working, probably because the highlight is too short or too long (I don’t quite understand the rules). On archive.is it just has a start and end number for where in the text is highlighted, making it always work and never be unwieldily.
Sadly, I just tried the normal “copy link to highlight” on LW, and when I clicked through the page auto-refreshes, so the highlighted text flashes purple then disappears quickly after. It would be good for us to change that, and maybe implement this feature.
I have misgivings about the text-fragment feature as currently implemented. It is at least now a standard and Firefox implements reading text-fragment URLs (just doesn’t conveniently allow creation without a plugin or something), which was my biggest objection before; but there are still limitations to it which show that a lot of what the text-fragment ‘solution’ is, is a solution to the self-inflicted problems of many websites being too lazy to provide useful anchor IDs anywhere in the page. (I don’t know how often I go to link a section of a blog post, where the post is written in a completely standard hierarchical table-of-contents way, and the headers turn out to be… nothing but <h2>s with not an id= anywhere in sight.) We would be a lot better off if pages had more meaningful IDs and selecting text did something like, pick the nearest preceding ID. (This could be implemented in LW2 or Gwern.net right now, incidentally. If the user selects some text, just search through the tree to find the first previous ID, and update the current browser-bar URL to URL#ID.)
Hacking IDs onto an unwilling page, whose author neither knows nor cares nor can even find out what IDs are in use (or what they may be breaking by editing this or that word), is a recipe for long-term breakage: your archive.is example works simply because archive.is is an archive website, and the pages, in theory, never change (even though the original URLs certainly can, and often quite dramatically). That’s less true for LW comments or articles. There are also downstream effects: text-fragments are long and verbose and can’t be written by hand because they’re trying to specify arbitrary ranges which are robust to corruption, and they are unwieldy to search. (How does a tool handle different hash-anchors in a URL? Most choose to define them as unique URLs different from each other… so what happens when two users selecting from the same section inevitably wind up selecting slightly different text ranges every time, and every user has a unique text-fragment anchor? Now suddenly every URL is unique—no more useful backlinks, no more consolidated discussions of the same URL, etc. And if the URL content changes, you don’t get anything out of it. It’s now just a bunch of trailing junk causing problems forever, like all that ?utm_foo_bar junk.)
Somewhat like the fad for abusing # for the stupid #! JS thing (which pretty much everyone, Twitter included, came to regret), I worry that this is still a half-baked tech designed for a very narrow use case (Google’s convenience in providing search results) where we don’t know how well it will work in the wild long-term or what side-effects it will have. So I personally have been holding off on it and making a point of deleting those archive.is anchors.
“Copy link to highlight” is not available in Firefox. And while e.g. Bing search seems to automatically generate these “#:~:text=” links, I find they don’t work with any degree of consistency. And they’re even more affected by link rot than usual, since any change to the initial text (like a typo fix) will break that part of the link.
I like this idea. There’s always endless controversy about quoting out of context. I can’t recall seeing any previous specific proposals to help people assess the relevance of context for themselves.
Currently I overall don’t like it because I constantly highlight text while I’m reading it, and so am v annoyed by the URL constantly changing, but it’s plausible I’d get over this in time, and it’d be a good feature to add to LW.
I have a general belief that internet epistemic hygiene norms should include that, when you quote someone, not only should you link to the source, but you should link to the highlight of that source. In general, if you highlight text on a webpage and right-click, you can “copy link to highlight” which when opened scrolls to and highlights that text. (Random example on Wikipedia.)
Further on this theme, archive.is has the interesting feature of constantly altering the URL to point to the currently highlighted bit of text, making this even easier. (Example, and you can highlight other bits of text to see it change.) Currently I overall don’t like it because I constantly highlight text while I’m reading it, and so am v annoyed by the URL constantly changing, but it’s plausible I’d get over this in time, and it’d be a good feature to add to LW.
The archive.is feature is also better because the normal “copy link to highlight” can often be unwieldily and long. Also I recall it sometimes not working, probably because the highlight is too short or too long (I don’t quite understand the rules). On archive.is it just has a start and end number for where in the text is highlighted, making it always work and never be unwieldily.
Sadly, I just tried the normal “copy link to highlight” on LW, and when I clicked through the page auto-refreshes, so the highlighted text flashes purple then disappears quickly after. It would be good for us to change that, and maybe implement this feature.
I have misgivings about the text-fragment feature as currently implemented. It is at least now a standard and Firefox implements reading text-fragment URLs (just doesn’t conveniently allow creation without a plugin or something), which was my biggest objection before; but there are still limitations to it which show that a lot of what the text-fragment ‘solution’ is, is a solution to the self-inflicted problems of many websites being too lazy to provide useful anchor IDs anywhere in the page. (I don’t know how often I go to link a section of a blog post, where the post is written in a completely standard hierarchical table-of-contents way, and the headers turn out to be… nothing but
<h2>
s with not anid=
anywhere in sight.) We would be a lot better off if pages had more meaningful IDs and selecting text did something like, pick the nearest preceding ID. (This could be implemented in LW2 or Gwern.net right now, incidentally. If the user selects some text, just search through the tree to find the first previous ID, and update the current browser-bar URL toURL#ID
.)Hacking IDs onto an unwilling page, whose author neither knows nor cares nor can even find out what IDs are in use (or what they may be breaking by editing this or that word), is a recipe for long-term breakage: your archive.is example works simply because archive.is is an archive website, and the pages, in theory, never change (even though the original URLs certainly can, and often quite dramatically). That’s less true for LW comments or articles. There are also downstream effects: text-fragments are long and verbose and can’t be written by hand because they’re trying to specify arbitrary ranges which are robust to corruption, and they are unwieldy to search. (How does a tool handle different hash-anchors in a URL? Most choose to define them as unique URLs different from each other… so what happens when two users selecting from the same section inevitably wind up selecting slightly different text ranges every time, and every user has a unique text-fragment anchor? Now suddenly every URL is unique—no more useful backlinks, no more consolidated discussions of the same URL, etc. And if the URL content changes, you don’t get anything out of it. It’s now just a bunch of trailing junk causing problems forever, like all that
?utm_foo_bar
junk.)Somewhat like the fad for abusing
#
for the stupid#!
JS thing (which pretty much everyone, Twitter included, came to regret), I worry that this is still a half-baked tech designed for a very narrow use case (Google’s convenience in providing search results) where we don’t know how well it will work in the wild long-term or what side-effects it will have. So I personally have been holding off on it and making a point of deleting those archive.is anchors.“Copy link to highlight” is not available in Firefox. And while e.g. Bing search seems to automatically generate these “#:~:text=” links, I find they don’t work with any degree of consistency. And they’re even more affected by link rot than usual, since any change to the initial text (like a typo fix) will break that part of the link.
Though if the text changes, then it degrades gracefully to just linking to the right webpage, which is the current norm.
The highlights are officially called “text fragments” and the syntax is described here: https://developer.mozilla.org/en-US/docs/Web/URI/Reference/Fragment/Text_fragments
I like this idea. There’s always endless controversy about quoting out of context. I can’t recall seeing any previous specific proposals to help people assess the relevance of context for themselves.
You can add it as an opt-in feature.
I find them visually awful and disable them in settings. And avoid using archive.is because there’s no way to turn that off.
Not that I browse LW that much, in fairness.