Although most of the text produced by humans and published on the internet has the dynamic you describe, most of the hours of video watched on Youtube is produced by people making a living from those videos or expecting (realistically) to start making a living soon, and IMHO there’s tons of great information on YT (although it can be hard to find).
Unsure whether the dominance of youtube is due to youtube paying creators or due to it having a good recommender algorithm. Considering the amount of content I watch that’s just people talking, I’m fairly sure it’s mainly the algorithm.
Youtube is certainly dominant on measures like person-hours, but I’m not interested in that: I’m interested in how to make the internet more informative to the kind of people who will read this conversation (i.e., smart knowledgeable people). (So for example, how many people use YT to veg out at the end of the day are not what I’m interested in.)
I’m not seeing how YT’s recommendation algorithm is good for the kind of people who will read this. (More precisely, although there was an interval of a few months during which the algorithm was quite informative in my experience, that ended many months ago with the result that the current situation is that there is no good way AFAIK to discover the great content on YT in my experience without wasting a lot of time wading through mostly-worthless content.) I doubt you mean the way the algorithm influences the kind of content that creators choose to create. (Creators tend to learn a lot about the algorithm because it has a large effect on their view count and consequently their revenue).
Also, I’m confused about your “just people talking”: a huge fraction of what I consider great YT content is just people talking. I don’t find that strange or regrettable and fail to see how that supports your point.
The quality of the recommender is highly variable and depends on the… psychological resilience… of the subject. If it sees a way to melt you into a passive consumer, it will. Mine seems to have turned recently. I didn’t watch enough of the lecture videos in my watchlater. It sensed weakness.
The point was that even if they stopped paying people, many people would continue producing People Talking content. Which likely suggests that an algorithm of the same strength as the youtube algorithm, applied to written content, would create a compelling system for discovering interesting stuff. And for some people, supposedly, Google News is that, but I was never able to get it to recommend stuff I wanted, and it seems to have a hard bias for a closed list of newspapers (it’s never going to recommend something from a substack) which is creepy.
A lot depends on what you mean by “algorithm of the same strength”. Youtube is a closed loop—they know how much of what things you watched, what you searched on, what you responded to and didn’t respond to. And they use that information to pay content producers in proportion to “success” of the content via their algorithms. And the additional feedback loop of knowing what videos you’re watching allows them to charge more for ads you’re shown.
It’s VERY good at optimizing for what it measures (people’s willingness to watch targeted ads around what content). I’d argue that’s more about data acquisition than algorithmic power. I’d further argue that it’s absolutely not what I want to be optimized in noncommercial interactive discussion spaces.
It MAY BE what I want in curated, directed, long- and short-form text publication. I could see Substack evolving to a similar model (where in addition to subscriptions to authors, you have it recommend per-read or ad-supported articles). I’d love it if an engine could aggregate dozens of magazines and publishers into that model, but I don’t think most of the current participants will agree to that level of central control.
(hmm does the lay-meaning of “algorithm” encompass the data, especially any ongoing recurring effects it would have. I think it must. A ML model is a product of its data.)
I think the trick with these systems is letting users talk back to the algorithm and help it out. Likes, or more meaningful signals of appreciation, help. Reddit go by without a recommender system because users were expected to essentially explicitly communicate all of their interests by subscribing to subreddits. Ranking is another way.
>an algorithm of the same strength as the youtube algorithm, applied to written content, would create a compelling system for discovering interesting stuff.
We have very different assessments. My guess is that the reason the youtube recommendations algorithm used to be better than any of the engines for searching text is because the engines cannot distinguish the high-quality textual pages from the low-quality pages (plus the sheer number of low-quality pages). Although there is certainly a lot of bad information and temptations to waste your time on Youtube, it is easier for an engine to avoid it and surface the informative content (but Youtube is not even trying to do that anymore) basically because the vast majority of the temptations to waste your time on Youtube aren’t even trying to deceive anybody that they’re informative.
Also the average youtube video that is sincerely trying to be informative rather than just entertaining is of distinctly higher quality IMHO than the average textual web page trying for the same thing.
Although most of the text produced by humans and published on the internet has the dynamic you describe, most of the hours of video watched on Youtube is produced by people making a living from those videos or expecting (realistically) to start making a living soon, and IMHO there’s tons of great information on YT (although it can be hard to find).
Unsure whether the dominance of youtube is due to youtube paying creators or due to it having a good recommender algorithm. Considering the amount of content I watch that’s just people talking, I’m fairly sure it’s mainly the algorithm.
Youtube is certainly dominant on measures like person-hours, but I’m not interested in that: I’m interested in how to make the internet more informative to the kind of people who will read this conversation (i.e., smart knowledgeable people). (So for example, how many people use YT to veg out at the end of the day are not what I’m interested in.)
I’m not seeing how YT’s recommendation algorithm is good for the kind of people who will read this. (More precisely, although there was an interval of a few months during which the algorithm was quite informative in my experience, that ended many months ago with the result that the current situation is that there is no good way AFAIK to discover the great content on YT in my experience without wasting a lot of time wading through mostly-worthless content.) I doubt you mean the way the algorithm influences the kind of content that creators choose to create. (Creators tend to learn a lot about the algorithm because it has a large effect on their view count and consequently their revenue).
Also, I’m confused about your “just people talking”: a huge fraction of what I consider great YT content is just people talking. I don’t find that strange or regrettable and fail to see how that supports your point.
The quality of the recommender is highly variable and depends on the… psychological resilience… of the subject. If it sees a way to melt you into a passive consumer, it will. Mine seems to have turned recently. I didn’t watch enough of the lecture videos in my watchlater. It sensed weakness.
The point was that even if they stopped paying people, many people would continue producing People Talking content.
Which likely suggests that an algorithm of the same strength as the youtube algorithm, applied to written content, would create a compelling system for discovering interesting stuff.
And for some people, supposedly, Google News is that, but I was never able to get it to recommend stuff I wanted, and it seems to have a hard bias for a closed list of newspapers (it’s never going to recommend something from a substack) which is creepy.
A lot depends on what you mean by “algorithm of the same strength”. Youtube is a closed loop—they know how much of what things you watched, what you searched on, what you responded to and didn’t respond to. And they use that information to pay content producers in proportion to “success” of the content via their algorithms. And the additional feedback loop of knowing what videos you’re watching allows them to charge more for ads you’re shown.
It’s VERY good at optimizing for what it measures (people’s willingness to watch targeted ads around what content). I’d argue that’s more about data acquisition than algorithmic power. I’d further argue that it’s absolutely not what I want to be optimized in noncommercial interactive discussion spaces.
It MAY BE what I want in curated, directed, long- and short-form text publication. I could see Substack evolving to a similar model (where in addition to subscriptions to authors, you have it recommend per-read or ad-supported articles). I’d love it if an engine could aggregate dozens of magazines and publishers into that model, but I don’t think most of the current participants will agree to that level of central control.
(hmm does the lay-meaning of “algorithm” encompass the data, especially any ongoing recurring effects it would have. I think it must. A ML model is a product of its data.)
I think the trick with these systems is letting users talk back to the algorithm and help it out. Likes, or more meaningful signals of appreciation, help. Reddit go by without a recommender system because users were expected to essentially explicitly communicate all of their interests by subscribing to subreddits. Ranking is another way.
>an algorithm of the same strength as the youtube algorithm, applied to written content, would create a compelling system for discovering interesting stuff.
We have very different assessments. My guess is that the reason the youtube recommendations algorithm used to be better than any of the engines for searching text is because the engines cannot distinguish the high-quality textual pages from the low-quality pages (plus the sheer number of low-quality pages). Although there is certainly a lot of bad information and temptations to waste your time on Youtube, it is easier for an engine to avoid it and surface the informative content (but Youtube is not even trying to do that anymore) basically because the vast majority of the temptations to waste your time on Youtube aren’t even trying to deceive anybody that they’re informative.
Also the average youtube video that is sincerely trying to be informative rather than just entertaining is of distinctly higher quality IMHO than the average textual web page trying for the same thing.