[Deleted]
Past Account
[Deleted]
[Deleted]
[Deleted]
[Deleted]
Do you guys have an RSS feed I could subscribe to?
Just a thought, maybe it’s a useful perspective. It seems kind of like a game. You choose whether or not to insert your beliefs and they choose their preferences. In this case it just turns out that you prefer life in both cases. What would you do if you didn’t know whether or not you had an Alice/Bob and had to choose your move ahead of time?
Is it against the spirit to start going now and use Saturday as the deadline? I have some journal entries I’d like to spin-up and a deadline would help!
What is GameB?
From context it’s presumably the 2009 outbreak where roughly ~60 million got infected in the US
[Deleted]
[Deleted]
[Deleted]
[Deleted]
First couple of sentences seem reasonable, something I was thinking but didn’t comment. However, the rest of this seems needlessly aggressive. I’d almost recommend pairing down to the limited critique and fleshing that out in more detail.
I’m interested in converting notes I have about a few topics into posts here. I was really trying to figure out why this would be a good use of my time. The notes are already rather readable by myself. I thought about this for a while and it seems as though I’m explicitly interested in getting feedback on some of my thought processes. I’m aware of Goodhart’s law so I know better than to have an empty plan to simply maximize my karma. However, on the other end, I don’t want to simply polish notes. If I were to constrain myself to only write about things I have notes on then it seems I could once again explicitly try to maximize karma. In fact, if I felt totally safe doing this it’d be a fun game to try out, possibly even comment on. Of course, eventually, the feedback that’d I’d receive would start to warp what kinds of future contributions I’d make to the site, but this seems safe. Given all of this, I’d conclude I can explicitly maximize different engagement metrics, at least at first.
[Deleted]
Not really sure, if I was really going for it, I could do about 15-25 posts. I’m going back and forth on which metrics to use. This seems highly tied to what I actually want feedback on. What do you mean by Q&A?
If we’re taking the idea that arguments are paths in topological space seriously, I feel like conditioned language models are going to be really important. We already use outlines to effectively create regression data-sets to model arguments. It seems like modifying GPT-2 so that you can condition on start/end prompts would be incredibly helpful here. More speculative, I think that GPT-2 is near the best we’ll ever get at next word prediction. Humans use outline like thinking much more often then is commonly supposed.
The maximum is over the domain. I’m not sure how your example is escaping from the hierarchy paradigm. I do consider the idea of having undetermined sub-tasks.
You seem concerned about why I choose to characterize the policy by how well it compresses the task. While it was possible to do a sort of ‘interleaving’ as you suggest from a technical point of view it makes no difference since compression transitions are assumed to be Markov. This translates to an assumption that planning ability depends only on what you currently have planned.
Practically speaking I should assume that the transitions are Markov and depend on both what has been planned and what has been executed. My argument rests on the idea that in expectation there’s no difference between the two strategies since what you plan for will in expectation match what happens.
The moment you start trying to build up a more complicated model it becomes clear that you can’t simply account for what has been executed in terms of a scalar. Otherwise what I just said is reinforced. In that case, you need to model how tasks are being created, prioritized, and executed. This is difficult to the point of being useless as a tool to understand what I was interested in.
I think we agree that the only way forward is to simply assume that this ‘meta’-policy can be invoked recursively. This is hard. Naively I’d hope for sub-task modularity/independence and additivity of the effectiveness ‘meta’-policy.
π=π1π2=π2π1
Γ(π)=Γ(π1)+Γ(π2)
Hopefully, it’s clearer why it’s impossible to go further without a good model for how tasks are sub-divided. It’s all too easy to run into Zeno-like paradoxes where it’s either impossible to plan due to compounding sub-task over-head or you can slice-up a task into infinitesimal dust. This is getting too long for a comment. I’ll leave it there.