James Camacho

Karma: 265

James Camacho 22 Nov 2025 11:21 UTC
1 point
0
in reply to: dirk’s comment on: Elizabeth’s Shortform
Okay, but why? I think being faux-polite is social deception because the purpose it serves usually isn’t to take a more cooperative approach with the person you’re arguing with, but to look nice to other people less invested in the argument who are reading through the comments. I’ve seen instances where people are genuinely trying to be nice, and I woud agree that that is “having manners”. I’ve just seen much more (esp. on LessWrong) of people sneering while pretending not to sneer, and when they do that to me it’s pretty obvious what they’re doing and I’m upset at the deception, but when they do it to others I notice it takes me longer to catch, and I’m sure the agree/upvote balance has been skewed by that.

I think a great example of this is many of the comments that reply to some of John Wentsworth’s more controversial opinions, like “My Empathy is Rarely Kind”.

James Camacho 22 Nov 2025 3:51 UTC
0 points
−1
in reply to: dirk’s comment on: Elizabeth’s Shortform
I think your version of “having manners” is social deception to get people to like you and hate the person you’re replying to.

James Camacho 20 Nov 2025 15:31 UTC
1 point
0
on: One King Upon The Chessboard
This is the idea behind maximal entropy regularization, e.g. in soft actor-critic. I’m curious what a chess bot trained purely to maximize future discounted entropy would look like, where it only gets to “keep playing” if it wins the game. I’ve made a few attempts at training it myself, but I’m not quite good enough at RL. Maybe I’ll try again soon.

James Camacho 19 Nov 2025 21:48 UTC
1 point
0
on: Monthly Roundup #36: November 2025

Do top people vary behaviors more? Where is causation here?

Robin Hanson: Top people have more conflicting stories about them as both nice and jerks. Because, I think, their behavior is in fact more context dependent. As that is in fact a more winning social strategy.

Triangulation: Also: high status attracts both detractors and sycophants.

Alternatively: top people are smarter, and smarter people are better able to detect when someone else is being a jerk to them. You’re not just asking, you’re not looking for advice or a mentor, you’re not engaging in good faith, you’re not here to enrich my life or make a mutually beneficial deal, you’re just a waste of space and time, and should confine yourself to wasting your own space and time. Perhaps the reason people at the top are both nice and jerks, is because they’re actually nice, but dumb people can’t detect when someone else was being a jerk to them first, and high status people attract a lot of pretending-to-be-nice jerks.

Actually, top people don’t even have to be smarter, they just have to run into this situation with a much higher freuency than the public.

James Camacho 17 Nov 2025 19:30 UTC
3 points
0
in reply to: Tomás B.’s comment on: Tomás B.’s Shortform
Are there 300 different dimensions that make people attractive to you and are as weighty as money? Selecting 10,000 random people will not just filter the world-class dissemblers, it will filter the world-class in general.

James Camacho 16 Nov 2025 21:18 UTC
1 point
0
on: Arrows of time and space
I think it’s because Schroedinger’s equation is

$ψ (t) ⟩ = e^{i H t} ψ (0) ⟩$

which, breaking $H$ into its energy eigenstates gives

$| ψ (t) ⟩ = \sum p_{n} e^{i E_{n} t} | ψ_{E_{n}} ⟩ .$

In Minkowski space, the time $t$ is literally imaginary. What if we rotated it $90^{\circ}$ in the complex plane so that it’s just like the other dimensions? This is called the Wick rotation, $i t = - β$ , and we recover the Boltzmann (Gibbs) distribution

$| ψ (t) ⟩ = \sum a_{n} e^{- β E_{n}} | ψ_{E_{n}} ⟩$

or equivalently

$Pr [ψ (t) = ψ_{E_{n}}] \propto e^{- β E_{n}} .$

$β$ takes on the role of inverse-temperature. Now, what is the entropy-maximizing distribution? Let

$p_{n} = Pr [ψ (t) = ψ_{E_{n}}] .$

We want

$\begin{matrix} max entropy & = - \sum p_{n} ln p_{n}, s.t. \sum p_{n} E_{n} & = constant . \end{matrix}$

Lagrange multipliers give

$ln p_{n} = β E_{n} + constant$

or the same Boltzmann (Gibbs) distribution we saw earlier. So basically, if we switch to a coordinate axis where time is identical to the other distributions, we find that evolution through time is equivalent to a maximum-entopy distribution in this other coordinate axis.

Now, this is almost circular reasoning, because why is Schroedinger’s equation the way it is? Basically, if you have some observer $t$ (careful here! we’re reusing variable names with different meanings!), and you have some object $ψ$ , and you say that $ψ$ looks the same to the observer as the observer changes, that’s written mathematically as

$\frac{\partial ψ}{\partial t} = A ψ$

where $A$ is some matrix/linear transformation. So

$ψ (t) = e^{A t} ψ (0) = e^{scaling    Z t + r o t a t i o n    i H t} ψ (0) .$

After a long enough time, we’re really only looking at rotation. So, why is our “observer” the same as the time dimension? Well, it’s not. But imagine it’s moves mostly in that dimension, maybe around $299, 792, 458$ times faster, with a little mixing in of the other three. Then the mixing in of the other three will make it so the entropy-maximizing distribution we see in 4 space dimensions should also be entropy-maximizing if we Wick rotate any one of those dimensions.

James Camacho 15 Nov 2025 22:05 UTC
1 point
0
in reply to: jessicata’s comment on: Matrices map between biproducts
Yeah, bra/ket but not quite. I was trying to match your notation where $⟨ \cdot, \cdot ⟩$ is the expander and $[\cdot, \cdot]$ is the contractor. I think I mixed up the bracket directions, and it makes more sense as

$h = ⟨ \begin{matrix} [h_{11} ⟩ & \dots & [h_{1 n} ⟩ ⋮ & ⋱ & ⋮ [h_{n 1} ⟩ & \dots & [h_{n n} ⟩ \end{matrix} ⎤ ⎥ ⎥ ⎦$

that way we can think of it as one big contraction

$[h_{11} ⟩, h_{12} ⟩, \dots, h_{n n} ⟩]$

or expansion

$⟨ [h_{11}, [h_{12}, \dots, [h_{n n} ⟩$

James Camacho 15 Nov 2025 21:37 UTC
1 point
0
on: Matrices map between biproducts
I believe
$h = ⟩ \begin{matrix} f_{11} ⟩ [g_{11} | & \dots & f_{1 n} ⟩ [g_{1 n} | ⋮ & ⋱ & ⋮ f_{n 1} ⟩ [g_{n 1} | & \dots & f_{n n} ⟩ [g_{n n} | \end{matrix} ⎡ ⎢ ⎢ ⎣$
is the main idea. $[g_{i j} |$ (row/down vector) contracts many values down to one while $| f_{j i} ⟩$ (column/up vector) expands one value back up to many, linearly.

James Camacho 13 Nov 2025 19:40 UTC
1 point
−5
in reply to: Arjun Pitchanathan’s comment on: How I Learned That I Don’t Feel Companionate Love
My issue with unambitious people is that they rarely think more than a few months ahead. This means they end up not developing skills that would make them interesting to talk to and be around, are usually not doing interesting work, and are usually not even doing work well either. Just enough to get by and live a happy life. My other issues with such people is it seems very selfish to choose mediocrity and happiness in the present over happiness for your children, those around you, those you allegedly care about, and even your own future. It also seems rather stupid and inconsistent to say (and believe) things like, “I want enough money to live nicely, without spending 1/3rd of my days working,” but then also on every given day take no actions to achieve that goal—just enough to make it through the next few months.

Importantly, for me, you do not have to be ambitious to do these things properly. There are also passionate people who just really enjoy the things they’re studying or working on. However, it’s harder to get lucky with the right passion, and it’s also harder to motivate yourself without passion or ambition, so for most people, a lack of ambition is a serious flaw.

I don’t think love (or day-to-day happiness) and ambition are merely different terminal values. I think people who are ambitious in their teens and early twenties can probably experience greater love and happiness for the rest of their lives, so the oxytocin poisoning is actually just making people short-sighted and stupid. It makes sense in an ancestral environment where you could randomly die the next day, but in our modern world we tend to have longer horizons.

I also think John has a visceral reaction of disgust when confronted with unambitious people. I get that. It’s the feeling that, “I worked way harder and I didn’t complain about the hard work. I still work harder than you, in the hope that I can achieve my goal in a few years. I’ve even given you advice to help you catch up that I had to struggle to figure out for myself. Give me a break.”

James Camacho 5 Nov 2025 20:13 UTC
1 point
3
in reply to: Raemon’s comment on: Being “Usefully Concrete”
I think of math as a more general problem-solving technique doing essentially what you describe in your post. “What specifically is my problem? What do I mean when I say, ‘I need to eat better’ or ‘I’m so confused’? How can I write this down precisely, so I can know when an attempt at solving it works?”

Math has a connotation that you’re solving numbers or scientific problems, but I would argue we should expand its definition to include solving problems like, “where should we go for dinner?” I think the reason humans are applying it to these problems last (through recommendation algorithms) is because the solutions are much fuzzier. Set theory is chosen to be as consistent as possible, computer programs very rarely have bit errors, even physics experiments turn out more-or-less the same each time, but what you want to eat for dinner changes every day based on a lot of factors it’s difficult to be aware of, or say anything concrete about.

James Camacho 5 Nov 2025 11:59 UTC
0 points
2
on: Being “Usefully Concrete”
There’s already a four-letter word in the English language for this concept: “math”.

James Camacho 2 Nov 2025 18:21 UTC
7 points
5
in reply to: speck1447’s comment on: Why I Transitioned: A Case Study
Somewhere around 10% of men would rather be a woman. However, many fewer wish to transition, as it’s costly, difficult, and might not even get them the love and attention they desire (assuming this is the primary motivation).

James Camacho 2 Nov 2025 18:11 UTC
9 points
7
in reply to: abstractapplic’s comment on: Why I Transitioned: A Case Study
I feel like the title is a pretty good content warning.

James Camacho 2 Nov 2025 18:10 UTC
4 points
0
on: Why I Transitioned: A Case Study

You might be able to predict from this that I was extremely lonely in the real world. By this point in my life (around ninth grade), I was a total social outcast. It’s hard to untangle the original causation here, but my social gracelessness and my status as a weird nerd formed a feedback loop: I wouldn’t talk to the vast majority of my classmates at school, supposedly because “I probably wouldn’t find it interesting anyway”. And whenever I did talk to other students, or speak in front of a class, I tended towards spergy faux pas monologues, without adjusting for what others may have wanted to talk about.

In other words, I wasn’t really even trying to connect to my peers in person, and instead spiraled into my own corner of weirdness of the internet. This was especially bad for me because, as it turns out, many of my peers on anime analysis YouTube were themselves miserable, self-destructive outcasts.

....

My autistic, socially maladaptive personality had resulted in me being rejected by the social order, and I wanted something, anything to make me feel loved again.

As someone who ended up in a similar situation (though AoPS, not anime), and then made the effort to connect with classmates in highschool, it was not worth it for me. When you say, ‘autistic’, do you mean autistic, or that you were smarter than most people and had different priorities (like studying)? I am very confident I am not autistic in the medical sense, though I have had one or two people say, “well maybe you are a little,” because I’ve diverged mentally from the rest of the population since birth.

In elementary school, I got along pretty well with my classmates, but my gifted class took the smartest 5–10 students per grade in the city (several grades were combined). In sixth grade (first year of middle school), I was mostly a loner, because my classmates were no longer as awesome. In seventh grade, I transferred to the better middle school (most of my elementary school yearmates did too that year), and was pretty social once again. In eighth grade, I moved to another state, where once again I was a loner, and this time I am confident in saying everyone was pretty stupid and no one cared much about the maths or the sciences. After all, I spent the next four years with them in high school.

In tenth grade, I finally decided to make some friends, because I didn’t like being a loner. I ended up making friends in cross country, because that was the only shared interest I really had with anyone else. It was alright, but I feel like I was always pushing them to study more math/science, and they were always interested in just having fun.

In college, I finally met smart people again, and realized the issue wasn’t me, it was them. My highschool classmates claimed to care about academics and had a lot of natural talent, but never applied themselves. They had different interests and different priorities, and trying to contort myself to fit in was not good. So, when I look back, I do not think it was a worthwhile endeavour to try making friends, even though I succeeded. It would have been worthwhile if I had remained in my old state, and went to highschool with my elementary classmates, but not in my new locale. I think it could be the case that you were in a similar situation.

James Camacho 2 Nov 2025 17:08 UTC
−1 points
−2
in reply to: Tomás B.’s comment on: Tomás B.’s Shortform
At a societal scale, the descriptive is usually the normative, just hidden away in billions of interacting preferences. The reactionary confuses this agglomerated good with their own which is bad if they ever want to change it to align more closely with their own, but they don’t, because they prefer being stupid and making this mistake. It is also good from a stability perspective for most people to be biased towards this kind of mistake, as the individual good becomes subservient to the common good.

“Having no legs is not good; I bet the crippled man has metaphorical legs to stand upon,” said the revolutionary as he crawled up the steps to the capital building, inconveniencing the rest of society for his greater cause.

“That crippled man has no legs; this must be the way of things and so just and right,” said the cripple, and no longer complained.

James Camacho 1 Nov 2025 22:55 UTC
−1 points
−2
in reply to: Tomás B.’s comment on: Tomás B.’s Shortform
- Except people prefer being stupid and making mistakes, thus what is true is good.
- Except two systems of people with the same preferences can lead to disparate outcomes, e.g. war and peace, thus what is good may not be true.
- Except those systems are controlled by people who prefer e.g. war over peace, thus what is true is good.
- Except those in control would prefer they were less stupid and made fewer mistakes, thus what is good may not be true.
- Except it’s pretty naive to think the smartest people in the world are actually making mistakes on the order of tens of millions of QALYs, thus what is true is probably good.
- Except the systems that give people power rarely optimize for intelligence or introspection, thus what is good is not always true.
- Except the people affected by these systems have the power to make them better, if they desired it above being stupid and making mistakes, thus what is true is good.
I think it is actually quite rare to live in a world where what is good is not also true. The best I can come up with is a very constrained world, maybe two pre-historic tribes that could cooperate, but due to some accidents and misunderstndings there’s now common knowledge the other tribe will murder you if you so much as look at them.

James Camacho 31 Oct 2025 0:08 UTC
1 point
0
on: Interview on the Hengshui Model High School
What stood out to me is how the system actually uses feedback to improve:
- Salary bonuses are tied to results, especially at the top. For example, the ¥300k ($40k) bonus for the teacher who had 150% more students admitted to top universities. American teachers are instead assessed on student improvement, but the top 10% of students already saturate the preliminary exam (and hence, there is no feedback mechanism or bonus there).
- Low-performing teachers get fired. Misbehaving students get expelled. In American public schools, this almost never happens, and as a result only the passionate teachers do a decent job of teaching, and most students do not pay much attention in class (sometimes because they can’t due to other students’ misbehavior).
However, there are some things I think are problematic with this school model:
- They have a similar subject-leader model among the teachers as American schools, though much tighter: adjustments are made quicker and more consistent based on homework and test scores, but there is also much less freedom in lesson planning. This is good, because it means even mediocre teachers will have good lessons, but really bad because it kills innovation. It’s the typical exploration/exploitation tradeoff. If their school were willing to share what works with other schools (they’re not; it’s a “secret manual”) this could be solved, otherwise they should choose 10–20% of their classes per subject to undergo an alternative curriculum for the “unit” (period of a few weeks), and see if it works better.
- The hours are too long. I was an American “study god”, and I could only intensely focus on competition math for a few hours a day, when I was already in the mood for it. Six-hour exams (like the Putnam) were exhausting. I do not think humans can think hard for more than a few hours a day. They can do light thinking for most of the day (e.g. repetitive problem sets, reading and writing) but very little is gained here. You might hope that with 16-hour days, a few of them involve intense thinking, but because the curriculum is the same for all the students, and different students will use up their energy on different subjects, the result is none of the hours are used in the most productive manner.

James Camacho 30 Oct 2025 22:50 UTC
1 point
0
on: Interview on the Hengshui Model High School
People’s imagination
People’s impression/perception

James Camacho 21 Oct 2025 1:16 UTC
1 point
0
in reply to: japancolorado’s comment on: japancolorado’s Shortform
My guess is that
1. People ask, “heads or tails?” not “tails or heads?” So, there is a bias for the first heads/tails token after talking about flipping a coin to be heads (and my guess is this applies to human authors as well).
2. The word “heads” is occurs more often in English text than “tails”, so again a bias towards “heads” if there are no other flips on the table.

James Camacho 21 Oct 2025 0:15 UTC
5 points
0
on: James Camacho’s Shortform
The Utility Engineering paper found hyperbolic discounting.
Eyeballing it, this is about
$U (t) = \frac{1}{1 + \frac{t}{6 months}} .$
This was pretty surprising to me, because I’ve always assumed discount rates should be timeless. Why should it matter if I can trade $1 today for $2 tomorrow, or $1 a week from now for $2 a week and a day from now? Because the money-making mechanism survived. The longer it survives, the more evidence we have it will continue to survive. Loosely, if the hazard rate $H$ is proportional to the survival (and thus discount) probability $U$ , we get
$\begin{matrix} U & = k H, \frac{d}{d t} U & = - U H ⟹ U (t) & = \frac{1}{1 + \frac{t}{k}} . \end{matrix}$
More rigorously, suppose there is some distribution of hazards in the environment. Maybe the opportunity could be snatched by someone else, maybe you could die and lose your chance at the opportunity, or maybe the Earth could get hit by a meteor. If we want to maximize the entropy of our prior for the hazard distribution, or we want it to be memoryless—so taking into account some hazards gives the same probability distribution for the rest of the hazards—the hazard rate should follow an exponential distribution
$Pr [H (0) = h] \propto e^{- k h} .$
By Bayes’ rule, the posterior after some time $t$ is
$Pr [H (t) = h] \propto e^{- (k + t) h}$
and the expected hazard rate is
$E [H (t)] = \frac{1}{k + t} .$
By linearity of expectation, we recover the discount factor
$U (t) = \frac{1}{1 + \frac{t}{k}} .$
I’m now a little partial to hyperbolic discounting, and surely the market takes this into account for company valuations or national bonds, right? But that is for another day (or hopefully a more knowledgeable commenter) to find out.