Instead of saying “This sentence doesn’t have truth value 1, nor 1⁄2, nor 1⁄4, …” (which, even if infinitely long, would only work for countably many truth values), you could simply say “This sentence has truth value 0″, which is just as paradoxical, but the paradox also works for real-valued or hyperreal truth values.
cubefox
I tried to find other examples, but apparently only American English uses the American style, while all(?) other languages use the British style, which should probably be called the “Non-American” style.
I particularly find definitions confusing. Which one is logically correct?
“Oculist” means “eye doctor”.
“Oculist” means eye doctor.
Oculist means “eye doctor”.
Oculist means eye doctor.
I would say 1 is the least incorrect one, but it still has issues. The problem is, in the phrase “A means B”, A refers to a word, a string of letters, but B doesn’t refer to a string of letters. It seems to refer to the concept B is expressing, to a meaning.
The problem with 1 seems to be that quotation can either refer to a word or to the meaning of a word. Let’s say double quotes refer to a word/phrase, while single quotes refer to the meaning of that phrase. Then the correct expression of the above would be this:
“Oculist” means ‘eye doctor’.
That seems to make perfect logical sense.[1]
- ↩︎
To clarify, there is a common distinction between
a) term / sign / symbol / word,
b) meaning / intension,
c) reference object / extension.An unquoted term refers to c), a quoted term refers to either a) or b). Hence my double / single quote disambiguation.
Is there an interpretation of KL divergence which works for subjective probability (credence functions) where there is no concept of “true” or “false” distribution? And even for an objective interpretation, the term “cost” seems to be external to probability theory.
From the standpoint of hedonic utilitarianism, assigning a higher value to a future with moderately happy humans than to a future with very happy AIs would indeed be a case of unjustified speciesism. However, in preference utilitarianism, specifically person-affecting preference utilitarianism, there is nothing wrong with preferring our descendants (who currently don’t exist) to be human rather than AIs.
PS: It’s a bit lame that this post had −27 karma without anybody providing a counterargument.
This is also why various artists don’t necessarily try to make Tolkien’s Orthanc, Barad-dûr, Angband, etc look ugly, but imposing and impressive in some way. Even H.R. Giger’s biomechanical landscapes could be described as aesthetic. Or the crooked architecture in The Cabinet of Dr. Caligari (1920). Architecture is art, and art doesn’t have to be beautiful or pleasant, just interesting. But presumably nobody would like to actually live in a Caligari-like environment. (Except perhaps people in the goth subculture?)
I don’t think this is a fallacy. If it was, one of the most powerful and common informal inference forms (IBE a.k.a. Inference to the Best Explanation / abduction) would be inadmissible. That would be absurd. Let me elaborate.
IBE works by listing all the potential explanations that come to mind, subjectively judging how good they are (with explanatory virtues like simplicity, fit, internal coherence, external coherence, unification, etc) and then inferring that the best explanation is probably correct. This involves the assumption that the probability is small that the true explanation is not among those which were considered. Sometimes this assumption seems unreasonable, in which case IBE shouldn’t be applied. That’s mostly the case if all considered explanations seem bad.
However, in many cases the “grain of truth” assumption (the true explanation is within the set of considered explanations) seems plausible. For example, I observe the door isn’t locked. By far the best (least contrived) explanation I can think of seems to be that I forgot to lock it. But of course there is a near infinitude of explanations I didn’t think of, so who is to say there isn’t an unknown explanation which is even better than the one about my forgetfulness? Well, it just seems unlikely that there is such an explanation.
And IBE isn’t just applicable to common everyday explanations. For example, the most common philosophical justification that the external world exists is an IBE. The best explanation for my experience of a table in front of me seems to be that there is a table in front of me. (Which interacts with light, which hits my eyes, which I probably also have, etc.)
Of course, in other cases, applications of IBE might be more controversial. However, in practice, if Alice makes an argument based on IBE, and Bob disagrees with its conclusion, this is commonly because Bob thinks Alice made a mistake when judging which of the explanations she considered is the best. In which case Bob can present reasons which suggest that, actually, explanation x is better than explanation y, contrary to what Alice assumed. Alice might be convinced by these reasons, or not, in which case she can provide the reasons why she still believes that y is better than x, and so on.
In short, in many or even most cases where someone disagrees with a particular application of IBE, their issue is not with IBE itself, but what the best explanation is. Which suggests the “grain of truth” assumption is often reasonable.
Most examples of bad reasoning, that are common amongst smart people, are almost good reasoning. Listing out all the ways something could happen is good, if and only if you actually list out all the ways something could happen
Well, that’s clearly almost always impossible (there are almost infinitely many possible explanations for almost anything), so we can’t make an exhaustive list. Moreover, “should” implies “can”, so, by contraposition, if we can’t list them, it’s not the case that we should list them.
, or at least manage to grapple with most of the probability mass.
But that’s backwards. IBE is a method which assigns probability to the best explanation based on how good it is (in terms of explanatory virtues) and based on being better than the other considered explanations. So IBE is a specific method for coming up with probabilities. It’s not just stating your prior. You can’t argue about purely subjective priors (that would be like arguing about taste) but you can make arguments about what makes some particular explanation good, or bad, or better than others. And if you happen to think that the “grain of truth” assumption is not plausible for a particular argument, you can also state that. (Though the fact that this is rather rarely done in practice suggests it’s in general not such a bad assumption to make.)
Judging from the pictures, this could also be a quadratic fit.
Not sure whether you know this, but on Twitter roon mentioned that GPT-5 (non-thinking? thinking?) was optimized for creative writing. Eliezer dismissed an early story shared by Altman.
By the way, “It seems” and “arguably” seem a bit less defensive than “I think” (which is purely subjective). Arguably.
I hear a lot of scorn for the rationalist style where you caveat every sentence with “I think” or the like.
I think e.g. Eliezer (in the sequences) and Scott Alexander don’t hedge a lot, so this doesn’t necessarily seem like a rationalist style. I do it a lot though, but I fairly sure it makes readability worse.
We don’t need to shave ahead of time anyway (we can do it when the pandemic is already here), so it doesn’t compete with mental resources now.
Impressive write-up! As a follow-up question, what’s currently your favorite (hypothetical) explanation for the actual main cause of high obesity rates? Some environmental contaminant? Something else?
Yes, with Hilbert proof systems, since those have axioms / axiom schemata. (In natural deduction systems there are only inference rules like Modus ponens, no logical axioms.) But semantically, a “primitive” identity symbol is commonly already interpreted to be the real identity, which would imply the truth of all instances of those axiom schemes. Though syntactically, for the proof system, you indeed still need to handle equality in FOL, either with axioms (Hilbert) or special inference rules (natural deduction).
These syntactical rules are weaker in FOL however than the full (semantic) notion of identity. Because they only infer that all “identical” objects have all first-order definable predicates in common, which doesn’t cover all possible properties, and which holds also for weaker forms of equality (“first-order equivalence”).
Eliezer mentioned the predicate “has finitely many predecessors” as an example of a property that is only second-order definable. So two distinct objects could have all their first-order definable properties in common while not being identical. The first-order theory wouldn’t prove that they are different. The second-order definition of identity, on the other hand, ranges over all properties rather than over all first-order definable ones, so it captures real identity.
But I’m pretty sure all instances of the axiom schema
(for any FOL-definable predicate in our theory) are already implied if we, as is customary, assume as logic FOL+identity, i.e. first-order predicate logic with a primitive logical predicate for identity. So in that case we don’t need the axiom schema. (And in second and higher order logic we need neither an axiom schema nor a primitive relation, because in that case identity is already definable in pure logic, without any theory, so we have a logical identity relation without having it to add as a primitive.) If we just assume FOL alone, adding this axiom schema to a particular first-order theory makes sense, but I conjecture that it is not equivalent to full (primitive or second-order definable) identity, similar to how the axiom schema of induction is not equivalent to the induction axiom, which requires second-order logic.
I don’t know about category theory, but identity can’t be defined in first-order logic, so it is usually added as a primitive logical predicate to first-order logic. Adding infinite axiom schemas of the sort you showed above don’t qualify as a definition, which has to be a finite biconditional. But identity is definable in pure second-order logic. Using Leibniz’s law, one can define
Intuitively, and are defined identical iff they have all their properties in common. Formally, there is always a subset of the domain in the range of and which for any object contains only . This guarantees that the definition cannot accidentally identify different objects.
For instance, are the symbols “1” = “2″ in first order logic? Depends on the axioms!
Yes and that is because in predicate logic, different names (constant symbols) are allowed to refer to the same object or to different objects, while the same names always refer to the same object. This is similar to natural language where one thing can have multiple names. Frege (the guy who came up with modern predicate logic) has written about this in an 1892 paper:
Equality gives rise to challenging questions which are not altogether easy to answer. Is it a relation? A relation between objects, or between names or signs of objects? In my Begriffsschrift I assumed the latter. The reasons which seem to favour this are the following: a = a and a = b are obviously statements of differing cognitive value; a = a holds a priori and, according to Kant, is to be labelled analytic, while statements of the form a = b often contain very valuable extensions of our knowledge and cannot always be established a priori. The discovery that the rising sun is not new every morning, but always the same, was one of the most fertile astronomical discoveries. Even to-day the identification of a small planet or a comet is not always a matter of course. Now if we were to regard equality as a relation between that which the names
a' andb’ designate, it would seem that a = b could not differ from a = a (i.e. provided a = b is true). A relation would thereby be expressed of a thing to itself, and indeed one in which each thing stands to itself but to no other thing. What is intended to be said by a = b seems to be that the signs or namesa' andb’ designate the same thing, so that those signs themselves would be under discussion; a relation between them would be asserted. But this relation would hold between the names or signs only in so far as they named or designated something. It would be mediated by the connexion of each of the two signs with the same designated thing. But this is arbitrary. Nobody can be forbidden to use any arbitrarily producible event or object as a sign for something. In that case the sentence a = b would no longer refer to the subject matter, but only to its mode of designation; we would express no proper knowledge by its means. But in many cases this is just what we want to do. If the signa' is distinguished from the signb’ only as object (here, by means of its shape), not as sign (i.e. not by the manner in which it designates something), the cognitive value of a = a becomes essentially equal to that of a = b, provided a = b is true. A difference can arise only if the difference between the signs corresponds to a difference in the mode of presentation of that which is designated. Let a, b, c be the lines connecting the vertices of a triangle with the midpoints of the opposite sides. The point of intersection of a and b is then the same as the point of intersection of b and c. So we have different designations for the same point, and these names (point of intersection of a and b,'point of intersection of b and c’) likewise indicate the mode of presentation; and hence the statement contains actual knowledge.It is natural, now, to think of there being connected with a sign (name, combination of words, letter), besides that to which the sign refers, which may be called the reference of the sign, also what I should like to call the sense of the sign, wherein the mode of presentation is contained. In our example, accordingly, the reference of the expressions
the point of intersection of a and b' andthe point of intersection of b and c’ would be the same, but not their senses. The reference ofevening star' would be the same as that ofmorning star,′ but not the sense. (...)He argues that identity expressed between names is a relationship between their “modes of presentation” (senses/meanings) rather than between their names or reference object(s). This makes sense because different names can be synonymous or not, and only if they are we can infer equality, while in the latter case whether they are equal or unequal has to be established by other means.
Oh, apparently I misinterpreted the meaning of “bowing out”, as I’m not a native English speaker. Anyway, I just want to register my opinion that I thought the “not worth getting into?” react was good in itself.
Doesn’t “not worth getting into?” sound better than “bowing out”?
I’m probably also misunderstanding, but wouldn’t this predict that large production models prefer words starting with “a” and names starting with “I” (capital “i”)? Because these letters are, simultaneously, frequently-used words in English. Which makes it likely that the tokenizer includes the tokens ” a” and ” I” and that the model is incentivized to use them.
A quite different possibility to define a fuzzy/probabilistic subset relation:
Assume the sets A and B are events (sets of possible disjoint outcomes). Then A⊆B iff P(B∣A)=1. This suggests that a probabilistic/partial/fuzzy “degree of subsethood” of A in B is simply equal to the probability P(B∣A).
This value is 1 if A is completely inside B, reducing to conventional crisp subsethood, and 0 if A is completely outside B. It is 0.5 if A is “halfway” inside B. Which seem pretty intuitive properties for fuzzy subsethood.
Additionally, the value itself has a simple probabilistic interpretation—the probability that an outcome is in B given that it is in A.