Adrià Garriga-alonso

Karma: 882

Adrià Garriga-alonso 9 Apr 2016 12:23 UTC
4 points
in reply to: Kuaiyu’s comment on: Attention! Financial scam targeting Less Wrong users
I would probably use better spelling in the messages. It reduces credibility of the scammer.

Adrià Garriga-alonso 14 May 2016 12:31 UTC
5 points
in reply to: ArgleBlargle’s comment on: 2016 LessWrong Diaspora Survey Analysis: Part One (Meta and Demographics)
Gratitude thread.

What a load of work, Ingres. Thank you for doing this.

Adrià Garriga-alonso 15 Jun 2016 17:10 UTC
5 points
in reply to: Elo’s comment on: Attention! Financial scam targeting Less Wrong users
Today in Hacker News there’s a research article speaking exactly of this.

https://news.ycombinator.com/item?id=11909111

Makes me think that a possible method to mitigate spam would be to answer each email with a LSTM-generated blob of text, so the attackers are swarmed with false positives and cannot continue the attack. Of course, this would have to be implemented by the email provider.

Adrià Garriga-alonso 21 Jun 2016 7:55 UTC
1 point
in reply to: hairyfigment’s comment on: Attention! Financial scam targeting Less Wrong users
This is what I thought. But ChristianKl is right: it doesn’t need to. From the first false positive you’re already doing damage with almost no cost to you. Sure your address will start to receive more spam, but it will be filtered like the spam you already have is.

But having it in the ISP, or as a really popular extension, would deal a big blow to spam.

Adrià Garriga-alonso 6 Oct 2016 9:50 UTC
0 points
in reply to: Houshalter’s comment on: Nick Bostrom says Google is winning the AI arms race

I don’t think we are that far away from AGI.

At the very least 20 years. And yes Alphabet are the closest, but in 20 years a lot of things can change.

Adrià Garriga-alonso 11 Dec 2016 16:03 UTC
0 points
in reply to: grinchler’s comment on: A sealed prediction
Almost 5 years now.

Adrià Garriga-alonso 10 Nov 2017 13:23 UTC
3 points
on: Announcing the AI Alignment Prize
Is it possible to enter the contest as a group? Meaning, can the article written for the contest have several coauthors?

Adrià Garriga-alonso 7 Aug 2018 10:31 UTC
3 points
in reply to: Rohin Shah’s comment on: A Gym Gridworld Environment for the Treacherous Turn
Would it be possible to just apply model-based planning and show the treacherous turn on the first time?
Model-based planning is also AI, and we clearly have an available model of this environment.

Adrià Garriga-alonso 23 Sep 2018 12:43 UTC
4 points
on: Modes of Petrov Day
Typo in pg. 31 of the ceremony guide: “sir ead” → “is read”.

Adrià Garriga-alonso 22 Dec 2018 12:55 UTC
5 points
in reply to: Greg C’s comment on: The Craft & The Community—A Post-Mortem & Resurrection
Upvoting to people see that the project failed earlier, and don’t have to spend a couple hours reading the main article given how this turned out.

Adrià Garriga-alonso 7 Jan 2019 14:11 UTC
4 points
on: AI safety without goal-directed behavior
I usually think that logic-based reasoning systems are the canonical example of of an AI without goal-directed behaviour. They just try to prove or disprove a statement, given a database of atoms and relationships. (Usually they’re restricted to statements that are decidable by construction so that is always possible).
You can also frame their behaviour as a utility function: U(time, state) = 1 if you have correctly decided the statement at t ≤ time, 0 otherwise. But your statement that
>It seems possible to build systems in such a way that these properties are inherent in the way that they reason, such that it’s not even coherent to ask what happens if we “get the utility function slightly wrong”.
very much applies. I’m fairly sure you can specify the behaviour of _anything_, including “dumb” things like trousers, screwdrivers, rocks and saucepans, as an utility function + perfect optimization, even though for most things this is a very unhelpful way of thinking. Or at least human artifacts. E.g. a screwdriver optimizes “transmit the rotational force that is applied to you”, a rock optimizes “keep these molecules bound and respond to forces according to the laws of physics”.

Adrià Garriga-alonso 8 Jan 2019 19:05 UTC
3 points
in reply to: Rohin Shah’s comment on: AI safety without goal-directed behavior
Yup. I actually made this argument two posts ago.
Ah, that’s good. I should probably read the rest of the sequence too.
Though it’s not clear how you’d use a logic-based reasoning system to act in the world
The easy way to use them would be as they are intended: oracles that will answer questions about factual statements. Humans would still do the questioning and implementing here. It’s unclear how exactly you’d ask really complicated, natural-language-based questions (obviously, otherwise we’d have solved AI), but I think it serves as an example of the paradigm.

Adrià Garriga-alonso 9 Jan 2019 0:17 UTC
5 points
in reply to: ESRogs’s comment on: Reframing Superintelligence: Comprehensive AI Services as General Intelligence
Yes, though I’m fairly sure he’s talking about using trained neural networks to e.g. classify an image, which is known to be fairly cheap, rather than training them. In other words, he’s talking about using an AI service rather than creating one.
He also says that “Machine learning and human learning differ in their relationship to costs” which is also evidence for my interpretation: training is expensive, testing on one example is very cheap.

Adrià Garriga-alonso 9 Jan 2019 1:00 UTC
3 points
in reply to: avturchin’s comment on: Reframing Superintelligence: Comprehensive AI Services as General Intelligence
This idea also excludes the robotic direction in AI development, which will anyway produce agential AIs.
Recursive self-improvement that makes the intelligence “super” quickly is what makes the misaligned utility actually dangerous, as opposed to dangerous like a, say, current day automatized assembly line.
A robot that self-improves would need to have the capacity to control its actuators and also to self-improve. Since none of these capabilities directly depends on the other, each time one of them improves, the improvement is much more likely to be first demonstrated independently of an improvement in the other one.
Thus we’re likely to already have some experience with self-improving AI, or the recursively improved AI to help us, when we get to dealing with people wanting to build self-improving robots. Even though with advanced AI in hand to help we should maybe still start early on that, it seems more important to get the not-necessarily-and-also-probably-not-robotic AI right.

Adrià Garriga-alonso 6 Feb 2019 22:32 UTC
12 points
on: Complexity Penalties in Statistical Learning
Great post, thank you for writing it!
By taking squares we are more forgiving when the model gets the answer almost right but much less forgiving when the model is way off
It’s really unclear why this would be better. I was going to ask for a clarification but I found something better instead.
The first justification for taking the least-squares estimator that came to mind is that it’s the maximum likelihood solution if you assume your errors E are independent and Gaussian. Under those conditions, the probability density of the data is $p ({y_{i}}_{i = 1, \dots, N} | {x_{i}}_{i = 1, \dots, N}, σ^{2}) = \prod_{i} Gaussian (y_{i} | g (x_{i}), σ^{2})$ . We take the product because the errors are independent for each point. If you take the logarithm (which is a transformation that preserves the maximum), this comes out to be $- 1 / 2 σ^{2} \cdot \sum_{i} (y_{i} - g (x_{i}))^{2}$ plus a constant that doesn’t depend on g.
It turns out there’s a stronger justification, which doesn’t require assuming the errors are Gaussian. The Gauss-Markov Theorem shows that the best unbiased estimator of the coefficients of g that is linear in the powers of the data, i.e. $x_{i}, x_{i}^{2}, x_{i}^{3}, \dots$ . That it’s unbiased means that, for several independent samples of data, the mean of the least-squares estimator will be the true coefficients of the polynomial you used to generate the data. The estimator is the best because it has the lowest variance under the above sampling scheme. If you assume you have a lot of data, by the central limit theorem, this is similar to lowest squared error in coefficient-space.
However, if you’re willing to drop the requirement that the estimator is unbiased, you can get even better error on average. The James-Stein estimator has less variance than least-squares, at the cost of biasing your estimates of the coefficients towards zero (or some other point). So, even within a certain maximum allowed degree of the polynomial, it is helpful to penalise complexity of the coefficients themselves.
On a related note, for some yet unknown reason, neural networks that have way more parameters than is necessary in a single layer seem to generalize better than networks with few parameters. See for example Towards Understanding the Role of Over-Parametrization in Generalization of Neural Networks. It seems to be related to the fact that we’re not actually finding the optimal configuration of parameters with gradient descent, but only approximating it. So the standard statistical analysis that you outlined here doesn’t apply in that case.

Adrià Garriga-alonso 13 Feb 2019 23:54 UTC
1 point
on: Three Kinds of Research Documents: Clarification, Explanatory, Academic
What about short technical reports / forum posts like the original “All Mathematicians are Trollable: Divergence of Naturalistic Logical Updates”? It’s short, doesn’t have any references, doesn’t hold your hand with much background like academic articles do. On the other hand it contains more detailed information than “An Untrollable Mathematician Illustrated”.
Is this a clarification post because it’s the first example of an idea?

Adrià Garriga-alonso 14 Feb 2019 17:12 UTC
1 point
on: Allowing a formal proof system to self improve while avoiding Lobian obstacles.
I’ve spotted a mistake, I think: the relationship inside the first prover should be an if and only if relationship. This is because if $p_{2}$ says everything is a theorem, then $p_{2} [s] ⟹ p_{1} [s]$ trivially holds. Thus, the condition to give $p_{2}$ “proof privileges” should be $p_{1} [p_{2} [s] ⟺ p_{1} [s]]$ . There might still be some problems from the modal logic, I’ll check when I have some more time.
I can’t believe that I was the first one to spot this. The post also has very few upvotes. Did nobody that knows this stuff see this and spot the mistake immediately? Was the post dismissed to start with? (in my opinion unfairly, but perhaps not).

Adrià Garriga-alonso 18 Feb 2019 0:26 UTC
7 points
on: Is voting theory important? An attempt to check my bias.
After reading the title, my main objections to voting theory were:
- The theory is already understood well enough, what is hard is convincing existing institutions to change.
- Convincing existing institutions to change is really hard, so there’s not much point in advancing the theory
Though I do agree that public elections are a process by which huge amount of resources get allocated, in many of the world’s richest countries, so it’s an important problem.
You make some arguments against these two:
The debate among reform activists between various voting methods (IRV, approval, Condorcet, score, STAR, etc.; as well as the understanding of proportional representation) has progressed substantially (emphasis mine) in the 20 years that I’ve been a part of it, and I think it can progress further.
Can you point to an example of this? Something the understanding of which has improved in the last 20 years. Also, are activists and the academia one and the same? Is this improved understanding because new theory was developed, or because the activists started understanding theory that had already been developed in the academia.
Voting reform has happened before in various contexts, and it should be expected that sooner or later and somewhere or other it will happen again. Will it happen in the particular ways and places I’d like it to? There’s no way to be sure either way, but I’d say that the probability is certainly over 1%,
The key question is not whether it will happen somewhere with a probability of at least 1%, but rather whether you or the people you inspire can move the needle with probability 1 in a millionth (or less), in a place that’s sufficiently big (or even bigger). Are you thinking of influencing the US Gov in particular? That would certainly qualify for a big entity. What are the wins that the voting reform movement has done in the past few years? ( I suppose the #1 USA example is Fargo, in North Dakota, that passed approval voting )
(There’s probably a lot of people unhappy with voting in some way, so if you can convince them that your proposal is going to make their group more powerful, maybe it’s not so hard).

Adrià Garriga-alonso 19 May 2019 21:18 UTC
4 points
on: Interpretations of “probability”
Frequentist statistics were invented in a (failed) attempt to keep subjectivity out of science in a time before humanity really understood the laws of probability theory
I’m a Bayesian, but do you have a source for this claim? It was my understanding that Frequentism was mostly promoted by Ron Fisher in the 20th century, well after the work of Bayes.
Synthesised from Wikipedia:
While the first cited frequentist work (the weak law of large numbers, 1713, Jacob Bernoulli, Frequentist probability) predates Bayes’ work (edited by Price in 1763, Bayes’ Theorem), it’s not by much. Further, according to the article on “Frequentist Probability”, “[Bernoulli] is also credited with some appreciation for subjective probability (prior to and without Bayes theorem).”
The ones that pushed frequentism in order to achieve objectivity were Fisher, Neyman and Pearson. From “Frequentist probability”: “All valued objectivity, so the best interpretation of probability available to them was frequentist”. Fisher did other nasty things, such as using the fact that causality is really hard to soundly establish to argue that tobacco was not proven to cause cancer. But nothing indicates that this was done out of not understanding the laws of probability theory.
AI scientists use the Bayesian interpretation
Sometimes yes, sometimes not. Even Bayesian AI scientists use frequentist statistics pretty often.
This post makes it sound like frequentism is useless and that is not true. The concepts of: a stochastic estimator for a quantity, and looking at whether it is biased, and its variance; were developed by frequentists to look at real world data. AI scientists use it to analyse algorithms like gradient descent, or approximate Bayesian inference schemes, but the tools are definitely useful.

Adrià Garriga-alonso 19 May 2019 21:21 UTC
1 point
in reply to: ryan_b’s comment on: Interpretations of “probability”
I clicked this because it seemed interesting, but reading the Q&A:
In atypical game we consider, one player offers bets, another decides how to bet, and a third decides the outcome of the bet. We often call the first player Forecaster, the second Skeptic, and the third Reality.
How is this any different from the classical Dutch Book argument, that unless you maintain beliefs as probabilities you will inevitably lose money?