sanxiyn

Karma: 692

sanxiyn 22 May 2024 13:27 UTC
5 points
1
in reply to: Chris_Leong’s comment on: EIS XIII: Reflections on Anthropic’s SAE Research Circa May 2024
I don’t think the post is saying the result is not valuable. The claim is that it underperformed expectation. Stock prices fall if they underperformed expectation, even if they are profitable. That does not mean they made loss.

sanxiyn 22 May 2024 13:22 UTC
4 points
1
in reply to: Sebastian Schmidt’s comment on: New voluntary commitments (AI Seoul Summit)
Unsure about this. Isn’t Qwen on Chatbot Arena Leaderboard, and is made by Alibaba?

sanxiyn 2 Mar 2024 4:47 UTC
8 points
1
in reply to: Gordon Seidoh Worley’s comment on: Elon Sues OpenAI
No. Traditionally, donors have no standing to sue charity. From https://www.thetaxadviser.com/issues/2021/sep/donor-no-standing-sue-donor-advised-fund.html
California limits by statute the persons who can sue for mismanagement of a charitable corporation’s assets. The court found that the claims raised by Pinkert for breach of a fiduciary duty for mismanagement of assets were claims for breach of a charitable trust. The court determined that under California law, a suit for breach of a charitable trust can be brought by the attorney general of California...

sanxiyn 27 Jul 2023 9:01 UTC
1 point
0
in reply to: jeff8765’s comment on: The First Room-Temperature Ambient-Pressure Superconductor
The patent is not yet granted.

sanxiyn 27 Jul 2023 8:58 UTC
2 points
0
on: The First Room-Temperature Ambient-Pressure Superconductor
Someone from South Korea is extremely skeptical and wrote a long thread going into paper’s details why it must be 100% false: https://twitter.com/AK2MARU/status/1684435312557314048. Sorry it’s in Korean, but we live in the age of miracle and serviceable machine translation.

sanxiyn 22 Jul 2023 22:00 UTC
3 points
1
on: A brief history of computers
But it wasn’t until the 1940s and the advent of the electronic computer that they actually built a machine that was used to construct mathematical tables. I’m confused...
You are confused because that is not the reality. As you can read on Wikipedia’s entry on difference engine, Scheutz built a difference engine derivative, sold it, and it was used to create logarithmic tables.
You must have read this while writing this article. It is prominent in the Wikipedia article in question and hard to miss. Why did you make this mistake? If it was a deliberate act of misleading for narrative convenience I am very disappointed. Yes, the reality is rarely narratively convenient, but you shouldn’t lie about it.

sanxiyn 6 Jul 2023 11:34 UTC
2 points
0
in reply to: cubefox’s comment on: [Linkpost] Introducing Superalignment
My median estimate has been 2028 (so 5 years). I first wrote down 2028 in 2016 (so 12 years after then), and during 7 years since, I barely moved the estimate. Things roughly happened when I expected them to.

sanxiyn 20 Jun 2023 3:17 UTC
1 point
0
in reply to: mic’s comment on: OpenAI introduces function calling for GPT-4
I am curious how this fine-tuning for function calling was done, because it is user controllable. In the OpenAI API, if you pass none to function_call parameter, the model never calls a function. There seem to be one input bit and one output bit, for “you may want to call a function” and “I want to call a function”.

sanxiyn 13 Jun 2023 0:08 UTC
8 points
11
in reply to: Hauke Hillebrandt’s comment on: UK PM: $125M for AI safety
While I agree being led by someone who is aware of AI safety is a positive sign, I note that OpenAI is led by Sam Altman who similarly showed awareness of AI safety issues.

sanxiyn 11 Jun 2023 3:56 UTC
1 point
0
on: Does anyone have access to videos of the BAAI Beijing AI conference talks?
I did the obvious thing and it worked? I have a suspicion you haven’t tried hard enough, but indeed we all have comparative advantages.
- Click the link, which is https://twitter.com/BAAIBeijing
- Click the link on the first tweet, to the conference website, which is https://2023.baai.ac.cn/
- The website is titled “2023 北京智源大会”, copy to the clipboard
- Type https://www.bilibili.com/ on the address bar, everyone knows that’s where Chinese videos are
- Paste “2023 北京智源大会” to the search box and press the enter key
- Click and watch 麻省理工Max Tegmark教授: 把AI关在受控的笼子里 (2023北京智源大会开幕式主旨演讲)

sanxiyn 10 Jun 2023 1:05 UTC
5 points
1
in reply to: Erich_Grunewald’s comment on: InternLM—China’s Best (Unverified)
Parallelization part (data parallelism, tensor parallelism, pipeline parallelism, ZeRO) is completely standard. See Efficient Training on Multiple GPUs by Hugging Face for a standard description. Failure recovery part is relatively unusual.

sanxiyn 8 Jun 2023 7:56 UTC
4 points
1
in reply to: ProgramCrafter’s comment on: LEAst-squares Concept Erasure (LEACE)
That is trivial to program? For example, you can have AutoGPT UI which lists pending tasks with icons next to them, where clicking a trashcan will completely erase it from the context. That doesn’t need any LLM-level help like LEACE.

sanxiyn 8 Jun 2023 7:44 UTC
4 points
0
in reply to: ProgramCrafter’s comment on: LEAst-squares Concept Erasure (LEACE)
What do you mean? Current LLMs are stateless. If unsuccessful attempts to solve the task are made, just reset the history and retry.

sanxiyn 5 Jun 2023 1:49 UTC
1 point
1
in reply to: GoteNoSente’s comment on: One implementation of regulatory GPU restrictions
There is no problem with air gap. Public key cryptography is a wonderful thing. Let there be a license file, which is a signed statement of hardware ID and duration for which license is valid. You need private key to produce a license file, but public key can be used to verify it. Publish a license server which can verify license files and can be run inside air gapped networks. Done.

sanxiyn 5 Jun 2023 1:31 UTC
3 points
0
in reply to: avturchin’s comment on: The AGI Race Between the US and China Doesn’t Exist.
I note that this is how Falcon from Abu Dhabi was trained. To quote:
Falcon is a 40 billion parameters autoregressive decoder-only model trained on 1 trillion tokens. It was trained on 384 GPUs on AWS over the course of two months.

sanxiyn 1 Jun 2023 1:55 UTC
1 point
0
in reply to: Lalartu’s comment on: Humans, chimpanzees and other animals
I think bow and arrow is powerful enough and gun is not necessary.

sanxiyn 31 May 2023 10:33 UTC
8 points
2
in reply to: DirectedEvolution’s comment on: What’s the consensus on porn?
As an example of question specific enough to be answerable by science, there is Is Pornography Use Associated with Sexual Difficulties and Dysfunctions among Younger Heterosexual Men? (2015). It begins:
Recent epidemiological studies reported high prevalence rates of erectile dysfunction (ED) among younger heterosexual men (≤40). It has been suggested that this “epidemic” of ED is related to increased pornography use. However, empirical evidence for such association is currently lacking.
The answer is no. As far as I know, this was among the first study powerful enough to answer this question. Well done, science!
Of course, nobody listens to science. Compare the introduction above with another introduction written 4 years later, from Is Pornography Use Related to Erectile Functioning? (2019).
Despite evidence to the contrary, a number of advocacy and self-help groups persist in claiming that internet pornography use is driving an epidemic of erectile dysfunction (ED).
The shift in tone is palpable, and you can just feel the powerlessness researchers feel about the situation.

sanxiyn 13 May 2023 15:55 UTC
5 points
1
in reply to: Mo Putera’s comment on: The way AGI wins could look very stupid
Since the topic of chess was brought up: I think the right intuition pump is endgame tablebase, not moves played by AlphaZero. A quote about KRNKNN mate-in-262 discovered by endgame tablebase from Wikipedia:
Playing over these moves is an eerie experience. They are not human; a grandmaster does not understand them any better than someone who has learned chess yesterday. The knights jump, the kings orbit, the sun goes down, and every move is the truth. It’s like being revealed the Meaning of Life, but it’s in Estonian.

sanxiyn 13 May 2023 0:07 UTC
1 point
0
in reply to: paulfchristiano’s comment on: I bet $500 on AI winning the IMO gold medal by 2026
I agree timescale is a good way to think about this. My intuition is if high school math problems are 1 then IMO math problems are 100(1e2) and typical research math problems are 10,000(1e4). So exactly half way! I don’t have first hand experience with hardest research math problems, but from what I heard about timescale they seem to reach 1,000,000(1e6). I’d rate typical practical R&D problems 1e3 and transformative R&D problems 1e5.
Edit: Using this scale, I rate GPT-3 at 1 and GPT-4 at 10. This suggests GPT-5 for IMO, which feels uncomfortable to me! Thinking about this, I think while there are lots of 1-data and 10-data, there are considerably less 100-data and above that most things are not written down. But maybe that is an excuse and it doesn’t matter.

sanxiyn 12 May 2023 2:37 UTC
13 points
5
in reply to: samuel prieto lima’s comment on: I bet $500 on AI winning the IMO gold medal by 2026
I kind of disagree. (I was on South Korean IMO team.) I agree IMO problems are in similar category of tasks including research math than high school math, but since IMO problems are intended to be solvable within a time limit, there is (quite low, in absolute sense) upper limit to their difficulty. Basically, intended solution is not longer than a single page. Research math problems have no such limit and can be arbitrarily difficult, or have a solution arbitrarily long.
Edit: Apart from time limit, length limit, and difficulty limit, another important aspect is that IMO problems are already solved, so known to be solvable. IMO problems are “Prove X”. Research math problems, even if they are stated as “Prove X”, is really “Prove or disprove X”, and sometimes this matters.