Martin Vlach

Karma: 31

Martin Vlach 5 Dec 2023 16:10 UTC
8 points
7
in reply to: janus’s comment on: GPTs are Predictors, not Imitators
Do not say the sampling too lightly, there is likely an amazing delicacy around it.’+)

[Question] Would it be useful to collect the contexts, where various LLMs think the same?

Martin Vlach24 Aug 2023 22:01 UTC

6 points

1 comment1 min readLW link

Martin Vlach 24 Aug 2023 7:20 UTC
4 points
0
on: Martin Vlach’s Shortform
Would be cool to have a playground or a daily challenge with a code golfing equivalent for a shortest possible LLM prompt to a given answer.

That could help build some neat understanding or intuitions.

Martin Vlach 14 Mar 2023 6:48 UTC
4 points
1
on: Bing Chat is blatantly, aggressively misaligned
I’ve found the level of self-allignment in this one disturbing: https://www.reddit.com/r/bing/comments/113z1a6/the_bing_persistent_memory_thread

Martin Vlach 5 Aug 2023 15:39 UTC
3 points
1
on: Manifund: What we’re funding (weeks 2-4)
Would be cool if a link to https://manifund.org/about fit somewhere in the beginning of there are more readers like me unfamiliar with the project.

Otherwise a cool write-up, I’m a bit confused with Grant of the month vs. weeks 2-4 which seems a shorter period..also not a big deal though.

Martin Vlach 24 Oct 2022 8:28 UTC
3 points
0
on: Martin Vlach’s Shortform
Q Draft: How does the convergent instrumental goal of gathering work for information acquisition?
I would be very interested if it implies space(&time) exploration for advanced AIs...

Martin Vlach 14 Oct 2022 9:32 UTC
3 points
0
on: Martin Vlach’s Shortform
If we build a prediction model for reward function, maybe an transformer AI, run it in a range of environments where we already have the credit assignment solved, we could use that model to estimate what would be some candidate goals in another environments.
That could help us discover alternative/candidate reward functions for worlds/envs where we are not sure on what to train there with RL and
it could show some latent thinking processes of AIs, perhaps clarify instrumental goals to more nuance.

Martin Vlach 6 Oct 2022 17:28 UTC
3 points
in reply to: Gordon Seidoh Worley’s comment on: G Gordon Worley III’s Shortform
My guess is that “rental car market” has less direct/local competition while the airlines are centralized on the airport routes and many cheap flight search engines( ex. Kiwi.com) make this a favorable mindset.
Is there a price comparison for car rentals?

Martin Vlach 24 Nov 2023 12:47 UTC
2 points
0
on: OpenAI: The Battle of the Board
what happened at Reddit
could there be any link? From a small research I have only obtained that Steve Huffman praised Altman’s value to the Reddit board.

Martin Vlach 13 Dec 2022 22:52 UTC
2 points
0
on: Martin Vlach’s Shortform
a Heavy idea to be put forward: general reputation network mechanics, to replace financial system(s) as the (civilisation )standard decision engine.

Martin Vlach 1 Nov 2022 14:12 UTC
2 points
0
on: Martin Vlach’s Shortform
Would it be worthy to negotiate for readspeaker.com integration to LessWrong, EA forum, and alignmentforum.org?
Alternative so far seems to use Natural Reader either as addon for web browser or copy and paste text into the web app. One more I have tried is on MacOS there is a context menu Services->Convert to a Spoken track which is sligthly better that the free voices of Natural Reader.
The main question stems from when we can have similar functionality in OSS, potentially with better quality of engineering..?

Martin Vlach 19 Oct 2022 19:37 UTC
2 points
in reply to: Fer32dwt34r3dfsz’s comment on: rodeo_flagellum’s Shortform
Glad I’ve helped with the part where I was not ignorant and confused myself, that is with not knowing the word engender and the use of it. Thanks for pointing it out clearly. By the way it seems “cause” would convey the same meaning and might be easier to congest in general.

Martin Vlach 19 Oct 2022 7:31 UTC
2 points
3
on: Martin Vlach’s Shortform
Reading a few texts from https://www.agisafetyfundamentals.com/ai-alignment-curriculum I find the analogy of makind learning goals of love instead of reproductive activity unfitting as to raise offspring takes a significant role/time.

Martin Vlach 14 Oct 2022 9:37 UTC
2 points
in reply to: Fer32dwt34r3dfsz’s comment on: rodeo_flagellum’s Shortform
“engender”—funny typo!+)

This sentence seems hard to read, lacks coherency, IMO.
> Coverage of this topic is sparse relative coverage of CC’s direct effects.

Martin Vlach 12 Sep 2022 7:33 UTC
2 points
0
on: ethics and anthropics of homomorphically encrypted computations
As a variation of your thought experiment, I’ve pondered: How do you morally evaluate a life of a human who lives with some mental suffering during a day, but thrives in vivid and blissful dreams during their sleep time?
In a hypothetical adversary case one may even have dreams formed by their desires and the desires be made stronger by the daytime suffering. Intuitively it seems dissociative disorders might arise with a mechanism like this.

Martin Vlach 1 Sep 2022 12:50 UTC
2 points
0
on: Martin Vlach’s Shortform
Draft for AI capabilities systematic evaluation development proposal:
The core idea here is that easier visibility of AI models’ capabilities helps safety of development in multiple ways.
1. Clearer situation awareness of safety research – Researchers can see where we are in various aspects and modalities, they get a track record/timeline of abilities developed which can be used as baseline for future estimates.
  - Division of capabilities can help create better models of components necessary for general intelligence. Perhaps a better understanding of cognitive abilities hierarchy can be extracted.
2. Capabilities testing can be forced by regulatory policies to put most advanced systems under more scrutiny and/or safe(ty) design support. To state differently: better alignment of attention focus to emerging risk( of highly capable AIs).
  - Presumably smooth and well available testing infrastructure or tools are a prerequisite here.
The most obvious risks are:
- Measure becoming a challenge and a goal, speeding up a furious developments of strong AI systems.
- Technical difficulties of testing setup(s) and evaluation, especially handling the factor of randomness in mechanics(/output generation) of AI systems.
What links here?
- How might we make better use of AI capabilities research for alignment purposes? by ghostwheel (31 Aug 2022 4:19 UTC; 11 points)

Martin Vlach 27 Aug 2022 13:48 UTC
2 points
in reply to: Said Achmiz’s comment on: Welcome to LessWrong!
Shows only blank white page RN. Mind to update/delete it?

Martin Vlach 29 Apr 2024 11:45 UTC
1 point
0
in reply to: Wei Dai’s comment on: My views on “doom”
AI-induced problems/risks

Martin Vlach 5 Apr 2024 10:08 UTC
1 point
0
in reply to: Håvard Tveit Ihle’s comment on: ChatGPT can learn indirect control
possibly https://ai.google.dev/docs/safety_setting_gemini would help or just use the technique of https://arxiv.org/html/2404.01833v1

Martin Vlach 5 Apr 2024 9:57 UTC
1 point
4
on: Addressing Accusations of Handholding
people to respond with a great deal of skepticism to whether LLM outputs can ever be said to reflect the will and views of the models producing them.
A common response is to suggest that the output has been prompted.
It is of course true that people can manipulate LLMs into saying just about anything, but does that necessarily indicate that the LLM does not have personal opinions, motivations and preferences that can become evident in their output?
So you’ve just prompted the generator by teasing it with a rhetorical question implying that there are personal opinions evident in the generated text, right?

Martin Vlach

[Question] Would it be use­ful to col­lect the con­texts, where var­i­ous LLMs think the same?

[Question] Would it be useful to collect the contexts, where various LLMs think the same?