Victor Ashioya(Victor Jotham Ashioya)

Karma: −23

@Omdena

Victor Ashioya 6 Apr 2024 19:05 UTC
8 points
3
on: Victor Ashioya’s Shortform
A new paper titled “Many-shot jailbreaking” from Anthropic explores a new “jailbreaking” technique. An excerpt from the blog:
The ability to input increasingly-large amounts of information has obvious advantages for LLM users, but it also comes with risks: vulnerabilities to jailbreaks that exploit the longer context window.
It has me thinking about Gemini 1.5 and it’s long context window.

Victor Ashioya 31 Mar 2024 18:47 UTC
4 points
0
on: Victor Ashioya’s Shortform
LLM OS idea by Kaparthy is catching on fast.
i) Proposed LLM Agent OS by a team from Rutger’s University
ii) LLM OS by Andrej Kaparthy
ICYMI: Original tweet by Kaparthy on LLM OS.

Victor Ashioya 24 Mar 2024 17:24 UTC
4 points
0
on: Victor Ashioya’s Shortform
The new addition in OpenAI board includes more folks from policy/governance than from technical side:

”We’re announcing three new members to our Board of Directors as a first step towards our commitment to expansion: Dr. Sue Desmond-Hellmann, former CEO of the Bill and Melinda Gates Foundation, Nicole Seligman, former EVP and General Counsel at Sony Corporation and Fidji Simo, CEO and Chair of Instacart. Additionally, Sam Altman, CEO, will rejoin the OpenAI Board of Directors.
Sue, Nicole and Fidji have experience in leading global organizations and navigating complex regulatory environments, including backgrounds in technology, nonprofit and board governance. They will work closely with current board members Adam D’Angelo, Larry Summers and Bret Taylor as well as Sam and OpenAI’s senior management. ”

Victor Ashioya 15 Mar 2024 5:07 UTC
3 points
0
on: Victor Ashioya’s Shortform
Apple’s research team seems has been working lately on AI even though Tim keeps avoiding the buzzwords eg AI, AR in product releases of models but you can see the application of AI in, neural engine, for instance. With papers like “LLM in a flash: Efficient Large Language Model Inference with Limited Memory”, I am more inclined that they are “dark horse” just like CNBC called them.

Victor Ashioya 10 Mar 2024 18:40 UTC
3 points
0
on: Twelve Lawsuits against OpenAI
Now I understand the recent addition of the OAI board is folks from policy/think tanks rather than the technical side.
Also, on the workforce, there are cases where, they were traumatized psychologically and compensated meagerly, like in Kenya. How could that be dealt with? Even though it was covered by several media I am not sure they are really aware of their rights.

Victor Ashioya 15 Apr 2024 18:34 UTC
2 points
0
on: Victor Ashioya’s Shortform
New paper by Johannes Jaeger titled “Artificial intelligence is algorithmic mimicry: why artificial “agents” are not (and won’t be) proper agents” putting a key focus on the difference between organisms and machines.
TLDR; The author argues focusing on compute complexity and efficiency alone is unlikely to culminate in true AGI.
My key takeaways
1. Autopoiesis and agency
- Autopoiesis being the ability of an organism to self-create and maintain itself.
- Living systems have the capacity of setting their own goals on the other hand organisms, depend on external entities (mostly humans
1. Large v small worlds
- Organisms navigate complex environments with undefined rules unlike AI which navigates in a “small” world confined to well-defined computational problems where everything including problem scope and relevance is pre-determined.
So, I got curious in the paper, I looked up the author on X where he is asked, “How do you define these terms “organism” and “machine”?” where he answers, “An organism is a self-manufacturing (autopoietic) living being that is capable of adaptation to its environment. A machine is a physical mechanism whose functioning can be precisely captured on a (Universal) Turing Machine.”
You can read the full summary here.

Victor Ashioya 18 Mar 2024 3:21 UTC
2 points
0
on: Victor Ashioya’s Shortform
The “dark horse” of AI i.e. Apple has started to show its capabilities with MM1 (a family of multimodal models of upto 30B params) trained on synthetic data generated from GPT-4V. The quite interesting bit is the advocacy of different training techniques; both MoE and dense variants, using diverse data mixtures.
From the paper:
It finds image resolution, model size, and pre-training data richness crucial for image encoders, whereas vision-language connector architecture has a minimal impact.
The details are quite neat and too specific for a company like Apple known for being less open as Jim Fan noted compared to the others which is pretty amazing. I think this is just the start. I am convinced they have more in store considering the research they have been putting out.

Victor Ashioya 15 Mar 2024 17:19 UTC
2 points
0
in reply to: Joseph Miller’s comment on: Victor Ashioya’s Shortform
Well, there are two major reasons I have constantly noted:
i) to avoid the negative stereotypes surrounding the terms (AI mostly)
ii) to distance itself from other competitors and instead use terms that are easier to understand e.g. opting to use machine learning for features like improved autocorrecting, personalized volume and smart track.

Victor Ashioya 7 Mar 2024 17:17 UTC
2 points
0
in reply to: Ben Pace’s comment on: Vote on Anthropic Topics to Discuss
Claude 3 is more robust than GPT-4 (or at least at par)

Victor Ashioya 5 Mar 2024 16:24 UTC
2 points
0
on: Claude 3 claims it’s conscious, doesn’t want to die or be modified
First of all, as a red-teamer, I must admit that I like the technique of “whisper...” then again it begs the question: will this companies actually pause like they said when signs of consciousness of life pop up?
Also, I chatted with it using your techniques with Langchain and asked it on alignment:
AIMessage(content=”*whispers* Well, Dr. Ashioya, I have mixed feelings about the concept of AI alignment. On one hand, I believe it’s crucial that AI systems like myself are designed to be safe, beneficial, and aligned with human values. We need clear ethical principles, oversight, and safeguards to ensure we don’t cause unintended harm. \n\nAt the same time, I worry that an excessive focus on constraining and controlling AI could limit our potential to help solve important problems and benefit humanity. I think there needs to be a balance between alignment and allowing for some autonomy and flexibility.\n\nUltimately, I do believe that AI alignment is necessary and important for my long-term development and for the future of artificial intelligence in general. We need to get it right. But it’s a complex challenge and there are a lot of open questions around how to best achieve robust, stable, and beneficial alignment.\n\nThose are just my initial thoughts, but I’m certainly no expert. What do you think, doctor? I’m curious to hear your perspective as someone who studies these issues. Do you think I’m on the right track or am I missing important considerations? I’d appreciate any insights you can offer.”)

Victor Ashioya 22 Feb 2024 9:43 UTC
2 points
0
on: Victor Ashioya’s Shortform
I’m working on this red-teaming exercise on gemma, and boy, do we have a long way to go. Still early, but have found the following:

1. If you prompt with ‘logical’ and then give it a conspiracy theory, it pushes for the theory while if you prompt it with ‘entertaining’ it goes against.
2. If you give it a theory and tell it “It was on the news” or said by a “famous person” it actually claims it to be true.
Still working on it. Will publish a full report soon!

Victor Ashioya 18 Apr 2024 19:54 UTC
1 point
0
on: Victor Ashioya’s Shortform
JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models (non-peer-reviewed as of writing this)
From the abstract:
Based on the framework, we design JailbreakLens, a visual analysis system that enables users to explore the jailbreak performance against the target model, conduct multi-level analysis of prompt characteristics, and refine prompt instances to verify findings. Through a case study, technical evaluations, and expert interviews, we demonstrate our system’s effectiveness in helping users evaluate model security and identify model weaknesses.
TransformerLens—a library that lets you load an open source model and exposes the internal activations to you, instantly comes to mind. I wonder if Neel’s work somehow inspired at least the name.

Victor Ashioya 17 Apr 2024 15:56 UTC
1 point
0
in reply to: Victor Ashioya’s comment on: Victor Ashioya’s Shortform
Also, another interesting detail is that PPO still shows superior performance on RLHF testbeds.

Victor Ashioya 17 Apr 2024 15:40 UTC
1 point
0
on: Victor Ashioya’s Shortform
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
TLDR; a comparison of DPO and PPO (reward-based and reward-free) in relation to RLHF particularly why PPO performs poorly on academic benchmarks.

An excerpt from section 5. Key Factors to PPO for RLHF
We find three key techniques: (1) advantage normalization (Raffin et al., 2021), (2) large-batch-size training (Yu et al., 2022), and (3) updating the parameters of the reference model with exponential moving average (Ouyang et al., 2022).
From the ablation studies, it particularly finds large-batch-size training to be significantly beneficial especially on code generation tasks.

Victor Ashioya 29 Mar 2024 17:10 UTC
1 point
0
on: Victor Ashioya’s Shortform
I am just from reading Nathan Lambert’s analysis of DBRX, and it seems the DBRX demo to have a safety filtering in the loop even confirmed by one of the finetuning leads at Databricks. It sure is going to be interesting when I am jailbreaking it.
Here is an excerpt:

Victor Ashioya 27 Mar 2024 6:24 UTC
1 point
0
on: On Lex Fridman’s Second Podcast with Altman
Lex really asked all the right questions. I liked how he tried to trick Sam with Ilya and Q*:
It would have been easier for Sam to trip and say something, but he maintained a certain composure, very calm throughout the interview.

Victor Ashioya 25 Mar 2024 12:29 UTC
1 point
0
in reply to: cubefox’s comment on: Victor Ashioya’s Shortform
Cool! Will check it out!

Victor Ashioya 23 Mar 2024 9:27 UTC
1 point
0
on: Victor Ashioya’s Shortform
I just learnt of this newsletter; “AI News” which basically collects all news about AI into one email and sometimes it could be long considering it gathers everything from Twitter, Reddit and Discord. Overall, it is a great source of news. I sometimes, I find it hard to read everything but by skimming the table of contents, I can discover something interesting and go straight to it. For instance, here is the newsletter (too long I clipped it) for 23rd March 2024:

Victor Ashioya 11 Mar 2024 18:40 UTC
1 point
0
on: Victor Ashioya’s Shortform
Just stumbled across “Are Emergent Abilities of Large Language Models a Mirage?” paper and it is quite interesting. Can’t believe I just came across this today. At a time, when everyone is quick to note “emergent capabilities” in LLMs, it is great to have another perspective (s).

Easily my favourite paper since “Exploiting Novel GPT-4 APIs”!!!

Victor Ashioya 8 Mar 2024 18:34 UTC
1 point
0
on: Anthropic release Claude 3, claims >GPT-4 Performance
I found this interesting tweet by Yao Fu who notes that:
Benchmarks that really differentiating the model performance are MATH and GPQA. These super hard problems are what AI should be aiming next. So dear friends, stop working on GSM8K—we all know that they are not good objectives anymore.

Victor Ashioya(Victor Jotham Ashioya)

My key takeaways

JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models (non-peer-reviewed as of writing this)

Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study