Thinking Partners: Building AI-Powered Knowledge Management Systems

This illustration is a satire of how we use AI today

TL;DR: Knowledge work has four main stages: capturing, retrieving, processing, and sharing. Without proper systems, each stage has bottlenecks:

  • Capturing: We forget 50% of new information within a day, 90% within a week. Manual capture breaks flow

  • Retrieving: The few informations that we do store are scattered across folders and applications

  • Processing: Complex thinking requires more, and better organized context than we can gather manually

  • Sharing: We reconstruct the same context over and over—for knowledge sharing (papers, posts, presentations), routine communication (emails, messages), and for AI conversations

Personal knowledge management systems like Zettelkasten, Obsidian or Notion partially address those bottlenecks but often require unsustainable manual discipline to maintain.

I’m building tools (a memory system and interfaces connecting to it) that make capturing and managing knowledge effortless and rewarding, transforming scattered notes, conversations, and ideas into an active thinking partner—one that processes knowledge continuously and constructs intelligent context for any task.

Intro

Niklas Luhmann revolutionized sociology using his Zettelkasten (slip box in German), a paper-based system of interconnected notes. He published more than 70 books and hundreds of articles. But his achievement wasn’t only productivity, it was also intellectual reach. His Zettelkasten allowed him to think at a scale and complexity that would be hard to achieve using only biological memory.

In his essay Communicating with Slip Boxes, he described how “a kind of secondary memory will arise, an alter ego with who we can constantly communicate”. It surprised him by revealing unexpected connections between ideas separated by years, or that occurred to him in very different contexts. As he puts it, the slip box provided “combinatorial possibilities which were never planned, never preconceived.”

The Problem

Building a Zettelkasten like Luhmann’s requires decades of disciplined manual work. I don’t have the time or the discipline to build one the way he did, and neither do most researchers or knowledge workers. Even dedicated personal knowledge management enthusiasts struggle to maintain their systems consistently.

Knowledge work happens mostly in four stages: capturing, retrieving, processing, and sharing. Without proper systems, each stage has bottlenecks:

Capturing: Most knowledge disappears before we document it.

The forgetting curve shows we forget approximately 50% of new information within a day and 90% within a week if it isn’t captured. Insights from conversations, connections noticed while reading, arguments developed in discussions—most disappear before we document them. Manual capture systems require us to stop what we are doing to take notes, breaking flow and slowing thinking.

Retrieving: Finding saved information takes too long.

Even when we capture knowledge, retrieval is inefficient. Information is scattered across folders, applications, documents, voice notes, and messages with no easy way to retrieve them. Retrieving insights or ideas often requires painstakingly checking multiple sources and losing precious time.

Processing: Complex thinking requires more context than we can gather.

Without proper knowledge management systems, complex thinking is almost impossible: synthesizing insights across dozens of sources, tracking how ideas evolved through months of conversations, seeing patterns that only emerge when concepts are juxtaposed in unexpected ways. The insights exist somewhere in our notes, but we can’t bring enough of it together simultaneously to do sophisticated thinking.

Sharing: We reconstruct the same context repeatedly.

Every time we communicate ideas, we manually reconstruct context from scratch. A blog post, a grant application, a student lecture, … all draw from the same underlying ideas but require hours to extract and re-sequence them differently. The first time might deepen your understanding, but the fifth time is probably just repetitive work.

Routine communication has the same problem. We lose hours weekly on emails and messages, hunting through Slack threads and email chains while context switching between tasks, manually reconstructing what’s relevant for each update or response.

AI conversations make this worse. Crafting effective prompts requires manually gathering and compiling the right context—work that takes time and expertise most people don’t have. Traditional RAG (Retrieval Augmented Generation) systems dump raw, unorganized documents into context, making them nearly useless for supporting complex thinking. Worse, more context actively degrades performance: research on 18 leading models shows systematic performance drops of 20-40 percentage points as context length increases, with models struggling most when critical information is buried in the middle.

To increase our individual capabilities, we need PKM (Personal Knowledge Management) tools that address these bottlenecks without requiring decades of manual discipline. Current solutions fail at this. Manual PKM systems like Notion, Obsidian, or even Zettelkasten require unsustainable discipline to maintain, and only partially solve the issues. AI assistants have poor memory and degrade with long context, and RAG worsen performance and doesn’t help complex thinking.

What I’m building

I’m building a memory system inspired by the memex[1] and interfaces connecting to it.

The initial design of the memory system (It needs a name, and, to coin one at random, ‘noonet[2] will do) will be based on the memex’s associative trails, the Zettelkasten[3], and other successful PKM systems. It will leverage LLMs to handle what previously required manual discipline: continuously processing what the interfaces feed it, summarizing content, linking concepts across contexts, surfacing unexpected connections, building hierarchies, and actively reorganizing as the user’s knowledge evolves.

The post MVP vision is to evolve this into something more dynamic and closer to ALife[4], where knowledge structures adapt and self-organize in ways that go beyond explicit programming.

The interfaces are split in two category:

  1. Capture: interfaces for capturing everything read, written, discussed, and worked on by the user. It will connect to existing tools (Google Drive, Notion, Slack, email, messaging apps) and continuously feed data to the noonet.

  2. Context building: interfaces that allow AIs or humans to extract information from the noonet for different tasks.

The system will adapt to different usage patterns. If the user doesn’t have time and needs quick information access, it will make it effortless. If the user wants to spend time exploring connections, having the system surface contradictions in their thinking, or point at fuzzy arguments that need clarifying, it will make that effortful thinking engaging and rewarding.

How it works in practice

The most immediate value will probably be automatic context reconstruction for AI interaction.

I haven’t settled on an exact architecture yet, I want to build this with early users and design partners, not in isolation. UX will be a key factor in whether the system is successful or not.

Here is an example of how it could look like:

You’ll connect your existing tools through a dashboard. The noonet will ingest and organize your information. You’ll be able to choose how much effort you want to invest in the organization of your knowledge, from fully automated (e.g. for people most interested in quick access to their knowledge and routine information sharing) to fully manual (e.g. for people most interested in doing complex thinking).

For the interfaces, I’m considering multiple options: MCP (Model Context Protocol) integration with ChatGPT/​Claude for automatic context construction fed alongside your prompt; desktop extensions that surface relevant context as you work; and voice interfaces for capturing and retrieving information on the go. The goal of those interfaces will be to remove the friction of storing and mobilizing your knowledge.

For example, when an AI is connected (e.g. via a MCP interface) to there noonet, users will be able to make queries like:

  • “Help me craft a message asking [person] what they think about [idea] I had last week”

    The interface finds that idea in the noonet, analyzes how the user usually interacts with that person, identifies what they have already shared, and builds context to feed alongside their prompt so the AI can draft an appropriate message.

  • “Help me answer this email from [client]”

    The interface retrieves the full conversation history with that client, relevant project context from documents and Slack, any commitments or deadlines mentioned, client management strategies discussed during meetings, and constructs a prompt that helps the AI draft a response that’s accurate and contextually appropriate.

  • “Help me turn that debate I had about [concept] into a post”

    The interface retrieves all related context about the concept in a structured way, surfaces insights and potential links to other related ideas the user might not have noticed, and constructs a prompt that helps the AI see the full argument structure so it can help them write and think with clarity.

  • “I want to discuss [subject]”

    The interface retrieves the user abstractions and thinking about the subject (not just relevant facts), the debates they had about it, their arguments, and builds a prompt that preserves their conceptual framework, so the AI can engage with the actual user’s understanding rather than generic information.

A Note on Cognitive Effort

Automating knowledge work raises a legitimate concern: could it remove valuable cognitive effort? Research on “desirable difficulties” shows that effortful learning enhances retention. Struggling to articulate an idea clearly, for instance, deepens understanding in ways that passively re-reading a textbook doesn’t.

My goal is to augment cognition, not replace it. Wrestling with ideas builds understanding. Searching through Slack for that conversation from last month doesn’t. The system should automate mechanical overhead while supporting the thinking that actually matters.

The line between mechanical and valuable work isn’t always clear. For example, manually organizing notes can force to think about relationships between ideas. I’m studying learning science and cognitive psychology to inform these design choices, and I will work with early users to identify which tasks genuinely help them think versus which are pure friction.

The system needs to be forgiving. If perfect discipline is required from day one, people won’t use it. Life happens—conferences, deadlines, exhaustion. The system should work automatically during these periods. When the user chooses to engage more deeply, they shouldn’t feel like they’re starting over or that this is conditional on them sustaining that level of engagement consistently. Deep engagement should multiply value, but baseline automation should still work when the user can’t maintain that engagement.

The system should follow a value curve: full automation delivers 50% of potential value, occasional engagement gets you to 80%, and consistent deep work reaches 95%. The last percentage points requiring disproportionate effort (akin to building a Zettelkasten), and that’s fine. Most people aren’t doing any of this systematically anyway, if these tools can nudge them toward better knowledge work practices while remaining useful when they can’t engage, I would consider that a success.

How this differs from existing tools

Tools like Roam, Obsidian and Notion require manual organization that most people can’t sustain. AI-powered tools have emerged to address this:

Mem.ai offers automatic tagging, collections, natural language search, and an AI assistant that can retrieve information from notes. It focuses on making stored information easily searchable.

Rewind.ai records everything on the screen and transcribes audio, storing it locally. It can search through this captured data via a prompt. It also has an AI integration that is essentially a RAG over the user’s digital history.

Both are retrieval systems optimized for storing and finding information. The architecture is: capture or store → index → search → retrieve.

What I’m building has a fundamentally different architecture. The noonet doesn’t just store information—it continuously processes it. Notes get summarized, linked to related concepts, reorganized as understanding evolves. What gets stored isn’t the original content, but an actively maintained knowledge structure that represents it. When you query the system, you’re not retrieving static documents—you’re accessing processed knowledge, chain of thoughts, concepts, …

The interface layer adds another distinction: instead of dumping retrieved documents into AI context (the naive RAG approach that causes performance degradation), it constructs precise prompts tailored to specific cognitive tasks. This isn’t a better search interface—it’s a different paradigm. They are building memory aids, I am building a thinking partner.

Why Me

I invented state-of-the-art jailbreaking techniques and I intuitively understand LLM behaviors. I created Persona Modulation, the technique used in the NeurIPS 2023 SoLaR workshop paper “Scalable and Transferable Black-Box Jailbreaks for Language Models via Persona Modulation”. I spent a year developing an improvement that broke all SOTA models automatically on every tested behavior in under 10 minutes, presented at the PARIS AI Summit, Day 1. I was a SERI MATS 3 scholar researching LLM cognition and working on cyborgism resulting in a new research direction called LLM Ethology. I worked as a contractor for METR (formerly ARC Evals) evaluating GPT-4 and other models before public release, I’m an Anthropic HackerOne red teamer, and I won GraySwan’s jailbreak championship with a single prompt. This matters for building knowledge tools because intelligent prompt engineering that avoids context rot requires understanding exactly how LLMs fail.

I’ve shipped production AI systems that companies pay for. I built the PRISM Eval platform from the ground up: over 60k lines of production code (front + back + infra + prompting) implementing autonomous jailbreak systems. Amazon AGI and other major companies under NDA paid for this. The platform is still running in production.

I can build complex systems fast. At 42 school, I ranked 2nd in my cohort of over 300 students on pure software engineering skills, while actively participating in 42AI (I was vice president in 2021-2022 and president in 2022-2023), teaching ML to 400+ students. Multiple hackathon wins: Apart Research AI Safety, LVMH OpenDare, Google Cloud Climate. I worked part-time for Outmind as their only NLP engineer for a year and a half during my studies, I know how to design systems that connect to multiple knowledge sources.

I live with this problem daily. I’m an AI safety researcher. This isn’t a market opportunity I discovered through research; it’s friction that actively slows down my own work. I feel the pain of reconstructing context across conversations, synthesizing evolving arguments from literature, and losing connections between ideas. I’ll be my first user, though I’m committed to building with design partners to avoid anchoring purely on my own use cases.

I know how to iterate. I’ve attempted multiple startups over the past few years, including spending a year and a half as CTO in the Paris startup ecosystem. I’ve learned what not to do (premature scaling, building without users, ignoring market feedback, not putting enough effort on communication). This time I’m building with design partners from day one, validating ruthlessly, and staying lean enough to pivot completely if needed.

Who this is for

Right now, I’m building this for people doing cognitively demanding work under heavy time constraints. Researchers, writers, strategists, anyone who needs to synthesize information from many sources, and increase their ability to do complex thinking.

As the tool matures, the use cases will expand. Knowledge workers of all types face these bottlenecks. Product managers tracking conversations with users, consultants synthesizing client insights. Anyone working with information and communicating ideas could benefit from reducing those bottlenecks.

Where this could lead

The immediate value is personal augmentation, but these tools also enable natural extensions.

Avatars—Learning from others: A noonet could power AI avatars that don’t just access knowledge, but embody the thinking behind it—the arguments, reasoning patterns, and conceptual frameworks. You could create avatars from domain experts’ published work (papers, talks, documented reasoning) for interactive mentorship or debates. Collective knowledge bases like LessWrong, Wikipedia, or company documentation could also become conversational avatars that embody the community’s reasoning, letting you engage with entire bodies of knowledge.

Avatars—Scaling yourself: Your own noonet could power an avatar that handles interactions on your behalf. For routine communications, it could respond in your voice and escalate novel queries to you. Public figures could use this to offer “speak with me” interfaces where their avatar engages with thousands of people. The noonet would process these conversations as if they had them, then surface valuable insights later: “I noticed you are thinking about something your avatar discussed with [person] last week” or “These are the 10 most interesting discussions from this week.” This lets individuals reach far more people than direct interaction allows, creating connections between ideas and people that wouldn’t otherwise happen.

Interconnected networks: When multiple people have noonets, they could connect through explicit protocols. A team could share relevant context automatically instead of reconstructing it in meetings. A CEO could query patterns across organizational conversations. This requires careful design (explicit rules for information flow, strong privacy controls) but could enable collective intelligence where knowledge actually connects instead of staying trapped in individual minds.

AGI as network participants: The architecture doesn’t have to distinguish between human and AI nodes. An AGI could have its own noonet and participate in the coordination network as a peer. If we can’t prevent or slow down AGI development, having infrastructure where humans remain architecturally integral to collective intelligence might matter. This becomes especially relevant as multi-agent AGI systems start being developed, and we have to make a choice of engineering the human in the loop or not.

The personal tools need to work first; that’s the foundation. But the potential for human-AGI coordination into multi-agent systems is what genuinely excites me about this project. I’ll write more about that in future blog posts.

What I’m Looking For

I’m at the beginning of this project and need several things:

Design partners: I want to build this with early users, not in isolation. If you’re a researcher, writer, or knowledge worker who feels these pain points acutely, I want to talk. Understanding real workflows and needs will shape what gets built.

Feedback and critique: This vision has obvious failure modes I’m still thinking through. Privacy concerns (storing all the user’s data could create a massive attack surface), dependency (what happens when the system fails?), cognitive atrophy (genuine risk of outsourcing too much thinking), cost (compute requirements for continuous processing).

Funding and support: I’m planning to apply for small grants (Manifund here) and startup programs to get runway. If you know funding sources that might be relevant, or if you’re interested in supporting this work, please reach out.

Collaborators: Eventually, this needs a strong team. Engineers who understand distributed systems and know how to handle strong privacy requirements, researchers who understand what cognitive work is valuable, developers who can design great user experiences, and product managers who make sure the product is always aligned with the users’ needs. I’m not hiring immediately, but I’m building relationships now.

I’m also planning to share more detailed thinking on LessWrong as I develop this further, and I’ll be looking to get embedded in the AI ecosystem in the Bay Area to learn from others working on related problems.

If any of this resonates, I want to hear from you, whether that’s interest in using an early version, critical feedback on the approach, or just thoughts on whether this direction seems valuable.

You can stay updated on this project at https://​​weavemind.ai, and you can reach out to me at contact@weavemind.ai

Thanks a lot to Esben Kran, Hermes Bozec, Nicholas Miailhe, Pierre Peigné and Tom David for the feedback on the post and discussions that led to this idea

  1. ^

    I recommend reading “As We May Think”, it is surprisingly relevant. I read it after having written this post, and there are a lot of similar observations and ideas than what I am aiming for. Here is where he coined the name:

    Consider a future device for individual use, which is a sort of mechanized private file and library. It needs a name, and, to coin one at random, ‘memex’ will do. A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility. It is an enlarged intimate supplement to his memory.

  2. ^

    noo- being the contracted form of nous (Ancient Greek for mind/​intelligence)

  3. ^

    Despite having similar concepts and have been imagined around the same time, both systems seems to be completely independent (Claude deepresearch)

  4. ^

    Artificial life: I will try grounding the design of the tools into how the brain and ecosystems effectively store, process and leverage information.