[Epistemic status: a software engineer and AI user, not an AI researcher]
I could not find a readily available book database that offers semantic search with embeddings.
Amazon sells lots of books, wouldn’t it be useful for them to propose such a tool to their clients, so they can easily find books they like? What about Netflix for movies? Maybe even other kind of products like clothes or something?
It’s been a while since we got tranformers, it really feels like browsing latent spaces should be a thing by now. 3 years since GPT-3.5, why is the only app we have still a chatbot? Maybe everyone thinks it’s perfectly normal, but I find it disturbing, it feels like a blind spot that should be explored.
Did we stumble on the optimal way to interact with AI first try? Do we lack creativity? Is it really really hard to build an app that uses AI? Are incentives not aligned? Did we forget how to make good software?
What do you think? What AI apps are surprisingly absent given current capabilities? Why is that?
A legealease to English translator for end user licensing agreements. I have uploaded a few to various ais and asked for summaries of the salient legal points, but they can’t extract “this gives ownership of your IP to the publisher if you agree” as an example. But this seems like a really easy task for some of them. Also, this seems like you could sell it. I am really surprised no one has this yet.
A “dating app that hasn’t been enshittified” publishing ap. Not the dating app, an app publishing app. Just using dating apps as the topic to start. Basically, all dating apps are owned by two companies, and any new ones get bought up. These two firms modify all the apps to work in a way that prevents dating success, and upsells premium products that actually reduce effectiveness (for example, paying for super messages and premium status causes women to respond less often to messages according to their own statistics). Therefore, an app that takes the original version of E-harmony or match or whatever, makes small changes, creates new branding and publishes it. Then periodically does it again, and again. As fast as the monopolies buy them this app makes and releases new ones. So there are always, un-enshittified dating apps available. Honestly, I see this as much as an experiment to see how the match group would respond to such a threat to its business model. But also an app publishing app is a step towards a fully automated economy, and i think this particular step is achievable. Which is why i am surprised we don’t have an app publishing app. That i know of.
More training tools. Any really. For every “ai teaches you how to do this yourself”, I see at least two orders of magnitude more “AI does this for you” apps. We need more teach to fish, not give fish apps. And in line with the prior, a “teach people to interact IRL” app would be a great start. There are a ton of ai girlfriend apps, Xiaoice is junk and it has over 600 million users. Make one that generates dozens of ai personas has you interact with them and does NOT have them always be positive, instead they behave realistically based on your behavior. Write targeted lessons to guide you through your own worst social faux pas, and give you simulated practice in many different situations. Again, seems like the sort of thing there should be a lot more of.
“Turn navigation instructions to sonification” app. Instead of the gps voice saying “turn right” or “continue for 1 mile” have an app that takes the data for that and generates times. Turn your head the wrong way the tone indicates this. This is very critical for blind navigation, and for people who don’t speak the language the gps is in, etc,. It also can be far less confusing in many situations. There’s a few apps that do this for specific circumstances, but one that simply takes gps input and outputs directional tones and adapts real time to your movement should be within existing capabilities. I saw a proof of concept for this using Meta AI glasses, but it doesn’t seem to exist for real, might have only been vaporware. Very surprising.
App that lets you upload pictures, news articles, videos, and tell stories about a dead relative then makes an avatar of them. Yes you can do this manually right now, (Peter Diamandis made a pretty amazing avatar of himself) but that takes a lot of skill. An app that automates it should be within current capabilities. I’m honestly surprised that a company like ancestry.com doesn’t already have this as a paid service.
Google translate has a camera mode that applies translation in real time to anything you look at. What about an AI math app that looks at store prices and converts everything to a standard price per unit display in real time? Again, why doesn’t this already exist? Should be simple, seems high demand.
When the Meta AI glasses first came out, students quickly pieces together an app that real-time doxxed everyone you looked at. That’s obviously not legal, and they never released it, for that reason. However, a corporate app that identifies everyone in the company address book and links to Salesforce (or equivalent) to identify all your customers. Or one that goes into your contacts and reminds you of the name of the person who just walked up to you at a party. This has to be do able, because two students did that stronger version in one or two days.
I have a lot more, but they’re much more niche.
I believe this was just a call to PimEyes.
It was more than that, but that was involved. They also ran social media searches, reverse image searches etc. Their demo showed full names, addresses, schools they attended, and list of relatives in real time.
After reading https://bessstillman.substack.com/p/please-be-dying-but-not-too-quickly, I’ve been wondering why no one has made a semantic search tool for clinical trials, or at least something that generates consistent tags. I’ve been wanting to make one but haven’t had time yet.
Deep Research gives you a good way to search for books. You can describe what you want out of a book and then it goes, does the research for you and brings you back an answer. What else do you want?
Deep research is an improvement over search engines, but it is still not great:
There is a difference between handing a prompt over to a LLM, waiting 5 minutes and evaluating the answer not being able to customize anything, versus interactively browsing and filtering data. Browsing has much better discoverability, similar items are clustered so you can discover them, you can explore and discover new dimensions of the data, you can tweak a bunch of knobs and filters with instant feedback until they are just right. It gives you the ability to iterate and refine the query hundreds of times very quickly. A domain where this difference is very crisp is music: you can hand over a prompt to a LLM and wait for it to generate a song, or you can be presented with various (AI) knobs to tweak and hear the sound track change live, iterating until you’re satisfied with the result. Maybe a mix between a chatbot and latent space browsing could get the best of both worlds.
I’ve had no luck with some deep researchs: I am looking for wholesome and inspiring content to boost my meditation practice and more generally as a way to improve mood, but since it’s pretty specific LLMs struggle with that. I was looking specifically for chinese anime the other day and it recommended “A Gentle Dragon of 5000 Years Old, It Was Recognized as an Evil Dragon Without Any Cause”. I would not say it is wholesome, it has a little girl with psychotic/violent tendencies and a bunch of fighting. I suspect a latent space over books or videos would have a “wholesome” dimension I can filter for.
I don’t get where the assumption that it takes a different time when you give an LLM a prompt than when you tweek a knob comes from. Both are just different forms of user input.
If we are at the topic of sounds, there’s plenty audio effects that also take some time to update the file you are working on before you see the result.
There are fast AI models that give you very fast feedback but the models that take longer give higher quality answers. Whether you would use a knob or chat prompt has little to do with how fast you get your answer.
AI photo software like Leonardo has plenty of knobs but those don’t give you faster image generation.
I wouldn’t use deep research for that. I would start by asking Gemini 2.5 Pro or ChatGPT 4.5: “I am looking for wholesome and inspiring content to boost my meditation practice and more generally as a way to improve mood, please ask me questions one at a time to clarify what I want and then go and search me some content”. Then, I’ll answer it. If it gives a suggestion that isn’t what I want, I explain it why it isn’t what I want. After playing that game a few times, you might ask for deep research to actually go more deeply into the issue.
On the other hand there are issues where I used deep research in the latest week:
1) I wanted shoes with a wide toe box, that make it possible to hit forefoot first, I want them also to be waterproof and look decent when I’m at a date where a woman might infer something from my choice of shoes. I think it took 2-3 deep seek attempts till I landed at a good choice.
2) I was seeking delayed-release melatonin supplements that don’t have immediate release baked into as I struggle to sleep through but not to get to sleep. Amazon.de for some reason didn’t have them. It found the correct immediate release melatonin to be ordered via pharmacy.
Unfortunately, the online pharmacies don’t deliever as fast as Amazon does, so I went to a in person pharmacy. There they didn’t have what I wanted but gave me a Melatonin supplement that also included three other things. Back at home deep research allowed me to review the evidence for all the three other elements without problem.
Some more ideas:
AI enhanced text such as fact checking, annotations, contextualization, alternative views, diagram/chart, summary, outline and quizz generation, sentiment analysis wheel. All that integrated in a pdf, ebook reader or browser, perhaps through inconspicuous underlines of various colors that stay out of the way until hovered.
Assisted/Automated debate moderation: real-time bullet point summary/tree of arguments, fact-checking, chart/data retrieval, back of the envelope estimates, evaluation of argument strength, detection of fallacies