How do I use AI to search for information on 1000 companies or so? This turns out to be harder than I thought. The difficulty can be expressed computationally: I’m requesting Output tokens = Linear(Input prompt) and that’s a lot of hard cash. This is interesting because to non-computer-scientist normies, this wrinkle is really not apparent. AIs can do basic agentic reasoning (like loop over a list of 1000 companies) and a bunch of searches, surely you can combine them? Yes it turns out, but it’ll cost you like $38 for one query.
I went with Ottogrid.ai, which was fun, but expensive.
If I really had time to hack at this, I would try to do breadth passes then depth passes. like google “top 20 AI companies”, scrape stats, then do narrower searches as time goes on. Is there a name for this algo? Idk. But people use it constantly, even cleaning the kitchen and so on.
But is that not done already???? This seems like the first AI-practical problem you’d solve once you had search-capable AI or early agentic or early reasoning.
Perhaps you could save some money by telling the AI to write a Python code that would scrape some information from the websites, convert them to plain text (much less data), and then using the AI to process that text?
The idea is that you pay for creating the Python code, and for processing its output, but you don’t pay for the Python code downloading and processing the data.
I’m not all that sure how AI search works. Searches, and indexes top 20 hits, or something like that. Is reading a webpage the expensive part? If so then caching/context window management might matter a lot. Plain text might backfire if you actually lose table structure and stuff. You can probably ignore styles at least.
How do I use AI to search for information on 1000 companies or so? This turns out to be harder than I thought. The difficulty can be expressed computationally: I’m requesting Output tokens = Linear(Input prompt) and that’s a lot of hard cash. This is interesting because to non-computer-scientist normies, this wrinkle is really not apparent. AIs can do basic agentic reasoning (like loop over a list of 1000 companies) and a bunch of searches, surely you can combine them? Yes it turns out, but it’ll cost you like $38 for one query.
I went with Ottogrid.ai, which was fun, but expensive.
If I really had time to hack at this, I would try to do breadth passes then depth passes. like google “top 20 AI companies”, scrape stats, then do narrower searches as time goes on. Is there a name for this algo? Idk. But people use it constantly, even cleaning the kitchen and so on.
But is that not done already???? This seems like the first AI-practical problem you’d solve once you had search-capable AI or early agentic or early reasoning.
Perhaps you could save some money by telling the AI to write a Python code that would scrape some information from the websites, convert them to plain text (much less data), and then using the AI to process that text?
The idea is that you pay for creating the Python code, and for processing its output, but you don’t pay for the Python code downloading and processing the data.
I’m not all that sure how AI search works. Searches, and indexes top 20 hits, or something like that. Is reading a webpage the expensive part? If so then caching/context window management might matter a lot. Plain text might backfire if you actually lose table structure and stuff. You can probably ignore styles at least.