Scattered thoughts about AI

Below are some scattered thoughts about AI, which I hope to update over time.

P.S. If you like models, building on RISK-V/FPGAs, and writing in C, and want to work on local inference, I'm hiring! lama@nuha.com

Last updated: 1 week, 4 days ago

I largely think that LLM-based B2B AI tools are too early to show any real success/takeoff, and consumer AI products are the most likely to find early success. That being said, it does not mean that you shouldn't attempt to build a B2B AI tool right now, because there's value in setting up proprietary data sourcing and collection. However, the fruits of that labor are likely going to take longer.
I've used a number of the AI coding tools, and while they're certainly impressive for simple things N, they can't really produce valuable code for N+1 level of difficulty, even with a very detailed report on how to solve the problem. I've talked to a number of software engineers, and they've come to the same conclusion.
It's interesting to me that, while everyone agrees that data cleaning is the most important part, nobody actually seems to be investing in it? RLHF is like a bandaid more than anything.
- Edit: I'm coming back to this point after some time, and I think too much RLHF neuters the model responses in a bad way.
Chat interfaces are a really boring UI, and prompts are a subpar UX. Why do users have to constantly find new ways to beg, cry, and threaten a computer to get the answers they want?
I do, however, believe that language models wouldn't have received nearly as much attention as quickly without the chat UI. It was (at the time) a really important design choice/innovation that showed what language models could do and what they were capable of in a really simple way, so much so that I knew 10-year-olds using ChatGPT in January 2023! The chat UI is still useful for some purposes, but I don't think it should be the end-all be-all for all interactions with language models.
I think search engines' decline is going to be far slower than what people imagine. I still find it sometimes easier to just search for an answer to query than wait for an LLM to generate a response for me.
Building auxiliary devices and marketing them as a replacement for existing XYZ device is as good as lighting money on fire, and proves that a company doesn't understand their customers (especially if they're selling to the consumer market).
The barrier to entry (ie, training costs) will drop faster than you can blink. You can already train Llama2-7B-level models for $100k.
We are in the late 1960s/early 1970s wave of personal computing as it relates to AI hardware. Everyone is over-investing in current boom-and-bust paradigms akin to the "mainframes will take over floors of every building!" mania from the 1970s, but that's all going to go to zero. Models today are very power-hungry, but nobody is pricing in the fact that models will have improved architectures that allow them to be less power-hungry over time and run on smaller hardware. (Even if investors are telling you they are, they actually aren't.) Anyone investing a huge amount of money into GPU data centers might as well be flushing it down the toilet.
In most cases, relying on synthetic data for training is garbage because the underlying model is trained on garbage data. I'm bearish on generating synthetic data for training out of models trained on reddit-speak.
Anyone serious about building a consumer AI companion product has to build out their own hardware infrastructure/distribution.
The hard part about building a consumer AI hardware company is not the hardware, nor the distribution, nor the scaling, much to many investors' confusion. It's in building a product that resonates with consumers. Investors do not understand this point, and due to lack of hardware experience, are focused on the wrong problems. This is what's going to cause many of them to be wiped out/miss out on the next wave of breakout AI/hardware companies. The criteria by which you should be judging companies is NOT a supply chain bet, but a PRODUCT bet.
The current wave of humanoid robotics startups are all going to fail because they lack data collection infrastructure/hardware distribution needed to train their robots to do useful tasks. These startups will start with 1-2 small trial contracts from companies that are willing to give them a chance, but that effort/scale is not at the level that is needed to actually build a useful industrial product, which is sad because some of the most talented roboticists in the world are working on these products.