Ryan Fox@ryanfox·Apr 05

GN🥃 all! Sorry for being absent here. My time has mostly been spent doing deep dives in AI and then trying to let my brain breathe from information overload. It’s not easy! Anyway, I wanted to share some general frameworks and info I think are crucial for navigating our future, yet hardly anybody is paying attention to right now in AI:

The Basics:


  1. LLMs (large language models like GPT4) don’t connect to the internet. They can be thought of as similar to CPUs. They do not store data, they are essentially zip files of rules/principles for calculating using logic and reasoning.

  2. Computers up until this point have performed -Deterministic- compute.
    2 + 2 = 4
    A bit is a 1 or 0

LLMs perform -Probabilistic- compute.
2 + 2 is probably equal to 4-ish. Maybe 3, maybe 5.

It’s a fundamentally different kind of computation. Our brains do both types, and now computers can too!

The Cool Stuff:


  1. The cool stuff not enough people are talking about: Rebuilding “computers” with LLMs as the CPU:

If you download an LLM file and run it locally you might be disappointed to find it has no memory. It won’t remember your name and follow up questions will fail. Every question or statement starts the “conversation” over. It would be like throwing math questions at a CPU. It will compute each problem but with no memory to store the results. It’s rather uninteresting.

What makes ChatGPT so successful is that it re-feeds the entire printed conversation back into the LLM every time. It “fakes” a memory, but there are limits.

So how do we build LLMs with a memory? Well quite a few teams are working on building frameworks to chain things together, similar to how PC manufacturers assembled motherboards, graphics boards, hard drives, RAM, etc, and Apple/Microsoft/IBM developed Operating Systems.

Two teams I’ve really been intrigued by are LangChain and Pinecone. LangChain is a framework, or a way of putting components together…think of it as the full PC you might buy in a store. Pinecone is a vector database, which you can think of as a hard drive in this “PC.”

Components in this stack:

  • Chains: The core of LangChain. Components (and even other chains) can be stringed together to create chains.
  • Prompt templates: Prompt templates are templates for different types of prompts. Like "chatbot" style templates, ELI5 question-answering, etc
  • LLMs: Large language models like GPT-3, GPT-4, LLaMA, LaMDA, BLOOM, etc
  • Indexing Utils: Ways to interact with specific data (embeddings, vectorstores, document loaders)
  • Tools: Ways to interact with the outside world (search, calculators, etc)
  • Agents: Agents use LLMs to decide what actions should be taken. Tools like web search or calculators can be used, and all are packaged into a logical loop of operations.
  • Memory: Short-term memory, long-term memory.

I’m going to leave it at this for now. If this has piqued your interest I’d recommend checking out this somewhat approachable introduction to LangChain and Pinecone:

Building The Future with LLMs, LangChain, & Pinecone
youtu.be/nMniwlGyX-c