๐๐๐บ๐ฎ๐ป ๐s AI ๐ง๐ต๐ถ๐ป๐ธ๐ถ๐ป๐ด: ๐ฃ๐ฟ๐ผ๐ฏ๐ฎ๐ฏ๐ถ๐น๐ถ๐๐๐ถ๐ฐ ๐ผ๐ฟ ๐๐ฒ๐๐ฒ๐ฟ๐บ๐ถ๐ป๐ถ๐๐๐ถ๐ฐ?
The Debate I Keep Seeing.
These days, all I see in my LinkedIn feed are intense arguments surrounding AI and LLMs. Itโs either over-the-top salesy optimism like โAI agents are no longer experiments; your next 5 hires could cost $0,โ or a lot of naysaying like โLLMs are at or near their peakโฆ the โgame-changerโ crowd is in for disappointment,โ to downright fearmongering: โYour AI will confidently lie with your logo on it,โ or โAI will wipe out millions of jobs in the next 24 months.โ There is also a middle-ground crowd, though few and far between, who say, โLLMs cannot be AGI by definition. An LLM is a static model; once trained, it does not change, unlike our brains.โ
I have been contemplating a similar question since I was first introduced to Large Language Models. I am not a data scientist, but during my initial foray into data science, one cardinal rule I learned is: โDonโt use machine learning for problems that have a deterministic way of solving them.โ
Should We Use LLMs for Deterministic Tasks?
Large Language Models are fundamentally probabilistic in their core function, so the question is: should we even be using LLMs for more deterministic tasks like spelling โstrawberryโ or adding โ2 + 2,โ or for more serious functions like reliably and consistently answering questions such as โWhat were my sales last quarter?โ. Maybe not, given the cardinal rule of not using a probabilistic system to solve a deterministic problem. Problem solved, right? So stop wasting your money on all these AI agents and stick to the tried-and-true systems.
Humans Are Probabilistic Too.
Not so fast. We humans, so adept at spelling and solving basic arithmetic as well as complex and abstract problems, have probabilistic hardware too. Neural activity has inherent noise and randomness, and neurotransmitter release is probabilistic by nature. At the computational level, human thinking shows clear probabilistic patterns. We make decisions under uncertainty and update our beliefs based on evidence (at least the more rational among us do), very Bayesian-style. In fact, we would be paralyzed if our thinking and reasoning systems were fully deterministic when faced with uncertainty. So, if human thinking and reasoning are indeed probabilistic, then how come we are so good at deterministic tasks? It turns out our โprobabilistic determinismโ is a result of our brainโs hierarchical, modular architecture with redundancy and grounded, multi-modal training.
Where Human โProbabilistic Determinismโ Stems From?
Hierarchical, Modular Architecture with Redundancy
When we try to spell โstrawberry,โ we donโt rely on generic probabilistic pattern matching; we have dedicated neural circuits for phonology, orthography, motor control, etc. These specialized subsystems have been reinforced millions of times with human feedback and are redundant, where the same information is represented in overlapping ways across many neurons. These specialized neural circuits are so well trained that they are practically deterministic under normal conditions. In summary, human brains compile overlearned skills into near-deterministic routines. However, LLMs lack this modularity: theyโre a large probabilistic blobs, primarily predicting the next token. I believe agentic AI with access to external tools is already addressing this shortcoming of LLMs.
Grounded, Multi-Modal Learning in Humans
When we learned to add โ2 + 2,โ we didnโt learn this by just reading a book, but also by manipulating physical objects, seeing quantities, and hearing this across multiple contexts. Thus, our learning is multi-modal, grounded in physical reality, and is encoded redundantly across many of our neural subsystems. Whereas LLMs learn from text alone, so for an LLM โ2 + 2โ is just a token pattern without any grounded understanding of quantity. While vision-language models can process text and images together, the learning is often centered around correlations between image features and text descriptions rather than truly grounding concepts in sensory experience.
How to introduce Grounded, Multi-modal Learning in AI?
Embodied Learning and Robotics
Embodied AI and robotics are much closer to how humans learn. The idea is to give AI systems physical bodies to manipulate objects, sensors to experience cause and effect, and training through interaction rather than just observation. To draw an analogy: you can read every book about golf, understand the physics of ball flight, memorize the biomechanics of the perfect swing, and know all the theory about weight transfer and club path, but the first time you swing a club, you might miss the ball altogether (at least the less athletic among us).
Does Embodied Learning Really Matter? (Stephen Hawking as a Counterpoint)
Here you might ask: how important is the body for human learning? Look at the case of Dr. Stephen Hawking, he did his best work after his body was severely disabled, and all he was left with was a brilliant and active mind. The counterpoint: Dr. Hawking had a normal childhood with about 20 years of sensorimotor experiences. He learned to walk, manipulate physical objects, and experienced physics directly as a child.
Simulated Embodiment at Scale
LLMs have little to no grounding, primarily learning from text and pattern matching without an underlying world model. AI models may need grounded, multimodal learning to establish basic concepts and build from there. Projects like Teslaโs Optimus and various robotics labs are working on this. However, learning via robotics is slow and expensive. This is where simulated embodiment comes into play. Physics simulators like NVIDIA Isaac Sim let AI practice orders of magnitude faster than through physical robotic embodiment.
A Story Analogy (and a Spoiler)
To end on a lighter note, spoiler alert. Do not read further if you (for whatever reason) are yet to watch the movie Good Will Hunting.
I was rewatching the movie. Every decade or so it presents me with a brand-new interpretation, and it suddenly occurred to me. Will possesses extraordinary intellect; heโs read nearly every book there is to read, absorbed every fact, every theory, and yet he canโt use his genius to build a meaningful life.
In many ways, an LLM is like Will Hunting: it has ingested statistical patterns in human language but remains ungrounded, fluent without understanding, articulate without perception, intelligent without empathy. โThereโs a difference between knowing the path and walking the path.โ
Secondly, the dynamic between Will and Sean (Robin Williams) represents the kind of mentorship AI needs, human guidance that is emotional, not just technical. As I embark on this journey of embodied-AI learning, I intend to be a Sean for AIโs Will Hunting, and Iโll keep posting my learnings, philosophical musings, and realizations about this topic in this space.


