The AI Revolution’s Dirty Secret: LLMs can write poetry, but they can’t book your flight. They can explain quantum physics, but they can’t fill out a form. Action Model changes everything by training AI to actually do things, not just talk about them.
The Problem: LLMs Can’t Act
Language Models Were Never Trained to Use Platforms
LLMs Training: Just Text
LLMs were trained on language—books, articles, websites. They never learned how to use platforms, only how to describe them.
99.9% Behind GUIs
The internet isn’t text or APIs—it’s graphical interfaces. Every button, form, and menu that humans navigate daily.
Can't Click or Navigate
LLMs weren’t trained to click, type, or navigate. The internet remains unusable to traditional AI.
LLMs vs LAMs: The Fundamental Difference
What LLMs Do
Generate and understand text- Predict the next word in a sequence
- Generate human-like responses
- Process and analyze text
- Books, articles, journals
- Websites, blogs, social media
- Scientific papers
- Easy to scrape from the internet
- Cannot interact with GUIs
- Cannot perform actual actions
- Often hallucinate interface interactions
- Require APIs or integrations
Why APIs Aren’t the Solution
The API Myth: Less than 0.1% of web functionality is exposed via APIs. Major platforms like Instagram and Booking.com actively restrict or eliminate API access. APIs are built for developers, not users—and they never expose full functionality.
GUIs Are for Humans, LAMs Are for Humans
API Limitations
- Restricted functionality
- Developer-focused
- Often blocked or rate-limited
- Requires technical knowledge
- Platform-specific integration
GUI Advantages
- Full platform functionality
- Human-friendly interaction
- Universal approach
- No integration needed
- Works everywhere
The Action Tree: Mapping the Interactive Internet

The Action Tree - How millions of user journeys create a map of every possible action
How User Journeys Become Intelligence
1
User Performs Task
A user completes a task naturally—booking a hotel, posting on social media, or managing emails—while the browser extension records.
2
Journey Recorded
Every click, type, and navigation is captured along with context: DOM elements, screenshots, and environmental state.
3
Path Mapped
The journey becomes a branch in the Action Tree, connecting with similar paths from other users.
4
Tree Grows
Millions of journeys interweave, creating a comprehensive map of how to complete any task on any platform.
5
LAM Navigates
When given a goal, the LAM traverses the Action Tree to find the optimal path, executing actions with human-like precision.
Training Data Requirements
The Complexity Challenge
LLM Training Data
LLM Training Data
Readily Available
- 175B+ tokens available online
- Books, articles, websites
- Can be scraped automatically
- Synthetic data generation possible
- Static text content
- Web crawlers
- Database dumps
- Public datasets
- API access
LAM Training Data
LAM Training Data
Must Be Created
- Requires active user participation
- Dynamic interaction sequences
- Context-dependent actions
- Platform-specific nuances
- Temporal relationships
- Browser extensions
- Desktop recording
- User journey tracking
- Active labeling
- Community contribution
The Training Process
From Individual Actions to Collective Intelligence
The Network Effect: Every user journey makes the model smarter. When one person books a flight on a new airline website, millions can now automate that same task. This is the power of community training.
Data Collection Methodology
Component | What’s Captured | Purpose |
---|---|---|
DOM Elements | HTML structure, element IDs, classes | Identify clickable/interactive elements |
Screenshots | Visual state at each step | Understand visual context and layout |
Mouse Coordinates | Exact click positions | Precise action replay |
Keyboard Input | Text entered, keys pressed | Form filling and navigation |
URL Navigation | Page transitions and routes | Understand site structure |
Network Requests | API calls and responses | Capture dynamic content |
Timing Data | Delays and load times | Realistic action pacing |
Error States | Failed attempts and recovery | Robust error handling |
Community Training at Scale
The Resistance Builds Together
2M+ Trainers
Active community members training the LAM across millions of websites daily.
10B+ Actions
Individual actions recorded, labeled, and integrated into the Action Tree.
100K+ Platforms
Websites and applications mapped with complete workflow coverage.
Real-World Example: Multi-Platform Workflow
Complex Task: “Find trending news on X/Twitter, create a graphic in Canva, and post to Instagram”
How LAMs Execute Complex Chains
1
Understand Intent
Parse the user’s goal into a sequence of sub-tasks across multiple platforms.
2
Navigate to X/Twitter
Use the Action Tree to find the path: Open browser → Navigate to X → Login if needed
3
Find Trending Content
Click explore → Identify trending topics → Extract relevant content
4
Open Canva
Navigate to Canva → Select template → Insert extracted content
5
Create Graphic
Use design tools → Apply styling → Download image
6
Post to Instagram
Navigate to Instagram → Click create post → Upload image → Add caption → Publish
Why Community Training Wins
Millions of Perspectives
- Different workflows for same goal
- Cultural and regional variations
- Platform-specific optimizations
- Edge case coverage
The Training Paradox
Big Tech’s Dilemma: Training a LAM requires massive-scale user interaction data that even Google and Microsoft struggle to collect. Why? Because they can’t watch every user’s screen. But we can—with permission, transparency, and rewards.
Why Action Model Will Win
Factor | Big Tech | Action Model |
---|---|---|
Data Collection | Limited to their platforms | Every website, every platform |
User Incentive | None (they take your data) | Earn tokens for contribution |
Training Speed | Slow, corporate processes | Rapid, community-driven |
Coverage | Their ecosystem only | The entire internet |
Ownership | Shareholders | Community members |
Technical Architecture
The Action Loop

How LAMs Make Decisions - The Action Loop in Practice
1
Observe Environment
Capture current screen state, DOM, and context
2
Search Action Tree
Find relevant paths based on current state and goal
3
Predict Next Action
Determine optimal next step with confidence scoring
4
Execute Action
Perform click, type, or navigation action
5
Verify Result
Check if action succeeded and goal is closer
6
Repeat or Complete
Continue loop until goal achieved or timeout
Join the Training Revolution
Install Extension
Start training in 60 seconds and earn tokens for your contribution
Active Training
Label workflows for 100x rewards and higher impact
View Progress
Track your training contribution and earnings in real-time
The Future of AI Training
Projection: By 2026, the Action Tree will contain paths for every significant task on every major platform in every language. This isn’t just an AI model—it’s a complete map of human digital interaction.
What Happens Next
-
Phase 1: Platform Coverage (Current)
- Mapping major platforms
- Building core workflows
- Community growth
-
Phase 2: Deep Personalization
- Individual preferences
- Company-specific workflows
- Cultural adaptations
-
Phase 3: Universal Automation
- Any task, any platform
- Cross-platform chains
- Natural language to completion
You’re not just training an AI. You’re building the future of work. Train it. Own it. Control it.