D:\your_data
Sync local folders. Connect any AI via MCP, API, or CLI.
Like Dropbox, but for your AI agents.
$ diskd sync ~/Documents/contracts
$ diskd mcp serve
// The Problem
xLLM memory is volatile - lost every session
xAgent artifacts disappear after run
xWorkflows not persisted
xCopy-paste docs into every chat
+Sync folders - files, artifacts, memory
+Agent outputs persist to drive
+Workflows and state survive sessions
+Any AI reads via MCP, API, CLI
// Features
The D: drive your AI has been missing
MCP Server
Model Context Protocol server. Connect Claude, GPT, or any MCP-compatible agent to your files.
Unix-Style Tools
ls, glob, grep, cat, vsearch. Like Unix commands, but for your AI. Pipe-friendly. Composable.
Agent-Native CLI
Bash commands any agent can use. Claude Code, Cursor, Cline, or your own. No SDK needed.
Auto-Indexing
Upload files, indexing starts automatically. Vector embeddings + full-text. Ready to query.
3-Way Hybrid Search
Semantic (vsearch) + keyword (grep) + BI queries (biquery) in one platform. No other system combines all three.
20+ Processors
PDF, DOCX, XLSX, images, audio, video, YouTube, GitHub repos. No ETL pipeline needed.
// What Makes Us Different
Not just storage. An AI-integrated knowledge platform.
Natural Language Query Routing
Ask in plain English. The system automatically picks the best search method.
"Find payment terms" → vsearch (semantic)
"Find NET 30" → grep (exact match)
"Sales over $1000" → biquery (SQL)
Cross-File BI Analysis
Query across Excel, CSV, and database files with natural language or SQL.
"Show me Q4 revenue by region"
→ Queries multiple spreadsheets, returns table
Vision AI OCR
Extract text from scanned PDFs and images using Claude Vision or GPT-4 Vision.
Scanned contracts, receipts, handwritten notes
→ Fully searchable and indexed
DriveDB - Databases in Your Drive
Create queryable SQL databases directly in your drive. Schema-enforced or schema-less.
Store agent memory, workflow state, structured data
→ Auto-indexed with vectors
YouTube & Media Transcription
Index video content automatically. Search across transcripts with semantic search.
Paste YouTube URL or upload audio/video
→ Transcribed, chunked, searchable
GitHub Repository Indexing
Index entire codebases. Search across repositories with semantic understanding.
"Find authentication logic"
→ Finds relevant code across all repos
// Why Not Model Memory?
The industry is moving from stateless Q&A to stateful intelligence.
“The real question isn’t whether to add memory, but where that memory lives and who manages it.”
Short-lived, app-specific memories don’t support multi-agent work, governance, or long-term reuse.
-Tied to one model (Claude, GPT, etc.)
-Client-side file management
-No multi-agent collaboration
-No audit trail or compliance
-Memory lost on model switch
+Model-agnostic - works with any LLM
+Cloud-native with on-prem option
+Multi-agent collaboration built-in
+Full audit trail: who, what, when
+Memory persists across models
// Built for Production
Memory as a service layer between agents and data
Multi-Tenancy
Org-level controls, user isolation, quota management. State shared across agents, systems, and organizations.
Audit & Compliance
Every operation logged by design. Who did what, when, and why. Session replay to restore context.
Multi-Agent Collab
Multiple agents work on shared projects. No conflict-resolution headaches. Sub-agents use Drive as memory.
Cross-File Reasoning
Extract data from DOCX, PDF, XLSX. Cross-file analysis. BI-grade visualizations from natural language.
Session Persistence
Agent sessions automatically persisted. Replay conversations. Transfer state across agents and systems.
Billions of Files
Infrastructure built to scale. S3 backend handles storage. Context attached via integrations or other agents.
// Architecture
Built for scale. Optimized for cost.
S3-Native Storage
Files stored in S3-compatible object storage. Works with AWS S3, MinIO, OVH, Backblaze. Pay only for what you store.
LSM-Based Index
Log-structured merge-tree for fast writes. Vector indexes built on FAISS. Optimized for append-heavy AI workloads.
Infinite Scale
Storage grows with your data. No pre-provisioning. Terabytes to petabytes. Index shards across workers.
Deploy Anywhere
>On-premise: Docker Compose or Kubernetes. Your data stays in your infra.
>AWS: One-click deploy to ECS/EKS. S3 for storage. RDS optional.
>OVH: Object Storage + Managed Kubernetes. EU data residency.
>Any S3: MinIO, Backblaze B2, Cloudflare R2. Bring your own.
Low TCO
// Get Started
Mount your drive in minutes
# 1. Upload files to your drive
# Upload via CLI
diskd upload ./contracts/*.pdf ./reports/
# Or via SDK
await sdk.drive.upload('./contracts/agreement.pdf')# 2. Start MCP server for your LLM
# Start the MCP server
diskd mcp serve --port 3001
# Or add to Claude Desktop config:
{
"mcpServers": {
"diskd": {
"command": "diskd",
"args": ["mcp", "serve"]
}
}
}# 3. Agent uses CLI tools (like Unix commands)
# List files in drive
$ diskd ls /contracts
# Find files by pattern
$ diskd glob "**/*.pdf"
# Read file content
$ diskd cat /contracts/agreement.pdf
# Exact text search (like grep)
$ diskd grep "NET 30" /contracts
# Semantic search (vector similarity)
$ diskd vsearch "payment terms and conditions" /contracts
# SQL query on structured data
$ diskd biquery "SELECT * FROM data WHERE amount > 1000"# Works with any bash-capable agent
# Claude Code, Cursor, Cline, Aider - they all run bash
# Your agent just calls diskd commands like any Unix tool
# Example agent prompt:
"Search my contracts for payment terms using:
diskd vsearch 'payment terms' /contracts
Then summarize what you find."// The Origin Story
In the early days of personal computing, the C: drive was the system. It was where the OS lived. Essential, but fragile. If the system crashed, you wiped C:.
But the D: drive... that was different.
The D: drive was yours. It was where you put the work. It was secondary storage, yet it was the primary source of value. It was persistent. It survived the crash. It survived the upgrade.
“We realized that the industry was obsessed with building a faster CPU, but nobody was building the Computer.”
Fast forward to the AI era. We have built incredible processors (LLMs). We treat them like gods, but they are just CPUs—fast, stateless, and forgetful. We feed them data through a thin straw of RAM (Context Window), and when the power cuts, they forget everything.
diskd is the D: Drive for AI.
Persistence.