THE MISSING DRIVE FOR AI

D:\your_data

Sync local folders. Connect any AI via MCP, API, or CLI.
Like Dropbox, but for your AI agents.

diskd://mount

$ diskd sync ~/Documents/contracts

$ diskd mcp serve

// The Problem

WITHOUT diskd

xLLM memory is volatile - lost every session

xAgent artifacts disappear after run

xWorkflows not persisted

xCopy-paste docs into every chat

WITH diskd

+Sync folders - files, artifacts, memory

+Agent outputs persist to drive

+Workflows and state survive sessions

+Any AI reads via MCP, API, CLI

// Features

The D: drive your AI has been missing

MCP Server

Model Context Protocol server. Connect Claude, GPT, or any MCP-compatible agent to your files.

Unix-Style Tools

ls, glob, grep, cat, vsearch. Like Unix commands, but for your AI. Pipe-friendly. Composable.

Agent-Native CLI

Bash commands any agent can use. Claude Code, Cursor, Cline, or your own. No SDK needed.

Auto-Indexing

Upload files, indexing starts automatically. Vector embeddings + full-text. Ready to query.

3-Way Hybrid Search

Semantic (vsearch) + keyword (grep) + BI queries (biquery) in one platform. No other system combines all three.

20+ Processors

PDF, DOCX, XLSX, images, audio, video, YouTube, GitHub repos. No ETL pipeline needed.

// What Makes Us Different

Not just storage. An AI-integrated knowledge platform.

NL

Natural Language Query Routing

Ask in plain English. The system automatically picks the best search method.

"Find payment terms" → vsearch (semantic)

"Find NET 30" → grep (exact match)

"Sales over $1000" → biquery (SQL)

BI

Cross-File BI Analysis

Query across Excel, CSV, and database files with natural language or SQL.

"Show me Q4 revenue by region"

→ Queries multiple spreadsheets, returns table

AI

Vision AI OCR

Extract text from scanned PDFs and images using Claude Vision or GPT-4 Vision.

Scanned contracts, receipts, handwritten notes

→ Fully searchable and indexed

DB

DriveDB - Databases in Your Drive

Create queryable SQL databases directly in your drive. Schema-enforced or schema-less.

Store agent memory, workflow state, structured data

→ Auto-indexed with vectors

YT

YouTube & Media Transcription

Index video content automatically. Search across transcripts with semantic search.

Paste YouTube URL or upload audio/video

→ Transcribed, chunked, searchable

GH

GitHub Repository Indexing

Index entire codebases. Search across repositories with semantic understanding.

"Find authentication logic"

→ Finds relevant code across all repos

// Why Not Model Memory?

The industry is moving from stateless Q&A to stateful intelligence.

“The real question isn’t whether to add memory, but where that memory lives and who manages it.”

Short-lived, app-specific memories don’t support multi-agent work, governance, or long-term reuse.

MODEL-SPECIFIC MEMORY

-Tied to one model (Claude, GPT, etc.)

-Client-side file management

-No multi-agent collaboration

-No audit trail or compliance

-Memory lost on model switch

DISKD: INFRASTRUCTURE LAYER

+Model-agnostic - works with any LLM

+Cloud-native with on-prem option

+Multi-agent collaboration built-in

+Full audit trail: who, what, when

+Memory persists across models

// Built for Production

Memory as a service layer between agents and data

Multi-Tenancy

Org-level controls, user isolation, quota management. State shared across agents, systems, and organizations.

Audit & Compliance

Every operation logged by design. Who did what, when, and why. Session replay to restore context.

Multi-Agent Collab

Multiple agents work on shared projects. No conflict-resolution headaches. Sub-agents use Drive as memory.

Cross-File Reasoning

Extract data from DOCX, PDF, XLSX. Cross-file analysis. BI-grade visualizations from natural language.

Session Persistence

Agent sessions automatically persisted. Replay conversations. Transfer state across agents and systems.

Billions of Files

Infrastructure built to scale. S3 backend handles storage. Context attached via integrations or other agents.

// Architecture

Built for scale. Optimized for cost.

S3-Native Storage

Files stored in S3-compatible object storage. Works with AWS S3, MinIO, OVH, Backblaze. Pay only for what you store.

LSM-Based Index

Log-structured merge-tree for fast writes. Vector indexes built on FAISS. Optimized for append-heavy AI workloads.

Infinite Scale

Storage grows with your data. No pre-provisioning. Terabytes to petabytes. Index shards across workers.

Deploy Anywhere

>On-premise: Docker Compose or Kubernetes. Your data stays in your infra.

>AWS: One-click deploy to ECS/EKS. S3 for storage. RDS optional.

>OVH: Object Storage + Managed Kubernetes. EU data residency.

>Any S3: MinIO, Backblaze B2, Cloudflare R2. Bring your own.

Low TCO

$0.023
/GB/month (S3)
10ms
p99 search latency
100K
docs/second ingest
0
vendor lock-in

// Get Started

Mount your drive in minutes

# 1. Upload files to your drive

# Upload via CLI
diskd upload ./contracts/*.pdf ./reports/

# Or via SDK
await sdk.drive.upload('./contracts/agreement.pdf')

# 2. Start MCP server for your LLM

# Start the MCP server
diskd mcp serve --port 3001

# Or add to Claude Desktop config:
{
  "mcpServers": {
    "diskd": {
      "command": "diskd",
      "args": ["mcp", "serve"]
    }
  }
}

# 3. Agent uses CLI tools (like Unix commands)

# List files in drive
$ diskd ls /contracts

# Find files by pattern
$ diskd glob "**/*.pdf"

# Read file content
$ diskd cat /contracts/agreement.pdf

# Exact text search (like grep)
$ diskd grep "NET 30" /contracts

# Semantic search (vector similarity)
$ diskd vsearch "payment terms and conditions" /contracts

# SQL query on structured data
$ diskd biquery "SELECT * FROM data WHERE amount > 1000"

# Works with any bash-capable agent

# Claude Code, Cursor, Cline, Aider - they all run bash
# Your agent just calls diskd commands like any Unix tool

# Example agent prompt:
"Search my contracts for payment terms using:
 diskd vsearch 'payment terms' /contracts
Then summarize what you find."

// The Origin Story

In the early days of personal computing, the C: drive was the system. It was where the OS lived. Essential, but fragile. If the system crashed, you wiped C:.

But the D: drive... that was different.

The D: drive was yours. It was where you put the work. It was secondary storage, yet it was the primary source of value. It was persistent. It survived the crash. It survived the upgrade.

“We realized that the industry was obsessed with building a faster CPU, but nobody was building the Computer.”

Fast forward to the AI era. We have built incredible processors (LLMs). We treat them like gods, but they are just CPUs—fast, stateless, and forgetful. We feed them data through a thin straw of RAM (Context Window), and when the power cuts, they forget everything.

diskd is the D: Drive for AI.
Persistence.

Mount the drive.

Your data deserves to survive the model.