draft-ai-search-video-tutorials
MediaJots Build Stack
Software and Tools Used
Software and Tools Used to Build MediaJots
This chapter is a simple, non-technical walkthrough of the main tools inside MediaJots: what each tool does, and why it matters.
Visual Legend
🟦 App shell / interface
🟩 Media & transcript pipeline
🟨 AI generation layer
🟪 Storage & smart search
pywebview (Desktop App Shell)
pywebview
- What it does: Turns the app into a real desktop window instead of a browser tab.
- Why MediaJots needs it: You can use MediaJots like a normal desktop app, while still getting a modern interface.
Browser tab UI -> 🟦 Desktop app window
less app-like more focused + cleaner experience
HTML, CSS, JavaScript (Frontend UI)
HTML, CSS, JavaScript
- What it does: Builds everything you click and see: tabs, buttons, forms, status messages, and results.
- Why MediaJots needs it: MediaJots has many steps, so it needs a clear and responsive interface.
SQLite (Local Persistence)
SQLite
- What it does: Saves your processed media, transcripts, settings, and search data on your own machine.
- Why MediaJots needs it: It keeps your history and results ready to reopen quickly, even without depending on a cloud database.
Without local storage: open app -> start over each time
With SQLite: open app -> continue where you left off 🟪
requests (Internet Calls)
requests
- What it does: Lets MediaJots talk to online services (for transcription and AI features).
- Why MediaJots needs it: Without this connection layer, MediaJots cannot send audio out or get AI responses back.
flowchart LR
A[MediaJots App] --> B[requests]
B --> C[AssemblyAI]
B --> D[OpenRouter]
sequenceDiagram
participant U as User Action
participant M as MediaJots
participant R as requests
participant S as Online Service
U->>M: Click Generate / Ask AI
M->>R: Create API call
R->>S: Send request
S-->>R: Return response
R-->>M: Parsed result
M-->>U: Show output
yt-dlp (Media Extraction)
yt-dlp
- What it does: Pulls usable audio/video data from supported links.
- Why MediaJots needs it: MediaJots must grab the media first before it can create transcripts or summaries.
AssemblyAI (Speech-to-Text)
AssemblyAI
- What it does: Converts speech from audio into written text.
- Why MediaJots needs it: The transcript is the base for everything else: summaries, OutScript, and search.
OpenRouter (LLM + Embeddings)
OpenRouter
- What it does: Connects MediaJots to AI models for summaries, structured output, and smarter search.
- Why MediaJots needs it: It is what makes the app "AI-powered" instead of just a basic transcript viewer.
flowchart LR
A[Transcript Text] --> B[OpenRouter]
B --> C[Summary]
B --> D[AI Answers]
Pydantic (Schema Validation)
Pydantic
- What it does: Checks that AI responses follow the expected format.
- Why MediaJots needs it: AI output can be messy; this keeps results reliable and prevents broken screens.
flowchart LR
A[OpenRouter Raw Output] --> B[Pydantic Schema Check]
B --> C{Valid JSON Shape?}
C -->|Yes| D[Use Result in UI]
C -->|No| E[Raise Validation Error]
E --> F[Return Safe Fallback Message]
sqlite-vec (Vector Similarity in SQLite)
sqlite-vec
- What it does: Adds fast "meaning-based" search over saved transcript chunks.
- Why MediaJots needs it: It helps AI Search find the most relevant moments quickly, not just exact keyword matches.
Keyword search: "exact word match"
AI search: "same idea, different words" 🟪
How These Tools Work Together
End-to-End Tool Chain in MediaJots
- App window + UI (
pywebview+ HTML/CSS/JS) lets you paste links, click actions, and review results. - yt-dlp grabs media from supported links.
- AssemblyAI turns speech into transcript text.
- OpenRouter creates summaries/answers, and Pydantic keeps that output clean.
- SQLite + sqlite-vec save everything and power fast History, Keyword Search, and AI Search.
In plain terms: these tools work together so MediaJots can take a link, turn it into readable knowledge, and help you find key moments later.
flowchart TD
A[Paste URL in MediaJots] --> B[Extract media with yt-dlp]
B --> C[Transcribe speech with AssemblyAI]
C --> D[Generate summary and AI answers via OpenRouter]
D --> E[Validate structured output with Pydantic]
E --> F[Save and search in SQLite + sqlite-vec]
