Video Transcript RAG Search

MediaJots Overview

What MediaJots Is

Table of Contents

What MediaJots Is

How it is built — Python backend + pywebview bridge + local ui/index.html frontend.
What it helps with — turning long audio/video into something you can read and work with: transcript, summary, and structured OutScript.
What you can do in-app — tabs for OutScript workflow, Podcast RSS ingest, Keyword Search, AI Search, History, and Settings.
Why it feels fast after first run — transcript packages, metadata, and read-state are cached in SQLite.

End-to-End Workflow in the OutScript Tab

End-to-end Workflow in the OutScript Tab

Get Details — URL is normalized; media stream and metadata are resolved (title, duration, audio URL/ext, platform behavior).
Prepare/Transcribe — audio is staged locally, uploaded to AssemblyAI, and polled until transcript completion.
Hydrate transcript package — app stores raw transcript, sentence timing JSON, utterance/speaker-derived data, and cache metadata.
Generate with LLM — OpenRouter transforms transcript into OutScript JSON + markdown article + plain-language summary.
Reopen from cache — subsequent visits to the same source URL can load details/transcript/outscript directly from video_cache.

In practice, this means the first run does the heavy lifting, and repeat visits are much quicker.

result = api.get_video_details(video_url)
result = api.transcribe_audio_url(src, audio_url, title, description, stage_id)
llm = api.process_transcript_with_llm(src, transcript_text, title, description, sentences_json, speaker_names_json)

Data Model and Persistence Behavior

app_settings keeps provider keys and model choice (assembly_ai_api_key, openrouter_api_key, openrouter_model).
video_cache keeps source URL, media metadata, transcript payloads, OutScript JSON, summary text, read state, and estimated time saved.
transcript_minute_index + transcript_minute_fts power exact-term minute search.
ai_search_chunks stores embedding chunks for semantic retrieval and RAG answers.
Cache behavior uses canonical URLs and upserts, so you do not end up with duplicate rows for the same media.

OutScript Generation and Speaker-Aware Structure

OutScript Generation and Speaker-aware Structure

Structured output — OutScript is validated with typed models and normalized before save.
Speaker-aware output — diarization hints and speaker-map reconciliation are applied before final rendering.
Safe fallback path — if strict model output fails, the app falls back to transcript-derived structure so you still get usable output.
Editable results — speaker names and derived markdown/sections can be saved back with save_processed_data.

Search Capabilities

Keyword Search (lexical):
Builds minute rows from sentence timestamps (or word-count fallback).
Uses FTS5 when available, with SQL LIKE fallback when not.
Returns seek offsets so users can jump back into transcript context.
AI Search (semantic + RAG):
Embeds minute chunks with OpenRouter embeddings.
Retrieves top similar chunks via sqlite-vec or Python cosine fallback.
Generates grounded answers with strict JSON validation and source citations.

Use both together: keyword search for exact terms, AI search for intent-level questions.

kw = api.search_keyword_minutes(query="vector", platform="YouTube", limit=300)
ai = api.ai_search_query("Where does the speaker explain vector fallback?")

Additional Platform Features

Podcast RSS ingest with episode listing and "Add to OutScript" handoff.
History tools with open/delete actions, bulk clear, and read tracking.
Read/time-saved metrics based on transcript/OutScript reading-time estimates.
Clear status and error messaging across extraction, transcription, indexing, and LLM calls.
Settings-driven controls including keyword and AI index rebuild triggers.

How This Chapter Connects to Later Chapters

Chapter 1 (this file) gives the system-level map.
Chapter 2 (How to Use MediaJots) is the practical user walkthrough.
Chapter 3 (Keyword Search) deep-dives lexical indexing and minute retrieval internals.
Chapter 4 (AI Search) deep-dives embeddings, vector retrieval, and RAG answer generation.

Use this chapter as the orientation layer before the feature-specific chapters.

How to Use MediaJots

Setup and Required Keys

Settings tab

Before running workflows, open the Settings tab and configure:

assembly_ai_api_key for transcription
openrouter_api_key for summary/OutScript and AI Search
openrouter_model for LLM generation

After saving settings, use the index buttons when needed:

Build Keyword Index
Build AI Search Index

This ensures search tabs are ready after you process content.

Process a Video End-to-End

Use the OutScript tab in this order:

Paste a supported video URL.
Click Get Details to resolve media metadata and stream URL.
Click Generate OutScript to run transcription + LLM processing.
Review transcript, summary, and OutScript output.
Click Save to persist final edited state.

The app automatically reuses cache when the same source URL is reopened.

details = api.get_video_details(video_url)
tx = api.transcribe_audio_url(src, audio_url, title, description, stage_id)
llm = api.process_transcript_with_llm(src, tx["text"], title, description, sentences_json, speaker_names_json)

Use Podcast RSS Input

If your content source is podcast-first:

Open Podcast tab.
Paste RSS feed URL.
Click Load Feed.
Choose an episode and click Add to OutScript.
Continue in OutScript tab with Get Details / Generate OutScript flow.

This bridges feed discovery and the same transcript workflow used for video URLs.

Review and Edit Speaker-Aware Output

After generation:

Verify transcript quality in the transcript panel.
Review summary for clarity and accuracy.
Inspect OutScript sections, timestamps, and speaker labeling.
Adjust speaker names where needed.
Save processed data to persist edits.

Use this step to make the OutScript accurate and easy to revisit later.

Run Keyword Search in Practice

Use Keyword Search when you need exact-term lookup:

Rebuild minute index (if content changed).
Enter query and optional platform filter.
Open a result via Open Transcript.
Continue analysis in OutScript tab at the relevant moment.

api.rebuild_keyword_index()
rows = api.search_keyword_minutes("speaker labels", "YouTube", 300)

Run AI Search in Practice

Use AI Search for intent-level questions:

Rebuild AI index after new transcript content is added.
Ask a natural-language question.
Read answer + citations.
Inspect retrieved source chunks for verification.

api.rebuild_ai_search_index()
answer = api.ai_search_query("What are the main optimization strategies discussed?")

Use History and Read Tracking

The History tab helps you manage processed assets:

reopen prior items quickly
delete individual transcript packages
clear all transcript cache when needed
track read state and estimated time saved

Mark transcripts as read when you finish reviewing to keep progress meaningful.

Recommended Daily Workflow

For consistent output quality:

Confirm settings/API keys.
Process new media in OutScript tab.
Edit speakers and save final output.
Rebuild Keyword + AI indexes.
Use searches to revisit key moments and reinforce understanding.
Mark transcripts as read to track your learning progress.

This sequence keeps your archive searchable and your learning progress easy to track.

Keyword Search

What Keyword Search Solves

Why Keyword Search Exists in MediaJots

Keyword Search is your fast lookup layer for media you already transcribed. Instead of scanning full transcripts, you jump straight to minute-level segments that match what you typed.

In day-to-day use, it solves three recurring problems:

Recall: "Where did this speaker mention X?"
Navigation: "Open the transcript near that moment."
Cross-item lookup: "Find matches across all cached videos/podcasts."

This is intentionally separate from AI Search:

Keyword Search is lexical (FTS/LIKE over text rows).
AI Search is semantic (embedding vectors + RAG answer generation).

This chapter is all about how that keyword path works.

result = api.search_keyword_minutes("speaker labels", "YouTube", 300)

Data Prerequisites and Source of Truth

What Must Exist Before Keyword Search Works

Keyword Search only indexes what is already in video_cache. If raw_transcript_text is empty, there is nothing to index.

The primary source query comes from AppDatabase.list_video_cache_for_keyword_index(), which returns:

source_url
video_title
video_description
sentences_json
raw_transcript_text
speaker_names_json
updated_at

So the indexing pipeline assumes transcript generation already happened in the OutScript workflow.

Important detail:

If a user has never transcribed media, index rebuild succeeds structurally but produces zero searchable minute rows.

Minute-Level Chunking Strategy

How MediaJots Converts a Transcript into Searchable Minute Rows

The core transformation lives in Api._minute_rows_for_item(item).

Preferred path: sentence timestamps

If sentences_json is available, MediaJots:

Iterates sentence records.
Reads each sentence text and start timestamp (milliseconds).
Buckets sentence text into minute = start // 60000.
Optionally prefixes speaker name/label ([Speaker X] ...) using speaker_names_json.
Joins all sentence text in the same minute into one transcript chunk.

For each minute bucket, it emits one row containing:

platform
title
description
minute label (h:mm)
updated timestamp
transcript chunk
source URL

Fallback path: no sentence timing

If sentence-level JSON is missing, it falls back to simple transcript slicing:

Splits raw_transcript_text into words.
Groups by 160 words per synthetic minute.
Emits rows with minute = index // 160.

That fallback keeps search usable even when timing metadata is imperfect.

Index Rebuild Flow (Backend API)

What Happens When You Click "Rebuild Minute Index"

UI triggers window.pywebview.api.rebuild_keyword_index(), mapped to Api.rebuild_keyword_index().

Backend flow:

Fetch all eligible transcript cache rows via list_video_cache_for_keyword_index().
Expand each item into minute rows using _minute_rows_for_item.
Persist rows with db.rebuild_transcript_minute_index(rows).
Fetch platform filter options with db.list_keyword_platforms().
Return structured status:

success flag
message (Indexed N minute row(s).)
indexed row count
platform list

rebuild_transcript_minute_index() is a full refresh, not incremental:

Deletes existing transcript_minute_index
Deletes existing transcript_minute_fts (when available)
Reinserts all normalized rows
Rebuilds FTS mirror rows aligned by rowid

Practically, this keeps results consistent after transcript edits or cache changes.

items = db.list_video_cache_for_keyword_index()
rows = [r for item in items for r in api._minute_rows_for_item(item)]
inserted = db.rebuild_transcript_minute_index(rows)

SQLite Storage Model and FTS Design

How Keyword Data Is Stored for Fast Lookup

MediaJots uses two synchronized SQLite tables for keyword retrieval:

transcript_minute_index (authoritative structured rows)
transcript_minute_fts (FTS5 virtual table for full-text matching)

Structured table (`transcript_minute_index`)

Stores:

metadata (platform, title, description, source_url, updated_at)
navigation fields (minute_start, minute_label)
searchable body (transcript)

minute_start is normalized as total minutes from h:mm for sort and seek calculations.

Full-text table (`transcript_minute_fts`)

Created with FTS5 when available and indexed on:

platform, title, description, transcript

Fields like source_url and minute_label are unindexed payload columns (kept for display/navigation joins).

The goal here is simple:

Use FTS when available for better speed and matching.
Keep a fallback path for environments where FTS is unavailable.

Query Execution Path and Fallback Logic

How a Search Query Is Executed

UI call: window.pywebview.api.search_keyword_minutes(query, platform, limit)

Backend call chain:

Api.search_keyword_minutes(...) -> AppDatabase.search_transcript_minute_index(...)

Query behavior

query is trimmed text input.
platform is optional exact filter.
limit is clamped to [1, 2000] (UI typically passes 300).

Two query modes

With query text

Try FTS5:
transcript_minute_fts MATCH ?
Join to transcript_minute_index for full row data
If FTS unavailable, fallback to SQL LIKE across:
platform
title
description
transcript

Without query text

Return recent index rows (optionally filtered by platform), sorted and limited.

Response row shape

Each result includes:

platform, title, description
minute_label, updated_at
transcript, source_url
minute_start_minutes
computed seek_offset_ms = minute_start 60 1000

That seek_offset_ms is the key that enables jump-to-moment navigation.

SELECT i.title, i.minute_label, i.transcript
FROM transcript_minute_fts f
JOIN transcript_minute_index i ON i.id = f.rowid
WHERE transcript_minute_fts MATCH ?
LIMIT 300;

Frontend UX and Interaction Lifecycle

Keyword Search UI Behavior in `ui/index.html`

The keywordView tab contains:

rebuild button (keywordRefreshBtn)
platform filter dropdown (keywordPlatformFilter)
search input (keywordQueryInput)
search button (keywordSearchBtn)
result count + results list

Rebuild lifecycle

rebuildKeywordIndex():

disables rebuild button
shows "Building minute index…"
calls backend rebuild API
refreshes platform filter options
clears current result list
restores button state

Search lifecycle

searchKeywordIndex():

validates non-empty query
shows "Searching keyword index…"
calls backend search API
renders rows with highlighted query matches
updates count and status text

Result item rendering

renderKeywordResults() shows for each row:

title
platform + minute + updated time
description snippet
transcript snippet
action button: Open Transcript

When you click Open, openKeywordInWorkflow(source_url, seek_offset_ms) switches to the main video workflow and opens the media near that exact minute.

Seek and Navigation Integration

How Keyword Search Connects Back to OutScript Workflow

Keyword Search is not a standalone viewer; it is a discovery layer that routes you back into the main media/transcript workflow.

openKeywordInWorkflow(sourceUrl, seekOffsetMs) does the following:

Stores pending seek offset (pendingOutscriptSeekMs).
Activates the main "video" tab.
Writes source_url into the workflow URL input.
Calls getVideoDetails() to load cached/extracted media state.
Lets downstream workflow components scroll or seek to that timestamp.

This creates a smooth handoff:

Search result -> exact source media -> contextual transcript/outscript navigation.

The practical UX is: find first, then deep-read in the right place.

Reliability, Edge Cases, and Performance Notes

Practical Edge Cases and Why the Design Holds Up

1) Missing timing metadata

When sentence timestamps are unavailable, the word-count fallback still lets indexing run. Precision is lower, but discoverability remains.

2) Environments without SQLite FTS5

Search automatically falls back to LIKE queries. That keeps functionality available across more Python/SQLite builds, with some trade-off in performance and matching quality.

3) Speaker labeling quality

If diarization labels exist, minute chunks include bracketed speaker markers. This improves query utility for terms like names or role-specific references.

4) Rebuild consistency

Index rebuild uses full replace. That avoids stale fragments when transcript content changes.

5) Result ordering

Rows are sorted by:

newest updated_at first,
then title,
then minute order.

So you see recent media first, while preserving timeline order inside each item.

End-to-End Execution Trace

Full Request Trace: From Button Click to Clickable Result

User transcribes media in OutScript workflow (data lands in video_cache).
User opens Keyword Search tab.
User clicks Rebuild Minute Index.
Backend expands transcript content to minute rows and rebuilds SQLite index tables.
User enters query + optional platform filter.
Backend executes FTS5 (or LIKE fallback) and returns matching minute rows.
UI renders highlighted snippets and counts.
User clicks Open Transcript on a match.
App reopens source media in workflow context with minute seek offset.

That loop is the heart of Keyword Search in MediaJots: index once, query fast, jump to context.

AI Search

What AI Search Is Designed to Do

Why AI Search Exists Alongside Keyword Search

AI Search in MediaJots is the semantic layer for your transcript archive. Instead of exact text matching, it finds conceptually related chunks and then builds a grounded answer with citations.

The feature solves questions like:

"What are the main arguments about topic X across my archive?"
"Where did speakers discuss this idea, even if they used different wording?"
"Give me a concise answer and show which transcript chunks support it."

Compared to Keyword Search:

Keyword Search: lexical lookup over minute text.
AI Search: vector similarity + LLM response constrained to retrieved sources.

Core Architecture and Components

High-Level AI Search Stack in MediaJots

AI Search is split into three layers:

Data preparation + embeddings (api.py + ai_search.py)
Vector/chunk storage + nearest-neighbor retrieval (db.py)
RAG answer generation + UI rendering (ai_search.py + ui/index.html)

Main backend entry points

Api.rebuild_ai_search_index()
Api.ai_search_status()
Api.ai_search_query(question)

External model/API usage

OpenRouter embeddings endpoint for chunk vectors
OpenRouter chat endpoint for final grounded JSON answer

Persistence layer

ai_search_chunks SQLite table stores chunk metadata + serialized vectors
Optional sqlite-vec acceleration if available
Pure-Python cosine fallback if vector extension is unavailable

That design keeps AI Search working both in high-performance setups and in more minimal environments.

status = api.ai_search_status()
result = api.ai_search_query("How is cosine fallback handled?")

Index Build Prerequisites and Source Rows

What Data AI Search Reuses from the Transcript Pipeline

Just like Keyword Search, AI Search starts from transcript cache entries you already generated.

Api.rebuild_ai_search_index() calls:

db.load_settings() to fetch API key/model settings
db.list_video_cache_for_keyword_index() to fetch transcript-bearing media rows

It expands each item through _minute_rows_for_item(item) (the same minute-chunking logic used by keyword indexing). So both search systems are built on the same chunk foundation.

Input guarantees before a useful AI index can be built:

OpenRouter API key exists in settings
At least one cached item has transcript text
Minute rows can be generated from sentence timestamps or fallback chunking

If prerequisites fail, the API returns a clear message instead of silently building a broken partial index.

Chunk Construction and Embedding Generation

How MediaJots Builds Embedding Inputs

Inside rebuild_ai_search_index():

For each minute row, read:

title
transcript chunk body
minute label
source URL/platform

Create chunk_text from transcript body (trimmed).
Build embedding input as:

title + "\n\n" + chunk_text

Trim text length to model-safe limits (<= 7500 chars).

Why include title + chunk text together:

It injects topical context into embedding space.
Similar questions can retrieve chunks that might not repeat all core terms in the body itself.

Embedding generation is batched in fetch_embeddings_openrouter():

model: openai/text-embedding-3-small (default)
batch size: 32
timeout controls + robust error handling
strict vector count validation (must align 1:1 with inputs)

If anything mismatches (or request fails), rebuild stops with a clear error payload.

vectors, err = fetch_embeddings_openrouter(api_key, texts, model="openai/text-embedding-3-small")
inserted = db.replace_ai_search_chunks(db_rows)

Vector Storage and Retrieval Strategy

How Embeddings Are Stored and Queried

Embeddings are persisted in ai_search_chunks with metadata per chunk:

source_url
seek_offset_ms
minute_label
platform
title
chunk_text
embedding (JSON serialized float list)
updated_at

Rebuild behavior in replace_ai_search_chunks(rows) is full replacement:

delete all old rows
insert normalized new rows

This keeps the index consistent after transcript updates.

Retrieval path: nearest chunks

search_ai_similar_chunks(query_embedding, k=10):

If sqlite-vec is usable:
compute cosine distance in SQL (vec_distance_cosine)
order ascending distance
Else:
load vectors into Python
compute cosine distance manually
sort and take top-k

Returned rows always include distance + navigation metadata used by the UI and citation rendering.

Question Flow and Query Embedding

What Happens When User Asks an AI Search Question

UI calls window.pywebview.api.ai_search_query(question).

Backend validation steps in Api.ai_search_query():

Ensure OpenRouter key exists.
Ensure question text is non-empty.
Ensure AI index has chunks (count_ai_search_chunks() > 0).

Then:

Embed the user question via fetch_embeddings_openrouter(api_key, [question]).
Retrieve top 10 nearest chunks via db.search_ai_similar_chunks(...).
Hydrate speaker labels in chunk text (hydrate_bracketed_diarization_tags) using cached speaker hints from video_cache.
Build ranked source objects with:

rank number
URL/title/platform/minute metadata
chunk text
distance

If no similar chunks are found, the call exits early with a clear "No similar chunks found" style message.

RAG Prompting and Structured Answer Contract

How the Final AI Answer Is Generated and Validated

Answer synthesis is handled by run_ai_search_rag(...) in ai_search.py.

Prompt construction

Retrieved chunks are serialized into numbered blocks:

SOURCE [1] ...
SOURCE [2] ...
…

Prompt rules force grounding:

answer only from supplied sources
admit insufficient evidence when needed
return strict JSON with:
answer (string)
citations (list of source numbers)

Schema enforcement

MediaJots uses a Pydantic model (AiSearchAnswerPayload) and sends JSON schema constraints to OpenRouter response_format.

If provider rejects strict schema mode:

code retries with generic json_object mode

Returned text is then:

parsed from raw model output
schema-validated
citation-normalized to in-range source IDs only

Only validated output is treated as success.

This greatly reduces malformed responses and keeps citation mapping predictable.

parsed = parse_llm_json_payload(raw)
validated = AiSearchAnswerPayload.model_validate(parsed)

Frontend UX: Status, Sources, and Citations

How AI Search Appears in the Desktop UI

In ui/index.html, AI Search tab behavior includes:

rebuild index action (rebuild_ai_search_index)
run query action (ai_search_query)
status text updates for each phase
answer panel
source list panel
citation panel
sqlite-vec capability indicator pill

When query succeeds, UI shows:

human-readable answer text
ranked source chunks used in retrieval
citations that map to source numbers

If query fails, the UI still shows fallback source/citation panels instead of going blank.

That makes it easier to debug retrieval quality vs generation quality.

Error Handling and Safety Guarantees

Why AI Search Fails Predictably (and Recoverably)

The pipeline includes explicit guards at every step:

missing API key -> immediate actionable message
empty question -> prompt user input
empty index -> instruct rebuild
embedding request failure -> abort with network/model error detail
no top chunks -> explain retrieval miss
malformed LLM JSON -> parse-failed response, debug log written
schema mismatch -> strict validation failure, debug log written

Debug logging hooks (write_llm_debug_log) help inspect raw LLM output and distinguish:

transport/API errors
parse/JSON formatting failures
schema contract violations

This layered validation matters because AI Search combines multiple probabilistic systems, and each one needs to fail loudly enough for fast recovery.

Performance Characteristics and Tuning Levers

Where Latency Comes From and What Can Be Tuned

Primary cost centers:

Embedding generation during rebuild (network + model throughput)
Vector similarity retrieval (fast with sqlite-vec, slower with Python fallback)
Final chat completion for RAG answer

Main tuning levers visible in code:

embedding batch size (EMBED_BATCH_SIZE = 32)
max embed chars (MAX_EMBED_CHARS = 7500)
retrieval depth (top 10 chunks)
chat model from settings (openrouter_model)

Trade-off notes:

More retrieved chunks increase context coverage but can dilute relevance.
Smaller chunks improve precision but may lose narrative context.
sqlite-vec availability materially improves retrieval speed at larger archive sizes.

End-to-End AI Search Execution Trace

Full Flow: From Rebuild Click to Cited Answer

User clicks Rebuild AI Search Index.
Backend reads transcript cache rows and expands minute chunks.
Chunks are embedded via OpenRouter embeddings API.
Existing ai_search_chunks table is replaced with fresh vectors + metadata.
User asks a natural-language question.
Question is embedded into vector space.
Top similar chunks are retrieved (sqlite-vec or Python cosine path).
Ranked chunk set is passed to RAG prompt as numbered SOURCES.
OpenRouter chat model returns structured JSON answer + citations.
Backend validates and normalizes output.
UI renders answer, sources, and citation mapping.

This is the core AI Search contract in MediaJots: semantic retrieve, grounded generate, validated cite.

How to install MediaJots on your computer

Download PyCharm

Unzip the project and open in PyCharm

Run the following command in the Terminal inside PyCharm

pip install -r requirements.txt

Then run this

python app.py

Now follow instructions from the “How to Use MediaJots” chapter

What MediaJots Is

End-to-end Workflow in the OutScript Tab

Data Model and Persistence Behavior

OutScript Generation and Speaker-aware Structure

Search Capabilities

Additional Platform Features

How This Chapter Connects to Later Chapters

Setup and Required Keys

Process a Video End-to-End

Use Podcast RSS Input

Review and Edit Speaker-Aware Output

Run Keyword Search in Practice

Run AI Search in Practice

Use History and Read Tracking

Recommended Daily Workflow

Why Keyword Search Exists in MediaJots

What Must Exist Before Keyword Search Works

How MediaJots Converts a Transcript into Searchable Minute Rows

Preferred path: sentence timestamps

Fallback path: no sentence timing

What Happens When You Click "Rebuild Minute Index"

How Keyword Data Is Stored for Fast Lookup

Structured table (transcript_minute_index)

Full-text table (transcript_minute_fts)

How a Search Query Is Executed

Query behavior

Two query modes

Response row shape

Keyword Search UI Behavior in ui/index.html

Rebuild lifecycle

Search lifecycle

Result item rendering

How Keyword Search Connects Back to OutScript Workflow

Practical Edge Cases and Why the Design Holds Up

1) Missing timing metadata

2) Environments without SQLite FTS5

3) Speaker labeling quality

4) Rebuild consistency

5) Result ordering

Full Request Trace: From Button Click to Clickable Result

Why AI Search Exists Alongside Keyword Search

High-Level AI Search Stack in MediaJots

Main backend entry points

External model/API usage

Persistence layer

What Data AI Search Reuses from the Transcript Pipeline

How MediaJots Builds Embedding Inputs

How Embeddings Are Stored and Queried

Retrieval path: nearest chunks

What Happens When User Asks an AI Search Question

How the Final AI Answer Is Generated and Validated

Prompt construction

Schema enforcement

How AI Search Appears in the Desktop UI

Why AI Search Fails Predictably (and Recoverably)

Where Latency Comes From and What Can Be Tuned

Full Flow: From Rebuild Click to Cited Answer

How to install MediaJots on your computer

Structured table (`transcript_minute_index`)

Full-text table (`transcript_minute_fts`)

Keyword Search UI Behavior in `ui/index.html`