Personal Knowledge Base
This workflow captures YouTube transcripts, indexes them with qmd, searches with BM25 or hybrid search, and extends the same index with your own documents.
Initialize
bgng init
bgng init --api-key sd_live_xxxxx
init validates the SupaData key, creates the workspace directories, writes default transcripts and notes collections, initializes qmd, and creates an empty Markdown queue.
Capture transcripts
One at a time:
bgng url "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
In bulk:
bgng queue add "https://www.youtube.com/watch?v=A"
bgng queue add "https://youtu.be/B"
cat urls.txt | bgng queue add -
bgng queue list
bgng url batch-process
The batch processor rewrites the queue after each completed URL, so interruption leaves completed items in Processed and the in-flight item pending.
Search and retrieve
bgng search "rust ownership"
bgng query "how do borrow checkers work"
bgng get "2026-04-13/my-video"
bgng get "#a3f2c1" --from 20 --lines 50
Use search for exact terms and fast lookups. Use query for semantic questions, paraphrases, and ambiguous language.
Import local content
Register a directory:
bgng import ~/Documents/Notes --collection notes
bgng import ~/Projects/docs --pattern "**/*.{md,txt}"
Import a single file:
bgng import ./meeting-notes.md --collection notes
Single files are copied into ~/.bgng/<collection>/<basename> before indexing. Directories are registered in-place.
Manage collections and context
bgng collection list
bgng collection add code ~/Projects/awesome --pattern "**/*.ts"
bgng collection add archive ~/Old --no-default
bgng context global "Personal knowledge base of transcripts, notes, and project docs"
bgng context set transcripts /ThePrimeagen "Primeagen videos about Rust, vim, and productivity"
Context descriptions travel with search results and help downstream LLMs interpret matches.
Maintain the index
bgng status
bgng reindex
bgng reindex --force
bgng reindex -c notes
Use --force when you suspect stale or corrupt embeddings. It re-embeds all matching documents and can be slow on large collections.