Skip to main content

Overview

The Vector Database is the backbone of AI-powered understanding in the platform.
It stores semantic embeddings of video transcripts, sections, and summaries, allowing the system to reason about content, answer contextual questions, and provide intelligent tutoring-like behavior.

We use Qdrant as the vector database engine. Each collection corresponds to a specific layer of understanding of a video — from short transcript snippets to abstracted concept summaries.

Collections

Collection NameDescriptionDimensionsDistance
transcript-short-segmentsShort dialog-level transcript fragments with timestamps and sequence indices. Ideal for pinpointing exact spoken moments.1536Cosine
transcript-segments-with-contextLonger contextual transcript blocks (≈300–500 tokens) that help AI understand topic flow and meaning.1536Cosine
video-metadataAI-generated semantic metadata for each video — titles, summaries, and keywords optimized for semantic searches.1536Cosine
video-section-summariesSection-level summaries functioning as a table of contents for each video, with start/end times and topic keywords.1536Cosine
video-concept-summariesExtracted key concepts from sections, summarized and annotated for concept-level reasoning.1536Cosine

Example Entries

1. Short Transcript Segments

Compact segments representing exact spoken lines, tagged with timing and sequence information.

{
"contentId": "2259175",
"text": "and we wanna make everything for free.",
"start": 1665.56,
"end": 1668.08,
"segmentIndex": 513
}

2. Transcript Segments with Context

Larger segments containing multiple spoken lines or coherent ideas.

{
"contentId": "2261053",
"text": "God said it to me first. He said, leave stuff better than you found it, Jackie, because I was such a people pleaser. It was so hard for me to disappoint people that in order for me to actually ever leave a friendship, leave a community, leave a job, I would have to get so ticked off. I would have to get so mad. I would have to let it build up and build up and build up and build up and build up and build up until I didn't care if they hated me or not. Friends, that's not the way to do it. It's OK to say, hey, this job's not working out for me anymore. Doesn't work out with my lifestyle. It's not the hours I want to work. It's not the field I want to work in. It's OK to say this is not the right fit for me anymore. It's OK to say that. It's OK to say, hey, I really loved this spiritual community when I was a new believer. I love the way that you taught the Bible. It was really important in my discipleship. But now I have young kids and you don't have a Sunday school and I really just need to find somewhere else or I, you know, I'm a single and y'all don't care about us. You don't have to say it that way. I've been doing a lot of Facebook posts and some Instagram stuff that I'm sure is",
"start": 1321.6,
"end": 1398.08,
"segmentIndex": 16
}

3. Video Metadata

AI-generated semantic data about each video — perfect for AI search or recommendations.

{
"contentId": "2276203",
"title": "Navigating Love and Relationships: Insights from Ask Jackie",
"summary": "In this engaging session, Jackie addresses various relationship dynamics...",
"keywords": ["relationships", "dating advice", "communication", "attachment styles", "single parents"]
}

4. Video Section Summaries

Section-level descriptions to enable semantic navigation within long videos.

{
"contentId": "2276062",
"sectionIndex": 10,
"start": 2076.28,
"end": 2270.58,
"summary": "The speaker emphasizes the importance of supporting individuals as they grow...",
"keywords": ["mindset work", "timelines", "vision", "goal setting", "actionable steps"]
}

5. Video Concept Summaries

Concept-level extractions — distilled ideas from video sections.

{
"contentId": "2261900",
"concept": "Attracting Participants",
"summary": "The focus on creating compelling video and sales scripts underscores the necessity of attracting participants for challenges...",
"keywords": ["attraction", "participants", "challenges"],
"conceptIndex": 3
}

Purpose

These collections collectively enable:

  • Semantic search and content retrieval
  • AI tutoring and reasoning about course materials
  • Cross-video knowledge linking
  • Future extensions, such as question answering and personalized learning guidance

Each video is fully indexed, segmented, and summarized — allowing the AI to navigate knowledge at multiple levels of granularity.

Public API Status

Currently, the Vector Database is not directly exposed via public API endpoints.
The only available endpoint (/api/chatembed/:group_id) initializes or retrieves AI chat sessions that leverage these vectorized insights.
Additional endpoints for semantic querying and search are planned for future releases.