arrow_back Back to Tickets

Ticket #06cb21

completed

Problem

build a semantic text matching engine. ignore the raw video and extract the transcript and look for deep semantic overlap. updagrade to 6GB . support multiple whisper request or create a way to sequence the work rather than in parallel. lets upgrade from whisper-tiny to whisper-base

Creator

Marcus

Priority

1

Project Name

zfrika

Strategy

2 step matching workflow ===================== 1a. audio extraction and transcription (the words). 1b. ASR (automatic speech recognition) model. like OpenAI Whisper. 1c. process the video's audio track and convert the spoken words into clean text string. 2a. text embedding (the meaning) . 2b. generated text string pass through a dedicated text embedding model to output a vector representation of the conversation. use these models: the xenon/all-MiniLM-L6-v2 or Xenova/bge-small-en-v1.5 these models specialize in analyzing sentence and paragraph structure to determine precise semantic alignment. CLIP (Contrastive Language-Image Pre-training)

Next Steps

create a duplicate enviroemtn. (done). 1. move this project to an independent website.

Ticket Information

Ticket ID: 6a30a9bb685af5392a06cb21
Date Initiated: 6/16/2026, 8:33:00 AM
Date Resolved: 6/22/2026, 3:47:00 PM
Status: completed
Created: 6/15/2026, 6:41:15 PM
Last Updated: 6/22/2026, 3:48:01 PM