AI Search / Digital Asset Management
Asset Manager
Semantic and keyword search across video, audio, images and documents — with AI descriptions, transcripts and relevance scoring built in.
Live demo — click around
6 assets · semantic ranking
Interactive prototype with representative sample data.
The challenge
Teams accumulate large libraries of mixed media — video, audio, images and documents — but keyword-only search misses assets whose relevant content lives in the pixels or the spoken audio rather than the filename. Finding "the clip where the product tour opens on the dashboard" is nearly impossible when nothing was ever tagged or transcribed. The goal was a single library where any asset is discoverable by meaning, not just by exact-match metadata.
Our approach
The system pairs traditional keyword search with semantic search over AI-generated multimodal embeddings, then layers automated enrichment on top: every ingested asset is described by an LLM and, for audio and video, transcribed into a searchable transcript. A Svelte 5 front end presents a unified library with type/format/tag/metadata filtering, relevance scoring, and a detail view for previewing and editing each asset.
How it works
Ingestion via S3 and EventBridge
Assets land in S3; EventBridge events trigger the pipeline, with a batch ingestion service for bulk loads. Go Lambda microservices handle each stage independently.
AI enrichment
An asset processor calls AWS Bedrock to generate a description and a multimodal embedding for each asset. Audio and video are sent to Amazon Transcribe, and a transcript processor turns the output into searchable text.
Indexing
Metadata and enrichment results are stored in DynamoDB, while vector embeddings are written to an OpenSearch vector index to power semantic retrieval.
Hybrid search
A dedicated search API serves both keyword and semantic (embedding) queries with relevance scoring, so users can toggle between literal matching and meaning-based results.
Browse, filter and refine
The Svelte 5 + Vite client lets users filter by type, format, tags and metadata, and browse an asset grid of video, audio, image and document tiles.
Detail and editing
Selecting an asset opens a detail view with preview, the AI-generated description (editable), the transcript for audio/video, full metadata, and a relevance indicator. JWT authentication protects access.
Tech stack
Results
The system turns an untagged, mixed-media library into one that is searchable by meaning: assets are automatically described, audio and video are transcribed, and a single query can surface results by keyword or by semantic similarity across every media type. Enrichment happens automatically on ingestion, so the library stays searchable without manual tagging.
Assets indexed
Avg. search latency
Transcription hours processed
Semantic-search precision (vs. keyword baseline)
Ingestion throughput / hour
Manual tagging time saved
Metrics to be populated with the project owner’s real figures.


