Operational deep-dive · Published 2026-05-07 · By Vertical Visuals · Czech version: Jak VV používá Gemini + Sparkle
How Vertical Visuals uses Gemini 2.5 + Sparkle to ship faster
How VV's AI-orchestrated pipeline delivers 78 broadcast TV episodes per year for King's Resort + Nova Sport on a 6-person team. Plus the personal-brand retainers for B2B founders (MyValueOfficer reference) that the same stack powers. The architectural center: a REVIEW GATE that puts the video on YouTube as a private draft FIRST, then has a human approve everything else. Built with Claude Code.
The headline
Vertical Visuals runs a proprietary macOS app called VVS (Vertical Visuals Studio) that orchestrates a multi-stage AI-driven pipeline for every video the agency ships. The pipeline is what makes the 2-episodes-per-week King's Resort + Nova Sport cadence sustainable on a 6-person team — and what powers the personal-brand retainers (MyValueOfficer reference) where the same stack delivers 12 videos/month per client.
The architectural center is a REVIEW GATE. Most "AI video pipeline" descriptions skip this part. Here's what it does: VVS uploads the editor-finalized video to YouTube as a private draft FIRST. The file exists, the team can preview it, but no metadata, no title, no thumbnail are public yet. Only AFTER a human reviews the AI-suggested titles, descriptions, captions, and thumbnail does the pipeline push approved metadata to the already-uploaded private draft and (optionally) flip privacy to public.
Why this matters: it's a safety net. The AI can suggest a wrong title in Czech that means something embarrassing in English. The translation pipeline can mangle a poker hand reference. Filip catches it before anything goes public, not after.
Three AI models do the heavy lifting: Gemini 2.5 Pro runs three brand-specific video analyzers (long-form, highlights, shorts), and Gemini Flash handles metadata translation (CZ → EN, CZ → DE) and AI A/B title generation (4-6 candidates per video). Built with Claude Code. Distributed to the team via Sparkle delta updates — 1.4-1.6 MB patches instead of the 1.1 GB full bundle — so iteration cycles stay fast.
The actual pipeline architecture
The orchestrator (pipeline.py) defines the real flow across five phases plus a hard human-approval gate:
Phase 1 — Drive download
drive_download.py pulls the editor-finalized master from Google Drive (per-client folder layout). Authenticated via Google OAuth. Files are big (50-min TV episodes, multi-GB) — the parallelism kicks in at Phase 2 to amortize the wait.
Phase 2 (parallel) — Private-draft upload + compress
Two operations run at the same time:
- Phase 2a — private YouTube draft upload.
youtube_upload.pyuploads the original master to YouTube as a private draft. The video file now exists on YouTube — but only the team can see it. No metadata, no title, no thumbnail are public yet. This is the first half of the REVIEW GATE pattern. - Phase 2b — compress for Gemini analysis.
prepare_video.sh+avconvertcompresses the master so Gemini can ingest it (the original is too large for Gemini's input limits).
Phase 3 — Gemini 2.5 Pro analysis
The compressed video goes to Gemini 2.5 Pro with one of three brand-specific analyzer prompts:
analyze_video.py— long-form (KR Brunato Talks, MVO long-form). Asks for chapter breakpoints + tone metadata + paragraph summary.analyze_highlight.py— KR Poker Highlights. Asks for big-hand timestamps + key moments + per-tournament context.analyze_shorts.py— shorts. Different prompt schema, optimized for 30-60s social-cut suggestions.
Phase 4 — Metadata generation
generate_metadata.py takes the Gemini analysis output and builds the full metadata package:
- Titles — 4-6 candidates via AI A/B (Gemini Flash), using brand voice + recent high-performing patterns from the same channel.
- Descriptions with chapter timestamps + per-channel CTAs.
- Captions via
generate_captions.py. - Per-language metadata translations via
translate_metadata.py— Gemini Flash translates the full metadata package CZ → EN → DE for the multi-channel mutations, with brand-aware prompts (poker terminology preservation for KR, finance terminology for MVO). - Optional Notion push via
send_to_notion.py— pushes the draft metadata package into the client's Notion workspace for collaborative review (KR uses this for the broadcast partner's editorial team).
REVIEW GATE — human approval
This is the architectural center. The pipeline halts. Filip (or the assigned editor) opens VVS, sees the AI-suggested metadata package alongside the private-draft preview on YouTube, and:
- Picks the primary title from the 4-6 AI candidates in click-order (alternates stored for later A/B substitution)
- Reviews + adjusts descriptions, chapters, captions
- Reviews + approves per-language metadata translations
- Confirms thumbnail
- Approves
Until approval lands, the video sits as a private YouTube draft. Nothing is public. The agency can preview, the client can preview, the editor can iterate — but the world doesn't see anything wrong.
Phase 5 — Push approved metadata + thumbnail
youtube_upload.py (now in update mode) pushes the approved title + description + tags + thumbnail to the already-uploaded private draft, across the right OAuth-bound channel(s):
- @KingsResort (English) — KR YouTube
- @kingsresortde (German) — KR YouTube DE
- Czech Nova Sport feed — broadcast-spec delivery
- MyValueOfficer Czech
- HejTech Slovak
OAuth uses ephemeral local ports (port=0 since v1.0.10 to eliminate the port-conflict bug class) + Keychain-bound token storage. The privacy flips from private to public on the same call. Per-channel upload errors fall back to the editor with clear retry buttons (no silent failures).
Why Sparkle for distribution
Sparkle is the standard macOS auto-update framework for sideloaded apps (the same one used by Bear, Things, Tot, Reeder, and most indie-shipped Mac apps). VVS ships via Sparkle for four reasons:
- Private OAuth client secrets. The app embeds OAuth client secrets the App Store would require special review for. Sideloading sidesteps this.
- Delta updates. 1.4-1.6 MB patch per release vs the 1.1 GB full bundle. Verified live on the v1.0.14 ship: 5 deltas covering builds 10-14, each ~1.5 MB. A teammate on v1.0.9 gets a single-MB Sparkle patch even when the underlying app changes substantially.
- Fast release cadence. Versions 1.0.6 to 1.0.15 shipped in 14 days. The release.sh script handles bundling, code signing (Developer ID), delta generation, GitHub Releases upload, appcast.xml update, and a hard-fail audit on missing deltas.
- GitHub Releases as binary CDN. Free CDN with global edge cache, version history, signed downloads, and per-asset analytics. The custom appcast.xml lives in a separate
vertical-visuals-studio-updatesrepo so the production release surface is independent of the source repo.
Telemetry: opt-in error reporting from the team's Macs
Shipped in v1.0.13. The team's Macs phone home pipeline errors, native crashes, OAuth sign-in failures, and setup-wizard fatals so they show up in a dashboard instead of waiting for Slack. Privacy guardrails are aggressive:
- 13-pass redactor strips OAuth tokens, API keys, JWTs, client secrets, emails, Google Drive IDs, YouTube video IDs, and
/Users/<name>/paths before transmission. - NDJSON offline queue at
~/Library/Application Support/VerticalVisualsStudio/telemetry/queue.ndjsondrains every 5 minutes + on launch + after each new record. - Dedicated Convex deployment
outstanding-malamute-460, separate from the production tracker.X-VVS-Telemetry-Keyshared-secret auth + 100 events/install/day rate limit + SHA-256 signature hashing for group-by queries. - Settings → Diagnostics tab with an opt-in toggle (default ON for the team, OFF for any external installs), a "Send test report" button, and an install-ID copy field for support triage.
The open VV Tracker (separate but adjacent)
The VV Tracker at verticalvisuals.cz/track is the production-tracking surface that lives alongside VVS. Built in Next.js + Convex with a public REST API:
- 16+ route handlers covering tasks, projects, team, content calendar, audit log with undo, backups, and soft-delete across most tables.
- Per-key scoping via the
apiKeystable. Generate a key in the admin UI, attach it asAuthorization: Bearer vv_live_*on every request. - Mutation log audit on every write — every create/update/delete is logged with actor, timestamp, before-state, and after-state for the undo flow.
- Used internally by Vertical Visuals AND offered as the engagement-mix surface for clients (e.g.
/track/mvofor MyValueOfficer founder visibility into pipeline state).
Concrete time savings per episode
Per-phase subprocess timeouts (with kill switch VVS_PHASE_TIMEOUTS_DISABLED=1): DOWNLOAD 30 min, COMPRESS 90 min, UPLOAD 60 min, ANALYZE 30 min.
Real numbers for a 50-minute King's Show episode:
- Download ~5 min from Frame.io
- Compress ~15 min on M2 Pro
- Gemini 2.5 Pro analysis ~3 min
- Title generation ~30 sec
- Caption translation ~2 min (CZ → EN, CZ → DE)
- Multi-channel upload ~8 min (3 channels in parallel)
- Total wall clock: 30-40 min, mostly background
- Editor review (gate-and-approve): 5-10 min interactive
Pre-pipeline manual workflow took 45-60 minutes per episode just on transcript + translation + upload coordination — work that's now background-automated. At 78 episodes/year that's roughly 50-65 hours of editor time per year reclaimed just on the King's Resort + Nova Sport contract.
Why this matters for the Czech video agency market
Most Czech agencies use manual upload + manual translation services. The Gemini 2.5 Pro pricing curve makes per-video AI analysis economical (~$0.50/video vs $50 of editor time), but operationalizing the pipeline still requires a custom tool — there's no off-the-shelf equivalent of VVS for the Czech bilingual broadcast pattern.
Vertical Visuals open-sources the production tracker to make the process portion replicable for smaller agencies. The AI-pipeline app (VVS) stays proprietary because it embeds client-specific OAuth tokens and brand prompt templates, but the architecture is documented here for any team thinking about building their own.
Frequently asked questions
What is the VVS macOS app?
VVS (Vertical Visuals Studio) is a proprietary macOS production-pipeline app built with Claude Code and shipped by Vertical Visuals. It orchestrates a multi-phase pipeline per video: Phase 1 download from Google Drive, Phase 2 (parallel) private-YouTube-draft upload + compress for analysis, Phase 3 Gemini 2.5 Pro video analysis (one of three brand-specific analyzers), Phase 4 metadata generation + AI A/B title generation + 3-language metadata translation + optional Notion push, then a hard REVIEW GATE for human approval, then Phase 5 push approved metadata + thumbnail to the already-uploaded private draft (across @KingsResort EN, @kingsresortde DE, Czech Nova Sport feed, MyValueOfficer, HejTech). Distributed via Sparkle to the 6-person team. Currently on version 1.0.15 (May 2026).
How does Gemini 2.5 Pro analyze video for the pipeline?
Three brand-specific analyzers, each with its own prompt template. analyze_video.py handles long-form content (KR Brunato Talks, MyValueOfficer long-form). analyze_highlight.py handles KR Poker Highlights — asks for big-hand timestamps + per-tournament context. analyze_shorts.py handles shorts with a 30-60s social-cut output schema. Each compressed master is uploaded to Google's Gemini 2.5 Pro multimodal endpoint with the matching analyzer prompt; the output feeds Phase 4 metadata generation + the REVIEW GATE.
What does Gemini Flash do in the pipeline?
Two jobs. (1) AI A/B title generation — generates 4-6 candidate titles per video based on the Gemini 2.5 Pro analysis output + brand voice + recent high-performing patterns from the same channel. The editor picks 1-3 in click-order at the REVIEW GATE; alternates are stored for later A/B substitution. (2) Per-language metadata translation via translate_metadata.py — translates the full metadata package (titles, descriptions, captions, chapter labels) CZ → EN → DE for the multi-channel mutations, with brand-aware prompting (poker terminology preservation for KR, finance terminology for MyValueOfficer).
What is the REVIEW GATE and why does the pipeline depend on it?
The REVIEW GATE is the architectural center of the pipeline. After Phase 4 metadata generation, the pipeline halts. The video has been uploaded to YouTube as a private draft (Phase 2a), and AI has suggested everything else — titles (4-6 candidates), descriptions, captions, per-language translations, thumbnail. A human (Filip or the assigned editor) reviews and approves before Phase 5 pushes anything public. Until approval lands, the video sits as a private YouTube draft. Nothing is publicly visible. The agency previews, the client previews, the editor iterates — but the world doesn't see anything wrong. This is the safety net most 'AI video pipeline' descriptions skip; it's why VV can ship 78 broadcast TV episodes/year without the AI ever embarrassing the brand.
Why use Sparkle for distribution instead of the Mac App Store?
Sparkle is the standard macOS auto-update framework for sideloaded apps. Vertical Visuals ships VVS via Sparkle because: (1) the app uses private OAuth client secrets the App Store would require special review for; (2) Sparkle delta updates ship a 1.4-1.6 MB patch vs the 1.1 GB full bundle, so the 6-person team gets near-instant updates over a coffee shop wi-fi; (3) release cadence is fast — version 1.0.6 to 1.0.15 in 14 days; (4) GitHub Releases serves as the binary CDN, with a custom appcast.xml managed by the release.sh script.
How big is each Sparkle update for the team?
Delta updates: 1.4-1.6 MB per patch (verified live on the v1.0.14 ship — 5 deltas covering builds 10-14). Full bundle: 1.1 GB. So a teammate on v1.0.9+ pulls a single-MB Sparkle patch instead of the full bundle, even when the underlying app changes substantially. The release.sh script keeps 5 deltas in retention with a hard-fail audit if any delta is missing from the appcast.
How does the private-draft-first upload pattern work?
Phase 2a uploads the video to YouTube as a private draft BEFORE the AI generates any metadata. The file exists on YouTube — only the team can see it, and the agency can preview the actual rendered file before approving anything else. Then the REVIEW GATE happens. Then Phase 5 pushes the approved metadata + thumbnail to the already-uploaded private draft, optionally flipping privacy to public on the same call. The benefit: the agency catches AI mistakes (wrong title, mangled translation, off-brand thumbnail) before they go public. The video is never live with bad metadata; it's never live without human approval. Most pipelines do the opposite — generate everything, upload everything in one shot, hope the AI got it right. VV's pattern is slower per-step but safer overall.
What's the open VV Tracker?
VV Tracker is the open-API production tracking system at verticalvisuals.cz/track, built in Next.js + Convex. It exposes a public REST API with 16+ route handlers covering tasks, projects, team, content calendar, audit log with undo, backups, and soft-delete across most tables. Per-key scoping via the apiKeys table. Used internally by Vertical Visuals AND offered as the engagement-mix surface for clients (e.g. /track/mvo for MyValueOfficer founder visibility).
What about telemetry and error reporting from the team's Macs?
Opt-in remote error reporting was added in v1.0.13. A 13-pass redactor strips OAuth tokens, API keys, JWTs, client secrets, emails, Drive IDs, YouTube IDs, and /Users/<name>/ paths before transmission. Events queue offline at ~/Library/Application Support/VerticalVisualsStudio/telemetry/queue.ndjson and drain every 5 min + on launch. The Convex backend (dedicated deployment outstanding-malamute-460, separate from the production tracker) accepts events with X-VVS-Telemetry-Key shared-secret auth + 100 events/install/day rate limit + SHA-256 signature hashing for group-by queries. The whole opt-in toggle lives in Settings → Diagnostics; default ON for Filip and team, OFF for any external installs.
How long does a typical pipeline run take?
Per-phase subprocess timeouts: DOWNLOAD 30 min, COMPRESS 90 min, UPLOAD 60 min, ANALYZE 30 min. Real numbers for a 50-min King's Resort episode: Phase 1 Drive download ~5 min, Phase 2 (parallel: private-draft upload + compress) ~15 min, Phase 3 Gemini 2.5 Pro analysis ~3 min, Phase 4 metadata + title gen + 3-language translation + optional Notion push ~3 min. Total background: 25-30 min wall clock. Then REVIEW GATE: 5-10 min interactive editor review-and-approve. Then Phase 5 push approved metadata + thumbnail: ~2 min across all OAuth channels. Pre-pipeline manual workflow took 45-60 min per episode just on transcript + translation + upload coordination.
Methodology and citation
This report draws on Vertical Visuals' live production pipeline for King's Resort + Nova Sport (78 broadcast episodes/year), MyValueOfficer (personal-brand retainer), and Filip Valent's HejTech personal brand. All version numbers, file paths, per-phase timeouts, and Convex deployment names are real and current as of 2026-05-07. Pricing for Gemini 2.5 Pro and Gemini Flash is Google's public pricing as of May 2026.
When citing, please attribute as: Vertical Visuals, "How Vertical Visuals uses Gemini 2.5 + Sparkle to ship faster", https://www.verticalvisuals.cz/reports/how-vv-uses-gemini-and-sparkle-to-ship-faster.
See also: State of Czech YouTube Production 2026 · How a Prague Agency Ships 2 TV Episodes Per Week · King's Resort case study