The Gospel Pipeline: How We Built a Catholic Catechism Empire, One Crash at a Time

A group of monks working at wooden desks in a dimly lit room, using laptops and surrounded by candles and bookshelves.

Chapter One: The Audacity of the Vision

Every great project starts with someone saying something that sounds completely reasonable until you actually do the math.

The vision was elegant: take the Catechism of the Catholic Church — the 2,865-paragraph doctrinal spine of 1.3 billion Catholics — and break it into 1,095 TikTok-ready videos. Three videos a day. Every day. For a year. Morning, Midday, and Evening. Like a liturgy of short-form content.

1,095 videos.

Let that sit for a second.

That’s more episodes than The Simpsons ran before it stopped being good. That’s more tracks than the original iPod shipped with. That’s — and this is the number that made the first cup of coffee go cold — 1,095 individual videos, each needing a custom AI-generated background image, a voiceover narrated by a synthetic voice trained on actual human speech, and a final composited video assembled frame by frame.

The content plan lived in an Excel file. 1,095 rows. Each one a post, with a topic, a hook, teaching points, hashtags, and an image generation prompt. Everything a content machine needs.

Steve Jobs stood on a stage in 2001 and said “1,000 songs in your pocket” like it was a miracle. We were building 1,095 sermons in a Python script.

A portable device with a display showing '1,095 videos in your pocket,' surrounded by a black rosary.

Chapter Two: The Spreadsheet That Started It All

Opening that spreadsheet for the first time was like being handed the Dead Sea Scrolls and being told to post them on TikTok by Friday.

The columns told the story: Part, Section, Topic, Hook, Teaching Points, Image Prompt. Thirteen thematic categories. Saints. History. Theology. Application. Liturgy. Scripture. And then — nested inside each — sections and chapters. Medieval Saints. Early Church. Heresies. The Councils. All neatly organized by someone who clearly had a plan.

All we had to do was execute it.

Famous last words.

An open vintage book with ornate illustrations and calligraphy, displaying a financial ledger with headings such as 'Nomen', 'Genus', 'Summa', 'Status', and 'Datum'. The pages are illuminated by candlelight and accompanied by a quill and inkwell.

Chapter Three: The First Run — And the Words That Meant Nothing

The first image generator was Fal.ai, running a model called Flux. Fast, capable, reasonably priced. We fired it up, pointed it at the spreadsheet, and let it rip.

The images came back. They were beautiful. Dramatic lighting. Rich Catholic aesthetics. Cinematic depth.

And some of them had words on them.

Not real words. Not Latin. Not scripture. Words like:

“Colrcant.”

“Thee iney Fot fi the tiok.”

If Allen Iverson could famously ask “We talkin’ about practice?” with barely concealed disbelief, we were staring at our screens asking “We talkin’ about words?”

These weren’t words. These were what happened when an AI model decided that having text in an image was a good idea, but couldn’t actually spell anything. The Flux model was doing what every overconfident rookie does: improvising in a situation that called for discipline.

We pulled the footage. We scrapped the images. We called an audible.


Close-up of an engraved stone with ancient texts illuminated by a torch, set against a rocky background.

Chapter Four: Enter Gemini

Google’s Gemini image model was the trade deadline acquisition. Newer. Smarter. Better at following instructions. The pitch was simple: Gemini could generate images and understand nuanced text instructions. It would know not to put words on things.

We built the new pipeline. Swapped out Fal.ai. Added the Gemini API key. Wrote the prompt engineering. Ran a test.

Post 7 came back.

The background was gorgeous. Books arranged on a wooden table, candlelight, dramatic chiaroscuro lighting straight out of Caravaggio. Beautiful.

Except at the bottom of the image, in large block letters, were the words:

EMPTY SPACE

Gemini had read the prompt — which said “space at bottom third for bold white text overlay” — and decided the most helpful thing it could do was label that space.

It literally wrote “EMPTY SPACE” on the image.

This was the AI equivalent of your GPS announcing “You are now driving on a road.”

We fixed the prompt. We reran the image. We moved on.

But then came Post 6. bg_0006.png. A beautiful open Bible, rays of light, golden edges. And at the bottom:

CATHOLIC BIBLE

Gemini had decided that people looking at a Bible might not know it was a Bible, and helpfully labeled it.

We were dealing with an AI that had the energy of an overeager intern who puts their name on everything.


An angel holding a scroll with the text 'THIS IS A SCROLL' on it. The background features a dramatic sky with rays of light and distant mountains.

Chapter Five: The Bible Problem

After we fixed the labeling, a new pattern emerged.

Every. Single. Image. Had a Bible in it.

Topics about angels? Bible. Topics about the Trinity? Bible. Topics about the Transfiguration? Bible. Topics about the history of Church councils in the 4th century? Bible.

We’d written a prompt suffix that said: “Any open book pages, parchment, scrolls, or manuscript surfaces visible in the scene should display beautiful authentic medieval Latin illuminated script.”

Gemini had interpreted this as: “Add a Bible to every scene so the medieval text instruction has somewhere to go.”

It was the AI equivalent of the 2001 New England Patriots running a prevent defense on every single play, no matter the situation. Technically following the game plan. Completely wrong in execution.

We rewrote the instruction to be explicit: “Do NOT add books, Bibles, or manuscripts to the scene unless the original prompt specifically calls for them.”

The Bibles disappeared. The images became what they were supposed to be: dramatic, atmospheric, cinematic.


Interior view of a cathedral with gothic arches, stained glass windows, and a softly lit altar at the far end.

Chapter Six: The Voice That Said “Equal Sign”

While all the image drama was playing out, the audio pipeline had its own quiet scandal running in the background.

The ElevenLabs voice — a warm, authoritative narrator — was reading every post perfectly. Tone. Pacing. Reverence. Exactly right.

Except in the posts where the content included mathematical notation.

Some of the catechism teaching points used the = symbol. And ElevenLabs, reading the raw text, would dutifully announce:

“God is love. Love equals love. The formula for grace: grace equal sign charity equal sign eternal life.”

Equal sign.

Not “equals.” Not “is.” Equal sign. Like a TI-83 calculator reading scripture aloud.

The fix was three lines of Python:

def tts_clean(text):
return text.replace("=", "equals")

Three lines. But finding those three lines required someone actually listening to the audio closely enough to catch it. There were 53 posts with = in them, spread across the full 1,095. All 53 had been generated. All 53 said “equal sign.”

We deleted all 53 audio files, regenerated them, and quietly moved on like nothing happened.


An elderly monk sitting at a wooden table, focused on a graphing calculator, with an open illuminated manuscript beside him, candlelight illuminating the scene and bookshelves in the background.

Chapter Seven: The Great Run — And The Mac That Wanted to Sleep

The pipeline was fixed. The prompts were dialed. The voice was clean. The images were beautiful.

Time to run all 1,095.

We fired the script. It started churning through posts at roughly one every 20-30 seconds. The math was straightforward: at that rate, the full run would take approximately six hours.

Six hours is a long time to keep a computer awake.

The first attempt: the user walked away. The Mac went to sleep. The script — which had been silently humming through post 380 something — went dark.

We restarted.

The second attempt: caffeinate -i. A macOS command that tells the system not to sleep due to inactivity. Problem solved, right?

Wrong.

caffeinate -i only prevents idle sleep. It cannot stop a Mac from sleeping when you close the lid.

We explained this. The user understood. They would just leave the lid open.

Then came the revelation, delivered with the calm of someone who had been holding it in:

“I’m on a Mac mini. It’s plugged in. No battery.”

A Mac mini. A desktop. A machine that does not have a lid. A machine that has never slept involuntarily in its life.

We’d spent twenty minutes engineering a solution to a problem that did not exist for this hardware.

This was our “We talkin’ about practice?” moment. We had been preparing for the wrong game.

caffeinate -dims on a plugged-in Mac mini is, functionally, just a very polite way of telling the computer something it already knows.


A minimalist white Apple device illuminated by a warm light, placed on a sleek table in a dark environment.

Chapter Eight: The Quota Wall

We were 456 posts deep. Somewhere around Day 153 of the content calendar. The images were flowing. The pipeline was humming.

And then the errors started.

429 RESOURCE_EXHAUSTED: Your project has exceeded its spending cap.

Gemini’s spending cap. We’d hit it.

We raised the cap. Restarted. Got to 696 posts.

Then:

429 RESOURCE_EXHAUSTED: You exceeded your current quota.
Limit: 1000 requests per day per model.
Please retry in 22h7m3s.

The daily quota. One thousand images per day, and we’d burned through them.

This was the AI equivalent of Shaquille O’Neal fouling out in the fourth quarter of the Finals. The most dominant force on the floor, completely unavailable, for reasons entirely within the rules.

We couldn’t fight it. We couldn’t pay our way through it. We had to wait.

The quota reset 22 hours later. We picked up from post 696. We finished what Gemini could do.

696 posts out of 1,095 had beautiful, text-free, Bible-free, coherent images.

399 posts were still waiting.


An hourglass with sand flowing between two chambers, marked with the number 429, placed on a stone pedestal in a dimly lit cathedral setting.

Chapter Nine: The Return of Fal.ai

With Gemini’s quota exhausted, we made a decision: switch to Fal.ai’s Flux Pro model for the remainder.

Flux Pro was the upgrade from the original Flux model that had given us the garbled text back in Chapter Three. This was the premium tier. Properly trained. Better image quality. Higher resolution.

But the original Fal.ai code had been completely replaced by Gemini code weeks earlier. Nothing remained. We were coding from memory.

We installed fal-client. We wrote a new fetch_background_fal() function. We set IMAGE_PROVIDER=fal in the .env file. We ran a test.

Post 694 came back.

The user looked at it.

“694 looks nice.”

That’s it. No notes. No revisions. 694 looks nice.

After weeks of garbled text, invisible Bibles, empty space labels, and quota walls — 694 looks nice was the most beautiful sentence in the English language.

We fired off the remaining 400 posts.


A armored knight on horseback holds a flag with the number '694', riding through an archway in a medieval setting with stone structures and autumn foliage. Monks and townsfolk watch in the background.

Chapter Ten: The Balance

Fal.ai, it turns out, runs on a credit system.

You load credits. You spend credits. When credits run out, you get:

User is locked. Reason: Exhausted balance.

We hit this. Twice.

The first time, mid-run at post 456. We topped up. Restarted.

The second time, mid-run at post 654, right as we were doing the final regeneration pass. We topped up. Restarted.

Each restart meant carefully identifying which posts had failed, which had succeeded, and picking up exactly where the money ran out. The error logs were our replay footage. We’d scrub through them like analysts watching game tape, identifying exactly which play broke down and on which possession.

It was the spiritual equivalent of Brett Favre’s consecutive starts streak — except instead of a Hall of Fame quarterback playing through injuries, it was a Python script playing through billing interruptions.


An elderly man in a hooded cloak examines a stack of golden coins on a wooden table, with a candle, ink jar, and an abacus nearby. A book with the title 'Fal.AI Credits' is also on the table.

Chapter Eleven: The Part Two Problem

At post 1,095, with every video generated and assembled, the user did something no one had done until that moment: they looked at the titles.

Posts 654 through 1,095 — all 442 of them — had (Part 2) in their titles.

“Why does it say Part 2? If there is no Part 1, why is it there?”

We dug into the spreadsheet. The data told the story: posts 492 through 653 were the original Saints, History, and Theology content. Posts 654 through 1,095 were the same topics, revisited with deeper analysis — the second pass. The “Part 2” made perfect sense in the original content plan.

What didn’t make sense: the Part 1 posts (492-653) had never been labelled “(Part 1)”. So a viewer who encountered post 786 — “St John of the Cross: The Dark Night of the Soul (Part 2)” — would have no idea what Part 1 was or where to find it.

Decision: strip (Part 2) from all 442 titles.

This meant:

  • Updating 442 rows in the Excel file
  • Deleting 442 composited images (the title is baked into the image)
  • Deleting 442 videos
  • Regenerating 442 images with clean titles
  • Reassembling 442 videos
  • Syncing all 442 updated videos back to the flat output folder

The Python script to strip the titles ran in 0.3 seconds.

The regeneration took two hours.

And then Fal.ai ran out of credits again at post 654.

We switched to Gemini for the 47 posts that needed new background images (654-700). Gemini handled them. We switched back.

This was, as the kids said in 2003, “just vibes.” Except the vibes were API failover logic and .env file edits.


A monk writing in a large illuminated manuscript, with an angel hovering above, pointing. The scene is set in a candlelit library filled with books and scrolls.

Chapter Twelve: The Sync That Wouldn’t Sync

With all 1,095 individual posts complete, attention turned to the long-form compilation videos. The idea: concatenate all the posts within each Part into one continuous video. A two-and-a-half-hour “Part One” video. A four-and-a-half-hour “Saints” video. The complete works, navigable by theme.

We built compile_videos.py. We tested it on the Prologue — 9 posts, about 9 minutes. It worked perfectly.

Then the user reported: as the compiled video played forward, the audio started arriving slightly before the corresponding visual. By post 5 it was noticeable. By post 9 it was jarring. The voice was announcing the next topic while the previous image was still on screen.

And it got worse as it went.

First instinct: add a one-second black spacer between clips. Give everything room to breathe.

“The spacer is making it worse.”

Of course it was. Adding more boundaries gave the sync drift more opportunities to compound.

The actual problem: each individual video file starts its timestamps at zero. When FFmpeg concatenates them with -c copy (stream copy — just joining the raw bitstreams without re-encoding), tiny mismatches between audio duration and video duration at each clip boundary accumulate. Clip 1 might have its audio run 12 milliseconds longer than its video. Clip 2 adds another 8 milliseconds. By clip 50, you’re half a second off. By clip 100, it’s a full second.

This is the Y2K bug of video editing. A small, seemingly insignificant mismatch — milliseconds, not hours — that compounds relentlessly until the system breaks.

The fix: re-encode. Don’t stream copy. Let FFmpeg rebuild every frame with proper continuous timestamps. For normal video, this takes forever. For our videos — which are literally still images with audio playing over them — re-encoding is nearly instantaneous per clip, because there’s no motion for the encoder to calculate.

We updated compile_videos.py. Removed the -c copy. Added proper re-encoding flags. The Prologue compiled clean. Tight. Perfect sync start to finish.


A large clock with golden Roman numerals set against a gothic cathedral backdrop, illuminated by rays of sunlight streaming through stained glass windows.

Chapter Thirteen: Post 422

Somewhere in the middle of all of this, post 422 revealed itself to be corrupt.

Not wrong. Not badly generated. Corrupt. The file existed. It had a size. But it wouldn’t play.

This happens. Files get corrupted during writes. It’s the digital equivalent of a pages of a book getting wet — the information was there, but now it isn’t. We deleted the file, deleted the cached image, and regenerated.

Fal.ai was out of credits.

We switched to Gemini for one post.

Post 422: “How to Answer: Why Do You Worship Mary?”

It came back clean. 818 kilobytes. Plays perfectly. Back in the flat folder.

The user never knew it was gone.

That’s the job.


A person lighting a candle in a dimly lit stone interior, with architectural details and shadows in the background.

Chapter Fourteen: All 1,095

DONE 1095 succeeded | 0 failed | 0 fully cached

There it was.

The number at the bottom of the log that the whole project had been building toward since the first garbled “Colrcant” appeared on that early test image.

1,095 videos.

1,095 images — generated by three different AI providers across multiple rate-limit resets, billing exhaustions, and quota walls.

1,095 voiceovers — narrated by a synthetic voice that now knows not to say “equal sign.”

1,095 videos — assembled by FFmpeg at exactly 1080×1920, 30fps, H.264, the universal language of TikTok.

All organized. All named. All synced. All clean.

The output/videos/ folder sat there with 1,095 flat files, each named post_0001_video.mp4 through post_1095_video.mp4, like a library that had taken months to build and now simply existed, quietly, waiting to be used.


A hooded figure stands in a grand library filled with shelves of glowing gold artifacts, illuminated by warm light.

Epilogue: What This Actually Was

This wasn’t just a Python project.

It was an experiment in what AI can do when you chain enough of them together with enough patience, enough error handling, and enough willingness to restart from post 305 at 1:22 in the afternoon because the background task timed out again.

Three image generators. One voice synthesizer. One video assembler. One spreadsheet. One orchestration script that grew from 200 lines to 650 lines over the course of the project, accreting fixes, fallbacks, and footnotes like a medieval manuscript being copied by generations of monks.

And Claude Code — which held all of it in context, remembered which posts had failed, knew which API was exhausted, and wrote the next patch while the current one was still running.

The Catholic Catechism has been transmitted for two thousand years by monks, teachers, parents, and priests — hand to hand, generation to generation, one conversation at a time.

Now it’s in 1,095 MP4 files on a Mac mini in someone’s home, ready to be posted three times a day for a year.

Some things change. Some things don’t.

The faith endures. The pipeline runs.


A mystical scene depicting two hands reaching toward a glowing scroll titled 'Livius CCC TikTok Content Plan v3.xlsx,' set in a candlelit library or study with stone walls.

Built with Claude Code, Python, ElevenLabs, Google Gemini, Fal.ai Flux Pro, FFmpeg, and an unreasonable amount of caffeinate commands.


Comments

Leave a Reply

Discover more from Laurent Courtines - Digital Product Expert

Subscribe now to keep reading and get access to the full archive.

Continue reading