Optimizely CMS 12 & Gemini: Automated Blog Post Generation

Scaling high-quality, technically accurate content is one of the most significant challenges facing platform teams. Maintaining consistency, ensuring SEO compliance, and keeping the content pipeline full demands substantial manual effort from domain experts.

For the CmsIv project, our goal was to eliminate this content bottleneck entirely. We engineered a solution leveraging Optimizely CMS 12’s robust automation capabilities and the structured output functionality of the Google Gemini API to implement a 'Set and Forget' automated blogging pipeline.

This post details the architecture required to turn simple content ideas into fully published, SEO-optimized BlogDetailPage instances without human intervention.

The Solution: The CmsIv Prompt Queue Architecture

To ensure editors could easily add content ideas without triggering immediate, synchronous API calls (which could lead to editor timeouts), we decoupled the prompt input from the generation process using a simple, queue-based content model.

Decoupling with the Prompt Container

We defined a basic PromptIdeaBlock, allowing editors to provide a detailed technical request. These blocks are placed within a ContentArea on a designated administrative page (the "Generator Queue Page").

// CmsIv.Model/Blocks/PromptIdeaBlock.cs
[ContentType(DisplayName = "Prompt Idea", GUID = "153B83F3-E893-4E80-9A82-C93AC3CF2022", 
    Description = "A single idea for the Gemini AI to process.")]
public class PromptIdeaBlock : BlockData
{
    [CultureSpecific]
    [Display(Name = "Technical Prompt Idea")]
    [UIHint(UIHint.Textarea)]
    public virtual string PromptText { get; set; }

    [ScaffoldColumn(false)]
    public virtual bool IsProcessed { get; set; } = false;
}

The IsProcessed flag, while not strictly necessary for the queue processing, prevents the job from reprocessing blocks that failed during an API call or were partially handled, ensuring idempotent behavior.

Defining the Structured Output for Gemini

The greatest challenge in AI content generation is reliability. We cannot afford for the AI to return unstructured text that we then have to parse via regex. We utilize the Google.GenAI SDK's ability to enforce a specific JSON output schema.

We defined a target C# class that perfectly maps to our Optimizely BlogDetailPage properties (specifically for SEO fields).

// CmsIv.Web/Services/Gemini/GeminiPostResult.cs
public class GeminiPostResult
{
    public string SeoTitle { get; set; }
    public string SeoDescription { get; set; }
    public List<string> Keywords { get; set; }
    public string PostContent { get; set; } 
    // Note: PostContent must be pure HTML as we map it to XhtmlString
}

This C# class is then converted into the required JSON Schema used in the API call:

/* Defined implicitly or explicitly when calling the Gemini API */
{
  "type": "object",
  "properties": {
    "SeoTitle": { "type": "string", "description": "A title optimized for search engines." },
    "SeoDescription": { "type": "string", "description": "A meta description, 150-160 characters." },
    "Keywords": { "type": "array", "items": { "type": "string" } },
    "PostContent": { "type": "string", "description": "The full blog post content, formatted entirely in HTML (using h2, p, ul, pre tags)." }
  },
  "required": ["SeoTitle", "SeoDescription", "PostContent"]
}

The Engine: Implementing the BlogPostGeneratorJob

The heavy lifting is performed by a standard Optimizely Scheduled Job, set to run every four hours.

// CmsIv.Web/ScheduledJobs/BlogPostGeneratorJob.cs
[ScheduledPlugIn(DisplayName = "Blog Post Generator", 
    Description = "Processes queued prompts via Google Gemini and publishes new blog posts.")]
public class BlogPostGeneratorJob : ScheduledJobBase
{
    private readonly IContentRepository _contentRepository;
    private readonly IContentLoader _contentLoader;
    private readonly ILogger<BlogPostGeneratorJob> _logger;
    private readonly GenerativeModel _geminiModel;

    private const int GeneratorQueuePageId = 123; // ID of the page holding the ContentArea

    public BlogPostGeneratorJob(
        IContentRepository contentRepository, 
        IContentLoader contentLoader,
        ILogger<BlogPostGeneratorJob> logger)
    {
        _contentRepository = contentRepository;
        _contentLoader = contentLoader;
        _logger = logger;
        
        // Initialize Gemini model (API Key typically loaded from Configuration/Secrets)
        _geminiModel = new GenerativeModel(
            model: "gemini-2.5-pro", 
            apiKey: Environment.GetEnvironmentVariable("GEMINI_API_KEY"));
    }

    public override string Execute()
    {
        var blocks = GetUnprocessedPromptBlocks();
        if (!blocks.Any())
        {
            return "No new prompts found in the queue.";
        }

        int publishedCount = 0;
        foreach (var block in blocks)
        {
            try
            {
                var result = Task.Run(() => GenerateAndPublishPost(block)).Result;
                if (result)
                {
                    MarkBlockAsProcessed(block);
                    publishedCount++;
                }
            }
            catch (Exception ex)
            {
                _logger.LogError(ex, $"Failed to process prompt: {block.PromptText}");
            }
        }

        return $"Job finished. Successfully published {publishedCount} new blog posts.";
    }
}

Integrating the Google.GenAI SDK

The core logic involves constructing the prompt, including detailed instructions on the desired technical quality and the strict requirement for structured JSON output.

// Helper method within BlogPostGeneratorJob
private async Task<GeminiPostResult> GeneratePostContent(string userPrompt)
{
    var systemInstruction = "You are an expert .NET 8 technical blog writer for Optimizely CMS. Your response MUST be a single JSON object conforming to the required schema. Ensure the PostContent is well-formatted, complete HTML.";

    var config = new GenerateContentConfig
    {
        SystemInstruction = systemInstruction,
        ResponseMimeType = "application/json",
        ResponseSchema = Schema.FromType<GeminiPostResult>() // Enforce the structure
    };

    var response = await _geminiModel.GenerateContentAsync(
        new List<Content> { new UserContent(userPrompt) },
        config);

    if (string.IsNullOrWhiteSpace(response.Text))
    {
        throw new InvalidOperationException("Gemini returned empty content.");
    }

    // Deserialization happens here
    return JsonSerializer.Deserialize<GeminiPostResult>(response.Text, 
        new JsonSerializerOptions { PropertyNameCaseInsensitive = true });
}

Optimizely Content Creation and Publishing

Once we have the reliably structured GeminiPostResult, mapping it to the new BlogDetailPage and publishing is straightforward using the IContentRepository.

// Helper method within BlogPostGeneratorJob
private bool GenerateAndPublishPost(PromptIdeaBlock block)
{
    var postData = GeneratePostContent(block.PromptText).GetAwaiter().GetResult();
    var parentLink = new ContentReference(124); // Blog Landing Page reference

    // 1. Create a writable clone of the new page instance
    var newPost = _contentRepository.GetDefault<BlogDetailPage>(parentLink);
    
    // Set mandatory Optimizely properties
    newPost.Name = postData.SeoTitle; 
    newPost.URLSegment = postData.SeoTitle.ToLowerInvariant().Replace(" ", "-");

    // 2. Map AI results to Optimizely properties
    newPost.SeoTitle = postData.SeoTitle;
    newPost.SeoDescription = postData.SeoDescription;
    newPost.SeoKeywords = string.Join(", ", postData.Keywords);
    newPost.MainBody = new XhtmlString(postData.PostContent);
    newPost.Author = "CmsIv Automation"; // Set a standard author

    // 3. Save and Publish in one transaction
    _contentRepository.Save(newPost, SaveAction.Publish);
    _logger.LogInformation($"Successfully published new post: {newPost.Name}");

    return true;
}

Troubleshooting & Best Practices

Cause: API Rate Limiting or Timeouts

If the job queue is large (e.g., 50 prompts) and the Gemini call takes 20-30 seconds each, the total job execution time can exceed the recommended limit, causing the job to fail or timeout, potentially leaving partial results.

Solution: Batching and Asynchronous Execution Handling

Instead of relying on the standard sync Execute() method for the entire job, ensure the API call logic is correctly handled asynchronously (as shown above using Task.Run().Result, acknowledging the synchronous constraints of ScheduledJobBase execution context). Critically, implement a maximum batch size (e.g., process only 5 prompts per execution) and rely on the scheduled frequency to handle the remainder.

Cause: Invalid JSON Output

While structured output schemas greatly reduce errors, complex or lengthy requests sometimes lead the AI to add introductory text outside the JSON block (e.g., "Here is the content:\n{...}").

Solution: Robust Pre-Processing

Before attempting JsonSerializer.Deserialize, add validation and cleanup steps to isolate the pure JSON string. This often involves stripping markdown code fences (json) or trimming whitespace before the first brace { and after the last brace }.

Conclusion: The Power of 'Set and Forget' Automation

By leveraging Optimizely Scheduled Jobs and Google Gemini's ability to enforce structured data, CmsIv has achieved a zero-effort content pipeline. Editors simply add high-level ideas to a Content Area, and the system handles the technical creation, SEO optimization, property mapping, and immediate publishing.

This automated approach yields a consistent 5x increase in daily publishing capacity while guaranteeing that every piece of content meets minimum technical and SEO requirements upon creation, freeing up valuable developer time for core framework improvements.