\
When building a custom markdown renderer, one of the most frustrating issues is code blocks that break HTML rendering. Your code suddenly loses all line breaks and gets wrapped in the wrong HTML tags.
\ Here's what happens:
Before (Broken):
<pre class="language-python"> <code>import requestsimport pandas as pdimport xml.etree.ElementTree as ET</code> </pre> <p class="text-gray-700"> <code>root = ET.fromstring(login_response.text)</code> </p>
\ After (Fixed):
<pre class="language-python"> <code>import requests import pandas as pd import xml.etree.ElementTree as ET root = ET.fromstring(login_response.text)</code> </pre>
The problem occurs in the markdown processing order. Here's the sequence that breaks everything:
// ❌ WRONG: This breaks code blocks function markdownToHtml(markdown: string): string { // Step 1: Replace code blocks with placeholders html = html.replace(codeBlockRegex, (match, language, code) => { const placeholder = `__CODE_BLOCK_${index}__` return placeholder }) // Step 2: Restore code blocks immediately codeBlockPlaceholders.forEach((placeholder, index) => { const codeBlockHtml = `<pre><code>${code}</code></pre>` html = html.replace(placeholder, codeBlockHtml) }) // Step 3: Process paragraphs (THIS BREAKS EVERYTHING!) html = html.replace(/^(.*)$/gim, '<p class="text-gray-700">$1</p>') }
The Problem: After restoring code blocks, the paragraph processing runs and wraps everything in <p> tags, including your already-rendered code blocks.
// ❌ THIS LINE IS THE CULPRIT html = html.replace(/^(.*)$/gim, '<p class="text-gray-700">$1</p>')
This regex pattern matches every line and wraps it in <p> tags, even lines that are already inside <pre> tags.
The fix is to protect code blocks from paragraph processing using a double-layer placeholder system.
// ✅ RIGHT: Detect code blocks and replace with placeholders const codeBlockRegex = /```([a-zA-Z0-9_-]*)[\s]*\n([\s\S]*?)\n```/g const codeBlockPlaceholders = [] html = html.replace(codeBlockRegex, (match, language, code) => { const placeholder = `__CODE_BLOCK_${codeBlockPlaceholders.length}__` codeBlockPlaceholders.push({ match, language, code }) return placeholder })
// ✅ RIGHT: Convert code block placeholders to protected placeholders const protectedPlaceholders = [] html = html.replace(/(__CODE_BLOCK_\d+__)/g, (match) => { const placeholder = `__PROTECTED_${protectedPlaceholders.length}__` protectedPlaceholders.push({ placeholder, content: match }) return placeholder })
// ✅ RIGHT: Now process paragraphs - protected placeholders are safe html = html .replace(/\n\n/g, '</p><p class="text-gray-700">') .replace(/\n/g, '<br />') .replace(/^(.*)$/gim, '<p class="text-gray-700">$1</p>')
// ✅ RIGHT: Restore protected placeholders first protectedPlaceholders.forEach(({ placeholder, content }) => { html = html.replace(placeholder, content) }) // ✅ RIGHT: Finally restore code blocks AFTER all processing codeBlockPlaceholders.forEach((placeholder, index) => { const { language, code } = placeholder const codeBlockHtml = `<pre class="language-${language}"><code>${code}</code></pre>` const placeholderText = `__CODE_BLOCK_${index}__` html = html.replace(placeholderText, codeBlockHtml) })
Even with the correct processing order, you need proper CSS to preserve line breaks:
/* ✅ RIGHT: This preserves line breaks */ .prose-pre { white-space: pre !important; /* Critical for line breaks */ word-wrap: normal !important; overflow-x: auto !important; } .prose-pre code { white-space: pre !important; /* Also critical */ background: transparent !important; font-family: monospace !important; }
__CODE_BLOCK_X__ → __PROTECTED_X__ → Final HTMLwhite-space: pre ensures line breaks are respected// Wrong: Code blocks get processed before paragraphs codeBlockPlaceholders.forEach(/* restore code blocks */) html = html.replace(/^(.*)$/gim, '<p>$1</p>') // This breaks code blocks!
// Wrong: Optional newline can cause parsing issues const codeBlockRegex = /```(\w+)?\n?([\s\S]*?)```/g // Right: Force newline after language const codeBlockRegex = /```([a-zA-Z0-9_-]*)[\s]*\n([\s\S]*?)\n```/g
/* Wrong: Missing whitespace preservation */ .prose pre { @apply bg-gray-900; } /* Right: Explicit whitespace preservation */ .prose-pre { white-space: pre !important; }
After implementing the solution, test with complex code blocks:
def test_function(): # This should have proper line breaks result = some_complex_operation( param1="value1", param2="value2" ) return result
Check the generated HTML - it should look like:
<pre class="language-python"> <code>def test_function(): # This should have proper line breaks result = some_complex_operation( param1="value1", param2="value2" ) return result</code> </pre>
The key to fixing markdown code block line break issues is:
white-space: pre is essentialThis approach ensures that code blocks are completely isolated from paragraph processing and maintain their formatting integrity.


