About This Tool
Converting HTML back to Markdown is a frequent need when migrating content between platforms, cleaning up exported web pages, or reformatting articles for documentation systems that use Markdown. Content management systems, email templates, and web pages store content as HTML, but developers and writers prefer working with Markdown because it is cleaner, more portable, and easier to version control. Manually stripping HTML tags and reformatting content is tedious and error-prone, especially with nested lists, tables, and code blocks. This HTML to Markdown converter handles all common HTML elements: headings (h1 through h6), paragraphs, bold and italic text, links, images, ordered and unordered lists, code blocks, inline code, blockquotes, horizontal rules, tables, and line breaks. Unknown or unsupported tags are stripped while preserving their text content, so no data is lost. The output is clean, readable Markdown ready for use in GitHub, GitLab, Notion, or any Markdown-based platform.
Supported HTML Tags
The converter handles these HTML elements and maps them to their Markdown equivalents:
- <h1> to <h6>: Converted to
# Headingthrough###### Heading - <p>: Converted to paragraphs separated by blank lines
- <strong> and <b>: Converted to
**bold** - <em> and <i>: Converted to
*italic* - <a href="...">: Converted to
[text](url) - <img src="..." alt="...">: Converted to
 - <ul>/<li>: Converted to
- item - <ol>/<li>: Converted to
1. item - <pre><code>: Converted to fenced code blocks with triple backticks
- <code>: Converted to
`inline code` - <blockquote>: Converted to
> quoted text - <hr>: Converted to
--- - <table>: Converted to pipe-separated Markdown tables
- <br>: Converted to line breaks
How the Conversion Process Works
The converter processes HTML in multiple passes to handle both block-level and inline elements correctly:
- Step 1 - Preserve code blocks: Content inside
<pre>tags is extracted and set aside to prevent processing of its contents. HTML entities inside code are unescaped back to their original characters. - Step 2 - Block elements: Headings, blockquotes, horizontal rules, tables, and lists are converted to their Markdown equivalents. Each block type has specific conversion rules.
- Step 3 - Inline elements: Bold, italic, links, images, inline code, and strikethrough formatting within text are converted.
- Step 4 - Cleanup: Remaining HTML tags are stripped, HTML entities are unescaped, and excessive blank lines are condensed.
This multi-pass approach ensures nested formatting (like bold text inside a list item inside a blockquote) is handled correctly.
Common Use Cases
Developers and content creators convert HTML to Markdown in many scenarios:
- CMS migration: Moving blog posts from WordPress or Drupal to static site generators like Hugo, Jekyll, or Astro that use Markdown files
- Documentation: Converting internal wiki pages or HTML documentation to Markdown for storage in Git repositories
- Email cleanup: Stripping HTML formatting from email content to create clean text or Markdown versions
- Web scraping: Converting scraped HTML content into readable Markdown for analysis or archiving
- Note-taking: Reformatting web articles for import into Markdown-based note apps like Obsidian or Bear
Tips for Clean Conversions
Follow these practices to get the best results:
- Clean up the HTML first: Remove unnecessary
<div>wrappers, inline styles, and class attributes. The converter strips these, but cleaner input produces cleaner output. - Check nested lists: Deeply nested HTML lists may not convert perfectly. Review the output for correct indentation and nesting levels.
- Verify code blocks: Code inside
<pre><code>blocks preserves whitespace and unescapes HTML entities. Check that special characters in code examples rendered correctly. - Handle edge cases: Some HTML patterns (like tables with merged cells or complex layouts using CSS Grid) do not have Markdown equivalents. These are converted to their closest approximation.
- Review links and images: Verify that all URLs were extracted correctly, especially those containing query parameters or special characters.
Frequently Asked Questions
What happens to HTML tags that Markdown does not support?
<span class="highlight">text</span> becomes just text. No content is lost, but the styling information is removed since Markdown has no equivalent.Does this converter handle inline CSS styles?
Can I convert an entire web page to Markdown?
How are HTML entities handled?
&, <, >, ", and are converted back to their original characters. This ensures the Markdown output contains readable text rather than encoded entities.