avatar

MarkDown Syntax Highlighter


05-02-2015 11:35 by depado

Syntax Highlighter


One of the main issue I had with the markdown interpreter Misaka was that there is actually only a few examples online on how to define a custom block renderer. Here is how I managed this.


Using pygments allows to syntax highlight many languages. When there is no language defined, it is rendered without any CSS classes or inline styles (which gives the colors to the keywords, operators, etc...).

# -*- coding: utf-8 -*-

import misaka
from misaka import HtmlRenderer, SmartyPants
from pygments import highlight
from pygments.lexers import get_lexer_by_name
from pygments.formatters.html import HtmlFormatter


class HighlighterRenderer(HtmlRenderer, SmartyPants):

    def block_code(self, text, lang):
        has_syntax_highlite = False
        if not lang:
            lang = 'text'
        try:
            lexer = get_lexer_by_name(lang, stripall=True)
            if lang != 'text':
                has_syntax_highlite = True
        except:
            lexer = get_lexer_by_name('text', stripall=True)

        formatter = HtmlFormatter()
        return "{open_block}{formatted}{close_block}".format(
            open_block="<div class='code-highlight'>" if has_syntax_highlite else '',
            formatted=highlight(text, lexer, formatter),
            close_block="</div>" if has_syntax_highlite else ''
        )

    def table(self, header, body):
        return "<table class='table table-bordered table-hover'>" + header + body + "</table>"

markdown_renderer = misaka.Markdown(
    HighlighterRenderer(flags=misaka.HTML_ESCAPE | misaka.HTML_HARD_WRAP | misaka.HTML_SAFELINK),
    extensions=misaka.EXT_FENCED_CODE | misaka.EXT_NO_INTRA_EMPHASIS | misaka.EXT_TABLES | misaka.EXT_AUTOLINK | misaka.EXT_SPACE_HEADERS | misaka.EXT_STRIKETHROUGH | misaka.EXT_SUPERSCRIPT
)

Now this renderer is bootstrap-compliant. Meaning for examples that tables will be displayed using bootstrap and not raw html tables. Even though the code is quite simple it's really efficient. I ran some benchmarking on the edit and new article page. The ajax request for the preview takes a total time of 4ms. Isn't this a really efficient parser ? Additionnaly, before, it was only used to render blog posts. Now the markdown_renderer is available app-wide, allowing me to perform some markdown processing on any kind of data and not only on articles content.

The syntax highlighting system using pygments generates HTML. Now the "easiest" part to add colors is to tell the HtmlFormatter to create inline-style. The advantages of that technique is that you directly have the right colors in your generated HTML, no need to include some additionnal stylesheet. Though this advantage is also a disadvantage. This GitHub Repository contains several CSS files that you can include to create additional style. That allows customization. Plus the usage of inline-style can multiply the size of the generated html (because there is no re-usability of the generated inline-styles).

Click on the eye icon on the top-right border of this post to look at the raw markdown post. That will give you an example of the syntax accepted by this parser.