Force Sphinx to interpret Markdown in Python docstrings instead of reStructuredText
Question:
I’m using Sphinx to document a python project. I would like to use Markdown in my docstrings to format them. Even if I use the recommonmark
extension, it only covers the .md
files written manually, not the docstrings.
I use autodoc
, napoleon
and recommonmark
in my extensions.
How can I make sphinx parse markdown in my docstrings?
Answers:
Sphinx’s Autodoc extension emits an event named autodoc-process-docstring
every time it processes a doc-string. We can hook into that mechanism to convert the syntax from Markdown to reStructuredText.
Unfortunately, Recommonmark does not expose a Markdown-to-reST converter. It maps the parsed Markdown directly to a Docutils object, i.e., the same representation that Sphinx itself creates internally from reStructuredText.
Instead, I use Commonmark for the conversion in my projects. Because it’s fast — much faster than Pandoc, for example. Speed is important as the conversion happens on the fly and handles each doc-string individually. Other than that, any Markdown-to-reST converter would do. M2R2 would be a third example. The downside of any of these is that they do not support Recommonmark’s syntax extensions, such as cross-references to other parts of the documentation. Just the basic Markdown.
To plug in the Commonmark doc-string converter, make sure that package is installed (pip install commonmark
) and add the following to Sphinx’s configuration file conf.py
:
import commonmark
def docstring(app, what, name, obj, options, lines):
md = 'n'.join(lines)
ast = commonmark.Parser().parse(md)
rst = commonmark.ReStructuredTextRenderer().render(ast)
lines.clear()
lines += rst.splitlines()
def setup(app):
app.connect('autodoc-process-docstring', docstring)
Meanwhile, Recommonmark was deprecated in May 2021. The Sphinx extension MyST, a more feature-rich Markdown parser, is the replacement recommended by Sphinx and by Read-the-Docs. With MyST, one could use the same "hack" as above to get limited Markdown support. Though in February 2023, the extension Sphinx-Autodoc2 was published, which promises full (MyST-flavored) Markdown support in doc-strings, including cross-references.
A possible alternative to the approach outlined here is using MkDocs with the MkDocStrings plug-in, which would eliminate Sphinx and reStructuredText entirely from the process.
Building on @john-hennig answer, the following will keep the restructured text fields like: :py:attr:
, :py:class:
etc. . This allows you to reference other classes, etc.
import re
import commonmark
py_attr_re = re.compile(r":py:w+:(``[^:`]+``)")
def docstring(app, what, name, obj, options, lines):
md = 'n'.join(lines)
ast = commonmark.Parser().parse(md)
rst = commonmark.ReStructuredTextRenderer().render(ast)
lines.clear()
lines += rst.splitlines()
for i, line in enumerate(lines):
while True:
match = py_attr_re.search(line)
if match is None:
break
start, end = match.span(1)
line_start = line[:start]
line_end = line[end:]
line_modify = line[start:end]
line = line_start + line_modify[1:-1] + line_end
lines[i] = line
def setup(app):
app.connect('autodoc-process-docstring', docstring)
I had to extend the accepted answer by john-hen to allow multi-line descriptions of Args:
entries to be considered a single parameter:
def docstring(app, what, name, obj, options, lines):
wrapped = []
literal = False
for line in lines:
if line.strip().startswith(r'```'):
literal = not literal
if not literal:
line = ' '.join(x.rstrip() for x in line.split('n'))
indent = len(line) - len(line.lstrip())
if indent and not literal:
wrapped.append(' ' + line.lstrip())
else:
wrapped.append('n' + line.strip())
ast = commonmark.Parser().parse(''.join(wrapped))
rst = commonmark.ReStructuredTextRenderer().render(ast)
lines.clear()
lines += rst.splitlines()
def setup(app):
app.connect('autodoc-process-docstring', docstring)
The current @john-hennig is great, but seems to be failing for multi-line Args:
in python style. Here was my fix:
def docstring(app, what, name, obj, options, lines):
md = "n".join(lines)
ast = commonmark.Parser().parse(md)
rst = commonmark.ReStructuredTextRenderer().render(ast)
lines.clear()
lines += _normalize_docstring_lines(rst.splitlines())
def _normalize_docstring_lines(lines: list[str]) -> list[str]:
"""Fix an issue with multi-line args which are incorrectly parsed.
```
Args:
x: My multi-line description which fit on multiple lines
and continue in this line.
```
Is parsed as (missing indentation):
```
:param x: My multi-line description which fit on multiple lines
and continue in this line.
```
Instead of:
```
:param x: My multi-line description which fit on multiple lines
and continue in this line.
```
"""
is_param_field = False
new_lines = []
for l in lines:
if l.lstrip().startswith(":param"):
is_param_field = True
elif is_param_field:
if not l.strip(): # Blank line reset param
is_param_field = False
else: # Restore indentation
l = " " + l.lstrip()
new_lines.append(l)
return new_lines
def setup(app):
app.connect("autodoc-process-docstring", docstring)
I’m using Sphinx to document a python project. I would like to use Markdown in my docstrings to format them. Even if I use the recommonmark
extension, it only covers the .md
files written manually, not the docstrings.
I use autodoc
, napoleon
and recommonmark
in my extensions.
How can I make sphinx parse markdown in my docstrings?
Sphinx’s Autodoc extension emits an event named autodoc-process-docstring
every time it processes a doc-string. We can hook into that mechanism to convert the syntax from Markdown to reStructuredText.
Unfortunately, Recommonmark does not expose a Markdown-to-reST converter. It maps the parsed Markdown directly to a Docutils object, i.e., the same representation that Sphinx itself creates internally from reStructuredText.
Instead, I use Commonmark for the conversion in my projects. Because it’s fast — much faster than Pandoc, for example. Speed is important as the conversion happens on the fly and handles each doc-string individually. Other than that, any Markdown-to-reST converter would do. M2R2 would be a third example. The downside of any of these is that they do not support Recommonmark’s syntax extensions, such as cross-references to other parts of the documentation. Just the basic Markdown.
To plug in the Commonmark doc-string converter, make sure that package is installed (pip install commonmark
) and add the following to Sphinx’s configuration file conf.py
:
import commonmark
def docstring(app, what, name, obj, options, lines):
md = 'n'.join(lines)
ast = commonmark.Parser().parse(md)
rst = commonmark.ReStructuredTextRenderer().render(ast)
lines.clear()
lines += rst.splitlines()
def setup(app):
app.connect('autodoc-process-docstring', docstring)
Meanwhile, Recommonmark was deprecated in May 2021. The Sphinx extension MyST, a more feature-rich Markdown parser, is the replacement recommended by Sphinx and by Read-the-Docs. With MyST, one could use the same "hack" as above to get limited Markdown support. Though in February 2023, the extension Sphinx-Autodoc2 was published, which promises full (MyST-flavored) Markdown support in doc-strings, including cross-references.
A possible alternative to the approach outlined here is using MkDocs with the MkDocStrings plug-in, which would eliminate Sphinx and reStructuredText entirely from the process.
Building on @john-hennig answer, the following will keep the restructured text fields like: :py:attr:
, :py:class:
etc. . This allows you to reference other classes, etc.
import re
import commonmark
py_attr_re = re.compile(r":py:w+:(``[^:`]+``)")
def docstring(app, what, name, obj, options, lines):
md = 'n'.join(lines)
ast = commonmark.Parser().parse(md)
rst = commonmark.ReStructuredTextRenderer().render(ast)
lines.clear()
lines += rst.splitlines()
for i, line in enumerate(lines):
while True:
match = py_attr_re.search(line)
if match is None:
break
start, end = match.span(1)
line_start = line[:start]
line_end = line[end:]
line_modify = line[start:end]
line = line_start + line_modify[1:-1] + line_end
lines[i] = line
def setup(app):
app.connect('autodoc-process-docstring', docstring)
I had to extend the accepted answer by john-hen to allow multi-line descriptions of Args:
entries to be considered a single parameter:
def docstring(app, what, name, obj, options, lines):
wrapped = []
literal = False
for line in lines:
if line.strip().startswith(r'```'):
literal = not literal
if not literal:
line = ' '.join(x.rstrip() for x in line.split('n'))
indent = len(line) - len(line.lstrip())
if indent and not literal:
wrapped.append(' ' + line.lstrip())
else:
wrapped.append('n' + line.strip())
ast = commonmark.Parser().parse(''.join(wrapped))
rst = commonmark.ReStructuredTextRenderer().render(ast)
lines.clear()
lines += rst.splitlines()
def setup(app):
app.connect('autodoc-process-docstring', docstring)
The current @john-hennig is great, but seems to be failing for multi-line Args:
in python style. Here was my fix:
def docstring(app, what, name, obj, options, lines):
md = "n".join(lines)
ast = commonmark.Parser().parse(md)
rst = commonmark.ReStructuredTextRenderer().render(ast)
lines.clear()
lines += _normalize_docstring_lines(rst.splitlines())
def _normalize_docstring_lines(lines: list[str]) -> list[str]:
"""Fix an issue with multi-line args which are incorrectly parsed.
```
Args:
x: My multi-line description which fit on multiple lines
and continue in this line.
```
Is parsed as (missing indentation):
```
:param x: My multi-line description which fit on multiple lines
and continue in this line.
```
Instead of:
```
:param x: My multi-line description which fit on multiple lines
and continue in this line.
```
"""
is_param_field = False
new_lines = []
for l in lines:
if l.lstrip().startswith(":param"):
is_param_field = True
elif is_param_field:
if not l.strip(): # Blank line reset param
is_param_field = False
else: # Restore indentation
l = " " + l.lstrip()
new_lines.append(l)
return new_lines
def setup(app):
app.connect("autodoc-process-docstring", docstring)