CodeToRestSphinx.py - a Sphinx extension to translate source code to reST¶
This modules enables Sphinx to read in source files by converting the source code to reST before passing the file on to Sphinx. The overall design:
Monkeypatch Sphinx to include source files in the build, keeping the source file’s extension intact. (Sphinx strips the extension of reST files).
When Sphinx reads a source file, check to see if the file’s extension is intact. If so, it’s a source file; translate it to reST then pass it on to Sphinx.
Imports¶
These are listed in the order prescribed by PEP 8.
Standard library¶
Third-party imports¶
This was deprecated in Sphinx v5.1.0.
The exception FiletypeNotFoundError
was deprecated in Sphinx v2.4.0 by moving it from sphinx.io
to sphinx.errors
.
Local application imports¶
Utility¶
exclude_small_files¶
Python requires __init__.py
files; these are often small or even empty. Provide a function to exclude these from the docs in order to reduce noise. Invoke this from a setup
function in a Sphinx extension.
Therefore, this function excludes small files matching the given glob from a Sphinx build.
The Sphinx application object.
A glob pattern specifying which files should be excluded if they are empty.
Optional; the maximum size of an ignorable file, in bytes. Defaults to 0.
This returns a function which will be called by the config-inited event.
The path must start in the srcdir.
This is slightly inefficient, since it doesn’t use the existing excludes to avoid searching already-excluded values.
Paths must be relative to the srcdir.
Connect this to the config-inited Sphinx event.
source-read event¶
Create a logger for issuing warnings during the build process.
The source-read event occurs when a source file is read. If it’s code, this routine changes it into reST or Markdown.
app: The Sphinx application object.
docname: The name of the document that was read. It contains a path relative to the project directory and (typically) no extension.
A list whose single element is the contents of the source file.
See if it’s an extension we should process.
See if source_file
matches any of the globs.
On a match, pass the specified lexer alias.
Do this after checking the CodeChat_lexer_for_glob
list, since
this will raise an exception on failure.
Translate code to reST or Markdown.
if is_markdown_docname(app.config, docname):
source[0] = code_to_markdown_string(source[0], lexer=lexer)
markup = "Markdown"
else:
source[0] = code_to_rest_string(source[0], lexer=lexer)
source[0] = add_highlight_language(source[0], lexer)
markup = "reST"
logger.info(
"Converted as {} using the {} lexer.".format(markup, lexer.name)
)
except (KeyError, pygments.util.ClassNotFound) as e:
We don’t support this language.
Return True if the supplied docname
is source code.
See docname.
If the docname’s extension doesn’t change when asking for its full path,
then it’s source code. Normally, the docname of foo.rst
is foo
;
only for source code is the docname of foo.c
also foo.c
. Look up
the name and extension using doc2path.
Return True if the supplied docname
is Markdown; False means reST.
The Sphinx config object.
See docname.
Get the second extension: given a file named a.foo.bar
, produce [".foo"]
; given a.bar
, produce []
.
See if this is a recognized Markdown extension.
Monkeypatch¶
Sphinx doesn’t naturally look for source files. Simply adding all supported
source file extensions to conf.py
’s source_suffix
doesn’t work, since foo.c
and foo.h
will now both been seen as the
docname foo
, making then indistinguishable. See also my post on
sphinx-users.
path2doc patch¶
For source files, make their docname the same as the file name; for reST
files, allow Sphinx to strip off the extension as before. This patch
accomplishes this. It comes from sphinx.project.Project
, line 79 and
following in Sphinx 7.2.6.
Return the docname for the filename if the file is a document.
filename should be absolute or relative to the source directory.
try:
return self._path_to_docname[filename] # type: ignore[index]
except KeyError:
if os.path.isabs(filename):
with contextlib.suppress(ValueError):
filename = os.path.relpath(filename, self.srcdir)
for suffix in self.source_suffix:
if os.path.basename(filename).endswith(suffix):
return path_stabilize(filename).removesuffix(suffix)
The following code was added.
the file does not have a docname
Avoid recomputing the value of this variable by defining it globally.
Return True if the provided filename is a source code language CodeChat supports.
type: (str) -> bool Initialize this if necessary.
doc2path patch¶
Next, the way docnames get transformed back to a full path needs to be fixed
for source files. Specifically, a docname might be the source file, without
adding an extension. This code comes from sphinx.project.Project
of Sphinx
7.2.6.
Return the filename for the document name.
If absolute is True, return as an absolute path. Else, return as a relative path to the source directory.
Three lines of code added here – check for the no-extension case.
Backwards compatibility: the document does not exist
get_filetype patch¶
The get_filetype
function raises an exception if it can’t determine the type of a file. Patch it to also recognize source code as reST. This was taken from sphinx.util
, version 7.2.6.
If default filetype (None), considered as restructuredtext.
The following code was added.
This was the existing code.
Per the where to patch docs, patch this where the get_filetype
function is used, not where it’s defined:
The function sphinx.io.get_filetype
was deprecated in Sphinx v2.4.0; it was renamed to sphinx.util.get_filetype
instead. Sphinx uses sphinx.deprecation._ModuleWrapper
to perform deprecation. Since get_filetype
is used in sphinx.io
, we need to monkeypatch inside it, hence the _module
(a member of the _ModuleWrapper
).
In these versions, get_filetype
is used in sphinx.io
. It’s no longer deprecated, but removed; therefore, a direct monkeypatch works.
In current Sphinx, get_filetype
is used in several places:
Current Sphinx (7.2.6) doesn’t need this.
Correct naming for the “show source” option¶
The following function corrects the extension of source files in the
“source” link. By default, Sphinx (in sphinx.builders.html.StandaloneHTMLBuilder.get_doc_context
)
creates a sourcename by appending a file’s extension to the value returned by
doc2path
. For non-source files, doc2path
’s return value contains no
extension, so this works fine. However, for source files, doc2path
’s
return value contains an extension, so that appending the extension to source
files produces a doubled extension – .py.py
, for example.
See app.
The canonical name of the page being rendered, that is, without the
.html
suffix and using slashes as path separators.
The name of the template to render; this will be ‘page.html’ for all pages from reST documents.
A dictionary of values that are given to the template engine to render the page and can be modified to include custom values. Keys must be strings.
A doctree when the page is created from a reST documents; None when the page is created from an HTML template alone.
The extension Sphinx uses optionally includes the html_sourcelink_suffix.
Only provide the rename if necessary.
Take off the second of the double extensions.
Extension setup¶
This routine defines the entry point called by Sphinx to initialize this extension.
See app.
Ensure we’re using a new enough Sphinx using require_sphinx.
Use the source-read event hook to transform source code to reST before Sphinx processes it.
Add the CodeChat.css style sheet using add_css_file.
Add the CodeChat_lexer_for_glob config value. See add_config_value.
Use the html-page-context event to correct the extension of source files.
An ugly hack: we need to get to the Config
object after conf.py
’s values have been loaded. They aren’t loaded
yet, so we store the config
object to access it later when it is
loaded.
Return extension metadata.