CodeToPretext.py - a module to translate source code to PreTeXt ¶

The API lists two functions which convert source code into PreTeXt. It relies on Implementation to classify the source as code or comment, then _generate_pretext to convert this to PreTeXt.

Imports ¶

These are listed in the order prescribed by PEP 8.

Standard library ¶

import html
from io import StringIO
import re
 

Local application imports ¶

from .SourceClassifier import source_lexer, get_lexer, _debug_print
 
 

API ¶

The following routines provide easy access to the core functionality of this module.

This function converts a string containing source code to PreTeXt, preserving all indentations of both source code and comments. To do so, the comment characters are stripped from CodeChat-formatted comments and all code is placed inside fenced code blocks.

def code_to_pretext_string(

code_str: the code to translate to PreTeXt.

    code_str,

See options.

    **options,
):

Use a StringIO to capture writes into a string.

    output_ptx = StringIO()
    ast_syntax_error, classified_lines = source_lexer(
        code_str, get_lexer(code=code_str, **options)
    )

Remove the newline from a syntax error, so that source line numbers match exactly with the lines output by this function.

    if ast_syntax_error:
        output_ptx.write(
            f"<warning>CodeChat parse error: {html.escape(ast_syntax_error, False)[:-1]}</warning>"
        )
    output_ptx.write("<program><input>")
    _generate_pretext(classified_lines, output_ptx)
    output_ptx.write("</input></program>")
    return output_ptx.getvalue()
 
 

code_to_pretext_file ¶

Convert a source file to a PreTeXt file.

def code_to_pretext_file(

Path to a source code file to process.

    source_path,

Path to a destination PreTeXt file to create. It will be overwritten if it already exists. If not specified, it is source_path.ptx.

    pt_path=None,

Encoding to use for the input file. The default of None detects the encoding of the input file.

    input_encoding="utf-8",

Encoding to use for the output file.

    output_encoding="utf-8",

See options.

    **options,
):

Provide a default rst_path.

    if not pt_path:
        pt_path = source_path + ".ptx"
    with open(source_path, encoding=input_encoding) as fi:
        code_str = fi.read()

If not already present, provide the filename of the source to help in identifying a lexer.

    options.setdefault("filename", source_path)
    rst = code_to_pretext_string(code_str, **options)
    with open(pt_path, "w", encoding=output_encoding) as fo:
        fo.write(rst)
 
 

Converting classified code to PreTeXt ¶

A regex to match part of an opening XML tag in a comment context, allowing for leading whitespace. At this time, a comment must always begin with a <p> tag and end with a closing </p> tag.

xml_partial_opening_tag_regex = re.compile(r"^\s*<p\s*>")
xml_closing_tag_regex = re.compile(r"</p\s*>\s*$")

_generate_pretext ¶

Generate PreTeXt from the classified code. To do this, create a state machine, where current_type defines the state. When the state changes, exit the previous state, then enter the new state.

def _generate_pretext(

An iterable of (type, string) pairs, one per line.

    classified_lines,

A file-like output to which the pretext source is written.

    out_file,
):

Determine the number of characters produced when a newline is written.

    newline_chars = out_file.write("\n")

Undo this write.

    out_file.seek(out_file.tell() - newline_chars, 0)
 

Keep track of the current type. Begin with neither comment nor code.

    current_type = -2
 

Keep track of the current line number.

    line = 1

    for type_, string in classified_lines:
        _debug_print(
            "type_ = {}, line = {}, string = {}\n".format(type_, line, [string])
        )
 

See if there’s a change in state.

        if current_type != type_:

Exit the current state.

            _exit_state(current_type, string, out_file, newline_chars)
 

Enter the new state.

Code state:

            if type_ == -1:

Nothing is needed.

                pass

Comment state:

            else:

Add an opening <p> if there’s no XML tag at the beginning of this string.

                if not re.search(xml_partial_opening_tag_regex, string):
                    string = "<p>" + string
 

Escape characters in code; pass comments through with appropriate leading whitespace.

        out_file.write(
            html.escape(string, False) if type_ == -1 else " " * type_ + string
        )
 

Update the state.

        current_type = type_
        line += 1
 

When done, exit the last state.

    _exit_state(current_type, string, out_file, newline_chars)
 
 

_exit_state ¶

Output text produced when exiting a state. Supports _generate_pretext.

def _exit_state(

The type (classification) of the last line.

    type_,

One line from the program being translated.

    string,

See out_file.

    out_file,

The number of character written by a newline.

    newline_chars,
):

Code state: do nothing.

    if type_ == -1:
        pass

Comment state: emit a closing </p> tag if it wasn’t provided.

    elif type_ >= 0:
        if not re.search(xml_closing_tag_regex, string):

Append it to the end of the string by backing up one character (the existing newline at the end of the current line).

            out_file.seek(out_file.tell() - newline_chars, 0)
            out_file.write("</p>\n")

Initial state. Nothing needed.

    else:
        pass

CodeToPretext.py - a module to translate source code to PreTeXt ¶

Imports ¶

Standard library ¶

Third-party imports ¶

Local application imports ¶

API ¶

code_to_pretext_string ¶

code_to_pretext_file ¶

Converting classified code to PreTeXt ¶

_generate_pretext ¶

_exit_state ¶

CodeChat

Navigation

Related Topics