Invocation and output of Thrases   

[Note 10-02-02: Still thrases has to be ported to Python 3. thrases has been called "finished forever" at some place - but then the global namespace might be changed to the usual scope handling.]

Save thrases.py anywhere, where you can import from it. Then you can tell your python interpreter (python 2.2-2.7):

from thrases import Template
x = Template(filename, sep='|', incode='iso8859-1')
x.render(locals())
my_page_as_unicodeobj = x.out.getvalue()

incode has the default None. If and only if you want unicode objects as output from x.render() you must provide the encoding here you used when writing the template 'filename' onto your disk. However - when thrases shall render (parts of) websites, the output of render() must be a string with a conrete encoding. In that case you should try to use that encoding for all input of thrases. Not only for the template file, but also for the strings your calling code reaches to the template. In that case thrases will not return unicode, but leave your encoding alone. Thrases is working much faster with encoded strings.

Instead of 'locals()' any analogous dict of variable names and values will do. In case you reach 'locals()' the template has full access to the variables therein - in projects, where templates might be changed by extern webdesigners, you'd better build such a dict yourself, containing just the needed variables with their values. Note, that also code objects can be reached so. Thus there is no urgent need for python's import-statement in the mini python thrases can interprete.

The template provided by filename (for a file anywhere on your disk) can have any text format of your choice. You can also reach a string to the template - if thrases doesn't find a file called 'filename', it will interprete that argument as a string or unicode object describing a template.

There is a second possibility for getting output from a template: You can reach an open file descriptor to Template.render() by the keyword argument fd. In that case x.render(locals(), fd=my_file) will not return anything, but write the result on my_file. This brings remarkable improvements in speed. And should also work with sockets of HTTP-servers.

With files as input to the constructor, a template can re-read that file by the method x.reinit(), which also evaluates the time signature. You also have the switch x.render(locals(), fd=my_file, reinit=True) as a keyword argument, which has the default False.

   Syntax   

Thrases reserves the separator of its phrases for its own use, you cannot use thrases, when you need it for anything else. There is no escaping mechanism (however you could easily do appropriate replacements around). In the sequel '~~' - the default for the separator - will denote it. At first thrases does the text replacements for '#insert filename~~' (more below). Then input is split at its double tildes, which have a similar meaning as the semicolon in python or C. Thrases decides how to manage the parts by applying regular expressions and pythons compile() for testing their syntax. For each of them one of the following three actions will be performed:

A fourth case for phrases are python style comments, which start with '#' (leading white space allowed), and range til the next double tilde. Thus '#insert filename~~' is not a comment, because not a complete phrase - there might be no leading double tilde (more below).

There is an exception to the second case: function calls. A python function call is assumed as 'silent'. Thus str(float(2)) would not render anything. In cases like this a workaround with assignment must be used: ~~s=str(float(2))~~s~~ will render the value of the sole s.

The mini python understood by thrases is format-free. Not completely as C is, but whitespace around your phrases can have any shape you want. Inside the python phrases python syntax applies - as well as the common use of the backslash. The empty phrase, consisting of four consecutive tildes, dedents python one level. Getting to 'negative' indentation levels is preempted: You could write '~~~~~~~~~~~~~~~~~~~~' for getting - probably - to the 'root' level.

   Inserting Files   

As mentioned above '#insert filename~~' will be replaced literally by the content of the file filename. This can be written anythere in the template. The file has to be provided

The file to get inserted will be first searched in this starting directory and then, if not found, in its parent directories in ascending order. Thus a file containing function definitions, common titles or footers or anything like this for a group of files can be saved in a directory above them and though provided by its basename only.

#insert C:\\project\\templates\\filename~~

(for Windows here) will be completely replaced by the bare content of 'C:\project\templates\filename'. Thus, in the frequent case when it contains python (like defs of thrases-functions), that content should end with a double tilde. But also portions of simple text can be inserted so. It's because of this, that '#insert ...~~' is not a complete phrase in thrases, it is 'open' to the front. You can do something like:

Sylvia's homepage is #insert C:\\project\\sites\\maintainers\\Sylvia;~~ where you can find ...

and also the more normal:

On <a href="#insert C:\\project\\sites\\maintainers\\Sylvia~~">Sylvia's homepage</a> you can find ...

to paste a piece of simple text into simple text of your template. This would not work with comments, if no double tilde (followed by arbitrary white space) precedes the '#', this and the following characters are rendered.

   The python in thrases   

This mini python is complete besides commands or functions with effects on namespaces. There is no "class" statement in thrases, no "import" and neither "exec" nor "eval". But you have "def", nested too, and creation of objects by assignments. Not by iterable unpacking however - you can unpack of course, but the variables on the left side must already exist. And for functions the *-syntax for argument reaching is not supported. In the body of every function declared by "def" you have the full syntax of thrases with the three possible actions described above.

Another main restriction is the scope of variables declared in the template, they are all global to the template (without needing any 'global my_variable'-statements deeper in the execution stack). Except function arguments, which stay local as usually. All variables and functions you declare explicitely anywhere in the template's code are mounted to an instance self.usr, which is an attribute of your template object. Its class is empty (declared: 'class o: pass') having no other purpose. Thus you are completely free in the choice of your variable's names and safe from any side effects on the variables of the calling code.

On error-handling: The try-and-except mechanisms involved in the choice between action 1, 2 and 3 as mentioned above will frequently force thrases to render your python expression literally to output, if erroneous. The same with all python statements. For exceptions while rendering python's standard messages have slightly been extended: On syntax errors the last 20 lines before the line, where the error is detected, from an intermediate representation of the template (q.exin) are printed (much easier than to print the related lines of the template at that point of processing). On all other types the last 10 rendered strings in the resulting deque (if existent) are printed additionally.

   Example   

An example from production code (as SciTe shows it - editors will get no problems with html-formatting by thrases). It will render just a <div>-element, which is destined to get integrated into an extern side. Thus CSS must be used inline here, but by templating we still can avoid repetetions.

# -*- coding: iso8859-1 -*-
~~#insert auXion_header.htm~~
<div align="center">
<div style = "text-align: left;
              padding:0px;
              background:lightcyan;
              width:80%; border:2px;
              border-style:ridge;
              border-color:#000080;
              font-family:Georgia, Sylfaen, Times New Roman;">
<br>~~
st = line[c_ict['lang']].strip()~~
if st[-1]==',': st = st[0:-1]~~~~
<div style="text-align:center; font-size:17pt;">~~st~~</div><br>~~
if kurz1:<div ~~largeLine~~>~~kurz1~~</div>~~~~
addedlabel('Genaues Stempeldatum: ', format_date(line[c_ict['dtstamp']]))~~
addedlabel('Erhaltung (preservation): ', line[c_ict['zustand']])~~
if line[c_ict['ftsize']]:
        sf = line[c_ict['ftarr']]~~
        if sf in ('geteilt', 'ungeteilt'):
            <div ~~Label~~>Format: </div>
            <div ~~Data~~>~~line[c_ict['ftsize']]~~, ~~sf~~e Rückseite</div>~~~~
        else:
            addedlabel('Format:', line[c_ict['ftsize']])~~     
            addedlabel('Rückseite:',sf)~~~~~~

addedlabel('Verlag (publisher): ', line[c_ict['verlag']])~~
addedlabel('Künstler (artist): ', line[c_ict['artist']])~~<br>~~


if line[c_ict['english']]:<div ~~largeLine~~>~~line[c_ict['english']]~~</div>~~
        addedlabel('Exact date of postmark:', format_date(line[c_ict['dtstamp']]))~~
        
        sfts = line[c_ict['ftsize']]~~
        if sfts=='klein': sfts='small'~~~~ elif sfts=='groß': sfts='large'~~~~~~
        addedlabel('Format: ', sfts)~~
        
        sfta = line[c_ict['ftarr']]~~
        if sfta=='geteilt': sfta = 'divided'~~~~ elif sfta=='ungeteilt': sfta='for address only'~~~~
        elif sfta.upper()=='RKP': sfta='no postcard'~~~~~~
        addedlabel('Back', sfta)~~</div>~~~~</div></div>

The second line imports another thrases-formatted file:

Label = 'style ="float: left; padding: 4px 0px 0px 10px; width: 180px; "'~~
Label += 'font-family: Verdana; font-size: 10pt;"'~~
Data = 'style = " font-family: Sylfaen, Georgia; font-size: 13pt;"'~~
largeLine = 'style=" background: #f8d0d0; margin: 0px; width: 100%; text-align:center;'~~
largeLine += 'border-width: 3px 0px 3px 0px; border-style:solid; border-color: #ffffff;"'~~
largeLine += 'font-size:14pt; padding: 6px 0px;"'~~
def format_date(dte):
      if len(dte)>4:
          dte = dte[6:8]+'.'+dte[4:6]+'.'+dte[:4]~~
          if dte[3]=='0': dte = dte[:3]+dte[4:]~~~~
          if dte[0]=='0': dte=dte[1:]~~~~~~
      return dte~~~~
       
def addedlabel(SSS, TTT):
  if TTT: <div ~~Label~~>~~SSS~~ </div><div ~~Data~~>~~TTT~~</div>~~~~~~

(For those who try to understand this completely: fomat_date() is not used by the concrete div, which doesn't come from a data set with a given "Genaues Stempeldatum (exact date of postmark)". It depends on premises about those data and can be neglected here. c_ict is a dict, which maps column names in the underlying database to column numbers in ListView-widgets. It is used throughout the project and makes adapting changes in the database very easy).

Thrases renders something like the following from this template and datasets describing historical picture postcards. It needs so many "ifs" und thus templating, because nearly all db-fields could be empty.The function addedlabel() is central here and its best place is in a template, not in the calling code.


Ein Kind stellt eine brennende Kerze an ein Kruzifix im Schnee

x um 1930-40
Erhaltung (preservation):
I
Format:
groß, geteilte Rückseite
Verlag (publisher):
Selbstverlag M. Spötl, Schwaz
Künstler (artist):
M. Spötl

A child puts a burning candlelight to a crucifix in the snow
Format:
large
Back:
divided

   Conclusion, code   

The __init__() might possibly be slow, but the rendering is done very fast by thrases. I replaced a Mako template in production code by an equivalent for thrases. At that time a unicode object has been returned and still it seemed, thrases rendered faster. With encoded strings and rendering onto a file descriptor thrases is by far the fastest templating engine in the world, i hope.

[Note 10-02-05: This has probably changed with some of the recently new template engines like Tenjin. Still there is this theoretical maximum: The time to copy the mere strings + the time the python interpreter needs for executing the included statements. And thrases's overhead is small.]

It was great fun to make thrases. And here is the code. Good luck!

# Written by Joost Behrends
# and placed in the public domain

from StringIO import StringIO
from collections import deque
from inspect import currentframe, getouterframes
import re, os, sys, time

class TemplateError(Exception):
    """Exception raised by generated code from a template for thrases"""
    pass

class StringProtectingSplitter():
    """ 
    Designed for a split(), which splits only outside strings.
    Not a string method, because of unicode. Unlike split() a deque is returned.
    """ 
    r0 = 'u?""".*?"""'; r1 = "u?'''.*?'''"
    r2 = '[ru]?"[^"]*"'; r3 = "[ru]?'[^']*'"
    rxString = re.compile('(?:'+r0+'|'+r1+'|'+r2+'|'+r3+')')    
    
    def __init__(q, s):
        q.splitchars = s
        q.wd = len(s)

    def split(q, sarg):
        Ivalls = [t.span() for t in q.rxString.finditer(sarg)]
        lix = 0; ix = sarg.find(q.splitchars)
        result = deque()
        while ix > -1:
            while Ivalls and ix >= Ivalls[0][1]: del Ivalls[0]
            if Ivalls and ix < Ivalls[0][0] or not Ivalls:
                result.append(sarg[lix:ix]); lix = ix+q.wd
            ix = sarg.find(q.splitchars, ix+q.wd)
        if lix < len(sarg): result.append(sarg[lix:])
        return result
      
rsID = '[a-zA-Z_][a-zA-Z_0-9]*'; rxID = re.compile(rsID)
rsID0 = '^'+rsID; rxID0 = re.compile(rsID0)
rxFunc = re.compile(rsID0 + '\(.*\)$')
rxAssignment = re.compile('(' +rsID0+ ')([^=]*=\\s*[^=\\s]+)+')
rxCondPhrase = re.compile('(?:def|for|if|while|else|elif|try|except|finally)(?:\\s+.*?)?:')
        # def is in rxCondPhrase too: for correct splitting; MEMO: elif is "exclusive"
rxDeFrase = re.compile('^(?:def\\s+)('+rsID+')\\s*\(((?:\\s*'+rsID+'.*?)*)\)\\s*:')

def dummyfunc(*x): pass

class Template():
    """
    __init__ () reads its template from filename into a string
    and converts it to a representation of valid python code,
    called exin, which render() will pass to exec().

    Input will be split by sep into phrases. Three things can happen to them:
    - The phrase can be transformed into the command writing it to output.
      This happens to non-python.
    - The phrase is provided for evaluation by python, then the result
      will get the same transformation. This happens to python expressions.
    - Python statements are written unchanged to exin.
    compile is used for syntax checking only.

    render() simply passes exin to exec(), which then evaluates the actual
    values of the args.
    """

    def reinit(q):
        if os.stat(q.filename).st_mtime > q.constructed:
            q.__init__(q.filename, q.incode)
       
    def wripy(q, s, lv, silent=False):
        for name in q.Vars.keys():
            if not name in q._protectedVars:
                s = '@'+s+'@'
                while q.Vars[name].search(s): s = q.Vars[name].sub('\\1q.usr.\\2\\3', s)
                s=s[1:-1]
        if silent: q.exin.write('\t'*lv + s +'\n')
        elif q.incode: q.exin.write('\t'*lv +'OUT.write('+unicode(s, q.incode)+')\n')
        else: q.exin.write('\t'*lv +'OUT.write('+ s +')\n')

    def wrirect(q, s, lv):
        if isinstance(s, str) and q.incode: s = unicode(s, q.incode)
        if s.find('"""') > -1 or s.endswith('"'):
            lix=0; ix=s.find('"')
            while ix > -1:
                q.exin.write('\t'*lv + 'OUT.write("""' +s[lix:ix]+ '""")\n')
                q.exin.write('\t'*lv + """OUT.write('"')\n""")
                lix=ix+1; ix=s.find('"', lix)
            q.exin.write(s[lix:])
        else:
            q.exin.write('\t'*lv + 'OUT.write("""' +s+ '""")\n')

    def __init__(q, filename, sep='~~', incode = None):
        q.incode = incode     # needed in q.wrirect()
        q.filename = filename; q.constructed = time.time()  
            # both only for reloading
        q.exin = StringIO()
        q.exin.write('OUT = q.out\n')
        q.Vars = {}; q.subscribedFuncs = {}
        q.protectedVars = {}; q._protectedVars = []
            # _protectedVars is the sum of the lists of protectedVars.values()
            # for levels greater or equal than the actual
        ByColon = StringProtectingSplitter(':')
        BySep = StringProtectingSplitter(sep)
        lv=0

        if len(filename)>256 or not os.path.exists(filename):
            q.inp = filename
            q.tmplName = ' input string '
            q.templateInputIsFile = False
        else:
            f = open(filename, 'rb'); q.inp = f.read(); f.close()
            if q.incode: q.inp = q.inp.decode(incode)
            q.tmplName = filename # just for messages of exceptions
            q.templateInputIsFile = True
        rxInsert = re.compile('#insert\\s+([^;]+)~~')
        for matching in rxInsert.finditer(q.inp):
            filename = matching.group(1)
            if not os.sep in filename:
                if q.templateInputIsFile: dir = os.path.dirname(q.tmplName)
                else: dir = os.path.dirname(getouterframes(currentframe())[1][1])
                if not filename in os.listdir(dir):
                    while os.sep in os.path.dirname(dir) or '/' in os.path.dirname(dir):
                        dir = os.path.dirname(dir)
                        if filename in os.listdir(dir): break
                        if os.sep not in os.path.dirname(dir) and \
                        '/' not in os.path.dirname(dir) or \
                        sys.platform == 'win32' and dir[1:3]==':\\':
                            raise IOError(filename + ' not found')
                            break
                filename = dir + os.sep + filename
            f = open(filename, 'r')
            q.inp = rxInsert.sub(f.read(), q.inp, 1); f.close()

        class o: pass
        q.usr = o()
        subqueue = deque(); noCond = False
        PhraseQueue = deque(q.inp.split(sep))
        while len(PhraseQueue):
            phrase = PhraseQueue.popleft()
            thrase = phrase.strip() # t for 'test'

            if thrase.startswith('#'): pass
            elif not phrase:
                if lv: lv -= 1
                for i in q.subscribedFuncs.keys():
                    if lv <= i:
                        q.exin.write('\t'*lv +'q.usr.'+q.subscribedFuncs[i] +'='+ \
                                                       q.subscribedFuncs[i] +'\n')
                        del q.subscribedFuncs[i]
                        q._protectedVars = \
                            q._protectedVars[:len(q._protectedVars)-len(q.protectedVars[i])]
                        q.protectedVars[i] = []
                    # all this happens only 'once'

            elif thrase.startswith(('[',']','/','+','*','(',')','%','<','>','"',"'",'=')) \
            or thrase=='':
                q.wrirect(phrase, lv)

            elif thrase == 'pass': q.wripy(thrase, lv, silent=True)
                
            elif thrase in ('break', 'continue') or \
            thrase.startswith(('assert', 'del', 'raise')):
                q.wripy('try:', lv, silent = True)
                q.wripy(thrase, lv+1, silent = True)
                q.wripy('except:', lv, silent = True)
                q.wrirect(phrase, lv+1)
                            
            elif thrase.startswith(('return', 'yield')):
                try:
                    compile('def t(): '+thrase, '', 'exec')
                    q.wripy(thrase, lv, silent=True)
                except:
                    q.wrirect(phrase, lv)
                    
            elif rxFunc.match(thrase):
                try:
                    compile(thrase, '', 'exec')
                    q.wripy(thrase, lv, silent = True)
                except:
                    q.wrirect(phrase, lv)
          
            elif not noCond and thrase.find(':') > -1:
                # care for nested conditional statements
                subqueue = ByColon.split(phrase)
                while len(subqueue) > 1:
                    s = subqueue.popleft()
                    thrase = s.strip() + ':'
                    if thrase.startswith('else') or thrase.startswith('elif'):
                        for_test = 'if x: pass\n' + thrase + ' pass'
                    elif thrase.startswith('except'):
                        for_test = 'try:pass\n' + thrase + ' pass'
                    elif thrase.startswith('finally'):
                        for_test = 'try:pass\nexcept:pass\n' + thrase + ' pass'
                    else:
                        for_test = thrase + ' pass'
                    try:
                        compile(for_test, '', 'exec')
                    except:
                        subqueue.appendleft(s+':'+subqueue.popleft())
                    else:
                        q.tNP = rxDeFrase.match(thrase)
                        if q.tNP:
                            name = q.tNP.group(1)
                            setattr(q.usr, name, dummyfunc)
                            q.wripy(thrase, lv, silent=True)
                            q.Vars[name] = \
                                re.compile('(.*[^\.a-zA-Z_0-9])('+ name+ ')(\\W.*)')
                            q.protectedVars[lv] = q.tNP.group(2).split(',')
                            q.protectedVars[lv] = \
                                filter(lambda x:rxID0.search(x), q.protectedVars[lv])
                            q.protectedVars[lv] = \
                                map(lambda x:rxID0.search(x).group(0), q.protectedVars[lv])
                            q._protectedVars += q.protectedVars[lv]
                            q.subscribedFuncs[lv] = name
                            lv += 1
                            phrase = phrase[len(s)+1: ]
                        elif rxCondPhrase.match(thrase):
                            q.wripy(thrase, lv, silent=True); lv += 1
                            phrase = phrase[len(s)+1: ]
                        else:
                            subqueue.appendleft(s+':'+subqueue.popleft())
                            # cares for slices
                else:
                    PhraseQueue.appendleft(phrase)
                    noCond = True
                    continue  # avoid 'noCond = False' at the end of the loop

            elif rxAssignment.match(thrase):
                try:
                    compile(thrase, '', 'exec')
                except:
                    q.wrirect(phrase, lv)
                else:
                    name = rxAssignment.search(thrase).group(1)
                    if name not in q._protectedVars:
                        q.wripy('q.usr.'+thrase, lv, silent=True)
                        if not hasattr(q.usr, name): setattr(q.usr, name, '')
                        q.Vars[name] = re.compile('(.*[^\\.a-zA-Z_0-9])('+ name+ ')(\\W.*)')
                    else:
                        q.wripy(thrase, lv, silent=True)
            else:
                try:
                    compile(thrase, '', 'eval')
                except:
                    q.wrirect(phrase, lv)
                else:
                    q.wripy('try:', lv, silent = True)
                    q.wripy('q.tNP = '+ thrase, lv+1, silent = True)
                    q.wripy('except:', lv, silent = True)
                    q.wrirect(phrase, lv+1)
                    q.wripy('else:', lv, silent = True)
                    q.wripy('q.tNP', lv+1)
            noCond = False
                                
    def render(q, locs, fd = None, reinit = False):
        """
        The dict locs gives render() full access to the variables therein.
        To warn still more explicitly: This means, that python code in the template
        can change the caller's locals(), when
        the argument for locs in the render call is locals().
        """
        if reinit and q.templateInputIsFile: q.reinit()
#        print q.exin.getvalue()
        if fd: q.out = fd
        else: q.out = StringIO()
        locs['q'] = q
        try:
            exec q.exin.getvalue() in locs
        except SyntaxError:
            import sys, traceback
            l = q.exin.getvalue().split('\n')
            t0, t1, dummy = sys.exc_info(); del dummy; t1 = repr(t1)
            message = '\n--<>-- ' + q.tmplName + ' --<>--' \
                    +  '\n--<>-- SyntaxError --<>--\n'
            try:
                i = int(re.findall('(\\d+)', t1, re.DOTALL)[0])
            except:
                message += t1
            else:
                ll = []; k = 20
                for j in range(i+1, -1, -1):
                    if l[j].find('if q.out and q.out[-1]==";": q.out.pop()') == -1:
                        ll.insert(0, ss[j])
                    k -= 1
                    if not k: break
                message += '\n'.join(ll[0:20-k])
            finally:
                raise TemplateError(message); return
        except:
            import sys
            t0, t1, dummy = sys.exc_info(); del dummy
            message = '\n--<>-- ' + q.tmplName + ' --<>--' \
                      + '\n' + repr(t1) + '\n'
            raise TemplateError(message); return
SourceForge.net Logo