From 890e960b57697538512b45f676ef82a260c3d473 Mon Sep 17 00:00:00 2001 From: Sei Lisa Date: Fri, 13 Mar 2015 06:38:35 +0100 Subject: [PATCH] Implement Lazy List reading. Update docs according to last changes (in FS too). Adds a new tree node type, SUBIDX, which hopefully should never appear in actual output. If it does, it's prefixed with the string (MISSING TYPE) as a cue to the programmer. --- README.md | 52 +++++++++++------ lslopt/lslfoldconst.py | 13 +++++ lslopt/lsloutput.py | 3 + lslopt/lslparse.py | 128 +++++++++++++++++++++++++++-------------- 4 files changed, 135 insertions(+), 61 deletions(-) diff --git a/README.md b/README.md index 27c1b79..c29ba8b 100644 --- a/README.md +++ b/README.md @@ -10,7 +10,7 @@ The aim of this program is to act as a filter that performs the optimizations au It also implements several syntax extensions to help improving the readability of scripts and the productivity of the programmer. It works well when combined with a C preprocessor such as _Boost::Wave_ (the one embedded in Firestorm) or `cpp`. -Firestorm does already incorporate an optimizer. However it is limited to removing unused global variables and functions, and does so by simple string analysis, not by syntactic analysis. In contrast, the program presented here does full syntax analysis and implements many more optimizations, including removing unused locals, simplifying many expressions, removing dead code, and more. +Firestorm does already incorporate an optimizer. However it is limited to removing unused global variables and functions, and does so by simple string analysis, not by syntactic analysis (e.g. if a local variable with the same name as a global is defined, the global isn't optimized out even if not used). In contrast, the program presented here does full syntax analysis and implements many more optimizations, including removing unused locals, simplifying many expressions, removing dead code, and more. ## Syntax extensions @@ -123,42 +123,40 @@ These extensions are implemented for compatibility with the syntax extensions or ### `switch()` statements -Enables use of C-like `switch` statements. These produce very awkward code, hard to optimize, and evaluate the argument multiple times (as many as `case` labels are present). - -In Firestorm, this feature is implemented as a search-and-replace of text, with no regards for the semantic meaning of the text, so a fragment like this: -``` -llOwnerSay("Don't use switch(), just in case it doesn't work as you expect: it is buggy."); -``` -produces this unexpected output: -``` -[08:26:32] Object: Don't use {if(() == ( it doesn't work as you expect))jump c6hFK7; -, just in @c6hFK7; - it is buggy. -``` -This optimizer performs proper syntax analysis and processes the above example correctly. +Enables use of C-like `switch` statements. These produce very awkward code, hard to optimize, and the argument is evaluated multiple times (as many as `case` labels are present). The syntax of the `switch` statement as implemented, has two restrictions over its C counterpart: 1. `case` labels can't appear in nested blocks. That's because they are replaced by LSL labels, and as discussed in the *Multiple labels with the same name* section above, label scope rules prevent their visibility in an outer block, so once converted to labels, the corresponding `jump` instructions would not be able to find them. This limitation means that [Duff's device](https://en.wikipedia.org/wiki/Duff's_device) or similar constructs can't be implemented with this optimizer. 2. `switch()` needs to be followed by a block, not by a single statement; for example, whiile this works in C, it won't work in this optimizer: `switch(1) case 1: break;`. The reason is that `case` is treated by this parser as a statement, rather than as a label, making `break` be outside the `switch`. This limitation is probably only of theoretical importance and will not have any practical implication, since single-statement `switch` clauses are of no practical use (known to the author). +As an extension, and for compatibility with Firestorm, if there is a block beginning right after a `case` or `default` statement, the colon is optional. For example, all these are valid: +``` + switch (x) { case 1: ; default: ; } + switch (x) { case 1 {} default {} } +``` + ### Lazy lists -That's how Firestorm calls an extended syntax for assigning values to individual list elements. The syntax is: +That's how Firestorm calls an extended syntax for subindex values after individual list elements. + +#### Assignment + +The syntax for assignment is: ``` mylist[index] = value; ``` -It is designed to be a shortcut for this: +which is designed to be roughly a shortcut for this: ``` mylist = llListReplaceList(mylist, [value], index, index); ``` -However, the implementation includes creating a function that performs the replacement. The function is called `lazy_list_set`. In Firestorm, this function can't be user-overriden: if you define your own, you'll get a duplicate identifier error. In this optimizer, if you define a function with this prototype: +However, the implementation includes creating a function that performs the replacement. The function is called `lazy_list_set`. It can be user-overriden. If you define a function with this prototype: ``` list lazy_list_set(list target, integer index, list value) ``` -then the optimizer will use yours rather than defining it twice. +which returns the list with the element replaced, then the optimizer will use yours rather than defining it twice. -For compatibility with Firestorm, when the index is greater than the number of elements in the list, the intermediate values are filled with integer zeros. If you don't want that, you may have a reason to override it. But best is to stay away from this syntax altogether, as the counterpart of using `mylist[index]` to read an element doesn't work, because the system doesn't know what type to extract it as (i.e. which of the `llList2XXX` functions to use to do the extraction). +For compatibility with Firestorm, when the index is greater than the number of elements in the list, the intermediate values are filled with integer zeros. If you don't want that, you may have a reason to override it. Note that the value of the assignment as an expression is the whole list, not the element. For example, this will fail because it's assigning a list to an integer: ``` @@ -168,6 +166,22 @@ But this will work: ``` list a; integer b; a[5] = b = 4; ``` +#### Reading + +The syntax for reading an element is the same as for assigning, but it returns no type, therefore a type cast is mandatory. For example: + +``` + list a; integer b = (integer)a[3]; +``` +That is converted at parsing time to: +``` + list a; integer b = llList2Integer(a, 3); +``` +If the type it's cast to is list, it needs two parameters (starting and ending index), not one: +``` + list a; a = (list)a[3, 3]; +``` +That is a requirement of the underlying `llList2List` function used in this case. ## Using the program diff --git a/lslopt/lslfoldconst.py b/lslopt/lslfoldconst.py index 9207aeb..0bc8f9b 100644 --- a/lslopt/lslfoldconst.py +++ b/lslopt/lslfoldconst.py @@ -1029,6 +1029,19 @@ class foldconst(object): node['StSw'] = True return + if nt == 'SUBIDX': + # Recurse to every child. It's SEF if all children are. + idx = 0 + issef = True + while idx < len(child): + self.FoldTree(child, idx) + if 'SEF' not in child[idx]: + issef = False + idx += 1 + if issef: + node['SEF'] = True + return + if nt == ';': node['SEF'] = True return diff --git a/lslopt/lsloutput.py b/lslopt/lsloutput.py index 6950f57..6efa3ed 100644 --- a/lslopt/lsloutput.py +++ b/lslopt/lsloutput.py @@ -298,6 +298,9 @@ class outscript(object): if nt == 'EXPRLIST': return self.OutExprList(child) + if nt == 'SUBIDX': + return '(MISSING TYPE)' + self.OutExpr(child[0]) + '[' + self.OutExprList(child[1:]) + ']' + assert False, 'Internal error: expression type "' + nt + '" not handled' # pragma: no cover def OutCode(self, node): diff --git a/lslopt/lslparse.py b/lslopt/lslparse.py index 9c91db4..a7c6cc5 100644 --- a/lslopt/lslparse.py +++ b/lslopt/lslparse.py @@ -193,6 +193,9 @@ class parser(object): unicode:'STRING_VALUE', Key:'KEY_VALUE', Vector:'VECTOR_VALUE', Quaternion:'ROTATION_VALUE', list:'LIST_VALUE'} + TypeToExtractionFunction = {'integer':'llList2Integer', + 'float':'llList2Float', 'string':'llList2String', 'key':'llList2Key', + 'vector':'llList2Vector', 'rotation':'llList2Rot', 'list':'llList2List'} # Utility function def GenerateLabel(self): @@ -619,19 +622,20 @@ class parser(object): | LIST_VALUE | TRUE | FALSE | vector_literal | rotation_literal | list_literal | PRINT '(' expression ')' | IDENT '(' expression_list ')' | lvalue '++' | lvalue '--' | assignment %if allowed + | IDENT '[' expression ']' '=' expression %if lazylists + | IDENT '[' expression ']' %if lazylists | lvalue vector_literal: '<' expression ',' expression ',' expression '>' rotation_literal: '<' expression ',' expression ',' expression ',' expression '>' list_literal: '[' optional_expression_list ']' - assignment: xlvalue '=' expression | lvalue '+=' expression + assignment: lvalue '=' expression | lvalue '+=' expression | lvalue '-=' expression | lvalue '*=' expression | lvalue '/=' expression | lvalue '%=' expression | lvalue '|=' expression %if extendedassignment | lvalue '&=' expression %if extendedassignment | lvalue '<<=' expression %if extendedassignment | lvalue '>>=' expression %if extendedassignment - xlvalue: lvalue | IDENT '[' expression ']' %if lazylists lvalue: IDENT | IDENT '.' IDENT """ tok0 = self.tok[0] @@ -741,6 +745,68 @@ class parser(object): raise EParseTypeMismatch(self) typ = sym['Type'] lvalue = {'nt':'IDENT', 't':typ, 'name':name, 'scope':sym['Scope']} + + # Lazy lists + if self.lazylists and tok0 == '[': + self.NextToken() + if typ != 'list': + raise EParseTypeMismatch(self) + idxexpr = self.Parse_optional_expression_list() + self.expect(']') + self.NextToken() + if self.tok[0] != '=' or not AllowAssignment: + return {'nt':'SUBIDX', 't':None, 'ch':[lvalue] + idxexpr} + + # Lazy list assignment + if len(idxexpr) != 1: + raise EParseFunctionMismatch(self) + if idxexpr[0]['t'] != 'integer': + raise EParseTypeMismatch(self) + idxexpr = idxexpr[0] + self.NextToken() + expr = self.Parse_expression() + rtyp = expr['t'] + # Define aux function if it doesn't exist + # (leaves users room for writing their own replacement, e.g. + # one that fills with something other than zeros) + if 'lazy_list_set' not in self.symtab[0]: + self.PushScope() + paramscope = self.scopeindex + params = (['list', 'integer', 'list'], + ['L', 'i', 'v']) + self.AddSymbol('f', 0, 'lazy_list_set', Loc=self.usedspots, + Type='list', ParamTypes=params[0], ParamNames=params[1]) + self.AddSymbol('v', paramscope, 'L', Type='list') + self.AddSymbol('v', paramscope, 'i', Type='integer') + self.AddSymbol('v', paramscope, 'v', Type='list') + #self.PushScope() # no locals + + # Add body (apologies for the wall of text) + # Generated from this source: + ''' +list lazy_list_set(list L, integer i, list v) +{ + while (llGetListLength(L) < i) + L = L + 0; + return llListReplaceList(L, v, i, i); +} + ''' + self.tree[self.usedspots] = {'ch': [{'ch': [{'ch': [{'ch': [{'ch': [{'scope': paramscope, 'nt': 'IDENT', 't': 'list', 'name': 'L'}], 'nt': 'FNCALL', 't': 'integer', 'name': 'llGetListLength'}, {'scope': paramscope, 'nt': 'IDENT', 't': 'integer', 'name': 'i'}], 'nt': '<', 't': 'integer'}, {'ch': [{'ch': [{'scope': paramscope, 'nt': 'IDENT', 't': 'list', 'name': 'L'}, {'ch': [{'scope': paramscope, 'nt': 'IDENT', 't': 'list', 'name': 'L'}, {'nt': 'CONST', 't': 'integer', 'value': 0}], 'nt': '+', 't': 'list'}], 'nt': '=', 't': 'list'}], 'nt': 'EXPR', 't': 'list'}], 'nt': 'WHILE', 't': None}, {'ch': [{'ch': [{'scope': paramscope, 'nt': 'IDENT', 't': 'list', 'name': 'L'}, {'scope': paramscope, 'nt': 'IDENT', 't': 'list', 'name': 'v'}, {'scope': paramscope, 'nt': 'IDENT', 't': 'integer', 'name': 'i'}, {'scope': paramscope, 'nt': 'IDENT', 't': 'integer', 'name': 'i'}], 'nt': 'FNCALL', 't': 'list', 'name': 'llListReplaceList'}], 'nt': 'RETURN', 't': None, 'LIR': True}], 'nt': '{}', 't': None, 'LIR': True}], 't': 'list', 'pnames': params[1], 'scope': 0, 'pscope': paramscope, 'nt': 'FNDEF', 'ptypes': params[0], 'name': 'lazy_list_set'} + self.usedspots += 1 + #self.PopScope() # no locals + self.PopScope() + + if expr['t'] is None: + raise EParseTypeMismatch(self) + if expr['t'] != 'list': + expr = {'nt':'CAST', 't':'list', 'ch':[expr]} + + return {'nt':'=', 't':'list', 'ch':[lvalue, { + 'nt':'FNCALL', 't':'list', 'name':'lazy_list_set', + 'scope':0, + 'ch':[lvalue.copy(), idxexpr, expr] + }]} + if tok0 == '.': self.NextToken() self.expect('IDENT') @@ -756,8 +822,8 @@ class parser(object): raise EParseTypeMismatch(self) return {'nt':'V++' if tok0 == '++' else 'V--', 't':lvalue['t'], 'ch':[lvalue]} if AllowAssignment and (tok0 in self.assignment_toks - or self.extendedassignment and tok0 in self.extassignment_toks - or self.lazylists and tok0 == '['): + or self.extendedassignment + and tok0 in self.extassignment_toks): if tok0 == '[': if lvalue['nt'] != 'IDENT': raise EParseSyntax(self) @@ -783,44 +849,6 @@ class parser(object): if tok0 != '*=' or typ == 'float': expr = self.autocastcheck(expr, typ) rtyp = typ - # Lazy list handler - if tok0 == '[': - # Define aux function if it doesn't exist - # (leaves users room for writing their own replacement, e.g. - # one that fills with something other than zeros) - if 'lazy_list_set' not in self.symtab[0]: - self.PushScope() - paramscope = self.scopeindex - params = (['list', 'integer', 'list'], - ['L', 'i', 'v']) - self.AddSymbol('f', 0, 'lazy_list_set', Loc=self.usedspots, - Type='list', ParamTypes=params[0], ParamNames=params[1]) - self.AddSymbol('v', paramscope, 'L', Type='list') - self.AddSymbol('v', paramscope, 'i', Type='integer') - self.AddSymbol('v', paramscope, 'v', Type='list') - #self.PushScope() # no locals - - # Add body (apologies for the wall of text) - # Generated from this source: - ''' -list lazy_list_set(list L, integer i, list v) -{ - while (llGetListLength(L) < i) - L = L + 0; - return llListReplaceList(L, v, i, i); -} - ''' - self.tree[self.usedspots] = {'ch': [{'ch': [{'ch': [{'ch': [{'ch': [{'scope': paramscope, 'nt': 'IDENT', 't': 'list', 'name': 'L'}], 'nt': 'FNCALL', 't': 'integer', 'name': 'llGetListLength'}, {'scope': paramscope, 'nt': 'IDENT', 't': 'integer', 'name': 'i'}], 'nt': '<', 't': 'integer'}, {'ch': [{'ch': [{'scope': paramscope, 'nt': 'IDENT', 't': 'list', 'name': 'L'}, {'ch': [{'scope': paramscope, 'nt': 'IDENT', 't': 'list', 'name': 'L'}, {'nt': 'CONST', 't': 'integer', 'value': 0}], 'nt': '+', 't': 'list'}], 'nt': '=', 't': 'list'}], 'nt': 'EXPR', 't': 'list'}], 'nt': 'WHILE', 't': None}, {'ch': [{'ch': [{'scope': paramscope, 'nt': 'IDENT', 't': 'list', 'name': 'L'}, {'scope': paramscope, 'nt': 'IDENT', 't': 'list', 'name': 'v'}, {'scope': paramscope, 'nt': 'IDENT', 't': 'integer', 'name': 'i'}, {'scope': paramscope, 'nt': 'IDENT', 't': 'integer', 'name': 'i'}], 'nt': 'FNCALL', 't': 'list', 'name': 'llListReplaceList'}], 'nt': 'RETURN', 't': None, 'LIR': True}], 'nt': '{}', 't': None, 'LIR': True}], 't': 'list', 'pnames': params[1], 'scope': 0, 'pscope': paramscope, 'nt': 'FNDEF', 'ptypes': params[0], 'name': 'lazy_list_set'} - self.usedspots += 1 - #self.PopScope() # no locals - self.PopScope() - - return {'nt':'=', 't':'list', 'ch':[lvalue, { - 'nt':'FNCALL', 't':'list', 'name':'lazy_list_set', - 'scope':0, - 'ch':[lvalue.copy(), idxexpr, - {'nt':'LIST','t':'list', 'ch':[expr]}] - }]} # Lots of drama for checking types. This is pretty much like # addition, subtraction, multiply, divide, etc. all in one go. @@ -958,6 +986,22 @@ list lazy_list_set(list L, integer i, list v) else: expr = self.Parse_unary_postfix_expression(AllowAssignment = False) basetype = expr['t'] + if self.lazylists and basetype is None and expr['nt'] == 'SUBIDX': + fn = self.TypeToExtractionFunction[typ] + sym = self.FindSymbolFull(fn) + if sym is None: + # in the unlikely event that the underlying function is not + # defined in builtins.txt, throw a syntax error (making a + # new exception just for this seems overkill, and throwing + # an unknown identifier error would be confusing) + raise EParseSyntax(self) + fnparamtypes = sym['ParamTypes'] + subparamtypes = [x['t'] for x in expr['ch']] + if fnparamtypes != subparamtypes: + raise EParseFunctionMismatch(self) + return {'nt':'FNCALL', 't':sym['Type'], 'name':fn, 'scope':0, + 'ch':expr['ch']} + if typ == 'list' and basetype in self.types \ or basetype in ('integer', 'float') and typ in ('integer', 'float', 'string') \ or basetype == 'string' and typ in self.types \