Sunday, June 8, 2014

Weekly Log: June 2 - June 8

This week I looked at the Java Compiler Compiler (JavaCC) classes for the infix parsing of the selector function and vector constructor necessary for the arrays package. We can do lexical analysis and parsing using JavaCC.

In lexical analysis, we tokenize a sequence of characters. After we tokenize the input, we define the grammar for the parsing and JavaCC will construct an automata for matching regular expressions. For example, if we receive "3+3" and we define a rule such as:

<Number> ::= [1-9][0-9]*
<Op> :: = [ +, -, *, / ]
<Expression> ::= <Number> <Op> <Number>

Then, that particular string will turn into:

(NUMBER 3) (OP +) (NUMBER 3)

and this sequence will be turned into an expression.

For the selector and vector, I had to define four tokens:

< LEFT_BRACES : "{" >

< RIGHT_BRACES : "}" >

< LEFT_BRACKET : "[" >

< RIGHT_BRACKET : "]" >



The parsing rules are defined as follows:

For vector:
< LEFT_BRACES > node (, node)* < RIGHT_BRACES >
< LEFT_BRACES >  < RIGHT_BRACES >

For selector:
id <LEFT_BRACKET> node < RIGHT_BRACKET>

where node can be anything JSBML supports (number, expressions, strings...).

For example, these are all valid math:
  • {}
  • {1, 2, 3}
  • {{1, 2, 3}, {4, 5, 6}}
  • {cos(x), sin(y)}
  • Y[1]
  • Y[3*1-(31-67)+5]
These are not valid:
  • {1, 3,} 
  • [i]
  • y[]
However, you can still do things such as:
  • Y[-1] reference negative index
  • {{1},{1,2}} vectors of vectors of different sizes
  • and many others
These are not prevented but we would like to be able to tell whether the user constructs a valid model or not. Therefore, I would like to write a piece of code that validates a model (not all SBML core, just for arrays). My next task is to come up with a list of validation rules and figure out how to implement them in the code.

No comments:

Post a Comment