Monday, June 30, 2014

Weekly Log: June 23 - June 29

I continued to implement validation for the arrays package in JSBML. For the past week, I wrote a compiler for ASTNode objects to evaluate an expression. I wrote this compiler so I can determine whether there is any index math that goes out of bounds. I made the compiler accept a map that maps an id to a value. Each pair corresponds to an array dimension for a certain array object you are referencing. I use the map so I can change the value of a certain id to calculate the lower bound and upper bound for a certain array. 

It is important to note that only indices that are statically computable are allowed. This means if an id appears in the math, the object is refers to should be constant. Even with this implication, determining whether an index goes out of bounds is not easy. Right now, I assume that every index math is monotonically increasing. This way I just need to compute the end points for that function. However, this is not always true. A function may have selectors, piecewise function, trigonometric, and etc, so the bounds can be anywhere in between the end points.  This means that all the points must be evaluated.

As of revision 20642, I have done validation for the following rules:

10206, 1207, 1208, 10209, 10210, 20103, 20104, 20107, 20108, 20111, 20112, 20114, 20202, 20204, 20205, 20302, 20305, 20307, 20308.

There are two rules that says Dimension and Index objects can only have attribute arrayDimension set to 0, 1, or 2, but since it is likely this limitation will be removed in a later version, I am not checking for these rules. 

This week, I am going to do more testing for validation. In addition, I need to implement one validation rule that checks that binary and nary operations have arguments that match in number of dimensions (rule 10211). In addition, I am going to start thinking about the pseudocode for the flattening routine.

Wednesday, June 25, 2014

Monday, June 23, 2014

Weekly Log: June 16 - June 22

The C++ SBML library, libsbml, is quite extensive in terms of validation, and it would be infeasible to validate everything libsbml does in the time frame I have. So my focus is only the arrays package.

This week I started coding for the arrays package validation. In the validation section of the arrays specification, there are categories of validation. For instance, there are lists of rules that are applied to extended SBase objects, dimension objects, index objects, and math, among others.  I tried to be as consistent as possible to the validator in libsbml. The validator I am implementing works as follows: there is a validator for each validation category (sbase, dimension, index...), and each validator checks a set of constraints, where each constraint checks for a certain validation rule (e.g. dimension size should point to a valid parameter with a non-negative integer value and that is both scalar and constant). If a constraint is violated, then an error is reported. The error tells what is wrong with the given document that is being checked. Each validator returns a list of errors. The arrays validator calls the validator for each validation category.

So far, these are the things being validated:
  • Only certain SBase objects can have a listOfDimensions;
  • For listOfDimensions and listOfIndices, array dimension for the objects in the list should be unique;
  • For listOfDimensions and listOfIndices, having an object with array dimension n implies that there is objects with array dimension 0..n-1;
  • Dimension size should point to a parameter. This parameter should be a scalar with a constant non-negative integer value;
  • The first argument of a selector function should point to a vector or an array object.
  • Vectors should be regular, and not rigged.
There are still some validation left to do and this will keep me busy for some time.

Monday, June 16, 2014

Weekly Log: June 9 - June 15

What is a good model? what is a bad model? what is a valid model? These questions are raised when we are proposing a new package. Every package has its own specifications that contains a lot of useful information. For example, it tells what is the purpose of the package, what it contains, how it is constructed, what you can and what you cannot do with it. So far, I have implemented the arrays package in JSBML in accordance with the latest specifications. Though you can construct models with the arrays package, valid models are not enforced in JSBML. This means that nothing prevents you from constructing invalid models. But at this point may ask yourself, what is allowed and what is not allowed in the arrays package.

A valid model is one that does not violate any of the validation rules listed in the spec. Last week I worked mainly on documentation since the arrays package lacked validation rules. I updated the arrays specifications and wrote a list of validation rules.  For example, the arrays package is currently limited to at most three array dimensions. This means that if you add a species with four dimensions, you violate the specifications. So if you are interested to know what the arrays validation rules are,  let me know. The arrays package is still in development, so the spec is not yet finalized and is subject to change. Hopefully, my work helps the community reach a consensus.


It is quite difficult to enforce the validity of SBML models. That's because a modeling tool may allow the user to do something that violates the specifications and also, nothing prevents a user to go into the SBML file and change something that completely ruins the model. Therefore, I am implementing a validator that conforms with the validation rules and automatically checks the validity of a model. More details to come.

Sunday, June 8, 2014

Weekly Log: June 2 - June 8

This week I looked at the Java Compiler Compiler (JavaCC) classes for the infix parsing of the selector function and vector constructor necessary for the arrays package. We can do lexical analysis and parsing using JavaCC.

In lexical analysis, we tokenize a sequence of characters. After we tokenize the input, we define the grammar for the parsing and JavaCC will construct an automata for matching regular expressions. For example, if we receive "3+3" and we define a rule such as:

<Number> ::= [1-9][0-9]*
<Op> :: = [ +, -, *, / ]
<Expression> ::= <Number> <Op> <Number>

Then, that particular string will turn into:

(NUMBER 3) (OP +) (NUMBER 3)

and this sequence will be turned into an expression.

For the selector and vector, I had to define four tokens:

< LEFT_BRACES : "{" >

< RIGHT_BRACES : "}" >

< LEFT_BRACKET : "[" >

< RIGHT_BRACKET : "]" >



The parsing rules are defined as follows:

For vector:
< LEFT_BRACES > node (, node)* < RIGHT_BRACES >
< LEFT_BRACES >  < RIGHT_BRACES >

For selector:
id <LEFT_BRACKET> node < RIGHT_BRACKET>

where node can be anything JSBML supports (number, expressions, strings...).

For example, these are all valid math:
  • {}
  • {1, 2, 3}
  • {{1, 2, 3}, {4, 5, 6}}
  • {cos(x), sin(y)}
  • Y[1]
  • Y[3*1-(31-67)+5]
These are not valid:
  • {1, 3,} 
  • [i]
  • y[]
However, you can still do things such as:
  • Y[-1] reference negative index
  • {{1},{1,2}} vectors of vectors of different sizes
  • and many others
These are not prevented but we would like to be able to tell whether the user constructs a valid model or not. Therefore, I would like to write a piece of code that validates a model (not all SBML core, just for arrays). My next task is to come up with a list of validation rules and figure out how to implement them in the code.

Monday, June 2, 2014

Weekly Log: May 26 - June 1

For the past week, I have been playing with the math for the arrays package. To fully incorporate the math into JSBML, two steps are needed: implement the functions/types in JSBML and the infix parsing of the math. The math extension for the arrays package is the following:

constructors: vector
element referenced operator: selector
qualifier components: bvar, lowlimit, uplimit, condition
sum product operators: sum, product
quantifier operators: forall, exists
statistics operators: mean, sdev, variance, median, mode, moment, moment about

Right now I am going to focus on vector and selector, and the remaining math is for future releases. Vector, as the name suggests, is used to create a vector. In JSBML, a vector node is a collection of ordered ASTNodes. In order to get an element in an array or vector, the selector function is necessary.  The selector function takes an array/vector and the index math to reference a certain element.

Vector and selector are implemented already. Selector was done by one of the mentors, Nico Rodriguez. I have done the vector. To create a vector ASTNode, you can specify vector(0, 1, 2, 3, 4, 5) and this in MathML is equivalent to:

  <vector>
    <cn type="integer"> 0 </cn>
    <cn type="integer"> 1 </cn>
    <cn type="integer"> 2 </cn>
    <cn type="integer"> 3 </cn>
    <cn type="integer"> 4 </cn>
  </vector>

The vector constructor was tested and it seems to be working fine. Once I got this to work, I moved on to infix parsing. The code is there already, but it needs more testing. More details to come.