This chapter describes LME (Lightweight Matrix Engine), the interpreter for numerical computing used by Sysquake.
An LME program, or a code fragment typed at a command line, is composed of statements. A statement can be either a simple expression, a variable assignment, or a programming construct. Statements are separated by commas, semicolons, or end of lines. The end of line has the same meaning as a comma, unless the line ends with a semicolon. When simple expressions and assignments are followed by a comma (or an end of line), the result is displayed to the standard output; when they are followed by a semicolon, no output is produced. What follows programming constructs does not matter.
When typed at the command line, the result of simple expressions is assigned to the variable ans; this makes easy reusing intermediate results in successive expressions.
A statement can span over several lines, provided all the lines but the last one end with three dots. For example,
1 + ... 2
is equivalent to 1 + 2. After the three dots, the remaining of the line, as well as empty lines and lines which contain only spaces, are ignored.
Unless when it is part of a string enclosed between single ticks, a single percent character or two slash characters mark the beginning of a comment, which continues until the end of the line and is ignored by LME. Comments must follow continuation characters, if any.
a = 2; % comment at the end of a line x = 5; // another comment % comment spanning the whole line b = ... % comment after the continuation characters a; a = 3% no need to put spaces before the percent sign s = '%'; % percent characters in a string
Comments may also be enclosed between /* and */; in that case, they can span several lines.
Pragmas are directives for LME compiler. They can be placed at the same location as LME statements, i.e. in separate lines or between semicolons or commas. They have the following syntax:
_pragma name arguments
where name is the pragma name and arguments are additional data whose meaning depends on the pragma.
Currently, only one pragma is defined. Pragmas with unknown names are ignored.
Name | Arguments | Effect |
---|---|---|
line | n | Set the current line number to n |
_pragma line 120 sets the current line number as reported by error messages or used by the debugger or profiler to 120. This can be useful when the LME source code has been generated by processing another file, and line numbers displayed in error messages should refer to the original file.
Functions are fragments of code which can use input arguments as parameters and produce output arguments as results. They can be built in LME (built-in functions), loaded from optional extensions, or defined with LME statements (user functions).
A function call is the action of executing a function, maybe with input and/or output arguments. LME supports different syntaxes.
fun fun() fun(in1) fun(in1, in2,...) out1 = fun... (out1, out2, ...) = fun... [out1, out2, ...] = fun... [out1 out2 ...] = fun...
Input arguments are enclosed between parenthesis. They are passed to the called function by value, which means that they cannot be modified by the called function. When a function is called without any input argument, parenthesis may be omitted.
Output arguments are assigned to variables or part of variables (structure field, list element, or array element). A single output argument is specified on the left on an equal character. Several output arguments must be enclosed between parenthesis or square brackets (arguments can simply be separated by spaces when they are enclosed in brackets). Parenthesis and square brackets are equivalent as far as LME is concerned; parenthesis are preferred in LME code, but square brackets are available for compatibility with third-party applications.
In some cases, a simpler syntax can be used when the function has only literal character strings as input arguments. The following conditions must be satisfied:
In that case, the following syntax is accepted; left and right columns are equivalent.
fun str1 | fun('str1') |
fun str1 str2 | fun('str1','str2') |
fun abc,def | fun('abc'),def |
Arguments can also be quoted strings; in that case, they may contain spaces, tabulators, commas, semicolons, and escape sequences beginning with a backslash (see below for a description of the string data type). Quoted and unquoted arguments can be mixed:
fun 'a bc\n' | fun('a bc\n') |
fun str1 'str 2' | fun('str1','str 2') |
This command syntax is especially useful for functions which accept well-known options represented as strings, such as format loose.
Libraries are collections of user functions, identified in LME by a name. Typically, they are stored in a file whose name is the library name with a ".lml" suffix (for instance, library stdlib is stored in file "stdlib.lml"). Before a user function can be called, its library must be loaded with the use statement. use statements have an effect only in the context where they are placed, i.e. in a library, or the command-line interface, or a Sysquake SQ file; this way, different libraries may define functions with the same name provided they are not used in the same context.
In a library, functions can be public or private. Public functions may be called from any context which use the library, while private functions are visible only from the library they are defined in.
The basic type of LME is the two-dimensional array, or matrix. Scalar numbers and row or column vectors are special kinds of matrices. Arrays with more than two dimensions are also supported. All elements have the same type, which are described in the table below. Two non-numerical types exist for character arrays and logical (boolean) arrays. Cell arrays, which contain composite types, are described in a section below.
Type | Description |
---|---|
double | 64-bit IEEE number |
complex double | Two 64-bit IEEE numbers |
single | 32-bit IEEE number |
complex single | Two 32-bit IEEE numbers |
uint32 | 32-bit unsigned integer |
int32 | 32-bit signed integer |
uint16 | 16-bit unsigned integer |
int16 | 16-bit signed integer |
uint8 | 8-bit unsigned integer |
int8 | 8-bit signed integer |
uint64 | 64-bit unsigned integer |
int64 | 64-bit signed integer |
64-bit integer numbers are not supported by all applications on all platforms.
These basic types can be used to represent many mathematic objects:
Unless a conversion function is used explicitly, numbers are represented by double or complex values. Most mathematical functions accept as input any type of numerical value and convert them to double; they return a real or complex value according to their mathematical definition.
Basic element-wise arithmetic and comparison operators accept directly integer types ("element-wise" means the operators + - .* ./ .\ and the functions mod and rem, as well as operators * / \ with a scalar multiplicand or divisor). If their arguments do not have the same type, they are converted to the size of the largest argument size, in the following order:
double > uint64 > int64 > uint32 > int32 > uint16 > int16 > uint8 > int8
Functions which manipulate arrays (such as reshape which changes their size or repmat which replicates them) preserve their type.
To convert arrays to numerical, char, or logical arrays, use functions + (unary operator), char, or logical respectively. To convert the numerical types, use functions double, single, or uint8 and similar functions.
Double and complex numbers are stored as floating-point numbers, whose finite accuracy depends on the number magnitude. During computations, round-off errors can accumulate and lead to visible artifacts; for example, 2-sqrt(2)*sqrt(2), which is mathematically 0, yields -4.4409e-16. Integers whose absolute value is smaller than 2^52 (about 4.5e15) have an exact representation, though.
Literal double numbers (constant numbers given by their numerical value) have an optional sign, an integer part, an optional fractional part following a dot, and an optional exponent. The exponent is the power of ten which multiplies the number; it is made of the letter 'e' or 'E' followed by an optional sign and an integer number. Numbers too large to be represented by the floating-point format are changed to plus or minus infinity; too small numbers are changed to 0. Here are some examples (numbers on the same line are equivalent):
123 +123 123. 123.00 12300e-2 -2.5 -25e-1 -0.25e1 -0.25e+1 0 0.0 -0 1e-99999 inf 1e999999 -inf -1e999999
Literal integer numbers may also be expressed in hexadecimal with prefix 0x, in octal with prefix 0, or in binary with prefix 0b. The four literals below all represent 11, stored as double:
0xb 013 0b1011 11
Literal integer numbers stored as integers and literal single numbers are followed by a suffix to specify their type, such as 2int16 for the number 2 stored as a two-byte signed number or 0x300uint32 for the number whose decimal representation is 768 stored as a four-byte unsigned number. All the integer types are valid, as well as single. This syntax gives the same result as the call to the corresponding function (e.g. 2int16 is the same as int16(2)), except when the integer number cannot be represented with a double; then the number is rounded to the nearest value which can be represented with a double. Compare the expressions below:
Expression | Value |
---|---|
uint64(123456789012345678) | 123456789012345696 |
123456789012345678uint64 | 123456789012345678 |
Literal complex numbers are written as the sum or difference of a real number and an imaginary number. Literal imaginary numbers are written as double numbers with an i or j suffix, like 2i, 3.7e5j, or 0xffj. Functions i and j can also be used when there are no variables of the same name, but should be avoided for safety reasons.
The suffices for single and imaginary can be combined as isingle or jsingle, in this order only:
2jsingle 3single + 4isingle
Command format is used to specify how numbers are displayed.
Strings are stored as arrays (usually row vectors) of 16-bit unsigned numbers. Literal strings are enclosed in single quotes:
'Example of string' ''
The second string is empty. For special characters, the following escape sequences are recognized:
Character | Escape seq. | Character code |
---|---|---|
Null | \0 | 0 |
Bell | \a | 7 |
Backspace | \b | 8 |
Horizontal tab | \t | 9 |
Line feed | \n | 10 |
Vertical tab | \v | 11 |
Form feed | \f | 12 |
Carriage return | \r | 13 |
Single tick | \' | 39 |
Single tick | '' (two ') | 39 |
Backslash | \\ | 92 |
Hexadecimal number | \xhh | hh |
Octal number | \ooo | ooo |
16-bit UTF-16 | \uhhhh | unicode UTF-16 code |
For octal and hexadecimal representations, up to 3 (octal) or 2 (hexadecimal) digits are decoded; the first non-octal or non-hexadecimal digit marks the end of the sequence. The null character can conveniently be encoded with its octal representation, \0, provided it is not followed by octal digits (it should be written \000 in that case). It is an error when another character is found after the backslash. Single ticks can be represented either by a backslash followed by a single tick, or by two single ticks.
Depending on the application and the operating system, strings can contain directly Unicode characters encoded as UTF-8, or MBCS (multibyte character sequences). 16-bit characters encoded with \uhhhh escape sequences are always accepted and handled correctly by all built-in LME functions (low-level input/output to files and devices which are byte-oriented is an exception; explicit UTF-8 conversion should be performed if necessary).
Lists are ordered sets of other elements. They may be made of any type, including lists. Literal lists are enclosed in braces; elements are separated with commas.
{1,[3,6;2,9],'abc',{1,'xx'}}
Lists can be empty:
{}
List's purpose is to collect any kind of data which can be assigned to variables or passed as arguments to functions.
Cell arrays are arrays whose elements (or cells) contain data of any type. They differ from lists only by having more than one dimension. Most functions which expect lists also accept cell arrays; functions which expect cell arrays treat lists of n elements as 1-by-n cell arrays.
To create a cell array with 2 dimensions, cells are written between braces, where rows are separated with semicolons and row elements with commas:
{1, 'abc'; 27, true}
Since the use of braces without semicolon produces a list, there is no direct way to create a cell array with a single row, or an empty cell array. Most of the time, this is not a problem since lists are accepted where cell arrays are expected. To force the creation of a cell array, the reshape function can be used:
reshape({'ab', 'cde'}, 1, 2)
Like lists and cell arrays, structures are sets of data of any type. While list elements are ordered but unnamed, structure elements, called fields, have a name which is used to access them. There are two ways to make structures: with the struct function, or by setting each field in an assignment. s.f refers to the value of the field named f in the structure s. Usually, s is the name of a variable; but unless it is in the left part of an assignment, it can be any expression.
a = struct('name', 'Sysquake', ... 'os', {'Windows', 'Mac OS X', 'Linux'}); b.x = 200; b.y = 280; b.radius = 90; c.s = b;
With the assignments above, a.os{3} is 'Linux' and c.s.radius is 90.
Function references are equivalent to the name of a function together with the context in which they are created. Their main use is as argument to other functions. They are obtained with operator @.
Inline and anonymous functions encapsulate executable code. They differ only in the way they are created: inline functions are made with function inline, while anonymous functions have special syntax and semantics where the values of variables in the current context can be captured implicitly without being listed as argument. Their main use is as argument to other functions.
Sets are represented with numerical arrays of any type (integer, real or complex double or single, character, or logical), or lists or cell arrays of strings. Members correspond to an element of the array or list. All set-related functions accept sets with multiple values, which are always reduced to unique values with function unique. They implement membership test, union, intersection, difference, and exclusive or. Numerical sets can be mixed; the result has the same type as when mixing numerical types in array concatenation. Numerical sets and list or cell arrays os strings cannot be mixed.
Objects are the basis of Object-Oriented Programming (OOP), an approach of programming which puts the emphasis on encapsulated data with a known programmatic interface (the objects). Two OOP languages in common use today are C++ and Java.
The exact definition of OOP varies from person to person. Here is what it means when it relates to LME:
Here is an example of the use of polynom objects, which (as can be guessed from their name) contain polynomials. Statement use classes imports the definitions of methods for class polynom and others.
use classes; p = polynom([1,5,0,1]) p = x^3+5x^2+1 q = p^2 + 3 * p / polynom([1,0]) q = x^6+10x^5+25x^4+2x^3+13x^2+15x+1
LME identifies channels for input and output with non-negative integer numbers called file descriptors. File descriptors correspond to files, devices such as serial port, network connections, etc. They are used as input argument by most functions related to input and output, such as fprintf for formatted data output or fgets for reading a line of text.
Note that the description below applies to most LME applications. For some of them, files, command prompts, or standard input are irrelevant or disabled; and standard output does not always correspond to the screen.
At least four file descriptors are predefined:
Value | Input/Output | Purpose |
---|---|---|
0 | Input | Standard input from keyboard |
1 | Output | Standard output to screen |
2 | Output | Standard error to screen |
3 | Output | Prompt for commands |
You can use these file descriptors without calling any opening function
first, and you cannot close them. For instance, to display the value of
fprintf(1, 'pi = %.6f\n', pi); pi = 3.141593
Some functions use implicitly one of these file descriptors. For instance disp displays a value to file descriptor 1, and warning displays a warning message to file descriptor 2.
File descriptors for files and devices are obtained with specific functions. For instance fopen is used for reading from or writing to a file. These functions have as input arguments values which specify what to open and how (file name, host name on a network, input or output mode, etc.), and as output argument a file descriptor. Such file descriptors are valid until a call to fclose, which closes the file or the connection.
When an error occurs, the execution is interrupted and an error message explaining what happened is displayed, unless the code is enclosed in a try/catch block. The whole error message can look like
> factor({2}) Wrong type (stdlib:primes:164) 'ones' -> stdlib:factor:174
The first line contains an error message, the location in the source code where the error occurred, and the name of the function or operator involved. Here stdlib is the library name, primes is the function name, and 164 is the line number in the file which contains the library. If the function where the error occurs is called itself by another function, the whole chain of calls is displayed; here, primes was called by factor at line 174 in library stdlib.
Here is the list of errors which can occur. For some of them, LME attempts to solve the problem itself, e.g. by allocating more memory for the task.