Monday 27 February 2012

Standard Library (1 of 5). Basic Datatypes

In this post I overview Libretto basic datatypes.

Type String

Strings are instances of predefined class String.

Regular Strings

Libretto supports regular strings – character sequences enclosed in double quotes (Quotation mark, U+0022). For instance,
"It's a string" 
The empty string is denoted by "". For regular strings the following escape sequences are defined:

\nline feed
\rreturn
\thorizontal tab
\\backslash
\"double quote
\bbackspace
\fform feed
\{opening curly bracket
\uXXXX unicode character, e.g. \u754C
\xXXhexadecimal byte, e.g. \x8A

Note that the single quote “'” indicates in Libretto string symbols (see section String Symbols).

Quote Brackets

Regular strings have the same opening and closing symbols. It raises problems with character escaping and absence of nested strings. This is why strings enclosed in distinct opening and closing quotation marks are also introduced in Libretto:

  • “...” (U+201C, U+201D),
  • ‘...’ (U+2018, U+2019) and
  • «...» (U+00AB, U+00BB).

These pairs of symbols are called quote brackets. The use of quote-bracketed strings conforms to a good programming style, because they have a structure, which simplifies string handling and allows for nested strings:
““Hello”, he said”
«The word «acme» originates from Greek»
«Don't say “goodbye”»
The quote brackets “...”, ‘...’ and «...» are equivalent. Each pair allows multi-line strings:
«“Hello”, he said.
Don't say “goodbye”»

“The 
word 

«
acme
» 
originates 
from 
Greek”
The quote brackets have ASCII equivalents:

string  UnicodeASCIIequivalent
“…”U+201C, U+201D  ^"…"^
‘…’U+2018, U+2019^'…'^
«…»U+00AB, U+00BB<<…>>

For quote-bracketed strings the following escape sequences are defined: \”, \“, \\, , , \‘, \’ and \{.

String Handling Methods

The predefined class String supports a number of operations over strings including concatenation
“aaa” + “bbb”  //  “aaabbb”
the repetition function
“abc” * 3  //  “abcabcabc”
the length function
“aaa”.length  //  3
the substring function, which returns the first matching substring index
“abcd”.substring(«bc»)  //  1
“abcd”.substring(“123”)  //  ()

Parametric Strings

Including expressions in curly brackets provides string parametrization. For instance,
«1 + 2 = {1 + 2}»
To evaluate this parametric string, the do method is applied:
«1 + 2 = {1 + 2}».do()  //  “1 + 2 = 3”
Thus, a parametric string is handled in Libretto as a nullary function. We can use regular syntactic sugar:
«1 + 2 = {1 + 2}»()  //  “1 + 2 = 3”
«1 + 2 = {1 + 2}»!  //  “1 + 2 = 3”
Parametric strings are context dependent – like other Libretto expressions:
(1, 2, 3). «{$} plus 3 equals {$ + 3}»!
  // “1 plus 3 equals 4”
  // “2 plus 3 equals 5”
  // “3 plus 3 equals 6”
In Libretto a lazy approach to parametric string evaluation is implemented: a parametric string is handled as an ordinary string until the method do is applied to it. This has many advantages (e.g. using parametric strings as patterns and anonymous functions). Besides, laziness secures nested evaluations:
«{«{1+2}»} = {1+2}»  //  “{«{1+2}»} = {1+2}”
«{«{1+2}»} = {1+2}»!  //  “{1+2} = 3”
«{«{1+2}»} = {1+2}»!!  //  “{1+2} = 3”!    ->    “3 = 3”
These features make parametric strings handy as web page patterns.

Note that the similarity of parametric strings with anonymous functions is limited, because closures do not work on parametric strings:
object x extends “a”

def f(fun) {
  var x = 1
  “x equals ”.fun! 
}

f(%{$ + x})  //  “x equals a”
f(“{$ + x}”)  //  “x equals 1”
An opening curly bracket can be escaped to prevent it from being the beginning of an evaluated expression:
«\{1+2} = {1+2}»!  //  “{1+2} = 3”
Parametric strings are convenient as pretty printing forms. Let us define
fix class Person(name: String, hasChild: Person*)
fix persons = 
    Person(“Ann”) as ann. 
      Person(“Paul”) as paul.
        (ann, paul, Person(“John”, ann, paul))
Now we can find the pairs of parents with their children:
persons as p. hasChild. «{p.name} is {name}’s parent»!
  // “John is Ann’s parent”
  // “John is Paul’s parent”

String Symbols

Libretto does not support special entities similar to atoms in Erlang, symbol literals in Scala, or Ruby’s symbols. Instead of them Libretto offers string symbols. In order to provide the compactness of string symbols, a special notation based on the single quote is introduced:
 'identifier
For instance,
'Mon  'Tue  'Sun  'hello   'Device_1001a
Such strings are called string symbols. They are defined as the instances of the class StringSymbol, which is a subclass of String:
class StringSymbol extends String
String symbols can be handled as ordinary strings:
if ('hello == “hello”) “yes” else “no”  //  “yes”
Another feature of string symbols is that they are interned (equal string symbols are represented in the memory by the same object). It improves their efficiency and saves the memory. For arbitrary strings the internalization is not guaranteed. The operator eq compares objects by reference:
if ('hello eq 'hello) “yes” else “no”  //  “yes”
if (“hello” eq “hello”) “yes” else “no”  //  “no”
if ('hello eq “hello”) “yes” else “no”  //  “no”
The Libretto code based on string symbols is more compact and readable than that using ordinary strings. String symbols are convenient for solving some metaprogramming tasks, as well as for introducing enumeration types in pattern matching. For instance,
day match {
  case 'Mon => “Monday”
  case 'Tue => “Tuesday”
  case 'Fri => “Friday”
}

Type Int

Integer literals have type Int. For example,
  123  -20  1  73850
Integers can also be displayed in binary and hexadecimal formats:
0xFE  //  hexadecimal corresponding to decimal 254
0b11111110  //  binary corresponding to decimal 254
Basic arithmetic operations are defined on integers including
+  -  *  div   mod 
Arithmetic expressions can occur in paths:
fix class Book(title: String, price: Int)

{
  fix books = (Book(“t1”, 30), Book(“t2”, 20), Book(“t3”, 40)) 
  books.(price * 3)  //  90  60  120
}
This query multiplies each book price by 3. The parentheses are used to specify the evaluation order, because books.price * 3 is interpreted as (books.price) * 3. Note that the last expression results in the same sequence:
books.price * 3  //  90  60  120
This happens because all possible values of the field price are applied. Note that this feature should be used with care:
{
  fix x = (1,2,3,4,5,6,7,8,9,10)
  (x*x*x*x).size  //  10000
}
In practice, such problems are rare.

Libretto can work with big integers. For instance,
def fact = if (this == 0) 1 else this * fact(this - 1)

33.fact  //  8683317618811886495518194401280000000

Type Real

Floating point literals have type Real. Reals are also accompanied by standard arithmetic operations.
123.5     -10.2e3       116.1 + 15.3        10.1 div 4.2

Type Boolean

Libretto is designed in such a way that any value distinct from () is interpreted in it as true, whereas () denotes false and means failure. This is a key feature of Libretto, on which the path evaluation mechanism is based. Thus, in principle, class Boolean is not necessary in Libretto. On the other hand, the explicit use of logical constructs conforms to good programming style, and provides standard values for relations like ==, etc. Therefore the definitions of two functions true and false are included in the Libretto standard library:
def true = %true
def false = ()
Now we get:
5 == 5  //  true
5 == 3  //  ()

Class List

Lists provide sequence encapsulation and handling sequences as separate objects. The class List has the following architecture:
class List(var contents: Any*) {
  def do(index: Int) = contents(index)
  def size = contents.size
}

def (x:List)=(y: Any) {x& = y}
def (x:List)+=(y:Any) {x& += y}
def (x:List).=(y:Any) {x& .= y}
def (y:List)-- {x& --}
def (x:List)+(y:List) = List() {& = x&; & += y&}
def (x:List)*(n:Int) = List() as ls {1..n. ls& += x&}

def (x)::(y:List) {y .= x}
def %:: undo(x: List) = (x(0), List(x(1..)))

def (y:List):::(x) {y += x}
def %::: undo(x: List) = (List(x(0..(x.size-2)), x(x.size-1))
etc.

The examples of lists are
List(1,2,3,4,5), List(List(“a”), “b”, List(1, List(“c”, “d”))), List()
The second list is nested – lists can be the elements of other lists. The third list is empty.

Each list is an indexing function, which enumerates its elements from 0:
List(“a”, “b”, “c”, “d”, “e”)(3)  //  “d”
List indexing is based on the indexing of the field contents.

Since lists are frequently used, some bits of syntactic sugar are defined on them. In particular, field-contents syntactic sugar (see section Field contents and &) can be applied to lists:
List(1,2,3,4,5).contents  //  1 2 3 4 5
is equivalent to
List(1,2,3,4,5)&  //  1 2 3 4 5
Second, lists can be displayed in square brackets:
List(1,2,3,4,5)  is equivalent to  [1,2,3,4,5]

List(List(“a”), “b”, List(1, List(“c”, “d”))) 
                           is equivalent to [[“a”],“b”,[1,[“c”, “d”]]]

List()   is equivalent to   []
The operator :: attaches a value to a list as the first element:
1 :: [2,3,4,5]  ///  [1,2,3,4,5]
[“a”] :: []  //  [[“a”]]
The method undo is defined on ::, so this operator can be used in pattern matching:
x :: y = [1,2,3,4,5]
x  //  1
y  //  [2,3,4,5]
The operator ::: attaches a value to a list as the last element. As an example, let us introduce a naive definition of the list reverse function:
def List reverse {
  case [] => []
  case h :: t => t.reverse ::: h
}

[1,2,3,4,5].reverse  //  [5,4,3,2,1]
A bracket notation can be used for context typing:
def [] append(snd: []) {
  case [] => snd
  case h :: t => h :: t.append(snd)
}
def [] head = this(0)
Here [] denotes the type of lists with arbitrary elements. List-element typing is not allowed in Libretto.

Note that reverse, append and head are element functions. Unlike sequences, which determine the basic semantics of Libretto, List is an ordinary class. In particular, lists can be the elements of a sequence:
([1,2], [“a”, [“b”, “c”], “d”], []).size  //  2 3 0
Compare two almost similar functions, the one of which is an element function, and the other is a collection function:
def []  head1 = this(0)
def []* head2 = this(0)

([1,2], [“a”, “b”, 1], [[1]]).head1  //  1  “a”  [1]
([1,2], [“a”, “b”, 1], [[1]]).head2  //  [1,2]
In the first expression the element function head1 is applied to each context list. In the second expression the collection function head2 is applied to the sequence of lists as a whole.

A collection function
def* list = [this]
encapsulates the context sequence in a list:
(1,2,3,4,5).list  //  [1,2,3,4,5]
It is based on the flatness of sequences:
[1,2,3,4,5] == [(1,2,3,4,5)] == [(1,2),3,(4,5)]
The function list is inverse to &:
[1,2,3,4,5]&.list  //  [1,2,3,4,5]
(1,2,3,4,5).list&  //  (1,2,3,4,5)
One more example:
def Int fact = if (this == 0) 1 else this * (this-1).fact
def Int* listOfFact = fact.list

(1,2,3,4,5).listOfFact  //  [1, 2, 6, 24, 120]

Class Any

Any is the root class in the class hierarchy of Libretto. Any object is an instance of this class, and the methods associated with this class are applicable to any object. The programmer can define on Any her/his own external methods:
def Any double = (this, this)
(5,6,7).double  //  5 5 6 6 7 7
Since the default context type of a function is Any, this definition is equivalent to the following:
def double = (this, this) 
For collection functions the situation is similar: the definition
def Any* doubleC = (this, this)
is equivalent to
def * doubleC = (this, this)
Now,
(5,6,7).doubleC  //  5 6 7 5 6 7
The Libretto standard library supports (among others) the following methods defined on Any:

  • toString returns the string representation of an object
  • hashCode returns the hash code of an object
  • == compares two objects by value
  • eq compares two objects by reference
  • clone creates a new copy of an object

No comments:

Post a Comment