YSC4230 Lecture 5
Arrays
How to represent arrays?
- static
- array contains direct address
- dynamic
- array contains indirect address
arrays are just collections of address pointers.
Multi-dimensional arrays
We need to know the size of these.
OCaml, Java have a size of array, during runtime (Waste more space).
movq -8(%rax) %rdx // load size, %rax is pointing at A[0] so we need to go back a word (-8).
cmpq %rdx %rcx // compare idx to bound
j l __ok // jump if ok
callq __err_oob
__ok:
movq (%rax, %rcx, 8) dest
Strings
C strings lie in text segment, so they are read only, so they can be shared.
Tagged Datatypes
Datatypes may be recursive, in that case we can use pointer, because it is known size.
In implementations which have length stored allow us to have O(1) access.
How to compile switch?
break instructions compiled to a jump to the next label after the switch blk.
%tag = ⟦e⟧
br label %l1
l1: %cmp1 = icmp eq %tag, $tag1
br %cmp1 label %b1, label %l
b1: ⟦s1⟧
br label %l2
l2: %cmp2 = icmp eq %tag, $tag2
br %cmp2 label %b2, label %l
b2: ⟦s2⟧
br label %l
lN: %cmpN = icmp eq %tag, $tagN
br %cmpN label %bN, label %merg
bN: ⟦sN⟧
br label %merg
merge:
Pattern matching
Bind variables. Patterns can nest.
Strategy:
- Flatten nested patterns into matches against one constructor at a time.
- Compile match against the tags of the datatype for C-style switches.
- Code for each branch must copy data from e to variables bound in the patterns.
LLVM Datatypes
LLVM IR: uses types to describe the structure of data.
data t = void
| i1 | i8 | i64
| [<#elts> x t]
| fty
| {t1, t2, ...}
| t* (pointers)
| %Tident (named type)
array types are monomorphic.
getelementptr
Type Casting
bitcast
Regular Expressions
- ɛ, empty string
- ‘a’, ordinary character
- R1 | R2, alternatives, stands for choice of R1 or R2
- R1R2, concatenation, stands for R1, followed by R2.
- R*, Kleene star, stands for zero or more repetitions of R
Extensions
- “foo”, a string, requivalnet to ‘f’ ‘o’ ‘o’
- R+, stands for one or more repetitions of R, equivalent to RR*
- R?, Zero or one occurrences of R, equivalent to (E|R)
- [‘a’-‘z’] One of a or b or c or … z, equivalent to (a|b|…|z)
- [^‘0’-‘9’] Any char except 0 through 9 (expensive)
- R as x Name the string matched by R as x.