What is Flex?

Flex is used for lexical analysis e.g. identifying words in a sentence / tokens in source code.

We specify valid tokens, and provide names for the important ones e.g. if, else etc…

This is done via rules files in flex. Specifying rules is done via its Rules section in the file using regex.

Common REGEX

x      - the character ”x”
"x"    - an ”x”, even if x is an operator.
\x     - an ”x”, even if x is an operator.
[xy]   - the character x or y.
[x -z] - the characters x, y or z.
[^x]   - any character but x.
.      - any character but newline.
^x     - an x at the beginning of a line.
<y>x   - an x when Flex is in start condition y.
x$     - an x at the end of a line.
x?     - an optional x.
x*     - 0,1,2, ...  instances of x.
x+     - 1,2,3, ...  instances of x.
x|y    - an x or a y.
(x)    - an x.
x/y    - an x but only if followed by y.
{xx}   - the translation of xx from the definitions section.
x{m,n} - m through n occurrences of x

It then specifies what to do with it in C source code.

Specifying names is done via its Definitions section in the file.

Defining DIGIT

DIGIT [0-9]

They also allow helper functions in C (via Declarations and User subroutines).

Exact syntax of a rules file:

%{
Declarations
}%
Definitions
%%
Rules
%%
User subroutines

References

CS143 Handout PA1