panthema / 2007 / flex-bison-cpp-example
Funny Drawing with 'C++' 'FLEX' and a Bison

Flex Bison C++ Template/Example

TalkBox
Gravatar Yogesh:

Hi , When I'm using bison 3.0.4 and except 1 complilation error everything is fine . Issue is 'yyloc' not declared. Do I need to use some option of bison to generate decleartion of 'yyloc'

Gravatar Russel:

Hey man 2015 here. Thanks a bunch :) I will be sure to credit you in my source files.

Gravatar Timo:

@3li: Sorry, I dont have an example with multiple instances. I would look at the duplicate symbols and try to make them static or put them into different namespaces.

Gravatar 3li:

Hi, Great work ! I use it for some work, and now I need to link multiple different lexer-parser; You said that is easy, but it results in a lot of redefinitions in the generated scanners codes. Do you have an example with multiple pairs ?

Gravatar Hong:

Hey, thanks. You really saved my day. This is a great project for me newbie to Flex&Bison.

BTW, your code still compiles and runs well in Visual Studio 2013!

Gravatar Red:

Hey, thanks. You really saved my day. This is a great project for us aficionados of Flex&Bison.

BTW, your code still compiles and runs well in Visual Studio 2010!

Gravatar Lily:

your flexLexer.h is diff with mine. My flexLexer.h is from flex-2.5.4.
So I can`t complile it used your flexLexer.h

Gravatar Timo:

I'm not so knowledgeable on windows stuff, but I believe there are flex more up-to-date flex executables available, like 2.5.35.

Gravatar Dev:

Is it possible to integrate this example in Visual studio environment? I tried using this example in Visual Studio 2008. It builds successfully, but when I try to generate a different scanner using command line then it fails to generate scanner.cc saying "flex: could not create". I am using flex 2.5.4

Gravatar Name:

Hello, I have a problem with the error handling. I want a central location where both LexerError and parser error are handled. Is there a way to do this?

Gravatar Robert:

I created an example of Flex and Bison in C++ based in part on this - thanks for the info! If your interested in checking out my code it can be seen at http://www.thewaffleshop.net/2010/03/06/flex-bison-cpp-example/

Gravatar Sergey:

thanks. it is a good work!

Gravatar wilm:

Thanks, this is great. I had dealt with this problem quite a while back. I came up with a similar design using a node like structure with a virtual method. It looks like your driver class is a cleaner implementation than mine. My edges were not nearly as clean, plus I was using flex/bison in C mode. Thanks again this should save me lots of headache.

Gravatar Timo:

Thanks for the notice. Version 0.1.4 fixes that.

Gravatar brucer42:

doesn't work with bison-2.4.1, does work with bison-2.3. botheration

Gravatar djeauh:

Very good tutorial. Thanks!

Gravatar Wolfgang:

Hi, really fine explanation, and exactly what I was searching for! I need to extend C-code file-parser to C++-code iostream and string parsing :-) Thanks for that!
Rgds,
Wolfgang

Gravatar Jason:

Hey, great tutorial. Builds just great on Mac, but I can't find where it dumps the executable (exprtest). Any ideas?

Gravatar Timo:

The example builds well in Ubuntu Hardy, both without flex / bison and with the two tools installed. Regenerated parser and scanner source code also compiles ok.

flex and bison are not called directly in the Makefiles, they are triggered by automake's magic. It works by simply putting the scanner.ll and parser.yy files into the program_SOURCES rule inside Makefile.am. Calling flex and bison yourself in a custom makefile is straight-forward parameter fiddeling.

Gravatar Name:

Dude, I cant build your project in Ubuntu? I cant find bison anywhere in your makefile




Posted on 2007-08-20, last updated 2009-09-05 by Timo Bingmann at Permlink.

Summary

This code template can be used to integrate a Flex scanner and Bison parser pair into a modern C++ program. These two universal tools are very difficult to incorporate into a good C++ design. The template utilizes both Flex and Bison in C++ mode and their output are encapsulate into classes. Some tricky code is used to bind the two classes together. Thus the lexer and parser become fully re-entrant, as all state variables are contained in the class objects. Furthermore multiple different lexer-parser pairs can easily be linked into one binary, because they have different class names and/or are located in a different namespace.

Why use these old tools? Well, they are here to stay and they work well. But most important, the code generated by Flex and Bison requires no compile-time dependencies, because they generate fully autonomous source code. So far I have not found any modern parser generator which outputs independent code. It is even possible to compile the generated source on Windows with Visual C++ 2005.

The implemented grammar is a standard infix-notation calculator, which even supports some variables. This application is not the focus of the template, it is there as a starting-point for you to insert your own grammar.

See the README below for more information and pointers into the code.

For the changes between versions see the corresponding weblog entries for 0.1.3 and 0.1.4 or the ChangeLog below.

Downloads

Flex Bison C++ Example 0.1.4 (current) released 2009-09-05
Source code archive: Download flex-bison-cpp-example-0.1.4.tar.bz2 (245kb)
MD5: 9138d558a6af8de798947f3d86e2430f
  Browse doxygen documentation online
Browse source online

See bottom of this page for older downloads.

License

The parts of the example code written by myself are released into the public domain or, at your option, under the Do What The Fuck You Want To Public License (WTFPL). There are some special GPL license exceptions for the included source files from the Bison and Flex distributions. All in all you can just copy all the code and don't have to credit me at all.

Git Repository

The git repository containing all sources and packages is available by running
git clone https://github.com/bingmann/flex-bison-cpp-example.git

README

Website / License

The current example package can be downloaded from http://idlebox.net/2007/flex-bison-cpp-example/

The following just mean you can copy the example code into your program or use it for whatever purpose without crediting me (though I would really like it if you did):

The parts of the example code written by myself are released into the public domain or, at your option, under the Do What The Fuck You Want To Public License (WTFPL), which can be found in the file COPYING. There are some special GPL license exceptions for the included source files from the Bison and Flex distributions.

The idea and method of this example is based on code from http://ioctl.org/jan/bison/

Why Use These Old Tools?

Well, they are here to stay and they work well. These days there are much more sophisticated C++ parser generation frameworks around:

All these libraries do good jobs when you need to generate parsers for more difficult grammars.

However if you write a program with one of the frameworks above, then your users need that parser framework installed to compile your program. But Flex and Bison require no compile-time dependencies, because they generate fully autonomous source code. (And Flex and Bison are installed almost everywhere.) So far I have not found any modern parser generator which outputs independent code.

It is even possible to compile the generated source with Visual C++ on Windows (worked with 8.0 aka 2005). Flex and Bison need not be installed on the windows machine. The source package includes a VC++ solution and two project files.

Source and Generated Files

The src directory contains the following source files. Note that some of them are automatically generated from others.






So if you wish to create a program using a C++ Flex lexer and Bison parser, you need to copy the following files:

Namespace and Library

The scanner, parser and driver classes are located within the example namespace. When coding a larger program, I believe it is most convenient to put all scanner/parser source files into a separate directory and build a static library. Then all parts of the parser are located in a separate namespace and directory.

Code Overview

This is a brief overview of the code's structure. Further detailed information is contained in the doxygen documentation, comments in the source and ultimately in the code itself.

Scanner

The input stream is first converted by the lexical scanner into tokens. The scanner is defined by the list of regular expressions in scanner.ll . From this file Flex generates the file scanner.cc, which mainly contains a function called yylex(). This function returns the next token for the parser. For a C++ scanner the yylex() function is contained in a class, which is named yyFlexLexer by default. It is declared in the FlexLexer.h and is derived from the abstract FlexLexer class.

By defining the macro yyFlexLexer => ExampleFlexLexer in scanner.h, the default name of the scanner class is changed. Furthermore to extend yylex()'s parameter list, the class example::Scanner is derived from the ExampleFlexLexer class. It is mainly a forwarding class. By defining the macro YY_DECL, the yylex() function generated by Flex is renamed to example::Scanner::lex().

Another change to the default Flex code is that the token type is changed from int to the enum example::Parser::token defined by parser.

Parser

Bison's support for C++ is much more sophisticated. In C++ mode it generates a class named example::Parser, which is located in parser.cc and declared in parser.h . The header file also defines the scanner tokens, which must be returned by the Flex scanner's regular expression rules. Bison's C++ skeleton also installs the three .hh files, which contain utility classes required by the parser.

In the example calculator the Bison code constructs a calculation node tree. The tree's nodes are derived from CalcNode and are evaluated to output the parsed expression's result.

Driver

The example::Driver class brings the two components scanner and parser classes together. It is the context parameter of the parser class. The hook between scanner object and parser object is done by defining the yylex() macro to be "driver.lexer->lex". This way the Bison parser requests the next token from the scanner object contained within the driver.

The example::Driver object can be accessed by the Bison actions. Therefore it will contain a reference to the data classes filled by the parser's rules. In the example it contains a reference to the CalcContext. Thus a refernce to a CalcContext must be given to the constructor of example::Driver. This CalcContext object will be filled with the parsed data.

To initiate parsing the example::Driver class contains the three functions example::Driver::parse_stream(), example::Driver::parse_file() and example::Driver::parse_string().

Example Calculator

The example lexer and grammar is a simple floating point arithmetic calculator. It follows the usual operator precedence rules (sometimes called BODMAS or PEMDAS): Parentheses, Exponentation, Multiplication/Division, Addition/Subtraction.

Besides these simple arithmetic operators, the program also supports variables. These can be assigned a value and used in subsequent expressions.

It can be started interactively and will process expressions entered on the console. The expression's parse tree is printed and then evaluated. Here some examples:

$ ./exprtest
Reading expressions from stdin
input: 4 * 1.5 + 3 * (2 ^ 4 - 4)
tree:
+ add
  * multiply
    4
    1.5
  * multiply
    3
    - subtract
      ^ power
        2
        4
      4
evaluated: 42
input: v = (2 ^ 4 - 4)
Setting variable v = 12
input: 3.5 * a
input:1.6: Unknown variable "a"
input: 3.5 * v
tree:
* multiply
  3.5
  12
evaluated: 42
input: 5 + * 6
input:1.4: syntax error, unexpected '*'
input: 5 + (4 * 4
input:1.10-9: syntax error, unexpected end of file, expecting ')'

The exprtest can also be used to process text files containing expressions. Within the file each line is parsed as an expression. Multiple expressions can be put into one line by terminating them with a semicolon ';'. The exprtest outputs a parse tree for each parsed non-assignment line.


v = (2 ^ 4 - 4); e = 2.71828

4 * 1.5 + 3 * v

6 * (2 * 2) ^ 2 / 2

The above example file (included as exprtest.txt) can be processed by calling ./exprtest exprtest.txt. The program outputs the following evaluation:

Setting variable v = 12
Setting variable e = 2.71828
Expressions:
[0]:
tree:
+ add
  * multiply
    4
    1.5
  * multiply
    3
    12
evaluated: 42
[1]:
tree:
/ divide
  * multiply
    6
    ^ power
      * multiply
        2
        2
      2
  2
evaluated: 48

ChangeLog

2009-09-05 - Timo Bingmann - v0.1.42008-10-23 - Timo Bingmann - v0.1.42008-08-03 - Timo Bingmann - v0.1.2

Older Downloads

Flex Bison C++ Example 0.1.3 released 2008-10-23
Source code archive: Download flex-bison-cpp-example-0.1.3.tar.bz2 (242kb)
MD5: 80ebfaa321aef9c871670f1823b071b3
  Browse doxygen documentation online
Browse source online

Flex Bison C++ Example 0.1.2 released 2008-08-03
Source code archive: Download flex-bison-cpp-example-0.1.2.tar.bz2 (241kb)
MD5: 265c45d4e47818bc368ea765f9841da6
  Browse doxygen documentation online
Browse source online

Flex Bison C++ Example 0.1 released 2007-08-20
Source code archive: Download flex-bison-cpp-example-0.1.tar.bz2 (248kb)
MD5: abba99e37c6790a4f1a914fc13611258
  Browse doxygen documentation online
Browse source online