271 lines
		
	
	
		
			11 KiB
		
	
	
	
		
			HTML
		
	
	
	
	
	
			
		
		
	
	
			271 lines
		
	
	
		
			11 KiB
		
	
	
	
		
			HTML
		
	
	
	
	
	
| <!DOCTYPE html>
 | |
| <html><head>
 | |
| <meta name="viewport" content="width=device-width, initial-scale=1.0">
 | |
| <meta http-equiv="content-type" content="text/html; charset=UTF-8">
 | |
| <link href="sqlite.css" rel="stylesheet">
 | |
| <title>The Lemon LALR(1) Parser Generator</title>
 | |
| <!-- path= -->
 | |
| </head>
 | |
| <body>
 | |
| <div class=nosearch>
 | |
| <a href="index.html">
 | |
| <img class="logo" src="images/sqlite370_banner.gif" alt="SQLite" border="0">
 | |
| </a>
 | |
| <div><!-- IE hack to prevent disappearing logo --></div>
 | |
| <div class="tagline desktoponly">
 | |
| Small. Fast. Reliable.<br>Choose any three.
 | |
| </div>
 | |
| <div class="menu mainmenu">
 | |
| <ul>
 | |
| <li><a href="index.html">Home</a>
 | |
| <li class='mobileonly'><a href="javascript:void(0)" onclick='toggle_div("submenu")'>Menu</a>
 | |
| <li class='wideonly'><a href='about.html'>About</a>
 | |
| <li class='desktoponly'><a href="docs.html">Documentation</a>
 | |
| <li class='desktoponly'><a href="download.html">Download</a>
 | |
| <li class='wideonly'><a href='copyright.html'>License</a>
 | |
| <li class='desktoponly'><a href="support.html">Support</a>
 | |
| <li class='desktoponly'><a href="prosupport.html">Purchase</a>
 | |
| <li class='search' id='search_menubutton'>
 | |
| <a href="javascript:void(0)" onclick='toggle_search()'>Search</a>
 | |
| </ul>
 | |
| </div>
 | |
| <div class="menu submenu" id="submenu">
 | |
| <ul>
 | |
| <li><a href='about.html'>About</a>
 | |
| <li><a href='docs.html'>Documentation</a>
 | |
| <li><a href='download.html'>Download</a>
 | |
| <li><a href='support.html'>Support</a>
 | |
| <li><a href='prosupport.html'>Purchase</a>
 | |
| </ul>
 | |
| </div>
 | |
| <div class="searchmenu" id="searchmenu">
 | |
| <form method="GET" action="search">
 | |
| <select name="s" id="searchtype">
 | |
| <option value="d">Search Documentation</option>
 | |
| <option value="c">Search Changelog</option>
 | |
| </select>
 | |
| <input type="text" name="q" id="searchbox" value="">
 | |
| <input type="submit" value="Go">
 | |
| </form>
 | |
| </div>
 | |
| </div>
 | |
| <script>
 | |
| function toggle_div(nm) {
 | |
| var w = document.getElementById(nm);
 | |
| if( w.style.display=="block" ){
 | |
| w.style.display = "none";
 | |
| }else{
 | |
| w.style.display = "block";
 | |
| }
 | |
| }
 | |
| function toggle_search() {
 | |
| var w = document.getElementById("searchmenu");
 | |
| if( w.style.display=="block" ){
 | |
| w.style.display = "none";
 | |
| } else {
 | |
| w.style.display = "block";
 | |
| setTimeout(function(){
 | |
| document.getElementById("searchbox").focus()
 | |
| }, 30);
 | |
| }
 | |
| }
 | |
| function div_off(nm){document.getElementById(nm).style.display="none";}
 | |
| window.onbeforeunload = function(e){div_off("submenu");}
 | |
| /* Disable the Search feature if we are not operating from CGI, since */
 | |
| /* Search is accomplished using CGI and will not work without it. */
 | |
| if( !location.origin || !location.origin.match || !location.origin.match(/http/) ){
 | |
| document.getElementById("search_menubutton").style.display = "none";
 | |
| }
 | |
| /* Used by the Hide/Show button beside syntax diagrams, to toggle the */
 | |
| function hideorshow(btn,obj){
 | |
| var x = document.getElementById(obj);
 | |
| var b = document.getElementById(btn);
 | |
| if( x.style.display!='none' ){
 | |
| x.style.display = 'none';
 | |
| b.innerHTML='show';
 | |
| }else{
 | |
| x.style.display = '';
 | |
| b.innerHTML='hide';
 | |
| }
 | |
| return false;
 | |
| }
 | |
| </script>
 | |
| </div>
 | |
| <div class=fancy>
 | |
| <div class=nosearch>
 | |
| <div class="fancy_title">
 | |
| The Lemon LALR(1) Parser Generator
 | |
| </div>
 | |
| <div class="fancy_toc">
 | |
| <a onclick="toggle_toc()">
 | |
| <span class="fancy_toc_mark" id="toc_mk">►</span>
 | |
| Table Of Contents
 | |
| </a>
 | |
| <div id="toc_sub"><div class="fancy-toc1"><a href="#overview">1. Overview</a></div>
 | |
| <div class="fancy-toc2"><a href="#lemon_source_files_and_documentation">1.1. Lemon Source Files And Documentation</a></div>
 | |
| <div class="fancy-toc1"><a href="#advantages_of_lemon">2. Advantages of Lemon</a></div>
 | |
| <div class="fancy-toc2"><a href="#use_of_lemon_within_sqlite">2.1. Use of Lemon Within SQLite</a></div>
 | |
| <div class="fancy-toc2"><a href="#lemon_customizations_especially_for_sqlite">2.2. Lemon Customizations Especially For SQLite</a></div>
 | |
| <div class="fancy-toc1"><a href="#history_of_lemon">3. History Of Lemon</a></div>
 | |
| </div>
 | |
| </div>
 | |
| <script>
 | |
| function toggle_toc(){
 | |
| var sub = document.getElementById("toc_sub")
 | |
| var mk = document.getElementById("toc_mk")
 | |
| if( sub.style.display!="block" ){
 | |
| sub.style.display = "block";
 | |
| mk.innerHTML = "▼";
 | |
| } else {
 | |
| sub.style.display = "none";
 | |
| mk.innerHTML = "►";
 | |
| }
 | |
| }
 | |
| </script>
 | |
| </div>
 | |
| 
 | |
| 
 | |
| 
 | |
| 
 | |
| 
 | |
| <h1 id="overview"><span>1. </span>Overview</h1>
 | |
| 
 | |
| <p>The SQL language parser for SQLite is generated using a code-generator
 | |
| program called "Lemon".  The Lemon program reads a grammar of the input
 | |
| language and emits C-code to implement a parser for that language.
 | |
| 
 | |
| 
 | |
| </p><h2 id="lemon_source_files_and_documentation"><span>1.1. </span>Lemon Source Files And Documentation</h2>
 | |
| 
 | |
| <p>Lemon does not have its own source repository.  Rather, Lemon consists
 | |
| of a few files in the SQLite source tree:
 | |
| 
 | |
| </p><ul>
 | |
| <li><p>
 | |
|      <a href="https://sqlite.org/src/doc/trunk/doc/lemon.html">lemon.html</a> →
 | |
|      The original detailed usage documentation and programmers reference
 | |
|      for Lemon.
 | |
| </p></li><li><p>
 | |
|      <a href="https://sqlite.org/src/file/tool/lemon.c">lemon.c</a> → The source code
 | |
|      for the utility program that reads a grammar file and generates 
 | |
|      corresponding parser C-code.
 | |
| </p></li><li><p>
 | |
|      <a href="https://sqlite.org/src/file/tool/lempar.c">lempar.c</a> → A template
 | |
|      for the generated parser C-code.  The "lemon" utility program reads this
 | |
|      template and inserts additional code in order to generate a parser.
 | |
| </p></li></ul>
 | |
| 
 | |
| <h1 id="advantages_of_lemon"><span>2. </span>Advantages of Lemon</h1>
 | |
| 
 | |
| <p>Lemon generates an LALR(1) parser.  Its operation is similar to the
 | |
| more familiar tools <a href="https://en.wikipedia.org/wiki/Yacc">Yacc</a> and
 | |
| <a href="https://en.wikipedia.org/wiki/GNU_bison">Bison</a>, but Lemon adds important
 | |
| improvements, including:
 | |
| 
 | |
| </p><ul>
 | |
| <li><p>
 | |
|      The grammar syntax is less error prone - using symbolic names for
 | |
|      semantic values rather that the "$1"-style positional notation
 | |
|      of Yacc.
 | |
| </p></li><li><p>
 | |
|      In Lemon, the tokenizer calls the parser.  Yacc operates the other
 | |
|      way around, with the parser calling the tokenizer.  The Lemon
 | |
|      approach is reentrant and threadsafe, whereas Yacc uses global 
 | |
|      variables and is therefore neither.  Reentrancy is especially
 | |
|      important for SQLite since some SQL statements make recursive calls
 | |
|      to the parser.  For example, when parsing a CREATE TABLE statement,
 | |
|      SQLite invokes the parser recursively to generate an INSERT statement
 | |
|      to make a new entry in the <a href="schematab.html">sqlite_schema</a> table.
 | |
| </p></li><li><p>
 | |
|      Lemon has the concept of a non-terminal destructor that can be
 | |
|      used to reclaim memory or other resources following a syntax error
 | |
|      or other aborted parse.
 | |
| </p></li></ul>
 | |
| 
 | |
| <h2 id="use_of_lemon_within_sqlite"><span>2.1. </span>Use of Lemon Within SQLite</h2>
 | |
| 
 | |
| <p>Lemon is used in two places in SQLite.
 | |
| 
 | |
| </p><p>The primary use of Lemon is to create the SQL language parser.
 | |
| A grammar file (<a href="https://sqlite.org/src/file/src/parse.y">parse.y</a>) is
 | |
| compiled by Lemon into parse.c and parse.h.  The parse.c file is
 | |
| incorporated into the <a href="amalgamation.html">amalgamation</a> without further modification.
 | |
| 
 | |
| </p><p>Lemon is also used to generate the parser for the query pattern
 | |
| expressions in the <a href="fts5.html">FTS5</a> extension.  In this case, the input grammar
 | |
| file is <a href="https://sqlite.org/src/file/ext/fts5/fts5parse.y">fts5parse.y</a>.
 | |
| 
 | |
| </p><h2 id="lemon_customizations_especially_for_sqlite"><span>2.2. </span>Lemon Customizations Especially For SQLite</h2>
 | |
| 
 | |
| <p>One of the advantages of hosting code generator tools as part of
 | |
| the project is that the tools can be optimized to serve specific needs of
 | |
| the overall project.  Lemon has benefited from this effect. Over the years,
 | |
| the Lemon parser generator has been extended and enhanced to provide
 | |
| new capabilities and improved performance to SQLite.  A few of the
 | |
| specific enhancements to Lemon that are specifically designed for use
 | |
| by SQLite include:
 | |
| 
 | |
| </p><ul>
 | |
| <li><p>
 | |
| Lemon has the concept of a "fallback" token.
 | |
| The SQL language contains a large number of keywords and these keywords
 | |
| have the potential to collide with identifier names.
 | |
| Lemon has the ability to designate some keywords has being able to
 | |
| "fallback" to an identifier.  If the keyword appears in the input token
 | |
| stream in a context that would otherwise be a syntax error, the token
 | |
| is automatically transformed into its fallback before the syntax error
 | |
| is raised.  This feature allows the parser to be very forgiving of
 | |
| reserved words used as identifiers, which is a problem that comes up
 | |
| frequently in the SQL language.
 | |
| 
 | |
| </p></li><li><p>
 | |
| In support of the <a href="testing.html#mcdc">100% MC/DC testing</a> goal for SQLite, 
 | |
| the parser code generated by Lemon has no unreachable branches,
 | |
| and contains extra (compile-time selected) instrumentation useful
 | |
| for measuring test coverage.
 | |
| 
 | |
| </p></li><li><p>
 | |
| Lemon supports conditional compilation of grammar file rules, so that
 | |
| a different parser can be generated depending on compile-time options.
 | |
| 
 | |
| </p></li><li><p>
 | |
| As a performance optimization, reduce actions in the Lemon input grammar
 | |
| are allowed to contain comments of the form "/*A-overwrites-Z*/" to indicate
 | |
| that the semantic value "A" on the right-hand side of the rule is allowed
 | |
| to directly overwrite the semantic value "Z" on the left-hand side.
 | |
| This simple optimization reduces the number of stack operations in the
 | |
| push-down automaton used to parse the input grammar, and thus improve
 | |
| performance of the parser.  It also makes the generated code a little smaller.
 | |
| </p></li></ul>
 | |
| 
 | |
| <p>The parsing of SQL statements is a significant consumer of CPU cycles 
 | |
| in any SQL database engine.  On-going efforts to optimize SQLite have caused
 | |
| the developers to spend a lot of time tweaking Lemon to generate faster
 | |
| parsers.  These efforts have benefited all users of the Lemon parser generator,
 | |
| not just SQLite.  But if Lemon had been a separately maintained tool, it
 | |
| would have been more difficult to make coordinated changes to both SQLite
 | |
| and Lemon, and as a result not as much optimization would have been
 | |
| accomplished.  Hence, the fact that the parser generator tool is included
 | |
| in the source tree for SQLite has turned out to be a net benefit for both
 | |
| the tool itself and for SQLite.
 | |
| 
 | |
| </p><h1 id="history_of_lemon"><span>3. </span>History Of Lemon</h1>
 | |
| 
 | |
| <p>Lemon was originally written by D. Richard Hipp (also the creator of SQLite)
 | |
| while he was in graduate school at Duke University between 1987 and 1992.
 | |
| The original creation date of Lemon has been lost, but was probably sometime
 | |
| around 1990.  Lemon generates an LALR(1) parser.  There was a companion 
 | |
| LL(1) parser generator tool named "Lime", but the source code for Lime
 | |
| has been lost.
 | |
| 
 | |
| </p><p>The Lemon source code was originally written as separate source files,
 | |
| and only later merged into a single "lemon.c" source file.
 | |
| 
 | |
| </p><p>The author of Lemon and SQLite (Hipp) reports that his C programming
 | |
| skills were greatly enhanced by studying John Ousterhout's original
 | |
| source code to Tcl.  Hipp discovered and studied Tcl in 1993.  Lemon
 | |
| was written before then, and SQLite afterwards.  There is a clear
 | |
| difference in the coding styles of these two products, with SQLite seeming
 | |
| to be cleaner, more readable, and easier to maintain.
 | |
| </p>
 |