271 lines
		
	
	
		
			11 KiB
		
	
	
	
		
			HTML
		
	
	
	
	
	
			
		
		
	
	
			271 lines
		
	
	
		
			11 KiB
		
	
	
	
		
			HTML
		
	
	
	
	
	
<!DOCTYPE html>
 | 
						|
<html><head>
 | 
						|
<meta name="viewport" content="width=device-width, initial-scale=1.0">
 | 
						|
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
 | 
						|
<link href="sqlite.css" rel="stylesheet">
 | 
						|
<title>The Lemon LALR(1) Parser Generator</title>
 | 
						|
<!-- path= -->
 | 
						|
</head>
 | 
						|
<body>
 | 
						|
<div class=nosearch>
 | 
						|
<a href="index.html">
 | 
						|
<img class="logo" src="images/sqlite370_banner.gif" alt="SQLite" border="0">
 | 
						|
</a>
 | 
						|
<div><!-- IE hack to prevent disappearing logo --></div>
 | 
						|
<div class="tagline desktoponly">
 | 
						|
Small. Fast. Reliable.<br>Choose any three.
 | 
						|
</div>
 | 
						|
<div class="menu mainmenu">
 | 
						|
<ul>
 | 
						|
<li><a href="index.html">Home</a>
 | 
						|
<li class='mobileonly'><a href="javascript:void(0)" onclick='toggle_div("submenu")'>Menu</a>
 | 
						|
<li class='wideonly'><a href='about.html'>About</a>
 | 
						|
<li class='desktoponly'><a href="docs.html">Documentation</a>
 | 
						|
<li class='desktoponly'><a href="download.html">Download</a>
 | 
						|
<li class='wideonly'><a href='copyright.html'>License</a>
 | 
						|
<li class='desktoponly'><a href="support.html">Support</a>
 | 
						|
<li class='desktoponly'><a href="prosupport.html">Purchase</a>
 | 
						|
<li class='search' id='search_menubutton'>
 | 
						|
<a href="javascript:void(0)" onclick='toggle_search()'>Search</a>
 | 
						|
</ul>
 | 
						|
</div>
 | 
						|
<div class="menu submenu" id="submenu">
 | 
						|
<ul>
 | 
						|
<li><a href='about.html'>About</a>
 | 
						|
<li><a href='docs.html'>Documentation</a>
 | 
						|
<li><a href='download.html'>Download</a>
 | 
						|
<li><a href='support.html'>Support</a>
 | 
						|
<li><a href='prosupport.html'>Purchase</a>
 | 
						|
</ul>
 | 
						|
</div>
 | 
						|
<div class="searchmenu" id="searchmenu">
 | 
						|
<form method="GET" action="search">
 | 
						|
<select name="s" id="searchtype">
 | 
						|
<option value="d">Search Documentation</option>
 | 
						|
<option value="c">Search Changelog</option>
 | 
						|
</select>
 | 
						|
<input type="text" name="q" id="searchbox" value="">
 | 
						|
<input type="submit" value="Go">
 | 
						|
</form>
 | 
						|
</div>
 | 
						|
</div>
 | 
						|
<script>
 | 
						|
function toggle_div(nm) {
 | 
						|
var w = document.getElementById(nm);
 | 
						|
if( w.style.display=="block" ){
 | 
						|
w.style.display = "none";
 | 
						|
}else{
 | 
						|
w.style.display = "block";
 | 
						|
}
 | 
						|
}
 | 
						|
function toggle_search() {
 | 
						|
var w = document.getElementById("searchmenu");
 | 
						|
if( w.style.display=="block" ){
 | 
						|
w.style.display = "none";
 | 
						|
} else {
 | 
						|
w.style.display = "block";
 | 
						|
setTimeout(function(){
 | 
						|
document.getElementById("searchbox").focus()
 | 
						|
}, 30);
 | 
						|
}
 | 
						|
}
 | 
						|
function div_off(nm){document.getElementById(nm).style.display="none";}
 | 
						|
window.onbeforeunload = function(e){div_off("submenu");}
 | 
						|
/* Disable the Search feature if we are not operating from CGI, since */
 | 
						|
/* Search is accomplished using CGI and will not work without it. */
 | 
						|
if( !location.origin || !location.origin.match || !location.origin.match(/http/) ){
 | 
						|
document.getElementById("search_menubutton").style.display = "none";
 | 
						|
}
 | 
						|
/* Used by the Hide/Show button beside syntax diagrams, to toggle the */
 | 
						|
function hideorshow(btn,obj){
 | 
						|
var x = document.getElementById(obj);
 | 
						|
var b = document.getElementById(btn);
 | 
						|
if( x.style.display!='none' ){
 | 
						|
x.style.display = 'none';
 | 
						|
b.innerHTML='show';
 | 
						|
}else{
 | 
						|
x.style.display = '';
 | 
						|
b.innerHTML='hide';
 | 
						|
}
 | 
						|
return false;
 | 
						|
}
 | 
						|
</script>
 | 
						|
</div>
 | 
						|
<div class=fancy>
 | 
						|
<div class=nosearch>
 | 
						|
<div class="fancy_title">
 | 
						|
The Lemon LALR(1) Parser Generator
 | 
						|
</div>
 | 
						|
<div class="fancy_toc">
 | 
						|
<a onclick="toggle_toc()">
 | 
						|
<span class="fancy_toc_mark" id="toc_mk">►</span>
 | 
						|
Table Of Contents
 | 
						|
</a>
 | 
						|
<div id="toc_sub"><div class="fancy-toc1"><a href="#overview">1. Overview</a></div>
 | 
						|
<div class="fancy-toc2"><a href="#lemon_source_files_and_documentation">1.1. Lemon Source Files And Documentation</a></div>
 | 
						|
<div class="fancy-toc1"><a href="#advantages_of_lemon">2. Advantages of Lemon</a></div>
 | 
						|
<div class="fancy-toc2"><a href="#use_of_lemon_within_sqlite">2.1. Use of Lemon Within SQLite</a></div>
 | 
						|
<div class="fancy-toc2"><a href="#lemon_customizations_especially_for_sqlite">2.2. Lemon Customizations Especially For SQLite</a></div>
 | 
						|
<div class="fancy-toc1"><a href="#history_of_lemon">3. History Of Lemon</a></div>
 | 
						|
</div>
 | 
						|
</div>
 | 
						|
<script>
 | 
						|
function toggle_toc(){
 | 
						|
var sub = document.getElementById("toc_sub")
 | 
						|
var mk = document.getElementById("toc_mk")
 | 
						|
if( sub.style.display!="block" ){
 | 
						|
sub.style.display = "block";
 | 
						|
mk.innerHTML = "▼";
 | 
						|
} else {
 | 
						|
sub.style.display = "none";
 | 
						|
mk.innerHTML = "►";
 | 
						|
}
 | 
						|
}
 | 
						|
</script>
 | 
						|
</div>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<h1 id="overview"><span>1. </span>Overview</h1>
 | 
						|
 | 
						|
<p>The SQL language parser for SQLite is generated using a code-generator
 | 
						|
program called "Lemon".  The Lemon program reads a grammar of the input
 | 
						|
language and emits C-code to implement a parser for that language.
 | 
						|
 | 
						|
 | 
						|
</p><h2 id="lemon_source_files_and_documentation"><span>1.1. </span>Lemon Source Files And Documentation</h2>
 | 
						|
 | 
						|
<p>Lemon does not have its own source repository.  Rather, Lemon consists
 | 
						|
of a few files in the SQLite source tree:
 | 
						|
 | 
						|
</p><ul>
 | 
						|
<li><p>
 | 
						|
     <a href="https://sqlite.org/src/doc/trunk/doc/lemon.html">lemon.html</a> →
 | 
						|
     The original detailed usage documentation and programmers reference
 | 
						|
     for Lemon.
 | 
						|
</p></li><li><p>
 | 
						|
     <a href="https://sqlite.org/src/file/tool/lemon.c">lemon.c</a> → The source code
 | 
						|
     for the utility program that reads a grammar file and generates 
 | 
						|
     corresponding parser C-code.
 | 
						|
</p></li><li><p>
 | 
						|
     <a href="https://sqlite.org/src/file/tool/lempar.c">lempar.c</a> → A template
 | 
						|
     for the generated parser C-code.  The "lemon" utility program reads this
 | 
						|
     template and inserts additional code in order to generate a parser.
 | 
						|
</p></li></ul>
 | 
						|
 | 
						|
<h1 id="advantages_of_lemon"><span>2. </span>Advantages of Lemon</h1>
 | 
						|
 | 
						|
<p>Lemon generates an LALR(1) parser.  Its operation is similar to the
 | 
						|
more familiar tools <a href="https://en.wikipedia.org/wiki/Yacc">Yacc</a> and
 | 
						|
<a href="https://en.wikipedia.org/wiki/GNU_bison">Bison</a>, but Lemon adds important
 | 
						|
improvements, including:
 | 
						|
 | 
						|
</p><ul>
 | 
						|
<li><p>
 | 
						|
     The grammar syntax is less error prone - using symbolic names for
 | 
						|
     semantic values rather that the "$1"-style positional notation
 | 
						|
     of Yacc.
 | 
						|
</p></li><li><p>
 | 
						|
     In Lemon, the tokenizer calls the parser.  Yacc operates the other
 | 
						|
     way around, with the parser calling the tokenizer.  The Lemon
 | 
						|
     approach is reentrant and threadsafe, whereas Yacc uses global 
 | 
						|
     variables and is therefore neither.  Reentrancy is especially
 | 
						|
     important for SQLite since some SQL statements make recursive calls
 | 
						|
     to the parser.  For example, when parsing a CREATE TABLE statement,
 | 
						|
     SQLite invokes the parser recursively to generate an INSERT statement
 | 
						|
     to make a new entry in the <a href="schematab.html">sqlite_schema</a> table.
 | 
						|
</p></li><li><p>
 | 
						|
     Lemon has the concept of a non-terminal destructor that can be
 | 
						|
     used to reclaim memory or other resources following a syntax error
 | 
						|
     or other aborted parse.
 | 
						|
</p></li></ul>
 | 
						|
 | 
						|
<h2 id="use_of_lemon_within_sqlite"><span>2.1. </span>Use of Lemon Within SQLite</h2>
 | 
						|
 | 
						|
<p>Lemon is used in two places in SQLite.
 | 
						|
 | 
						|
</p><p>The primary use of Lemon is to create the SQL language parser.
 | 
						|
A grammar file (<a href="https://sqlite.org/src/file/src/parse.y">parse.y</a>) is
 | 
						|
compiled by Lemon into parse.c and parse.h.  The parse.c file is
 | 
						|
incorporated into the <a href="amalgamation.html">amalgamation</a> without further modification.
 | 
						|
 | 
						|
</p><p>Lemon is also used to generate the parser for the query pattern
 | 
						|
expressions in the <a href="fts5.html">FTS5</a> extension.  In this case, the input grammar
 | 
						|
file is <a href="https://sqlite.org/src/file/ext/fts5/fts5parse.y">fts5parse.y</a>.
 | 
						|
 | 
						|
</p><h2 id="lemon_customizations_especially_for_sqlite"><span>2.2. </span>Lemon Customizations Especially For SQLite</h2>
 | 
						|
 | 
						|
<p>One of the advantages of hosting code generator tools as part of
 | 
						|
the project is that the tools can be optimized to serve specific needs of
 | 
						|
the overall project.  Lemon has benefited from this effect. Over the years,
 | 
						|
the Lemon parser generator has been extended and enhanced to provide
 | 
						|
new capabilities and improved performance to SQLite.  A few of the
 | 
						|
specific enhancements to Lemon that are specifically designed for use
 | 
						|
by SQLite include:
 | 
						|
 | 
						|
</p><ul>
 | 
						|
<li><p>
 | 
						|
Lemon has the concept of a "fallback" token.
 | 
						|
The SQL language contains a large number of keywords and these keywords
 | 
						|
have the potential to collide with identifier names.
 | 
						|
Lemon has the ability to designate some keywords has being able to
 | 
						|
"fallback" to an identifier.  If the keyword appears in the input token
 | 
						|
stream in a context that would otherwise be a syntax error, the token
 | 
						|
is automatically transformed into its fallback before the syntax error
 | 
						|
is raised.  This feature allows the parser to be very forgiving of
 | 
						|
reserved words used as identifiers, which is a problem that comes up
 | 
						|
frequently in the SQL language.
 | 
						|
 | 
						|
</p></li><li><p>
 | 
						|
In support of the <a href="testing.html#mcdc">100% MC/DC testing</a> goal for SQLite, 
 | 
						|
the parser code generated by Lemon has no unreachable branches,
 | 
						|
and contains extra (compile-time selected) instrumentation useful
 | 
						|
for measuring test coverage.
 | 
						|
 | 
						|
</p></li><li><p>
 | 
						|
Lemon supports conditional compilation of grammar file rules, so that
 | 
						|
a different parser can be generated depending on compile-time options.
 | 
						|
 | 
						|
</p></li><li><p>
 | 
						|
As a performance optimization, reduce actions in the Lemon input grammar
 | 
						|
are allowed to contain comments of the form "/*A-overwrites-Z*/" to indicate
 | 
						|
that the semantic value "A" on the right-hand side of the rule is allowed
 | 
						|
to directly overwrite the semantic value "Z" on the left-hand side.
 | 
						|
This simple optimization reduces the number of stack operations in the
 | 
						|
push-down automaton used to parse the input grammar, and thus improve
 | 
						|
performance of the parser.  It also makes the generated code a little smaller.
 | 
						|
</p></li></ul>
 | 
						|
 | 
						|
<p>The parsing of SQL statements is a significant consumer of CPU cycles 
 | 
						|
in any SQL database engine.  On-going efforts to optimize SQLite have caused
 | 
						|
the developers to spend a lot of time tweaking Lemon to generate faster
 | 
						|
parsers.  These efforts have benefited all users of the Lemon parser generator,
 | 
						|
not just SQLite.  But if Lemon had been a separately maintained tool, it
 | 
						|
would have been more difficult to make coordinated changes to both SQLite
 | 
						|
and Lemon, and as a result not as much optimization would have been
 | 
						|
accomplished.  Hence, the fact that the parser generator tool is included
 | 
						|
in the source tree for SQLite has turned out to be a net benefit for both
 | 
						|
the tool itself and for SQLite.
 | 
						|
 | 
						|
</p><h1 id="history_of_lemon"><span>3. </span>History Of Lemon</h1>
 | 
						|
 | 
						|
<p>Lemon was originally written by D. Richard Hipp (also the creator of SQLite)
 | 
						|
while he was in graduate school at Duke University between 1987 and 1992.
 | 
						|
The original creation date of Lemon has been lost, but was probably sometime
 | 
						|
around 1990.  Lemon generates an LALR(1) parser.  There was a companion 
 | 
						|
LL(1) parser generator tool named "Lime", but the source code for Lime
 | 
						|
has been lost.
 | 
						|
 | 
						|
</p><p>The Lemon source code was originally written as separate source files,
 | 
						|
and only later merged into a single "lemon.c" source file.
 | 
						|
 | 
						|
</p><p>The author of Lemon and SQLite (Hipp) reports that his C programming
 | 
						|
skills were greatly enhanced by studying John Ousterhout's original
 | 
						|
source code to Tcl.  Hipp discovered and studied Tcl in 1993.  Lemon
 | 
						|
was written before then, and SQLite afterwards.  There is a clear
 | 
						|
difference in the coding styles of these two products, with SQLite seeming
 | 
						|
to be cleaner, more readable, and easier to maintain.
 | 
						|
</p>
 |