271 lines
11 KiB
HTML
271 lines
11 KiB
HTML
<!DOCTYPE html>
|
|
<html><head>
|
|
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
|
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
|
|
<link href="sqlite.css" rel="stylesheet">
|
|
<title>The Lemon LALR(1) Parser Generator</title>
|
|
<!-- path= -->
|
|
</head>
|
|
<body>
|
|
<div class=nosearch>
|
|
<a href="index.html">
|
|
<img class="logo" src="images/sqlite370_banner.gif" alt="SQLite" border="0">
|
|
</a>
|
|
<div><!-- IE hack to prevent disappearing logo --></div>
|
|
<div class="tagline desktoponly">
|
|
Small. Fast. Reliable.<br>Choose any three.
|
|
</div>
|
|
<div class="menu mainmenu">
|
|
<ul>
|
|
<li><a href="index.html">Home</a>
|
|
<li class='mobileonly'><a href="javascript:void(0)" onclick='toggle_div("submenu")'>Menu</a>
|
|
<li class='wideonly'><a href='about.html'>About</a>
|
|
<li class='desktoponly'><a href="docs.html">Documentation</a>
|
|
<li class='desktoponly'><a href="download.html">Download</a>
|
|
<li class='wideonly'><a href='copyright.html'>License</a>
|
|
<li class='desktoponly'><a href="support.html">Support</a>
|
|
<li class='desktoponly'><a href="prosupport.html">Purchase</a>
|
|
<li class='search' id='search_menubutton'>
|
|
<a href="javascript:void(0)" onclick='toggle_search()'>Search</a>
|
|
</ul>
|
|
</div>
|
|
<div class="menu submenu" id="submenu">
|
|
<ul>
|
|
<li><a href='about.html'>About</a>
|
|
<li><a href='docs.html'>Documentation</a>
|
|
<li><a href='download.html'>Download</a>
|
|
<li><a href='support.html'>Support</a>
|
|
<li><a href='prosupport.html'>Purchase</a>
|
|
</ul>
|
|
</div>
|
|
<div class="searchmenu" id="searchmenu">
|
|
<form method="GET" action="search">
|
|
<select name="s" id="searchtype">
|
|
<option value="d">Search Documentation</option>
|
|
<option value="c">Search Changelog</option>
|
|
</select>
|
|
<input type="text" name="q" id="searchbox" value="">
|
|
<input type="submit" value="Go">
|
|
</form>
|
|
</div>
|
|
</div>
|
|
<script>
|
|
function toggle_div(nm) {
|
|
var w = document.getElementById(nm);
|
|
if( w.style.display=="block" ){
|
|
w.style.display = "none";
|
|
}else{
|
|
w.style.display = "block";
|
|
}
|
|
}
|
|
function toggle_search() {
|
|
var w = document.getElementById("searchmenu");
|
|
if( w.style.display=="block" ){
|
|
w.style.display = "none";
|
|
} else {
|
|
w.style.display = "block";
|
|
setTimeout(function(){
|
|
document.getElementById("searchbox").focus()
|
|
}, 30);
|
|
}
|
|
}
|
|
function div_off(nm){document.getElementById(nm).style.display="none";}
|
|
window.onbeforeunload = function(e){div_off("submenu");}
|
|
/* Disable the Search feature if we are not operating from CGI, since */
|
|
/* Search is accomplished using CGI and will not work without it. */
|
|
if( !location.origin || !location.origin.match || !location.origin.match(/http/) ){
|
|
document.getElementById("search_menubutton").style.display = "none";
|
|
}
|
|
/* Used by the Hide/Show button beside syntax diagrams, to toggle the */
|
|
function hideorshow(btn,obj){
|
|
var x = document.getElementById(obj);
|
|
var b = document.getElementById(btn);
|
|
if( x.style.display!='none' ){
|
|
x.style.display = 'none';
|
|
b.innerHTML='show';
|
|
}else{
|
|
x.style.display = '';
|
|
b.innerHTML='hide';
|
|
}
|
|
return false;
|
|
}
|
|
</script>
|
|
</div>
|
|
<div class=fancy>
|
|
<div class=nosearch>
|
|
<div class="fancy_title">
|
|
The Lemon LALR(1) Parser Generator
|
|
</div>
|
|
<div class="fancy_toc">
|
|
<a onclick="toggle_toc()">
|
|
<span class="fancy_toc_mark" id="toc_mk">►</span>
|
|
Table Of Contents
|
|
</a>
|
|
<div id="toc_sub"><div class="fancy-toc1"><a href="#overview">1. Overview</a></div>
|
|
<div class="fancy-toc2"><a href="#lemon_source_files_and_documentation">1.1. Lemon Source Files And Documentation</a></div>
|
|
<div class="fancy-toc1"><a href="#advantages_of_lemon">2. Advantages of Lemon</a></div>
|
|
<div class="fancy-toc2"><a href="#use_of_lemon_within_sqlite">2.1. Use of Lemon Within SQLite</a></div>
|
|
<div class="fancy-toc2"><a href="#lemon_customizations_especially_for_sqlite">2.2. Lemon Customizations Especially For SQLite</a></div>
|
|
<div class="fancy-toc1"><a href="#history_of_lemon">3. History Of Lemon</a></div>
|
|
</div>
|
|
</div>
|
|
<script>
|
|
function toggle_toc(){
|
|
var sub = document.getElementById("toc_sub")
|
|
var mk = document.getElementById("toc_mk")
|
|
if( sub.style.display!="block" ){
|
|
sub.style.display = "block";
|
|
mk.innerHTML = "▼";
|
|
} else {
|
|
sub.style.display = "none";
|
|
mk.innerHTML = "►";
|
|
}
|
|
}
|
|
</script>
|
|
</div>
|
|
|
|
|
|
|
|
|
|
|
|
<h1 id="overview"><span>1. </span>Overview</h1>
|
|
|
|
<p>The SQL language parser for SQLite is generated using a code-generator
|
|
program called "Lemon". The Lemon program reads a grammar of the input
|
|
language and emits C-code to implement a parser for that language.
|
|
|
|
|
|
</p><h2 id="lemon_source_files_and_documentation"><span>1.1. </span>Lemon Source Files And Documentation</h2>
|
|
|
|
<p>Lemon does not have its own source repository. Rather, Lemon consists
|
|
of a few files in the SQLite source tree:
|
|
|
|
</p><ul>
|
|
<li><p>
|
|
<a href="https://sqlite.org/src/doc/trunk/doc/lemon.html">lemon.html</a> →
|
|
The original detailed usage documentation and programmers reference
|
|
for Lemon.
|
|
</p></li><li><p>
|
|
<a href="https://sqlite.org/src/file/tool/lemon.c">lemon.c</a> → The source code
|
|
for the utility program that reads a grammar file and generates
|
|
corresponding parser C-code.
|
|
</p></li><li><p>
|
|
<a href="https://sqlite.org/src/file/tool/lempar.c">lempar.c</a> → A template
|
|
for the generated parser C-code. The "lemon" utility program reads this
|
|
template and inserts additional code in order to generate a parser.
|
|
</p></li></ul>
|
|
|
|
<h1 id="advantages_of_lemon"><span>2. </span>Advantages of Lemon</h1>
|
|
|
|
<p>Lemon generates an LALR(1) parser. Its operation is similar to the
|
|
more familiar tools <a href="https://en.wikipedia.org/wiki/Yacc">Yacc</a> and
|
|
<a href="https://en.wikipedia.org/wiki/GNU_bison">Bison</a>, but Lemon adds important
|
|
improvements, including:
|
|
|
|
</p><ul>
|
|
<li><p>
|
|
The grammar syntax is less error prone - using symbolic names for
|
|
semantic values rather that the "$1"-style positional notation
|
|
of Yacc.
|
|
</p></li><li><p>
|
|
In Lemon, the tokenizer calls the parser. Yacc operates the other
|
|
way around, with the parser calling the tokenizer. The Lemon
|
|
approach is reentrant and threadsafe, whereas Yacc uses global
|
|
variables and is therefore neither. Reentrancy is especially
|
|
important for SQLite since some SQL statements make recursive calls
|
|
to the parser. For example, when parsing a CREATE TABLE statement,
|
|
SQLite invokes the parser recursively to generate an INSERT statement
|
|
to make a new entry in the <a href="schematab.html">sqlite_schema</a> table.
|
|
</p></li><li><p>
|
|
Lemon has the concept of a non-terminal destructor that can be
|
|
used to reclaim memory or other resources following a syntax error
|
|
or other aborted parse.
|
|
</p></li></ul>
|
|
|
|
<h2 id="use_of_lemon_within_sqlite"><span>2.1. </span>Use of Lemon Within SQLite</h2>
|
|
|
|
<p>Lemon is used in two places in SQLite.
|
|
|
|
</p><p>The primary use of Lemon is to create the SQL language parser.
|
|
A grammar file (<a href="https://sqlite.org/src/file/src/parse.y">parse.y</a>) is
|
|
compiled by Lemon into parse.c and parse.h. The parse.c file is
|
|
incorporated into the <a href="amalgamation.html">amalgamation</a> without further modification.
|
|
|
|
</p><p>Lemon is also used to generate the parser for the query pattern
|
|
expressions in the <a href="fts5.html">FTS5</a> extension. In this case, the input grammar
|
|
file is <a href="https://sqlite.org/src/file/ext/fts5/fts5parse.y">fts5parse.y</a>.
|
|
|
|
</p><h2 id="lemon_customizations_especially_for_sqlite"><span>2.2. </span>Lemon Customizations Especially For SQLite</h2>
|
|
|
|
<p>One of the advantages of hosting code generator tools as part of
|
|
the project is that the tools can be optimized to serve specific needs of
|
|
the overall project. Lemon has benefited from this effect. Over the years,
|
|
the Lemon parser generator has been extended and enhanced to provide
|
|
new capabilities and improved performance to SQLite. A few of the
|
|
specific enhancements to Lemon that are specifically designed for use
|
|
by SQLite include:
|
|
|
|
</p><ul>
|
|
<li><p>
|
|
Lemon has the concept of a "fallback" token.
|
|
The SQL language contains a large number of keywords and these keywords
|
|
have the potential to collide with identifier names.
|
|
Lemon has the ability to designate some keywords has being able to
|
|
"fallback" to an identifier. If the keyword appears in the input token
|
|
stream in a context that would otherwise be a syntax error, the token
|
|
is automatically transformed into its fallback before the syntax error
|
|
is raised. This feature allows the parser to be very forgiving of
|
|
reserved words used as identifiers, which is a problem that comes up
|
|
frequently in the SQL language.
|
|
|
|
</p></li><li><p>
|
|
In support of the <a href="testing.html#mcdc">100% MC/DC testing</a> goal for SQLite,
|
|
the parser code generated by Lemon has no unreachable branches,
|
|
and contains extra (compile-time selected) instrumentation useful
|
|
for measuring test coverage.
|
|
|
|
</p></li><li><p>
|
|
Lemon supports conditional compilation of grammar file rules, so that
|
|
a different parser can be generated depending on compile-time options.
|
|
|
|
</p></li><li><p>
|
|
As a performance optimization, reduce actions in the Lemon input grammar
|
|
are allowed to contain comments of the form "/*A-overwrites-Z*/" to indicate
|
|
that the semantic value "A" on the right-hand side of the rule is allowed
|
|
to directly overwrite the semantic value "Z" on the left-hand side.
|
|
This simple optimization reduces the number of stack operations in the
|
|
push-down automaton used to parse the input grammar, and thus improve
|
|
performance of the parser. It also makes the generated code a little smaller.
|
|
</p></li></ul>
|
|
|
|
<p>The parsing of SQL statements is a significant consumer of CPU cycles
|
|
in any SQL database engine. On-going efforts to optimize SQLite have caused
|
|
the developers to spend a lot of time tweaking Lemon to generate faster
|
|
parsers. These efforts have benefited all users of the Lemon parser generator,
|
|
not just SQLite. But if Lemon had been a separately maintained tool, it
|
|
would have been more difficult to make coordinated changes to both SQLite
|
|
and Lemon, and as a result not as much optimization would have been
|
|
accomplished. Hence, the fact that the parser generator tool is included
|
|
in the source tree for SQLite has turned out to be a net benefit for both
|
|
the tool itself and for SQLite.
|
|
|
|
</p><h1 id="history_of_lemon"><span>3. </span>History Of Lemon</h1>
|
|
|
|
<p>Lemon was originally written by D. Richard Hipp (also the creator of SQLite)
|
|
while he was in graduate school at Duke University between 1987 and 1992.
|
|
The original creation date of Lemon has been lost, but was probably sometime
|
|
around 1990. Lemon generates an LALR(1) parser. There was a companion
|
|
LL(1) parser generator tool named "Lime", but the source code for Lime
|
|
has been lost.
|
|
|
|
</p><p>The Lemon source code was originally written as separate source files,
|
|
and only later merged into a single "lemon.c" source file.
|
|
|
|
</p><p>The author of Lemon and SQLite (Hipp) reports that his C programming
|
|
skills were greatly enhanced by studying John Ousterhout's original
|
|
source code to Tcl. Hipp discovered and studied Tcl in 1993. Lemon
|
|
was written before then, and SQLite afterwards. There is a clear
|
|
difference in the coding styles of these two products, with SQLite seeming
|
|
to be cleaner, more readable, and easier to maintain.
|
|
</p>
|