Definition
A regular grammar is a formal grammar whose production rules are restricted to forms that generate exactly the class of regular languages. In a right‑linear regular grammar, each production has the form A → aB or A → a (or A → ε), where A and B are non‑terminal symbols, a is a terminal symbol, and ε denotes the empty string. A left‑linear regular grammar uses productions of the form A → Ba or A → a (or A → ε). The languages produced by either right‑linear or left‑linear grammars coincide with the regular languages.
Overview
Regular grammars constitute the lowest level of the Chomsky hierarchy of formal grammars. They are equivalent in expressive power to deterministic and nondeterministic finite automata (DFAs and NFAs) and to regular expressions. This equivalence allows regular grammars to be used for specifying lexical syntax in compilers, for pattern‑matching utilities, and for modeling simple sequential processes. Because regular grammars cannot express nested or hierarchical structures, they are less powerful than context‑free or context‑sensitive grammars.
Etymology / Origin
The adjective “regular” in this context derives from “regular language,” a term introduced by Stephen Coleman Kleene in the 1950s when he formalized the notion of regular sets of strings using what are now called regular expressions. The concept of a regular grammar emerged shortly thereafter as a grammatical characterization of the same class of languages, aligning with Kleene’s work on automata theory.
Characteristics
| Feature | Description |
|---|---|
| Production forms | Right‑linear: A → aB, A → a, A → ε; Left‑linear: A → Ba, A → a, A → ε |
| Equivalence | Generates exactly the regular languages; equivalent to finite automata and regular expressions |
| Determinism | A regular grammar can be converted to a deterministic finite automaton (DFA) if it is right‑linear and no two productions for the same non‑terminal share the same terminal on the right‑hand side |
| Closure properties | Languages generated by regular grammars are closed under union, concatenation, Kleene star, complement, and intersection with regular languages |
| Limitations | Cannot generate languages requiring counting beyond a fixed bound (e.g., {aⁿbⁿ |
| Conversion algorithms | Systematic procedures exist to transform a regular grammar into an equivalent DFA/NFA and vice versa; similar transformations connect regular grammars with regular expressions |
| Variants | Mixed linear grammars (allowing both left‑ and right‑linear rules) may generate non‑regular languages; pure linear forms remain regular |
Related Topics
- Regular language – the class of languages accepted by finite automata and describable by regular expressions.
- Finite automaton – abstract machine (deterministic or nondeterministic) equivalent in power to regular grammars.
- Regular expression – algebraic notation for describing regular languages; transformable to and from regular grammars.
- Context‑free grammar – a more expressive grammar type that can generate languages with nested structures, such as balanced parentheses.
- Chomsky hierarchy – classification of formal grammars into regular, context‑free, context‑sensitive, and recursively enumerable types.
- Lexical analysis – phase of compilation where regular grammars (or regular expressions) define token patterns.
This entry summarizes the established understanding of regular grammars within theoretical computer science and formal language theory.