Why Sponsor Oils? | blog | oilshell.org

Three Great Videos About Regex Derivatives

2020-12-23

I usually don't post videos here, but the first one in this series got a positive reaction on Reddit, and there are two more.

I've been interested in implementing simple and efficient regex matching with the derivatives technique for a couple years. But I haven't had time to play with code that implements it, or read the original papers more carefully.

These videos helped me understand the technique despite the fact I haven't done math in over a decade. Others apparently felt the same way. The explanation is simple but not dumbed down.

Table of Contents
Context
How Does This Relate to Shell?
EasyTheory Videos
Derivative of a Regex?! (Brzozowski Derivative)
Example
Regex to DFA Directly Using Brzozowski Derivatives (No NFA)
Why Call This Operation a Derivative?
More Info

Context

For context, see the two July 2020 posts tagged #eggex, particularly this part of the second post, on derivatives.

Summary: A 2009 paper revived this regex matching technique. There have been many implementations since, though no popular ones.

How Does This Relate to Shell?

EasyTheory Videos

Derivative of a Regex?! (Brzozowski Derivative)

This video explains what a derivative is using the recursive definition of regular expressions.

 

Example

This one walks through an example of a regex and its derivative.

 

Regex to DFA Directly Using Brzozowski Derivatives (No NFA)

This one shows how to directly convert a regular expression to a DFA with derivatives. This is interesting because DFAs can be executed efficiently. See the Reddit thread for more color on this.

 

Why Call This Operation a Derivative?

I don't know, and the videos don't explain it! I assume there is some connection between algebra and calculus.

The Reddit thread has some speculation on this. Leave a comment if you have ideas.

I recently explained that languages are sets, not algorithms. And MathOverflow says that derivatives can be expressed in set theory. But I don't know if that's a useful way to think about it.

More Info

Ad: I'm looking for help with Oil, which is written in Python, shell, and C++. See our README.md.