cat on shelf?

This note describes a parsing algorithm for a slightly unusual query syntax. I will explain how to discover it by taking a series of small steps starting from traditional logic program syntax (skip to the first section if you haven’t seen that before).

summary. The point of the new syntax is that it requires fewer binary join variables (variables that show up in exactly two positions). For instance, if I want to know about all the cats that are on shelves1, I might write this Prolog query:

cat(X), on(X, Y), shelf(Y)

but in the new syntax, we can write

cat on shelf

This relies on two ideas: allow predicates to come out of order

that way on can show up in-between its arguments

and allow unary predicates to be used like variables

so instead of using a unary predicate application like cat X to pick out each cat, and then plugging X somewhere else in the query, we just write cat where we want it to go.

Prolog Syntax

I want to know about all the cats that are on shelves:

cat(X), on(X, Y), shelf(Y)

Each section will begin with this example written in a different way.

ML-flavored Prolog Syntax

cat X, on X Y, shelf Y

similar to syntax of Standard ML, Haskell.

Record Arities, Elide Commas

: cat/1 shelf/1 on/2

cat X on X Y shelf Y

Arity refers to the number of parameters that a predicate needs to be applied to, and the number of columns in the corresponding relation. So every cat tuple contains one value, which we take to represent the cat in question, and on has arity two because it relates two things to each other (the “supported” and the “supporter”).

You could call a set of predicates and their arities a schema. With a schema, and assuming that each predicate is fully applied, we know where the commas should be, so we could omit them. Commas are still useful for readability.

Allow Unary Predicates in Place of Variables

on cat shelf

This is the first substantial step. We read this as some cat is on some shelf, where the variables representing those entities have been elided. We need to infer them. We do so with this algorithm:

  1. Prepare an output, a stack, and the input starting with left-most word. The stack and output contain items: predicates that have been applied to zero or more variables.
  2. If the stack contains at least two items, and the top item is unary: generate a fresh variable and apply the top two items to it.
    1. If the stack now contains any nullary (fully applied) items, move them to the output.
  3. Otherwise, push the next word from the input onto the stack.

Note that when we say unary predicate we mean both a predicate that starts out unary, and any higher-arity predicate that has been partially applied. Here is how the algorithm processes the example:

; output | stack | input
------------------------
(0)  . | . | on cat shelf
(2)  . | on | cat shelf
(2)  . | on cat | shelf
(1)  cat X | on X | shelf
(2)  cat X | on X shelf | .
(1)  cat X, on X Y, shelf Y | . | .

The final result is a series of fully applied predicates.

(aside if you are familiar with the jargon: Prolog programs have a verbosity problem compared to functional programs; it’s kind of like being forced to write in ANF form. This algorithm is similar to the ANF translation.)

Final Step: Allow Argument Re-ordering

cat X, X on Y, shelf Y

or simply

cat on shelf

We use the same algorithm, but modify step (1):

  1. (modified) If the stack contains at least two items, and either of the top two items is unary: generate a fresh variable and apply the top two items to it.

Here’s a more elaborate example that involves a ternary word:

: telescope/1 with/2 sees/3

; cat sees cat with telescope

(0)  . | . | cat sees cat with telescope
(2)  . | cat | sees cat with telescope
(2)  . | cat sees | cat with telescope
(1)  cat X | sees X | cat with telescope
(2)  cat X | sees X cat | with telescope
(1)  cat X, cat Y | sees X Y | with telescope
(2)  cat X, cat Y | sees X Y with | telescope
(1)  cat X, cat Y, sees X Y T | with T | telescope
(2)  cat X, cat Y, sees X Y T | with T telescope | .
(1)  cat X, cat Y, sees X Y T, with T Z, telescope Z | . | .

Linguistic aside: we’re thinking of sees/3 as a relation that marks an instance of seeing; the instance is captured by T. The subject and object (X, Y) are also captured by sees/3, but other optional modifiers (the instrument, in this example) can be attached using other predicates.

Final notes:

https://cutfree.net


  1. cat on shelf
    ↩︎