Type Inference
This programming assignment is to be completed with the same team as the type checker assignment.
Part I: Generating Type Constraints
Following the lecture notes, derive type constraints for this language:
<expr> ::= <num> | true | false | {+ <expr> <expr>} | {- <expr> <expr>} | {* <expr> <expr>} | {iszero <expr>} | {bif <expr> <expr> <expr>} | <id> | {with {<id> <expr>} <expr>} | {rec {<id> <expr>} <expr>} | {fun {<id>} <expr>} | {<expr> <expr>} | tempty | {tcons <expr> <expr>} | {tempty? <expr>} | {tfirst <expr>} | {trest <expr>}
The novelty of this language is that the list operations are now polymorphic; that is, you can create lists of numbers or booleans.
Note: The right hand side of the rec binding does not have to be a
syntactic function. However, you may assume that the rec-bound identifier only
appears under a {fun ...}
in the right hand side of the binding. In
other words, the following expressions are legal:
{rec {f {fun {x} {f x}}} ...} {rec {f {with {y 4} {fun {x} {f y}}}} ...}
while the following are not legal:
{rec {f f} ...} {rec {f {+ 1 f}} ...}
Adapt your parser to parse this language. Then, write a function called
generate-constraints
which consumes a parsed expression of this
language and returns a list of constraints (of the type defined in Part
II). The correspondence between type constraints and the terms in Part II is as
follows:
-
The constants are the base types: they should contain either
the symbol
'number
or'boolean
. - The variables correspond to the expressions and identifiers whose types you wish to learn. Thus, they should contain either an expression, or a symbol representing an identifier. You may assume all identifiers are bound exactly once; i.e., there are no unbound identifiers, and no name is ever bound more than once.
- The constructors are the type constructors. They should
contain either:
- the symbol
'->
and a list of two arguments, or - the symbol
'listof
and a list of one argument.
- the symbol
In some cases, you may need to a fresh identifier when defining
constraints. The Scheme function gensym
returns a unique
identifier on every call.
Part II: Unification
Implement the unification algorithm from the lecture notes. Call the function
unify
. The algorithm
should work for a generic term representation, in which a term is one of:
- a constant, which contains a symbol,
- a variable, which can contain any value, or
- a constructor of the form C(t1, ..., tn), where C is a symbol and the ti are a list of sub-terms.
In addition, you will need data types for representing a constraint (a pair of terms) and substitution (a variable and a term). The unification algorithm will consume a list of constraints and produce a list of substitutions. As in the type checker assingment, your unification function must consume an error function to call with an appropriate message if one of the following errors is found:
- The unification of two terms is impossible.
- The occurs check fails.
Finally, when comparing variables for equality, use Scheme’s built-in
eq?
function. For symbols, it behaves exactly as
symbol=?
; for other values, it compares them for identity (like
Java’s ==
comparison). We will rely on identical variables
being deemed equivalent by eq?
when solving the constraints
generated in the following section.
Part III: Inferring Types
To infer the type of a program, parse it, generate constraints, and unify the constraints. The result will be a list of substitutions; by looking up the subsitution for the entire expression, you can access its type.
To implement this, your code needs to define a function,
infer-type
, which consumes a concrete representation of the
program (as given above), and produces either an error string or a
representation of the inferred type. Represent types concretely as:
<type> ::= number | boolean | (listof <type>) | (<type> -> <type>) | <string>
where strings are used to represent type variables. For example, the
type of length
would be:
((listof "a") -> number)
Extra Credit
For a very small amount of extra credit, write a program in this language for
which your algorithm infers the type
(“a” -> “b”)
. You
shouldn’t attempt this problem until you’ve fully completed the
assignment.
What Not To Do
You do not need to implement an interpreter for this language.
You do not need to implement let-based polymorphism.