www

Unnamed repository; edit this file 'description' to name the repository.
Log | Files | Refs | README

adt.hl.rkt (10815B)


      1 #lang hyper-literate typed/racket/base
      2 @(require scribble-enhanced/doc
      3           scribble-math
      4           racket/sandbox
      5           scribble/example
      6           (for-label typed/racket/base
      7                      phc-toolkit
      8                      phc-toolkit/untyped-only
      9                      datatype
     10                      "ctx.hl.rkt"
     11                      "tagged.hl.rkt"
     12                      "tagged-supertype.hl.rkt"
     13                      "structure.hl.rkt"
     14                      "constructor.hl.rkt"
     15                      "variant.hl.rkt"
     16                      "adt-init.rkt"))
     17 @doc-lib-setup
     18 
     19 @title[#:style (with-html5 manual-doc-style)
     20        #:tag "adt"
     21        #:tag-prefix "phc-adt/adt"]{Algebraic Data Types}
     22 
     23 @(chunks-toc-prefix
     24   '("(lib phc-adt/scribblings/phc-adt-implementation.scrbl)"
     25     "phc-adt/adt"))
     26 
     27 @(table-of-contents)
     28 
     29 @section{A note on polysemy}
     30 
     31 The name ``constructor'' usually designates two things:
     32 @itemlist[
     33  @item{A tagged value, like the ones created or accessed using the
     34   @racket[constructor] macro defined in
     35   @secref["constructor" #:tag-prefixes '("phc-adt/constructor")]}
     36  @item{A constructor function or macro for some kind of data structure, which
     37   is the function or macro used to create instances of that data structure.}]
     38 
     39 Since this could lead to ambiguities, we clarify by saying ``constructor'' in
     40 the former case, and ``builder'' or ``builder function'' in the latter case.
     41 
     42 @section[#:tag "ADT|introduction"]{Introduction}
     43 
     44 We define variants (tagged unions), with the following constraints:
     45 
     46 @; TODO: put a short usage example here
     47 
     48 @itemlist[
     49  @item{A constructor is described by a tag name, followed by zero or more
     50   values. Likewise, a tagged structure is described by a tag name, followed by
     51   zero or more field names, each field name being mapped to a value.}
     52  @item{Two different variants can contain the same constructor or tagged
     53   structure, and it is not possible to create an instance of that constructor or
     54   tagged structure that would belong to one variant but not the other.}
     55  @item{Constructors and tagged structures are "anonymous": it is not necessary
     56   to declare a constructor or tagged structure before creating instances of it,
     57   expressing its type, or using it as a match pattern.}
     58  @item{Constructors types and tagged structures types are "interned": two
     59   constructors with the same tag name have the same type, even if they are used
     60   in different files. The same applies to two tagged structures with the same
     61   tag name and field names: even if they are used in different files, they have
     62   the same type.}]
     63 
     64 The @racketmodname[datatype] package by Andrew Kent also
     65 implements Algebraic Data Types. The main differences are that unlike our
     66 library, data structures have to be declared before they are used (they are
     67 not "anonymous"), and a given constructor name cannot be shared by multiple
     68 unions, as can be seen in the example below where the second
     69 @tc[define-datatype] throws an error:
     70 
     71 @(define tr-evaluator
     72    (parameterize ([sandbox-output 'string]
     73                   [sandbox-error-output 'string])
     74      (make-evaluator 'typed/racket)))
     75 @examples[#:eval tr-evaluator
     76           (require datatype)
     77           
     78           (define-datatype Expr
     79             [Var (Symbol)]
     80             [Lambda (Symbol Expr)]
     81             [App (Expr Expr)])
     82           
     83           ;; define-datatype: variant type #<syntax:11:3 Var> already bound
     84           ;; in: Simple-Expr
     85           (eval:error
     86            (define-datatype Simple-Expr
     87              [Var (Symbol)]
     88              [Lambda (Symbol Expr)]))]
     89 
     90 @section{Remembered data types and pre-declarations}
     91 
     92 This library works by remembering all the constructors and all the tagged
     93 structures across compilations. More precisely, each constructor's tag name is
     94 written to a file named @filepath{adt-pre-declarations.rkt} in the same
     95 directory as the user code. The tag name and list of fields of each tagged
     96 structure is also written in the same file.
     97 
     98 The generated @racket{adt-pre-declarations.rkt} file declares a
     99 @racket[struct] for each tagged structure and constructor, so that all user
    100 files which @racket[require] the same @racket{adt-pre-declarations.rkt} will
    101 share the same @racket[struct] definitions.
    102 
    103 User files which make use of the @racketmodname[phc-adt] should include a call
    104 to @racket[adt-init] before using anything else. The @racket[adt-init] macro
    105 @racket[require]s the @filepath{adt-pre-declarations.rkt} file, and records
    106 the lexical context of that @racket[require], so that the other macros
    107 implemented by this library can fetch the pre-declared @racket[struct] types
    108 from the correct lexical scope. The @filepath{ctx.hl.rkt} file takes care of
    109 recording that lexical scope, while @filepath{adt-init.rkt} performs the
    110 initialisation sequence (creating the @filepath{adt-pre-declarations.rkt} file
    111 if it does not exist, loading the pre-declared @racket[struct] types from
    112 @filepath{adt-pre-declarations.rkt}, and using a utility from
    113 @filepath{ctx.hl.rkt} to record the lexical context).
    114 
    115 @chunk[<require-modules>
    116        (require "ctx.hl.rkt")]
    117 
    118 @section{The initialisation process}
    119 
    120 The initialisation process can be somewhat complex: the directives
    121 @racket[(remember-output-file "adt-pre-declarations.rkt")],
    122 @racket[(set-adt-context)] and @racket[(require "adt-pre-declarations.rkt")]
    123 have to be inserted in the right order, and the file
    124 @filepath{adt-pre-declarations.rkt} has to be initialised with the appropriate
    125 contents when it does not exist. The @racket[adt-init] macro defined in
    126 @filepath{"adt-init.rkt"} takes care of these steps.
    127 
    128 @chunk[<require-modules>
    129        (require "adt-init.rkt")]
    130 
    131 The generated @filepath{adt-pre-declarations.rkt} file will call the
    132 @racket[pre-declare-all-tagged-structure-structs] macro defined in
    133 @filepath{tagged-structure-low-level.hl.rkt}.
    134 
    135 @section{Tagged structures, untagged structures, constructors, and variants}
    136 
    137 We first define a low-level interface for tagged structures in the
    138 @filepath{tagged-structure-low-level.hl.rkt} file. This low-level interface
    139 includes for-syntax functions for expressing the type of tagged structures,
    140 creating builder functions for them, as well as match patterns. It also includes
    141 means to access the value of a given field on any tagged structure which
    142 contains that field. The @filepath{tagged.hl.rkt} file provides syntactic sugar
    143 for this low-level interface, and defines the @racket[tagged] identifier, which
    144 acts as a type expander, match expander and macro. The macro can be used to
    145 create builder functions which return instances of tagged structures, or to
    146 directly create such instances.
    147 
    148 @chunk[<require-modules>
    149        (require "tagged.hl.rkt")]
    150 
    151 The @filepath{"tagged-supertype.hl.rkt"} file defines a few operations
    152 implementing some form of "static duck typing": As a type expander,
    153 @racket[(tagged-supertype fieldᵢ …)] expands to the union type of all tagged
    154 structures containing a superset of the given set of fields. As a match
    155 expander, @racket[(tagged-supertype [fieldᵢ patᵢⱼ …] …)] expands to a match
    156 pattern which accepts any tagged structure with a superset of the given set of
    157 fields, as long as the value of each @racket[fieldᵢ] matches against all of the
    158 corresponding @racket[patᵢⱼ …].
    159 
    160 @chunk[<require-modules>
    161        (require "tagged-supertype.hl.rkt")]
    162 
    163 We then define untagged structures, which are tagged structures with the
    164 @racket[untagged] tag name. Untagged structures can be used conveniently when
    165 the tag name is not important and the goal is simply to map a set of field names
    166 to values. The @filepath{structure.hl.rkt} file defines the @tc[structure]
    167 type expander, match expander and macro. The @tc[structure] identifier acts as
    168 a simple wrapper around @racket[tagged] which supplies @racket[untagged] as the
    169 tag name.
    170 
    171 @chunk[<require-modules>
    172        (require "structure.hl.rkt")]
    173 
    174 Constructors are defined as tagged structures containing a single field, called
    175 @racket[values]. The @racket[constructor] macro, defined in
    176 @filepath{"constructor.hl.rkt"} accepts a rich syntax for creating constructor
    177 instances containing multiple values, associated with the tag name. The values
    178 are aggregated in a list, which is stored within the @racket[values] field of
    179 the tagged structure used to implement the constructor. The @racket[constructor]
    180 identifier is therefore nothing more than syntactic sugar for @racket[tagged].
    181 It relies on the @racketmodname[xlist] library, which provides a rich syntax for
    182 expressing the complex list types, as well as the corresponding match pattern.
    183 
    184 @chunk[<require-modules>
    185        (require "constructor.hl.rkt")]
    186 
    187 For convenience, we write a @tc[variant] form, which is a thin wrapper against
    188 the union type of several constructors and tagged structures,
    189 @tc[(U constructor-or-tagged …)].
    190 
    191 @chunk[<require-modules>
    192        (require "variant.hl.rkt")]
    193 
    194 Finally, we directly include the row polymorphism features from
    195 @filepath{tagged-structure-low-level.hl.rkt}:
    196 
    197 @chunk[<require-modules>
    198        (require "tagged-structure-low-level.hl.rkt")]
    199 
    200 @;{Finally, we define a @tc[uniform-get] form, which can
    201 operate on @tc[tagged] structures. We also wrap the plain 
    202 @tc[structure] form so that it instead returns a tagged
    203 structure, using a common tag for all plain structures. This
    204 allows us to rely on the invariant that @tc[uniform-get]
    205 always operates on data with the same shape (a constructor
    206 whose single value is a promise for a structure)@note{This
    207  avoids the risk of combinatorial explosion for the input
    208  type of @racket[uniform-get], when accessing a deeply nested
    209  field: allowing
    210  @racket[(U structure
    211             (constructor structure)
    212             (constructor (Promise structure)))]
    213  would result in a type of size @${n⁴}, with @${n} the depth
    214  of the accessed field.}}
    215 
    216 @chunk[<*>
    217        (begin <require-modules>)
    218        (provide adt-init
    219                 tagged
    220                 tagged?
    221                 define-tagged
    222                 TaggedTop
    223                 TaggedTop?
    224                 tagged-supertype
    225                 tagged-supertype*
    226                 
    227                 structure
    228                 structure?
    229                 define-structure
    230                 StructureTop
    231                 StructureTop?
    232                 structure-supertype
    233                 
    234                 constructor
    235                 constructor?
    236                 define-constructor
    237                 ConstructorTop
    238                 ConstructorTop?
    239                 
    240                 variant
    241                 define-variant
    242 
    243                 constructor-values
    244                 uniform-get
    245 
    246                 split
    247                 #;(for-syntax split/type)
    248                 merge
    249                 #;(for-syntax merge/type)
    250                 with+
    251                 with!
    252                 with!!
    253                 #;(for-syntax tagged-struct-id?))]