adt.hl.rkt (10815B)
1 #lang hyper-literate typed/racket/base 2 @(require scribble-enhanced/doc 3 scribble-math 4 racket/sandbox 5 scribble/example 6 (for-label typed/racket/base 7 phc-toolkit 8 phc-toolkit/untyped-only 9 datatype 10 "ctx.hl.rkt" 11 "tagged.hl.rkt" 12 "tagged-supertype.hl.rkt" 13 "structure.hl.rkt" 14 "constructor.hl.rkt" 15 "variant.hl.rkt" 16 "adt-init.rkt")) 17 @doc-lib-setup 18 19 @title[#:style (with-html5 manual-doc-style) 20 #:tag "adt" 21 #:tag-prefix "phc-adt/adt"]{Algebraic Data Types} 22 23 @(chunks-toc-prefix 24 '("(lib phc-adt/scribblings/phc-adt-implementation.scrbl)" 25 "phc-adt/adt")) 26 27 @(table-of-contents) 28 29 @section{A note on polysemy} 30 31 The name ``constructor'' usually designates two things: 32 @itemlist[ 33 @item{A tagged value, like the ones created or accessed using the 34 @racket[constructor] macro defined in 35 @secref["constructor" #:tag-prefixes '("phc-adt/constructor")]} 36 @item{A constructor function or macro for some kind of data structure, which 37 is the function or macro used to create instances of that data structure.}] 38 39 Since this could lead to ambiguities, we clarify by saying ``constructor'' in 40 the former case, and ``builder'' or ``builder function'' in the latter case. 41 42 @section[#:tag "ADT|introduction"]{Introduction} 43 44 We define variants (tagged unions), with the following constraints: 45 46 @; TODO: put a short usage example here 47 48 @itemlist[ 49 @item{A constructor is described by a tag name, followed by zero or more 50 values. Likewise, a tagged structure is described by a tag name, followed by 51 zero or more field names, each field name being mapped to a value.} 52 @item{Two different variants can contain the same constructor or tagged 53 structure, and it is not possible to create an instance of that constructor or 54 tagged structure that would belong to one variant but not the other.} 55 @item{Constructors and tagged structures are "anonymous": it is not necessary 56 to declare a constructor or tagged structure before creating instances of it, 57 expressing its type, or using it as a match pattern.} 58 @item{Constructors types and tagged structures types are "interned": two 59 constructors with the same tag name have the same type, even if they are used 60 in different files. The same applies to two tagged structures with the same 61 tag name and field names: even if they are used in different files, they have 62 the same type.}] 63 64 The @racketmodname[datatype] package by Andrew Kent also 65 implements Algebraic Data Types. The main differences are that unlike our 66 library, data structures have to be declared before they are used (they are 67 not "anonymous"), and a given constructor name cannot be shared by multiple 68 unions, as can be seen in the example below where the second 69 @tc[define-datatype] throws an error: 70 71 @(define tr-evaluator 72 (parameterize ([sandbox-output 'string] 73 [sandbox-error-output 'string]) 74 (make-evaluator 'typed/racket))) 75 @examples[#:eval tr-evaluator 76 (require datatype) 77 78 (define-datatype Expr 79 [Var (Symbol)] 80 [Lambda (Symbol Expr)] 81 [App (Expr Expr)]) 82 83 ;; define-datatype: variant type #<syntax:11:3 Var> already bound 84 ;; in: Simple-Expr 85 (eval:error 86 (define-datatype Simple-Expr 87 [Var (Symbol)] 88 [Lambda (Symbol Expr)]))] 89 90 @section{Remembered data types and pre-declarations} 91 92 This library works by remembering all the constructors and all the tagged 93 structures across compilations. More precisely, each constructor's tag name is 94 written to a file named @filepath{adt-pre-declarations.rkt} in the same 95 directory as the user code. The tag name and list of fields of each tagged 96 structure is also written in the same file. 97 98 The generated @racket{adt-pre-declarations.rkt} file declares a 99 @racket[struct] for each tagged structure and constructor, so that all user 100 files which @racket[require] the same @racket{adt-pre-declarations.rkt} will 101 share the same @racket[struct] definitions. 102 103 User files which make use of the @racketmodname[phc-adt] should include a call 104 to @racket[adt-init] before using anything else. The @racket[adt-init] macro 105 @racket[require]s the @filepath{adt-pre-declarations.rkt} file, and records 106 the lexical context of that @racket[require], so that the other macros 107 implemented by this library can fetch the pre-declared @racket[struct] types 108 from the correct lexical scope. The @filepath{ctx.hl.rkt} file takes care of 109 recording that lexical scope, while @filepath{adt-init.rkt} performs the 110 initialisation sequence (creating the @filepath{adt-pre-declarations.rkt} file 111 if it does not exist, loading the pre-declared @racket[struct] types from 112 @filepath{adt-pre-declarations.rkt}, and using a utility from 113 @filepath{ctx.hl.rkt} to record the lexical context). 114 115 @chunk[<require-modules> 116 (require "ctx.hl.rkt")] 117 118 @section{The initialisation process} 119 120 The initialisation process can be somewhat complex: the directives 121 @racket[(remember-output-file "adt-pre-declarations.rkt")], 122 @racket[(set-adt-context)] and @racket[(require "adt-pre-declarations.rkt")] 123 have to be inserted in the right order, and the file 124 @filepath{adt-pre-declarations.rkt} has to be initialised with the appropriate 125 contents when it does not exist. The @racket[adt-init] macro defined in 126 @filepath{"adt-init.rkt"} takes care of these steps. 127 128 @chunk[<require-modules> 129 (require "adt-init.rkt")] 130 131 The generated @filepath{adt-pre-declarations.rkt} file will call the 132 @racket[pre-declare-all-tagged-structure-structs] macro defined in 133 @filepath{tagged-structure-low-level.hl.rkt}. 134 135 @section{Tagged structures, untagged structures, constructors, and variants} 136 137 We first define a low-level interface for tagged structures in the 138 @filepath{tagged-structure-low-level.hl.rkt} file. This low-level interface 139 includes for-syntax functions for expressing the type of tagged structures, 140 creating builder functions for them, as well as match patterns. It also includes 141 means to access the value of a given field on any tagged structure which 142 contains that field. The @filepath{tagged.hl.rkt} file provides syntactic sugar 143 for this low-level interface, and defines the @racket[tagged] identifier, which 144 acts as a type expander, match expander and macro. The macro can be used to 145 create builder functions which return instances of tagged structures, or to 146 directly create such instances. 147 148 @chunk[<require-modules> 149 (require "tagged.hl.rkt")] 150 151 The @filepath{"tagged-supertype.hl.rkt"} file defines a few operations 152 implementing some form of "static duck typing": As a type expander, 153 @racket[(tagged-supertype fieldᵢ …)] expands to the union type of all tagged 154 structures containing a superset of the given set of fields. As a match 155 expander, @racket[(tagged-supertype [fieldᵢ patᵢⱼ …] …)] expands to a match 156 pattern which accepts any tagged structure with a superset of the given set of 157 fields, as long as the value of each @racket[fieldᵢ] matches against all of the 158 corresponding @racket[patᵢⱼ …]. 159 160 @chunk[<require-modules> 161 (require "tagged-supertype.hl.rkt")] 162 163 We then define untagged structures, which are tagged structures with the 164 @racket[untagged] tag name. Untagged structures can be used conveniently when 165 the tag name is not important and the goal is simply to map a set of field names 166 to values. The @filepath{structure.hl.rkt} file defines the @tc[structure] 167 type expander, match expander and macro. The @tc[structure] identifier acts as 168 a simple wrapper around @racket[tagged] which supplies @racket[untagged] as the 169 tag name. 170 171 @chunk[<require-modules> 172 (require "structure.hl.rkt")] 173 174 Constructors are defined as tagged structures containing a single field, called 175 @racket[values]. The @racket[constructor] macro, defined in 176 @filepath{"constructor.hl.rkt"} accepts a rich syntax for creating constructor 177 instances containing multiple values, associated with the tag name. The values 178 are aggregated in a list, which is stored within the @racket[values] field of 179 the tagged structure used to implement the constructor. The @racket[constructor] 180 identifier is therefore nothing more than syntactic sugar for @racket[tagged]. 181 It relies on the @racketmodname[xlist] library, which provides a rich syntax for 182 expressing the complex list types, as well as the corresponding match pattern. 183 184 @chunk[<require-modules> 185 (require "constructor.hl.rkt")] 186 187 For convenience, we write a @tc[variant] form, which is a thin wrapper against 188 the union type of several constructors and tagged structures, 189 @tc[(U constructor-or-tagged …)]. 190 191 @chunk[<require-modules> 192 (require "variant.hl.rkt")] 193 194 Finally, we directly include the row polymorphism features from 195 @filepath{tagged-structure-low-level.hl.rkt}: 196 197 @chunk[<require-modules> 198 (require "tagged-structure-low-level.hl.rkt")] 199 200 @;{Finally, we define a @tc[uniform-get] form, which can 201 operate on @tc[tagged] structures. We also wrap the plain 202 @tc[structure] form so that it instead returns a tagged 203 structure, using a common tag for all plain structures. This 204 allows us to rely on the invariant that @tc[uniform-get] 205 always operates on data with the same shape (a constructor 206 whose single value is a promise for a structure)@note{This 207 avoids the risk of combinatorial explosion for the input 208 type of @racket[uniform-get], when accessing a deeply nested 209 field: allowing 210 @racket[(U structure 211 (constructor structure) 212 (constructor (Promise structure)))] 213 would result in a type of size @${n⁴}, with @${n} the depth 214 of the accessed field.}} 215 216 @chunk[<*> 217 (begin <require-modules>) 218 (provide adt-init 219 tagged 220 tagged? 221 define-tagged 222 TaggedTop 223 TaggedTop? 224 tagged-supertype 225 tagged-supertype* 226 227 structure 228 structure? 229 define-structure 230 StructureTop 231 StructureTop? 232 structure-supertype 233 234 constructor 235 constructor? 236 define-constructor 237 ConstructorTop 238 ConstructorTop? 239 240 variant 241 define-variant 242 243 constructor-values 244 uniform-get 245 246 split 247 #;(for-syntax split/type) 248 merge 249 #;(for-syntax merge/type) 250 with+ 251 with! 252 with!! 253 #;(for-syntax tagged-struct-id?))]