mm-ADT Logo

mm-ADT is a dual licensed AGPL3/commercial open source project that offers software engineers, computer scientists, mathematicians, and others in the software industry a royalty-based OSS model. The Turing Complete mm-ADT virtual machine (VM) integrates disparate data technologies via algebraic composition, yielding synthetic data systems that have the requisite computational power and expressivity for the problems they were designed to solve. As an economic model, each integration point offers the respective development team access to the revenue streams generated by any for-profit organization leveraging mm-ADT.

Virtual Machine Components

diag f9bcfd96975e3103c088523d50854fca

The mm-ADT VM is the integration point for the following data processing technologies:

  • Programming Languages: Language designers can create custom languages or develop parsers for existing languages that compile to mm-ADT VM bytecode (binary machine code) or unicode (text assembly code).

  • Processing Engines: Processor developers enable their push- or pull-based execution engines to be programmed by any mm-ADT language for evaluation via mm-ADT’s abstract execution model for processing data structures via single-machine, multi-threaded, agent-based, distributed near-time, and/or cluster-oriented, batch-analytic processors.

  • Storage Systems: Storage engineers can embed their systems using model-ADTs expressed in mm-ADT’s dependent type system that enable the lossless encoding of key/value store, document store, wide-column store, graph store, relational store, and other novel or hybrid structures such as hypergraph, docu-graph, and quantum data structures.

The mm-ADT VM enables the intermingling of any language, any processor, and any storage system, that can faithfully implement the core language semantics (types and values), processor semantics (process trace graphs), and/or storage semantics (data structure streams).

mm-ADT Theory

mm-ADT Function

Every mm-ADT program denotes a single unary function that maps an obj of type \$S\$ (start) to an obj of type \$E\$ (end) with the function signature

\[ f: S \rightarrow E. \]

The complexities of mm-ADT are realized in the definition of an obj (which includes both types and values) and the internal structure of an \$f\$-program (which is a composition of nested curried functions). The sole purpose of this documentation is to make salient the various algebraic structures that are operationalized to ultimately yield the mapping \$f : S \rightarrow E\$.

mm-ADT Algebras

Component Algebra

Storage

obj magma

Language

inst monoid

Processor

type ringoid

Each of the three aforementioned mm-ADT VM components has an associated algebra that best describes its structures and processes and which serve as each component’s particular interpretation of mm-ADT. The storage algebra is a magma with a carrier set containing all mm-ADT objects (obj) and a non- associative, binary "juxtapostion" operator. The language algebra is a is a free monoid capturing the serially composition of parameterized instructions (inst) in the mm-ADT instruction set architecture. The processor algebra is a polynomial ringoid (poly) used for compilation and evaluation of mm-ADT types (or programs). These algebras represent the particular perspective that each component has on a shared data structure called the obj graph which realizes a ringoid-embedding of a monoidal Cayley graph. This graph is the medium by which algebraic actions effectively couple these components over the course of a computation: specification, compilation, and evaluation.

The primary purpose of this documentation is to explain these algebras, specify their relationship to one another and demonstrate how they are manipulated by mm-ADT technologies. Data system engineers will learn how to integrate their technology so end users may compose their work with others' to create synthetic data systems tailored to their problem’s particular computational requirements.

mm-ADT Technology

mm-ADT Console

The mm-ADT VM provides a REPL console for users to evaluate mm-ADT programs written in any mm-ADT language. The reference language distributed with the VM is called mmlang. mmlang is a low-level, functional language that is in near 1-to-1 correspondence with the underlying VM architecture — offering it’s users Turing-Complete expressivity when writing programs and an interactive teaching tool for studying the mm-ADT VM.

~/mm-adt bin/mmadt.sh
                                _____ _______
                           /\  |  __ |__   __|
 _ __ ___  _ __ ___ _____ /  \ | |  | | | |
| '_ ` _ \| '_ ` _ |_____/ /\ \| |  | | | |
| | | | | | | | | | |   / ____ \ |__| | | |
|_| |_| |_|_| |_| |_|  /_/    \_\____/  |_|
                                 mm-adt.org
mmlang>

A simple console session is presented below, where the parser expects programs written in the language specified left of the > prompt. All the examples contained herein are presented using mmlang.

mmlang> 1
==>1
mmlang> 1+2
==>3
mmlang> 1[plus,2]
==>3

mmlang Syntax and Semantics

The context-free grammar for mmlang is presented below. mmlang is best understood as a free ringoid whose elements denote mm-ADT programs which act, as modules, on the instruction monoid implemented by every mm-ADT processor.

obj   ::= (type | value)q
value ::= vbool | vint | vreal | vstr
vbool ::= 'true' | 'false'
vint  ::= [1-9][0-9]*
vreal ::= [0-9]+'.'[0-9]*
vstr  ::= "'" [a-zA-Z]* "'"
type  ::= ctype | dtype
ctype ::= 'bool' | 'int'  | 'real' | 'str' | poly | '_'
poly  ::= lst | rec | inst
q     ::= '{' int (',' int)? '}'
dtype ::= ctype q? ('<=' ctype q?)? inst*
sep   ::= ';' | ',' | '|'
lst   ::= '(' obj (sep obj)* ')' q?
rec   ::= '(' obj '->' obj (sep obj '->' obj)* ')' q?
inst  ::= '[' op(','obj)* ']' q?
op    ::= 'a' | 'add' | 'and' | 'as' | 'combine' | 'count' | 'eq' | 'error' |
          'explain' | 'fold' | 'from' | 'get' | 'given' | 'groupCount' | 'gt' |
          'gte' | 'head' | 'id' | 'is' | 'last' | 'lt' | 'lte' | 'map' | 'merge' |
          'mult' | 'neg' | 'noop' | 'one' | 'or' | 'path' | 'plus' | 'pow' | 'put' |
          'q' | 'repeat' |'split' | 'start' | 'tail' | 'to' | 'trace' | 'type' | 'zero'

Language Structures

compiling languages

The mm-ADT VM is written in Scala. It contains a fluent program builder API called mmscala. Most compiler designers should avoid mmscala as direct access to the JVM and the full JDK can create unsafe mm-ADT programs. Instead, mmlang should be the target language for most compilers. Moreover, all compilers targeting mmlang can be used across different mm-ADT VM implementations.

The recommended target language for all higher-level language compilers is mmlang.
Products and Coproducts

Category theory is the study of structure via manipulations that expose (or generate) other structures. Two important category theoretic concepts used throughout this section are products and coproducts.

product

A product is any object defined in terms of it’s accessible component objects. That is, from a single object, via \$\pi_n\$ projection morphisms, the product is decomposed into it’s constituent parts.

coproduct

A coproduct is any object defined in terms the component objects used to construct it. That is, from many objects, via \$\iota_n\$ injection morphisms, a coproduct can be composed from constituent parts.

Along with these decomposition (and composition) morphisms, there exists an isomorphism between any two products (or coproducts) should they project (or inject) to the same component objects. That is, product and coproduct equality are defined via component equality.

The Obj

obj type value q

Everything that can be denoted in mmlang is an obj. Within the VM and outside the referential purview of an interfacing language, every obj is the product of

  1. An object that is either a type or a value and

  2. A quantifier specifying the "amount" of objects being denoted.

\[ \texttt{obj} = (\texttt{type} + \texttt{value}) \times \texttt{q}. \]

obj type q value q

This internal structure is well-defined as an algebraic ring. The ring axioms specify how the internals of an obj are related via two binary operators: \$\times\$ and \$\+\$ . One particular axiom states that products both left and right distribute over coproducts. Thus, the previous formula can be rewritten as \[ \texttt{obj} = (\texttt{type} \times \texttt{q}) + (\texttt{value} \times \texttt{q}). \] This representation, which is isomorphic to the previous, states two distinct kinds of mm-ADT objs: quantified types and quantified values. This is the obj meta-model.

mmlang> int              (1)
==>int
mmlang> 1                (2)
==>1
mmlang> int{5}           (3)
==>int{5}
mmlang> 1{5}             (4)
==>1{5}
mmlang> {'a','b','a'}    (5)
==>'b'
==>'a'{2}
1 A single int type.
2 A single int value of 1.
3 Five int types.
4 Five 1 int values.
5 A str stream containing 'a','b', and 'a'.
The stream is a foundational concept in mm-ADT. It’s algebraic properties are discussed at length when discussing values. In the meantime, note that a stream is an obj. A stream’s type is the type of it’s constituent values. A stream’s quantifier is the sum of the quantifiers of it’s constituent values. Streams are unordered and do not nest in that {1,{2,3},4} is equivalent to {1,2,3,4}.

Types and Values

Many modern programming environments make a sharp distinction between types and values, where each is predominately the focus of either compilation (types) or evaluation (values). In mm-ADT, instructions operate on both types and values. Two notable consequences of computable types and values are

  1. Compilation is the evaluation of a program (a type) with a type argument (\$ \tt{type} \rightarrow \tt{type} \rightarrow \tt{type} \$) and

  2. Types serve as lambda (or anonymous) functions when a value is applied (\$ \tt{value} \rightarrow \tt{type} \rightarrow \tt{value} \$).

mmlang> int => int[is,[gt,0]]                   (1)
==>int{?}<=int[is,bool<=int[gt,0]]
mmlang> 5 => int{?}<=int[is,bool<=int[gt,0]]    (2)
==>5
1 The int-type is applied to the int[is,[gt,0]]-type to yield an int{?}-type denoting either 0 or 1 int (compilation).
2 The type bool<=int[gt,0] serves as a lambda that yields true or false for every incoming int value (evaluation).

Some interesting conceptual blurs arise from the co-existence of types and values during compilation and evaluation. The particulars of the ideas in the table below will be discussed over the course of the documentation.

Table 1. Consequences of Type/Value Integration
structure A structure B unification

type

program

a program is a "complicated" type.

compilation

evaluation

compilations are type evaluations, where a compilation error is a "type runtime" error.

type

value

quantifiers expand the cardinality of values and constrain the cardinality of types.

type

variable

(non-dependent) types refer to values across contexts and variables refer to values within a context.

type

AST

a single intermediate representation is used in compilation, optimization, and evaluation.

type

function

functions are (dependent) types with values generated at evaluation.

state

trace

types and values both encode state information in their process traces.

classical

quantum

quantum computing is classical computing with a unitary matrix quantifier ring.

canonical

atomics

atomic values and canonical types are both "tokens" by which all other constructs are built.

The Type

An obj was previously defined as \[ \texttt{obj} = (\texttt{type} \times \texttt{q}) + (\texttt{value} \times \texttt{q}). \]

That equation is not an axiom, but a theorem. Its truth can be deduced from the equations of the full axiomatization of obj. In particular, for types, they are defined relative to other types. Types are a coproduct of either a

  1. canonical type (ctype): a base/fundamental type, or a

  2. derived type (dtype): a product of a type and an instruction (inst).

The ctypes are nominal types. There are five ctypes:

  1. bool: denotes the set of booleans — \$ \mathbb{B} \$.

  2. int: denotes the set of integers — \$ \mathbb{Z} \$.

  3. real: denotes the set of reals — \$ \mathbb{R} \$.

  4. str: denotes the set of character strings — \$ \Sigma^\ast \$.

  5. poly: denotes the set of polynomials (composites) — \$ \tt{obj}^n \$.

type product

The dtypes are structural types whose recursive definition's base case is a ctype realized via a chain of instructions (inst) that operate on types to yield types. Formally, the type coproduct is defined as

\[ \begin{split} \texttt{type} &=\;& (\texttt{bool} + \texttt{int} + \texttt{real} + \texttt{str} + \texttt{poly}) + (\texttt{type} \times \texttt{inst}) \\ \texttt{type} &=\;& \texttt{ctype} + (\texttt{type} \times \texttt{inst}) \\ \texttt{type} &=\;& \texttt{ctype} + \texttt{dtype}, \end{split} \]

where each component of the coproduct also has a respective quantifier as previously defined for all objs.

Every obj has an associated quantifier. The coupling of objects and quantifiers may not always be apparent, especially when not discussing quantifiers in particular. When the typographical representation of an obj lacks an associated quantifier, the quantifier is unity. More specifically, the real 1.35{1} is written 1.35.

A dtype has two product projections. The type projection denotes the domain and the instruction projection denotes the function, where the type product as a whole, relative to the aforementioned component projections, is the range. \[ \begin{split} \tt{type} &=\;& (\tt{type} &\;\times\;& \tt{inst}) &\;+\;& \tt{ctype} \\ \text{“range} &=\;& (\text{domain} &\;\text{and}\;& \text{function}) &\;\text{or}\;& \text{base"} \end{split} \]

The implication of the dtype product is that mm-ADT types are generated inductively by applying instructions from the mm-ADT VM’s instruction set architecture (inst). The application of an inst to a type (ctype or dtype) yields a dtype that is a structural expansion of the previous type.

ctype dtype

For example, int is a ctype denoting a single int value from the set of all integers. When int is applied to the instruction [is>0], the dtype int{?}<=int[is>0] is formed, where [is>0] is syntactic sugar for [is,[gt,0]]. This dtype is a refinement type that restricts int to only those int values greater than zero — i.e., a natural number \$\mathbb{N}^+\$. In terms of the "range = domain and function" reading, when an int (domain) is applied to [is>0] (function), the result is either an int greater than zero or no int at all (range).

int isgt0

The diagram above is an instance of a structure that is core to various aspects of mm-ADT including type checking, type inference, compiler optimization, garbage collection, and more. It has a name. The obj graph for which the subgraph concerned with type specification is called the type graph such that \[ G_{\texttt{obj}} = G_{\texttt{type}} \cup G_{\texttt{value}}. \] The obj graph will be studied at length from the perspective of Cayley graphs, where the obj graph is understood as a ring-embedded monoidal Cayley graph.

Type Structure

Cayley Graphs
cayley graph

A Cayley graph is a graphical encoding of the structure of a group. If \$\langle A, \cdot, I \rangle \$ is a group with carrier set \$A\$, binary operator \$\cdot : (A \times A) \to A\$, and generating set \$I \subseteq A\$ then the graph \$G = (V,E)\$ with vertices \$V = G\$ and edges \$E = (A \times (A \times I))\$ is the Cayley graph of the group. The directed edge \$a \to_i b\$ states that the vertices \$a,b \in A\$ are connected by an edge labeled with the element \$i \in I\$. Thus, \$a \to_i b\$ captures the binary operation \$a \cdot i = b\$.

When constructed in full, a Cayley graph’s vertices are the group elements and its edges represent the set of possible transitions from any one element to the next given the generators. When lazily constructed, a Cayley graph encodes the history of a group computation, where the current element has an incoming edge from the previous element. A generalized Cayley Graph is a graph with respective vertex/edge structure, but for other magmas such as monoids and semigroups.

obj full

The full obj structure thus far is diagrammed on the right. On the left are some example mmlang expressions.

mmlang> int                                          (1)
==>int
mmlang> int{2}                                       (2)
==>int{2}
mmlang> int{2}[is>0]                                 (3)
==>int{0,2}<=int{2}[is,bool{2}<=int{2}[gt,0]]
mmlang> int{2}[is>0][mult,[neg]]                     (4)
==>int{0,2}<=int{2}[is,bool{2}<=int{2}[gt,0]][mult,int{0,2}[neg]]
1 A ctype denoting a single integer.
2 A ctype denoting two integers.
3 A dtype denoting zero, one, or two integers greater than 0.
4 A dtype extending the previous type by their negative complement.

The diagram below captures the salient features of a type, where the three projections encode a type’s

  1. Type signature: the ctype specification of a type’s domain and range (via the \$ \pi_\tt{domai\n} \$ and \$ \pi_\tt{rang\e} \$ projections), and

  2. Type definition: an instruction sequence specifying a process on the domain obj (via the \$ \pi_\tt{i\nsts} \$ projection).

type signature definition
Type Signature

Every mm-ADT type can be generally understood as a function that maps an obj of one type to an obj of another type. A type signature specifies the source and target of this mapping, where the domain is the source type, and the range is the target type. Both the domain and range type specifications include a respective quantifier denoted {q} in mmlang. The general pattern of a type signature is

range{q}<=domain{q}.

In common mathematical vernacular, if the function \$f\$ has a domain of \$X\$ and a range of \$Y\$, then it’s signature is denoted \$f: X \to Y\$. Furthermore, given quantifiers from a ring \$Q\$, the function signature would be denoted \$f: X \times Q \to Y \times Q\$.
mmlang Expression Description
mmlang> int<=int
==>int

From the perspective of "type-as-function," An mm-ADT int is a no-op on the set of integers. Given any integer, int returns that integer. In mmlang, when the domain and range are the same, the <= and repeated type are not displayed. That is int<=int is more concisely displayed as int.

mmlang> int{1}
==>int

In most programming languages, a value can be typed int as in

val x:int = 10.

Such declarations state that the value referred to by x is a single element within the set of integers. The concept of a "single element" is captured in mm-ADT by the obj quantifier, where a unit quantifier is not displayed in mmlang. That is, int{1} is more concisely displayed as int.

mmlang> int{5}
==>int{5}

int{5} is a type referring to 5 integers. As a point of comparison, int refers to a single integer. This is why int is syntactic sugar for int{1} in mmlang.

mmlang> int{0,5}
==>int{0,5}
mmlang> int{0,5}<=int{0,5}
==>int{0,5}

Quantifiers must be elements from a ring with unity. In the previous examples, the chosen quantifier was the integer ring \$\langle \mathbb{Z},+,\times \rangle\$. In this example, the carrier set is two integers and represents uncertainty as to the number of elements being referred to. int{0,5} is a type referring to either 0, 1, 2, 3, 4, or 5 integers.

mmlang> int<=bool
language error: bool is not an int

Types that are fully specified by their type signature are always canonical types. Therefore, bool<=int is meaningless as there are no instructions to map an int to a bool. This example does not assume an underlying model. When model-ADTs are defined, it is possible for bool<=int to yield a result.

Type Definition

Types and values both have a ground that exists outside of the mm-ADT virtual machine within the hosting environment (e.g. the JVM). The ground of the mm-ADT value 2 is the JVM primitive 2L (a Java long). The ground of the mm-ADT type int is the JVM class java.lang.Long. When the instruction [plus,4] is applied to the mm-ADT int value 2, a new mm-ADT int value is created whose ground is the JVM value 6L. When [plus,4] is applied to the mm-ADT int type, a new type is created with the same java.lang.Long ground. Thus, the information that distinguishes int from int[plus,4] is in the remembrance of the instruction that was applied to int. For a type, this history is called the type definition and is a subgraph of the larger type graph (which is a subgraph of an even larger obj graph).

In practice, the string representation of a value is its ground and the string representation of a type is its trace.
type value trace
mmlang> 2[plus,4][is>0]
==>6
mmlang> 2[plus,4][is>0][trace]
==>(2;[plus,4];6;[is,true])
mmlang> 2[plus,4][is>0][type]
==>int{?}<=int[plus,4][is,bool<=int[gt,0]]

Both types and values exist in a larger graph called the obj graph for which a type’s type graph is a component. In theory, the complete history of an mm-ADT program (from compilation to execution) is stored in this graph. However, in practice, the mm-ADT VM removes those traces (particular paths through the graph) that are no longer required by the program. This process is called trace retraction and is the mm-ADT equivalent of garbage collection.

type value trace short In the diagram above, the type vertices are elements of a free algebra called the inst monoid. However, in order to present more complex diagrams, vertex labels can be shortened to type’s canonical range type. With this convention, there is no loss of information. The full definition can be unambiguously deduced by concatenating the instructions encountered on the edges of the inverted path from the current range vertex to the root domain vertex (i.e. the base canonical type of the type induction). The above diagram’s more concise representation is on the right. All subsequent diagrams will follow this convention.
Type Quantification
Initial and Terminal Objects
initial

A category may have an initial and/or terminal object.

An initial object \$S\$ is the domain of a set of morphism \$S \rightarrow E_n\$. Initial objects, via their morphisms, generate all the objects of the category. If there is an initial object, then it is unique in that if there is another initial object, it has the same diagrammatic topology — all outgoing morphisms and no incoming morphisms save the identity. Thus, besides labels, two initials are isomorphic.

terminal

A terminal object \$E\$ is the range of a set of morphisms \$S_n \rightarrow E\$. Terminal objects subsume all other objects in the category in that all other objects \$S_n\$ can be morphed into the terminal object, but the terminal object can not be morphed into any other object. Similar to initials, should another terminal exist, the two terminal are isomorphic in that they both have the same number of incoming morphisms and no outgoing morphisms (save the identity).

 

Table 2. Quantifier Symbols in mmlang
usage sugar unsugared

none/nothing

{0}

{0,0}

some/just

{1,1}

exact

{x}

{x,x}

least [1]

{x,}

{x,max}

most [1]

{,x}

{min,x}

option/maybe

{?}

{0,1}

given [1]

{+}

{1,max}

any [1]

{*}

{1,max}

In order to quantify the amount of values denoted by a type, every mm-ADT type has an associated quantifier q denoted {q} in mmlang. Quantifiers are typically integers, but can be any element from an ordered algebraic ring with unity (e.g. integers, reals in \$ \mathbb{R}, \mathbb{R}^2, \mathbb{R}^3, \ldots, \mathbb{R}^n \$, unitary matrices, etc.). While integer quantifiers signify "amount," other quantifiers such as unitary matrices used in the representation of a quantum wave function, "amount" is a less accurate description as values can be negative where objs interact with constructive and destructive interference.

The default quantifier ring of the mm-ADT VM is \[ \langle \mathbb{N}^+ \times \mathbb{N}^+, \ast, + \rangle, \] where \$(1,1)\$ is the multiplicative identity (unity) and \$(0,0)\$ is the additive identity. The \$\ast\$ and \$ +\$ binary operators are pairwise integer multiplication and addition, respectively. In mmlang if an obj quantifier is not displayed, then the quantifier is assumed to be the unity of the ring, or {1,1}. Moreover, if a single value is provided, it is assumed to be repeated where {n} is shorthand for {n,n}. Thus, int is int{1} is int{1,1}.

One quantifier serves an important role in mm-ADT as both the additive identity and multiplicative annihilator — {0}. All objs quantified with the respective quantifier ring’s annihilator are isomorphic to the initial object.

Types such as int{0} and int{0}<=int[is>0&&<0] are isomorphic due to their quantifiers both being {0}. Typically, throughout the documentation, both will be referred to simply as obj{0} or {0} (the zero object).

Common programming concepts that are usually captured by typeclasses or functors (via lifting) are expressed in mm-ADT via quantification.

int                                   // a single int    (some)
int{?}<=int[is>0]                     // 0 or 1 int      (option)
int{0}<=int[is,false]                 // 0 ints          (none)
int{4}<=int{2}[_,_]                   // 4 ints          (exact)
int{*}<=[=db][get,'people'][age]      // 0 or any ints   (any)

An mm-ADT program is a type. The mmlang parser converts a textual representation of a type into a type obj. The mm-ADT VM encodes a type obj as a path within the larger mm-ADT type graph (a Cayley graph). The type’s graph is traversed and rewritten by different automata from the mm-ADT compiler and ultimately, the evaluating processor. While a type’s graph is a graphical encoding of the monoidal structure of a particular subset of obj, it is also a specification of a data flow pipeline that realizes elements of the type (i.e. computed resultant values). From the vantage point of the latter perspective, various techniques from the field of abstract interpretation are leveraged. Or particular import to this section, the expressions above are rewritten below by a compiler automata performing type and quantifier inference.

mmlang> int
==>int
mmlang> int[is>0]
==>int{?}<=int[is,bool<=int[gt,0]]
mmlang> int[is,false]
mmlang>
mmlang> int{2}[_,_]
==>int{4}<=int{2}-<(int,int)>-

Mono Types

type inst 0 1

bool

&& || - !

false

true

int

* + - > < >= =<

0

1

real

* + - > < >= =<

0.0

1.0

str

+ > < >= =<

''

The mm-ADT type system can be partitioned into mono types (monomials) and poly types (polynomials). The mono types are atomic and there are 4 of them: bool, int, real, and str. The associated table presents the typical operators (sugared instructions) that can be applied to each mono. The table also includes the additive (0) (additive) and multiplicative (1) identity values of each mono. The remaining subsections will present examples of each mono type that also highlight important mm-ADT concepts. '''

Bool Type

The type bool has two values: true and false. The 0-value is false and the 1-value is true, where [zero] and [one] yield values regardless of their input.

mmlang> bool[zero]
==>false
mmlang> bool[one]
==>true
mmlang> true[zero]
==>false
mmlang> true[one]
==>true
mmlang> false[zero]
==>false
mmlang> false[one]
==>true

bool zero one In mm-ADT, types and values are simply vertices in the obj graph. A polymorphic constant inst such as [zero] and [one] require only the domain’s type to produce a respective constant value. Consequently, the distinction between compilation and evaluation blur as "compilation" can produce values and thus, the program may be complete before being "evaluated." The example below further elucidates the phenomena, where the [trace] instruction maps an obj to its path in the Cayley obj graph. Note that [zero] transitions the process from the type graph to the value graph, which are both connected subgraphs of the larger obj graph. The final instruction, [and,bool,bool], is compiled to [and,false,false] which results in false. Thus, a single "compilation" automata is able to derive the expected result set. From an abstract interpretation perspective, regardless of the progam’s input being true or false, the range is always false.

mmlang> bool[or,bool,bool][zero][and,bool,bool]
==>false
mmlang> bool[or,bool,bool][zero][and,bool,bool][trace]
==>(bool;[or,bool,bool];bool;[zero];false;[and,false,false])

The instructions [and] and [or] create bool dtypes from type. Their type structures are

bool{q}<=obj{q}[and,bool*]
bool{q}<=obj{q}[or, bool*].

The domain obj becomes the domain of the respective bool arguments and the resultant (con/dis)junction of the bool arguments determines the range bool.

mmlang> int[and,>5,<8]
==>bool<=int[and,bool<=int[gt,5],bool<=int[lt,8]]
mmlang> {9,6}[and,>5,<8]
==>false
==>true
mmlang> {9,6}[or,>5,<8]
==>true{2}
Int Type

The mm-ADT type int refers to the set of all integers commonly referred to as \$\mathbb{Z}\$ in the mathematics literature. Integers are the prototypical example of an algebraic ring and thus, where 0 ("zero") and 1 ("one") derive their name. However, note that 0 and 1 generalize the integer elements 0 and 1 with numerous concrete, behavioral realizations in other carrier sets (with respective operators) besides int. If it were not for the success of the integers, 0 would perhaps be commonly refered to as annihilator and 1, unit. In a similar vein, mm-ADT pays homage to the integers with [zero] and [one] being the instructions by which 0 and 1, respectively, are extracted from those mm-ADT types that yield them such as int with 0 and 1.

mmlang> bool[zero]
==>false
mmlang> bool[one]
==>true
mmlang> int[zero]
==>0
mmlang> int[one]
==>1
mmlang> real[zero]
==>0.0
mmlang> real[one]
==>1.0
mmlang> str[zero]
==>''

The integers have two binary operators known as * ("multiplication") and + ("addition"), where \$\mathbb{Z},\ast \rangle\$ is a monoid and \$\langle \mathbb{Z},+\rangle\$ is an abelian group.

  1. Monoid Axioms (for multiplication)

    1. \$(a * b) * c = a * (b * c)\$ (associative)

    2. \$a * \mathbf{1} = a\$ (identity)

  2. Abelian Group Axioms (for addition)

    1. \$(a + b) + c = a + (b + c)\$ (associative)

    2. \$a + \mathbb{0} = a\$ (identity)

    3. \$a + b = b + a\$ (commutative)

    4. \$a + -a = \mathbf{0}\$ (inverse)

An mm-ADT int is currently encoded using 64-bits. In computer engineering, this is a long value. A later release will dynamically resize the amount of bits used for the encoding such that arbitrary large (both positive and negative) can be represented.
Real Type
Str Type

Anonymous Type

The type bool⇐int[gt,10] has a range of bool and a domain of int. When the type is written int[gt,10], the range is deduced by a compiling automata that applies int to [gt,10] to yield bool. In fact, the range of int[gt,10] is an anonymous type (called an anon for brevity) and is denoted _ (or with no character in most situations). An anon is a type that has not been grounded to a base type.

mmlang> bool<=int[gt,10]    (1)
==>bool<=int[gt,10]
mmlang> _<=int[gt,10]       (2)
==>bool<=int[gt,10]
mmlang> int[gt,10]          (3)
==>bool<=int[gt,10]
1 The domain and range of the type are fully specified.
2 A type with a specified domain of int and a specified range of _.
3 An mmlang sugar where if no range is specified, and it differs from the domain, then _ is assumed.

Anons can also be used to specify types without a domain.

mmlang> int{?}<=int[is,bool<=int[gt,10]]     (1)
==>int{?}<=int[is,bool<=int[gt,10]]
mmlang> int[is,int[gt,10]]                   (2)
==>int{?}<=int[is,bool<=int[gt,10]]
mmlang> int[is,_[gt,10]]                     (3)
==>int{?}<=int[is,bool<=int[gt,10]]
mmlang> int[is,[gt,10]]                      (4)
==>int{?}<=int[is,bool<=int[gt,10]]
1 The domain and range of the outer and inner nested type are fully specified.
2 The nested type has a specified domain of int.
3 The nested type has an unspecified domain of _.
4 An mmlang sugar where if no domain is specified, _ is assumed.

Poly Types

Rings

A ring is an algebraic structure \$ \langle A,*,+,1,0 \rangle \$, where \$A\$ is the carrier set, \$\*: A \times A \rightarrow A\$ is the multiplicative operator, \$\+: A \times A \rightarrow A \$ is the additive operator, \$1 \in A\$ is the multiplicative identity, and \$0 \in A\$ is the additive identity. A ring satisfies the following axioms, where \$ a,b,c \in A \$.

common notation terminology mmlang notation

\$(a + b) + c = a + (b + c)\$

additive associativity

((a,b),c) == (a,(b,c))

\$0 + a = a + 0 = a\$

additive identity

({0},a) == (a,{0}) == a

\$a - a = a + (-a) = 0\$

additive inverses

(a,a{-1}) == {0}

\$a + b = b + a\$

additive commutativity

(a,b) == (b,a)

\$(a * b) * c = a * (b * c)\$

multiplicative associativity

((a;b);c) == (a;(b;c))

\$1 * a = a * 1 = a\$

multiplicative identity

({1};a) == (a;{1}) == a

\$(a + b) * c = (a * c) + (b * c)\$

multiplicative left distributivity

((a,b);c) == ((a;c),(b;c))

\$a * (b + c) = (a * b) + (a * c)\$

multiplicative right distributivity

(a;(b,c)) == ((a;b),(a;c))

common notation deduction mmlang notation

\$a + b = a + c => b = c\$

factors

\$a + b = 0 => a = -b & b = -a\$

unique inverse

\$-(a+b) = (-a) + (-b)\$

distributivity

(a,b){-1} == (a{-1},b{-1})

\$-(-a) = a\$

distributivity

(a{-1}){-1} == a

\$a*0 = 0 = 0*a\$

annihilator

(a;{0}) == {0} == ({0};a)

\$a * (-b) = -a * b = -(a * b)\$

factors

(a;b{-1}) == (a{-1};b) == (a;b){-1}

\$(-a) * (-b) = a * b\$

factors

(a{-1};b{-1}) == (a;b)

Deep Dive 1. Poly Constructs in mmlang
  1. ( ) is a polynomial constructor.

  2. lst is a polynomial with terms indexed by int.

  3. rec is a polynomial with terms indexed by obj.

  4. , is a polynomial term deliminator denoting parallel compose.

  5. | is a polynomial term deliminator denoting parallel choose.

  6. ; is a polynomial term combinator denoting serial compose.

  7. -< is sugar for [split] copying the lhs obj across all polynomial terms.

  8. >- is sugar for [merge] joining polynomial terms into a single obj.

  9. = is sugar for [combine] composing polynomial terms pair-wise.

The reason for the [split] sugar symbol -<, is that it represents one wire (-) splitting into many (<). Likewise, the reason for >- being the [merge] sugar symbol is it represents many wires merging (>) into one (-). Finally, [combine] has a sugar of = which represents parallel wires being operated on independently.

A polynomial is a linear combination of terms composed of coefficients and indeterminates typically expressed as \[ p = q_0 x^0 + q_1 x^1 + q_2 x^2 + \ldots + q_n x^n, \] where \$q_i\$ is a coefficient, \$x^i\$ is an indeterminate, \$q_i x^i\$ is a term, and the terms are linearly combined via +.

In mm-ADT, the set \$\tt{poly} \subset \tt{obj}\$ are polynomials which, in mmlang, are expressions of the form

(x0{q0},x1{q1},x2{q2},…​,xn{qn})

where qi is a quantifier (coefficient), xi is an object (indeterminate), xi{qi} is an obj (term), and the objs are linearly combined via ,. This particular poly is called a ,-poly that captures the typical realization of a polynomial. As a data structure, a ,-poly is a multi-set or bag. As an algebraic structure, a ,-poly is a polynomial ring that faithfully encodes the mm-ADT processor’s additive and multiplicative type ringoid magmas as a free abelian group \$\langle F(\tt{obj}),+\rangle\$ and a non-free monoid \$\langle \tt{obj},\ast \rangle\$, respectively.

poly columns

The illustration (or suggestive diagram) on the left depicts an element of a free ring. There are four multiplicative monoid compositions rooted at \$a\$ and going up via a chain of \$\ast\$-compositions and there is a single additive abelian group element diagrammed left-to-right. As horizontal and vertical compositions, each depicts element(s) of a free magma. A poly enables one magma to remain free (unevalated) and the other to be non-free (evaluated). In particular, as a classic polynomial ring, ,-poly maintains the free additive group separate from the non-free multiplicative monoid such that the diagram’s vertical \$\ast\$-compositions would "collapse" yielding 4 isolated terms (objs) specified in mmlang as

(abc{q0},ad{q1},abe{q2},adeb{q3}).

If the multiplicative monoid compositions yielded values (evaluations) as opposed to types (compilations) and \$x_0 = a(bc)\$, \$x_1 = ad\$, \$x_2 = a(be)\$, and \$x_3 = a(d(eb))\$, then the resultant ,-poly structure would be

(x0{q0},x1{q1},x2{q2},x3{q3}).

For visual simplicity, quantifiers were not diagrammed. Furthermore, the reason that every term of the multiplicative monoid only has a single quantifier is due to the universal commutativity of coefficients theorem of stream ring theory. This theorem is used in mm-ADT compiler optimization and will be discussed at length in a later section of the documentation.

In general, an mm-ADT poly is used to decouple two algebraic structures such that one is free and the other non-free. From the perspective of applied computing, the free algebras are delayed, unevaluated structures and the non-free algebras are eager, evaluated structures. There are three kinds of poly. Each draws their terms from the same obj carrier set and are distinguished, in mmlang, by their respective term separator which denotes a particular "additive" binary operator: , (orThen), | (xorThen), and ; (andThen), where , has a free additive abelian group, | a free additive group, and ; a free semicommutative multiplicative monoid. The general form of each kind of poly is expressed below in mmlang.

(x0{q0},x1{q1},…​,xn{qn})
(x0{q0}|x1{q1}|…​|xn{qn})
(x0{q0};x1{q1};…​;xn{qn})

symbol name structure description illustration

,-poly

orThen

polynomial ring

nondeterministic parallel branching and products

comma poly example

|-poly

xorThen

monoid ring

deterministic if/else-branching and co-products

pipe poly example

;-poly

andThen

trace monoid

type rewriting and meta-programming

semi poly example

Lst and Rec

A poly is an obj defined by the following mmlang grammar fragment.

sep   ::= ';' | ',' | '|'
lst   ::= '(' obj? (sep obj)* ')' q?
rec   ::= '(' (obj '->' obj)? (sep obj '->' obj)* ')' q?
poly  ::= lst | rec

Like obj, there are value-polys and type-polys. A value-poly is a collection data structure. A type-poly is a branching data process. mm-ADT provides two structures that are generally useful when working with poly: a lst (list) and a rec (record).

poly kinds

  • lst

    • A lst containing all values is a list.

    • A lst containing a type is a copy branch.

  • rec

    • A rec containing all values is a record.

    • A rec containing a type is a predicate branch.

\[ \begin{split} \texttt{poly} &= \texttt{lst} &+ \texttt{rec} \\ \texttt{poly} &= (\texttt{,-lst} + \texttt{|-lst} + \texttt{;-lst}) &+ (\texttt{,-rec} + \texttt{|-rec} + \texttt{;-rec}) \end{split} \]

mmlang> ('a','b','c')                                               (1)
==>('a','b','c')
mmlang> ('a'->1,'b'->2,'c'->3)                                      (2)
==>('a'->1,'b'->2,'c'->3)
mmlang> ([plus,'a'],[plus,'b'],[plus,'c'])                          (3)
==>(_[plus,'a'],_[plus,'b'],_[plus,'c'])
mmlang> ([is,[eq,'a']]->1,[is,[eq,'b']]->2,[is,[eq,'c']]->3)        (4)
==>(_[is,_[eq,'a']]->int,_[is,_[eq,'b']]->int,_[is,_[eq,'c']]->int)
1 A three term value ,-lst.
2 A three term value ,-rec.
3 A three term type ,-lst.
4 A three term type ,-rec.

In the examples below, the value 3 is [split] (-<) across the respective poly terms.

mmlang> 3-<([plus,1],[mult,2])     (1)
==>(4,6)
mmlang> 3-<([plus,1]|[mult,2])     (2)
==>(4|)
mmlang> 3-<([plus,1];[mult,2])     (3)
==>(4;8)
1 A ,-lst type yielding a ,-lst value via (3+1,3*2) (orThen).
2 A |-lst type yielding a |-lst value via (3+1,obj{0}) (xorThen).
3 A ;-lst type yielding a ;-lst value via (3+1,3+1*2) (andThen).

The resultant polys can then have their terms [merged] (>-) into a single int{1,2} strm.

mmlang> 3-<([plus,1],[mult,2])>-   (1)
==>4
==>6
mmlang> 3-<([plus,1]|[mult,2])>-   (2)
==>4
mmlang> 3-<([plus,1];[mult,2])>-   (3)
==>8

The "split-merge" pattern -<()>- is common enough that a unified instruction exists.

mmlang> 3[[plus,1],[mult,2]]       (1)
==>4
==>6
mmlang> 3[[plus,1]|[mult,2]]       (2)
==>4
mmlang> 3[[plus,1];[mult,2]]       (3)
==>8

Finally, the most compact sugar forms of the running "split-merge" example is presented below. Given the verbosity of branching expressions, all subsequent examples will typically leverage as many sugar constructs as possible to ensure readability.

mmlang> 3[+1,*2]                   (1)
==>4
==>6
mmlang> 3[+1|*2]                   (2)
==>4
mmlang> 3[+1;*2]                   (3)
==>8
The [split] (-<) instruction creates a poly for each incoming obj. This poly denotes coupled parallel streams, where each subsequent instruction operates on each term of the poly in a lock-step manner. However, if there are no intermediate instructions between [split] and [merge] (typically specified with [combine]), then this is semantically equivalent to decoupled parallel streams whereby whenever an obj is made available by a branch, it can be immediately placed on the outgoing merged stream. This holds because streams are always unordered.
orThen ,-poly

A ,-poly is a polynomial ring with commutative addition defined as

\[ \texttt{t0{q0},t1{q1}} = \begin{cases} \texttt{t0\{q0+q1\}} & \text{if } \texttt{t0==t1}, \\ \texttt{t0{q0},t1{q1}} & \text{otherwise.} \end{cases} \]

If the indeterminates of two terms are equal, then they can be merged by summing their coefficients. In the lexicon of mm-ADT, if two types are equal, then their quantifiers can be summed using the additive operator of the respective quantifier ring. Within the category of the inst monoid, the following mmlang examples highlight the salient features of |-poly addition.

mmlang> 'mar'[[plus,'ko'],[plus,'ko']]      (1)
==>'marko'{2}
mmlang> 'mar'[[plus,'ko']{2}]               (2)
==>'marko'{2}
mmlang> 'mar'[plus,'ko']{2}                 (3)
==>'marko'{2}
mmlang> 'marko'{2}                          (4)
==>'marko'{2}
1 The 'mar' str is copied as input to two [plus,'ko'] branches.
2 Branch aggregation by coefficient summation is possible because the two branches are equal.
3 With only one "branch," the expression can be extracted from [branch].
4 The expression has reached a fixpoint and thus, is solved.

In the category poly, the previous mmlang examples are written as follows.

mmlang> 'mar'[[plus,'ko'],[plus,'ko']]
==>'marko'{2}
mmlang> mmlang> 'mar'[[plus,'ko']{2}]
==>bool{2}<=mmlang[gt,'mar']-<(bool{2}<=bool[plus,'ko']{2})>-
mmlang> [[start,'mar'][plus,'ko'],[start,'mar'][plus,'ko']]
==>'marko'{2}

The last expression demonstrates a property of all rings: multiplication both right and left distributes over addition. \[ \begin{split} a(b+c) &=& \; ab + ac \\ (b+c)a &=& \; ba + ca \end{split} \]

mmlang> 'a'[+'b',+'c']
==>'ab'
==>'ac'
mmlang> ['b','c']+'a'
==>'ba'
==>'ca'

Two ,-poly terms can be added if their indeterminates are equal. For the monotypes such as bool, int, real, and str, equality is based on the grounded value of the type. Equality ignores quantification and trace history.

mmlang> 'a'=='a'
==>true
mmlang> 'a'=='ab'
==>false
mmlang> 6==6
==>true
mmlang> 6==6{10}
==>true
mmlang> 6{20}==6{10}
==>true{20}

Every type is either a ctype or a dtype. A ctype does not have a trace. A dtype’s trace is the list of instructions rooted at a ctype. Type equality is based on ctype and trace equality.

mmlang> (int)==(int)
==>true
mmlang> (int)==(int[plus,10])
==>true
mmlang> (int[plus,10])==(int[plus,10])
==>true

In the example below, the two traces are different, but the semantics of the expression are the same. Thus, as a polynomial, these two terms should be combined. This is accomplished via a particular type of rewrite system for type rewrites. Given a specification of type equivalences (as defined by their trace), int[plus,10][plus,0] is rewritten to int[plus,10]. At which point the two types are equivalent and can be merged accordingly.

mmlang> (int[plus,10])==(int[plus,10][plus,0])
==>true
xorThen |-poly

A |-poly is a polynomial ring with non-commutative addition defined as

\[ \texttt{t0{q0}|t1{q1}} = \begin{cases} \texttt{t0{q0}} & \text{if } x⇒\texttt{t0{q0}} \notin \texttt{obj{0}}, \\ \texttt{t1{q1}} & \text{if } x⇒\texttt{t1{q1}} \notin \texttt{obj{0}}, \\ \texttt{obj{0}} & \text{otherwise.} \end{cases} \]

mmlang> 'mar'[[plus,'ko']|[plus,'io']]          (1)
==>'marko'
mmlang> 'mar'[[plus,'ko']{0}|[plus,'io']]       (2)
==>'mario'
mmlang> 'mar'[[plus,'ko']{0}|[plus,'io']{0}]    (3)
mmlang>
1 The first branch does not yield obj{0} so 'ko' is added to 'mar'. The polynomial reduces to [plus,'ko'].
2 The first branch does yield an obj{0} and the second does not so 'io' is added to 'mar'. The polynomial reduces to [plus,'io'].
3 Both branches yield an obj{0}. The polynomial reduces to obj{0}.
mmlang> 'mar'[[plus,'ko']|[plus,'io']]
==>'marko'
mmlang> 'mar'[[plus,'ko']{0}|[plus,'io']]
==>'mario'
mmlang> 'mar'[[plus,'ko']{0}|[plus,'io']{0}]
mmlang>

Every |-poly non-commutative addition because the order in which the terms/branches are evaluated determines the result of the computation. While ,-poly implements union, |-poly implements coalesce.

andThen ;-poly

A ;-poly is a polynomial ring with semi-commutative "addition," where "addition" is ring multiplication and thus, is monoid multiplication and monoid multiplication is only guaranteed commutative for the filter-subring. Polynomials with this structure are called monoid rings. In mm-ADT, monoid multiplication is standard type composition and is defined as

\[ \texttt{t0{q0};t1{q1} = t0;t1\{q0*q1\}} \]

mmlang> 'mar'[[plus,'k']{2}|[plus,'o']{3}]  (1)
==>'mark'{2}
mmlang> 'mar'[[plus,'k']{2}|[plus,'o']{0}]  (2)
==>'mark'{2}
1 The two branches are serially composed to create a single "branch" with a quantifier that is the product of the two original branch quantifiers.
2 The two branches are serially composed, but the second branch has a 0-quantifier and thus, the resolution goes to obj{0} as 2 * 0 = 0.
mmlang> 'mar'[[plus,'k']{2}|[plus,'o']{3}]
==>'mark'{2}
mmlang> 'mar'[[plus,'k']{2}|[plus,'o']{0}]
==>'mark'{2}

The [compose] instruction is a higher-order instruction that yields the same result as fundamental instruction composition/concatenation.

mmlang> 'mar'[plus,'k']{2}[plus,'o']{3}
==>'marko'{6}
mmlang> 'mar'[plus,'k']{2}[plus,'o']{0}
mmlang>

Moreover, a poly can be inspected using standard list instructions. Below demonstrates that the ;-poly[merge] (>-) is equivalent to ;-poly[last].

mmlang> 'mar'-<([plus,'k'];[plus,'o'])
==>('mark';'marko')
mmlang> 'mar'-<([plus,'k'];[plus,'o'])>-
==>'marko'
mmlang> 'mar'-<([plus,'k'];[plus,'o'])[last]
==>'marko'
Poly Domain Instructions

A poly has various interpretations including: list, vector, array, and program. The table below presents the set of instructions whose domain is poly.

inst arg(s) range style description mmlang vpoly example

[plus]

poly

poly

list

term expression concatenation

mmlang> ('a','b')+('c','d')
==>('a','b','c','d')

[mult]

poly

poly

vector

dot product that is distributive over addition via FOIL

mmlang> ('a','b')*('c','d')
==>(('a','c'),('a','d'),('b','c'),('b','d'))

[hmult]

poly

poly

vector

Hadamard pairwise product

mmlang> ('a','b')=('c','d')
==>('c','d')

[head]

obj

list

the first term of the polynomial

mmlang> ('a','b')[head]
==>'a'

[tail]

poly

list

the polynomial without the first term

mmlang> ('a','b')[tail]
==>('b')

[last]

obj

list

the last term of the polynomial

mmlang> ('a','b')[last]
==>'b'

[get]

int

obj

array

term access by index

mmlang> ('a','b','c').1
==>'b'
mmlang> ('a','b','c').3
language error: poly index is out of bounds: 3

[put]

int,obj

poly

array

term insertion term by index

mmlang> ('a','b','c')[put,1,'ab']
==>('a','ab','b','c')
mmlang> ('a','b','c')[put,5,'e']
==>('a','b','c','e')
Split and Merge
inst arg(s) range description mmlang vpoly example

[split]

poly

poly

mmlang> -<('a','b','c')
==>('a','b','c')

[merge]

obj

mmlang> -<('a','b','c')>-
==>'a'
==>'b'
==>'c'

poly op mmlang example

;-apply

-<(;)

mmlang> 6-<(_;_)
==>(6;6)
mmlang> 6-<(_+1;_>6)
==>(7;true)

,-copy

-<(,)

mmlang> 6-<(_,_)
==>(6,6)
mmlang> 6-<(_>0,_>8)
==>(true,false)

|-pick

-<(|)

mmlang> 6-<(_|_)
==>(6|)
mmlang> 6-<(_>0|_>8)
==>(true|)
poly op mmlang example

;-return

(;)>-

mmlang> 6-<(_;_)>-
==>6
mmlang> 6-<(_+1;_>6)>-
==>true

,-merge

(,)>-

mmlang> 6-<(_,_)>-
==>6{2}
mmlang> 6-<(_>0,_>8)>-
==>true
==>false

|-coalesce

(|)>-

mmlang> 6-<(_|_)>-
==>6
mmlang> 6-<(_+1|_>6)>-
==>7

Poly Patterns
Records

The examples thus far have used the lst version of poly. The slots of a lst poly are indexed by int (ordered from 0 to \$n\$). For complex poly structures such as nested polys or polys with many slots, working with int indices is cumbersome and error prone. The rec polys alleviate the problem while providing added expressivity. The slots of a rec are indexed by an arbitrary obj key.

A lst is analogous to a list or array and a rec is analogous to a record or map.

There are three primary use cases for rec.

  1. Labeling poly indices.

    mmlang> ('marko',29)
    ==>('marko',29)
    mmlang> ('marko',29).0                    (1)
    ==>'marko'
    mmlang> ('marko',29).1
    ==>29
    mmlang> ('name'->'marko','age'->29)
    ==>('name'->'marko','age'->29)
    mmlang> ('name'->'marko','age'->29).name  (2)
    ==>'marko'
    mmlang> ('name'->'marko','age'->29).age
    ==>29
    1 The slots of a "person" lst are accessed with int values.
    2 The slots of a "person" rec are accessed with str values.
  2. Accessing multiple slots at a time.

    ('a'->1,'b'->2,'c'->3,'d'->4)
    ('a'->1,'b'->2,'c'->3,'d'->4)[get,is>'a']
    ('a'->1,'b'->2,'c'->3,'d'->4)[get,is>'b']
  3. Branching with predicates.

    mmlang> {1,2,3}-<(is==2 -> 'name' | is>2 -> 'is' | int -> 'my')
    ==>(1->'my')
    ==>(2->'name')
    ==>(3->'is')
    mmlang> {1,2,3}-<(is==2 -> 'name' | is>2 -> 'is' | int -> 'my')>-
    ==>'my'
    ==>'name'
    ==>'is'

The poly recs are ordered. Even though slots can be uniquely identified by their obj-key, equality is dependent on position.

mmlang> (1->'a',2->'b')==(1->'a',2->'b')
==>false
mmlang> (1->'a',2->'b')==(2->'b',1->'a')
==>false

Order-based equality ensures the semantics of ;-rec and |-rec, which are non-commutative. The rec that is generated from a -< split has both the keys and the slots resolved according to the rules of juxtaposition . If two keys yield the same result, then their slots are merged as specified by the poly summation operator.

rec orders
mmlang> 23-<(is>10->'a',int->'b')
==>(23->str{2})
mmlang> 23-<(is>10->'a'|int->'b')
==>(23->'a')
mmlang> 23-<(is>10->'a';int->'b')
==>(23->'b')
Lists
Stream Containers
{0,1}-Boolean Matrices

Hadamard pair-wise product can be used to filter specific terms out of a poly in a manner analogous to \$\{0,1\}\$-boolean matrices in linear algebra. As values do not pass through values, [hmult] is provided a tpoly where the slots to filter have a {0} quantification (e.g., obj{0}) and the slots to keep should maintain an identity (e.g. [id] or [noop]).

mmlang> ('a','b','c')=(obj{0},obj{0},str[id])   (1)
==>(,,'c')
mmlang> ('a','b','c')=(,,str[id])               (2)
==>(,,'c')
mmlang> ('a','b','c')=(,,_[id])                 (3)
==>(,,'c')
mmlang> ('a','b','c')=(,,_)                     (4)
==>(,,'c')
1 A fully typed \$\{0,1\}\$-polynomial.
2 An empty slot is mmlang sugar for obj{0}.
3 The anonymous type _[id] is compiled to str[id].
4 The anonymous type _ is compiled to str (i.e. str[noop]).

The identity matrix is a \$\{0,1\}\$-matrix that when multiplied using standard matrix product, the result is equivalent to [id].

\[ \begin{pmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \\ \end{pmatrix} \cdot \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{pmatrix} = \begin{pmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \\ \end{pmatrix} \]

However, with pair-wise product, only the main diagonol remains.

\[ \begin{pmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \\ \end{pmatrix} \bullet \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{pmatrix} = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 5 & 0 \\ 0 & 0 & 9 \\ \end{pmatrix} \]

((1,2,3),(4,5,6),(7,8,9))=(=(_,,),=(,_,),=(,,_))
((1,2,3),
 (4,5,6),
 (7,8,9))
    =
 (=(_,,),
  =(,_,),
  =(,,_))
Meta-Programming

Identity Types

If \$\langle M,\cdot,e\rangle\$ is a monoid, where \$e \in M\$ is the identity element and there exists an element \$e' \in M\$ that also acts as an identity such that for every \$ x \in M \$, \$x \cdot e = x\$ and \$x \cdot e' = x\$, then because \$e \cdot e' = e\$ and \$e \cdot e' = e'\$, it is the case that \$e = e \cdot e' = e'\$ and \$e = e'\$. Thus, every monoid has a unique identity. However, in a free monoid, where element composition history is preserved, it is possible to record \$e\$ and \$e'\$ as distinctly labeled elements even though their role in the non-free monoid’s binary composition are the same — namely, that they both act as identities.

It is through distinct identities that mm-ADT supports dynamic computational state (e.g. variables) and static type specifications (e.g. definitions). In particular, both these programming idioms are supported by [to] and [define], respectively.

Variables
linear variable example
mmlang> 9 => int[plus,1]<x>[plus,2][mult,x]         (1)
==>120
mmlang> 9 => int[plus,1]<x>[plus,2][mult,x][trace]  (2)
==>(9;[plus,1];10;<x>;10;[plus,2];12;[mult,10])
mmlang> int[plus,1]<x>[plus,2][mult,x][explain]     (3)
==>'
int[plus,1]<x>[plus,2][mult,int<.x>]

instruction       domain      range state
------------------------------------------
[plus,1]          int    =>   int
[plus,2]          int    =>   int    x->int
[mult,int<.x>]    int    =>   int    x->int
'
1 The <x> instruction stores the value 10 in the value’s path through the obj graph.
2 The [trace] instruction provides the automaton’s path through the obj graph as ;-lst.
3 The [explain] instruction details the scope of state variables.
Definitions

Higher Order Types

In mm-ADT, every type \$A\$ is a space, where the points of the space are mm-ADT objs (types and values) and the structure distinguishing \$A\$ from being a common set is the inst relations linking the objs. Every mm-ADT type defines a directed labeled multigraph called a type graph. If \$x,y:A\$ are two objs in \$A\$, then the path \$x \to_{A} y\$ denotes a continuous walk that starts at \$x\$ and ends at \$y\$ proving that \$x\$ is "\$A\$-reduced" to \$y\$. Each step of that walk is an instruction forming an edge in the type graph. For instance, the type

int[is,[eq,int]]

describes a reflexive graph with vertices are in int and edges labeled [is,[eq,int]]. There are other mm-ADT types (an infinite amount in fact) that are related to int[is,[eq,int]] by a graph homomorphism. A few such types are itemized below.

int[plus,0]
int[mult,1]
int[plus,1][plus,-1]
int[plus,2][plus,-2]
int[plus,3][plus,-3]
...

Note a common pattern. The type int[plus,x:int][plus,x:int[neg]] captures an infinite number of more specific types that are homomorphic to int[is,[eq,int]]. Higher order type are defined using mm-ADT polys.

(int[is,[eq,int]])<=(int[plus,x:int][plus,x:int[neg]])

In category theory, a type-to-type mapping that preserves structure is called a functor. From the perspective of mm-ADT, two new "ctypes" are linked in a domain/range <=-relation within a higher-order identity type with respective signature. The identity is apparent in that there are no instructions required to coerce an obj of the domain to an obj of the range. This is analogous to int<=int, save that the mapping is not predicated on isomorphism, but on a broader relationship called a surjective homomorphism (a reduction). Any type containing only a type signature ()<=() says that the domain type is equal to the range type such that the domain can be substituted for the range.

[define,(int[is,[eq,int]])<=(int[plus,x][plus,x[neg]])][int][plus,10][plus,-10]
type to type type

The type int[is,[eq,int]] has an isomorphic image in int, where the reflexive self-loop paths in int[is,[eq,int]] are contracted to 0-length paths. The isomorphism realizes int as a classic set without structure because the type int has no instructions and thus, no type graph edges.

mmlang> [define,int<=(int[is,[eq,int]])]
        [define,(int[is,[eq,int]])<=(int[plus,x:int][plus,x:int[neg]])]
        [int][plus,10][plus,-10]
==>int

The composition of two structure preserving type morphisms yields a new type—​namely, int<=(int[plus,x:int][plus,x:int[neg]]).

Deep Dive 2. mm-ADT Types are Graphs and their Values Realize Paths

In the set theoretic interpretation of types, the mm-ADT type

int[is>0]

would be considered a reasonable type — it is the set of integers greater than 0. This type defines a subtype of int using a predicate (\$ \tt{g\t}:\mathbb{N} \times \mathbb{N} \rightarrow \mathbb{B}\$). On the other hand, the type

int[plus,1]

would be considered unreasonable as it is a transformation function, not a characteristic function. However, in mm-ADT, both are types. An obj is a member of a particular type if and only if the type’s function (as algorithmically defined by its instructions) does not map the obj to the terminal obj{0}. If \[ f: A \rightarrow B \] is the function of type B<=A[f], then \[ f(a) = \begin{cases} b\{*\} & \text{if $a$ is a value of the type}, \\ b\{ 0\} & \text{otherwise.} \end{cases} \]

An mm-ADT type graph denotes a type whose values are those objs that realize a path in that graph.

Language Algebras

The mm-ADT VM is a computing machine founded on a non-commutative polynomial ring called a stream ring. The practical mental-model that mm-ADT purports is that of streams of data flowing through functions (instructions) composed serially (*) and in parallel (+). Moreover, these streams can be split and merged via the product and coproduct of streams of data whose size and dynamics are regulated by the coefficients (or quantifiers) of the respective polynomial ring. The physical manifestation of this algebra is manipulated by mmlang. However, there are two other intervening algebras (at the processor and storage levels), where ultimately, the mmlang ringoid realizes a correspondence by means of a ring action upon a magma.

  1. Obj Magma (storage): The set of all objs along with a single, non-associative binary juxtaposition operator (denoted with a blank space).

    \[ \mathbf{Obj} = \langle \texttt{obj}, \;\; \rangle \]

  2. Inst Monoid (language): A nested monoid with an associative, non-commutative binary compose operator that corresponds to the processor’s instruction pipeline.

    \[ \mathbf{Inst} = \langle \texttt{inst}, *, 1, 0 \rangle \]

  3. Type Ringoid (processor): A generalization of the inst monoid that supports the composition of polynomials for the construction of serial (*), parallel (+), and parallel choice (|) pipelines, where \$ \mathbf{Inst} \subset \mathbf{Trace} \$. However, there exists an information preserving automorphism from \$\mathbf{Trace}\$ to \$\mathbf{Inst}\$ (a self embedding). This generalization provides greater flexibility for expressing a wider range of common computational patterns.

\[ \textbf{Trace} = \langle \texttt{inst} \cup \texttt{poly}, *, +, |, 1, 0 \rangle \]

Obj Magma

A magma is a possibly associative algebraic structure with a single binary operator. Let \[ \mathbf{Obj} = \langle \texttt{obj}, \;\; \rangle \] be a magma with obj denoting the set of all quantified mm-ADT objects and \$ : \tt{obj} \times \tt{obj} \rightarrow \tt{obj}\$ the binary juxtaposition operated (denoted by a blank space). There are four types of juxtaposition:

  1. A type juxtaposed to a value yields the value whose quantifier is multiplied by the type’s quantifier.

    t1{q0}<=t0[a][b] v2{q1} = v2{q0*q1}

  2. A type juxtaposed to a type yields a type whose domain is the left type’s domain and whose range is the right type’s range, where instructions are concatenated (juxtaposed) and respective quantifiers multiplied.

    t1{q0}<=t0[a][b] t2{q2}<=t1[c][d] = t2{q0*q1}<=t0[a][b][c][d]

  3. A value juxtaposed to a type yields an obj (typically a value) that is the application of the type to the value.

    v0{q0} t2{q1}<=t1[a][b] = b(a(v0)){q0*q1}

  4. A value juxtaposed to a value yields the right hand value with quantifiers multiplied.

    v0{q0} v1{q1} = v1{q0*q1}

This is the simplest algebraic structure describing mm-ADT. This interpretation of mm-ADT pushes the rules of branching and its additive effects on quantification to the type and thus, to the respective [branch] and [choose] instructions contained therein. The one non-associative context that renders the algebra a magma is the three element juxtaposition of type value type, where it is generally true that

(t0 v1) t2 \$ne\$ t0 (v1 t2).

Inst Monoid

Monoids

A monoid is a structure of the form \$\langle A,\ast \rangle\$, where \$A\$ is the carrier set closed under the associative binary operator \$\ast: A \times A \rightarrow A\$ with \$1 \in A\$ being the identity such that for every \$a,b,c \in A\$, \$(a \ast b) \ast c = a \ast (b \ast c)\$ and \$a \ast 1 = 1 \ast a = a\$.

mm-ADT types serve numerous roles which are typically realized by many sorts of objects in other programming environments. The reason for this singular use is quite literally because the mm-ADT type system (and value system) is inductively generated from the mm-ADT VM’s instruction set architecture which is the generating set of the free inst monoid

\[ \langle F(\texttt{obj}), \ast, \texttt{inst}, \texttt{[id]} \rangle, \]

where obj is the carrier set, \$\ast: \tt{obj} \times \tt{i\nst} \to \tt{obj} \$ is the binary instruction application operator, \$\tt{i\nst} \subset \tt{obj}\$ are parameterized instructions, and \$\tt{[id]} \in \tt{i\nst}\$ is the identity element. While the richest algebra describing mm-ADT is a ring with unity, it is important to note that this "instruction-application" interpretation provides enough expressivity to faithfully capture all possible mm-ADT computations. For this reason, the inst monoid is a syntactic monoid as any language capable of generating its elements can express any mm-ADT VM computation.

Monoidal Cayley Graph

The generative Cayley graph encoding of the inst monoid is the mm-ADT VM’s primary data structure. If obj is the set of all vertices and inst \$\subset\$ obj is the generating set, then the inst monoid acts on the obj graph where \[ V = \texttt{obj}, \;\;\; I = \texttt{inst}, \;\;\; E \subseteq V \times I \times V. \] The edge \$(a,i,b) \in E\$, notated as \$a\to_{i}b\$, has the corresponding mmlang denotation: b<=a[i].

mmlang> int                                         (1)
==>int
mmlang> int[plus,1]                                 (2)
==>int[plus,1]
mmlang> int[plus,1][mult,[plus,10]]                 (3)
==>int[plus,1][mult,int[plus,10]]
mmlang> int[plus,1][mult,[plus,10]][explain]        (4)
==>'
int[plus,1][mult,int[plus,10]]

instruction            domain      range state
-----------------------------------------------
[plus,1]               int    =>   int
[mult,int[plus,10]]    int    =>   int
 [plus,10]              int   =>    int
'
1 mm-ADT obj graph has an identity for each ctype (mult-rooted Cayley graph).
2 \$\tt{i\nt} \to_{[\tt{plus,1}]} \tt{i\nt}'\$ (or int<=int[plus,1]) is the Cayley source, label, and target.
3 An inst can be nested. Labels denoting trees of instructions.
4 The domain/range (source/target) of \$a \to_i b\$ edges in the Cayley graph via [explain].
Deep Dive 3. An Instruction Only Free Monoid

The mm-ADT obj graph has 5 roots serving as the 1 identity for each of the 5 ctypes. There is a single root for all of mm-ADT in obj{0}. The x<=[start,x] instruction is an initial instruction in that it returns it’s arguments regardless of the input obj (or lack thereof).

\[ \texttt{[start,x]}: \texttt{obj{0}} \to \texttt{x} \]

An mm-ADT program is a type defined recurssively as \$ \tt{type} = (\tt{type} \times \tt{i\nst})\$. As such, an mm-ADT program is not a composition of instructions. The way in which a dtype can be ground to a ctype using only instructions is via the [start] instruction. A single pass through the mm-ADT compiler yields a type trace based at a ctype.

start ctypes
mmlang> [start,int]
==>int
mmlang> [start,int][plus,5]
==>int[plus,5]
mmlang> [start,int][plus,5][gt,10]
==>bool<=int[plus,5][gt,10]

The [noop] instruction is the only mm-ADT instruction that does not alter the state of the obj trace graph and thus, the state of the computation. The instruction [id], on the other hand, does.

mmlang> [start,int][plus,5][gt,10][noop]
==>bool<=int[plus,5][gt,10]
mmlang> [start,int][plus,5][gt,10][id]
==>bool<=int[plus,5][gt,10][id]
The Foundational obj Graph
cayley example

Every aspect of an mm-ADT computation from composition to evaluation is realized on this graph:

  1. Composition: The point-free style of mmlang is a function of a source vertex following by a series of instructions that yield intermediate vertices along the way.

  2. Compilation: A path in the Cayley graph represents a program. By altering the head of that path with a type, the path is re-evaluated compiling the program with (potentially) a different path through the Cayley graph.

  3. Rewrite: The vertices can be "merged" using [rewrite] instructions that specify a domain pattern that is equivalent to a range pattern.

  4. Optimization: Every instruction in inst has an associated cost. Rewrites create a superposition of programs. A weighted shortest path calculation from domain to range is a simple technique for choosing an efficient execution plan.

  5. State: Variable bindings, type definitions, and rewrite rule are encoded in instructions ([to], [define], [rewrite] respectively). When the current monoid operation requires historic state information, the co-Cayley graph is searched for previously encoded inst terms.

  6. Evaluation: When a program path is prefixed with a value, a computation takes place whereby the binary * operator no longer operates as a free algebra, but collapses two elements to one. The path’s tail element is the result of the computation.

Subgraph Algebras

While the Cayley graph encoding respects the axioms of a monoid, certain subgraphs have richer structure. Examples include:

  1. Filter instructions \$(\tt{fil\ter} \subset \tt{i\nst})\$: is an idempotent commutative monoid.

  2. Map instructions \$(\tt{map}^{-1} subset \tt{i\nst})\$: is the subset of map instructions that are invertible. This subgraph is generated by a group.

  3. Poly instructions \$(\tt{poly} \subset \tt{i\nst})\$: lifts subgraphs into a higher-order vertex. This directed hypergraph, where many vertices link to many vertices, is how [rewrite] yields endofunctors that link disparate areas of the graph not explicit in the inst monoid.

Type Ringoid

Stream Ring Theory

Stream ring theory is a ring algebra defined by the direct product of a function and coefficient ring, where every function/coefficient pair is an element of the carrier of a polynomial stream ring. The algebra is useful in asynchronous distributed computing environments that primarily enjoy embarrassingly parallel processing, but where, at certain space and time synchronization points, large amounts of data need to be co-located for processing. A proto-version of the stream ring algebra was realized in the distributed graph computing framework Apache TinkerPop.

mm-ADT adopts the algebra of stream ring theory, where functions are instructions and coefficients are quantifiers. mm-ADT’s type system is realized as a multi-sorted extension of the algebra.

mm-ADT’s type system is founded on a multi-sorted ring with unity called the mm-ADT type ringoid. An algebraic ring is composed of a multiplicative monoid \$\langle A,\ast,1 \rangle\$ and a commutative additive group \$\langle A,+,0\rangle\$ that share the same carrier set \$A\$. A ringoid generalizes the mathematics of a ring to support the ring axioms on multi-sorted carriers \$A,B,C\$, etc. (i.e. typed structures). The mm-ADT type ringoid generator is the set of all ctypes and single instruction dypes covering inst, where, with the ringoid’s multiplicative operator (*) and commutative additive operator (+), when faithfully applied according to sort, induce the set of all possible mm-ADT types.

  • The additive operator + is structurally encoded using a ,-poly, where each slot of the polynomial is an independent parallel type.

  • The multiplicative operator * is structurally encoded using a ;-poly, where each slot of the polynomial is a dependent serial type.

  • The additive identity 0 is the polymorphic anonymous ctype _{0}.

  • The multiplicative identity 1 is the polymorphic anonymous ctype _{1}, which in mmlang is simply denoted _.

The algebra underlying most type theories operate as a semiring(oid), where the additive component is a monoid as opposed to an invertible group. In mm-ADT, the elements of the additive component can be inverted by their corresponding negative type (or negative obj in general). Thus, mm-ADT realizes an additive groupoid, where, for example, the ,-poly [int{1},int{-1}] merges to int{0} which is isomorphic to the initial obj{0}.
The Free Poly Monoid

Every corresponding mmlang expression makes use of poly-types. In mmlang, a poly can be denoted as a structure via [ ] (a value) or as a process via < > (a type). The example 3-slot int ,-poly below has a domain of int and a range of int{1,3}. This branch structure will product one, two, or three ints given a single int.

mmlang> int[int[is>0],int[is<0],int]
==>int{1,3}<=int-<(int{?}<=int[is,bool<=int[gt,0]],int{?}<=int[is,bool<=int[lt,0]],int)>-
op poly inst meta

\$\ast\$

;

[compose]

[mult]

\$+\$

,

[branch]

[plus]

\$o+\$

|

[choose]

[plus]

An mm-ADT poly (polynomial) is an element of either a serial (;), parallel (,), or choice (|) free trace monoid that is left-adjoint to a respective \$langle \tt{obj},\tt{[compose] \rangle\$, \$langle \tt{obj},\tt{[branch]} \rangle\$, or \$langle \tt{obj}, \tt{[cho\ose]} \rangle\$ non-free, reductive monoid. The nabla functions, \$\nabla: \text{-poly} \rightarrow \tt{obj} \$, folds a free poly structure into an obj{*} via the applications of the reducing monoid’s operator in inst such that \[ \begin{split} \nabla^;(\texttt{poly}) &=& \texttt{[compose}, \texttt{poly}^{q_0}_0, \texttt{poly}^{q_1}_1,\ldots, \texttt{poly}^{q_n}_n,\texttt{]} \text{ with } q=\prod_{i<n} q_i, \\ \nabla^,(\texttt{poly}) &=& \texttt{[branch}, \texttt{poly}^{q_0}_0, \texttt{poly}^{q_1}_1,\ldots, \texttt{poly}^{q_n}_n,\texttt{]} \text{ with } q=\sum_{i<n} q_i, \\ \nabla^|(\texttt{poly}) &=& \texttt{[choose}, \texttt{poly}^{q_0}_0, \texttt{poly}^{q_1}_1,\ldots, \texttt{poly}^{q_n}_n,\texttt{]} \text{ with } q=(\min_{i<n}(q_i),\max_{i<n}(q_i)), \end{split} \] where \$\nabla\$ is realized as the codiagonal >- (i.e. [merge]) instruction. the resultant obj is quantified within the specified q range. This is the general solution to deriving the type quantifier during compilation and can be further refined using instruction semantics. The following diagrams specify three monoid homomorphisms that couple the poly to inst monoids such that the underlying monoidal processes of the mm-ADT VM can be directly manipulated by any mm-ADT language that supports poly — e.g., within mmlang.


free monoid homomorphisms

Both the -poly and \$\langle \tt{obj},- \rangle\$ monoids derive their carrier sets from the initial set of mm-ADT objs. The injective delta functions, \$\Delta:\tt{obj} \rightarrow \text{-poly}\$, yield the generators of the -poly monoid’s, where \[ \begin{split} \Delta^{;}(\texttt{obj}) &=& [\texttt{obj};], \\ \Delta^{,}(\texttt{obj}) &=& [\texttt{obj},], \\ \Delta^{|}(\texttt{obj}) &=& [\texttt{obj}|], \end{split} \] and \$\Delta\$ the diagonal -< (i.e. [split]) instruction. Likewise, [compose], [branch], and [choose] are the instruction representations of the composition of objs. If \[ \begin{split} U: M \rightarrow \textbf{Set} \end{split} \] maps a monoid to its carrier set, then the diagrams below commute, where the universal property of monoid mappings is realized as the "lifted" poly syntactic category encoding of the mm-ADT instruction set architecture.

monoid homomorphisms

The following diagrams detail the operational semantics of the mm-ADT VM with respects to the coupling between the corresponding free and reductive monoids. The dashed lines provide a disentangled, parallel (product) view of the respective double-line compositions. Finally, the squiggly line joining the two parallel morphisms in the |-poly diagram makes clear that, unlike the branches in ,-poly these branches are coupled to ensure the proper sum type (disjoint union) semantics of the either coproduct.

;-poly \[ \left(\Delta^; \circ \left(f \ast g\right) \circ \nabla^; \right) \] andThen/compose-chain
dependent slots
product
\$A \ast B \implies C\$

semi delta nabla

,-poly \[ \left(\Delta^, \circ \left(f+g\right) \circ \nabla^,\right) \] copy/clone-branching
independent slots
product
\$A+A \implies 2A\$

comma delta nabla

|-poly \[ \left(\Delta^| \circ \left(f \oplus g\right) \circ \nabla^|\right) \] either/choice-branching
dependent slots
coproduct
\$A \oplus A \implies A\$

pipe delta nabla

Free Type Ringoid

The two ;,-poly monoids serve as components of a universal algebra that implements the stream ring algebra — the foundational algebra of the mm-ADT VM. The two poly monoids form a polynomial ring and thus, a syntactic free algebra that enables mm-ADT metaprogramming by way of a ring homomorphism from the structural ;,-poly free ring to the corresponding reductive procedural ring such that \[ ;,-\texttt{poly} \xrightarrow{\;\;\;\nabla^{;}\;\;\;} \langle \texttt{obj},\texttt{[compose][branch]}, \rangle \] where the branch component is, in fact, a group with (a{q},a{-q}) == {0} and [branch,a{q},a{-q}] == {0}.

mmlang> -<('a'{2},'a'{-2})
==>('a'{2},'a'{-2})
mmlang> -<('a'{2},'a'{-2})>-
==>'a'{0}

The |-poly can be appended to the ;,-poly ring, where it serves as an idempotent variation of the additive group commonly used to denote sum types.

In terms of the mm-ADT VMs component architecture, this ring homomorphism maps a language to a processor through the common communication medium of objs provided by storage. All mm-ADT compliant components are faithful to obj and the stream ring axioms that bind them regardless of their particular ring encoding. In this way, the mm-ADT VM remains agnostic to the specifics of the component implementations and thus, mm-ADT supports the creation of synthetic data systems.

ring homomorphisms

Storage Structures

Processor Structures

The programs of the mm-ADT virtual machine are types. From a set of canonical types (ctypes), derived types (dtypes) of arbitrary complexity can be constructed using instructions from the VM’s instruction set architecture. Every mm-ADT type has a corresponding diagrammatic representation that is isomorphic to a directed labeled type graph composed of type-vertices and instruction-edges.

A program’s type graph is the intermediate representation used by the mm-ADT VM to not only link types (encode), but also to compile them (transform/optimize). At execution time, values propagate through the type graph and generate a parallel, homomorphic image of the types as values in the value graph, where the resultant structure of an mm-ADT computation is the obj graph, where

\[ \texttt{obj} = (\texttt{type} \times \texttt{q}) + (\texttt{value} \times \texttt{q}). \]

process stack

Type composition, compilation, and evaluation are carried out by mm-ADT compliant processors. Processors ground the mm-ADT VM to the underlying physical computer (whether on a single machine, via multiple threads, or across compute cluster), where, at the bottom of this process stack, the natural world’s physics provides the baseline dynamics (the fundamental ground of the computation).

This section details the specifics of the relationships between types, values, and processors.

Processes

Processors are used in the following three situations:

  1. Composition: (type inference).

  2. Compilation: (type optimization). (fix point).

  3. Evaluation: (type enumeration).

Algebraic Actions

A thread of execution in the VM maintains two primary references: 1.) a reference to an obj and 2.) a reference to an inst. Via the juxtaposition of obj and inst another obj is realized. When a VM process refers to the obj \$a\$ and the instruction inst \$i\$, then the VM will perform one of the following operations:

  1. If the vertex \$a\$ has no outgoing \$i\$-labeled edge, then the VM applies \$a\$ to \$i\$ to yield the edge \$a \to_i b\$ and arrive at \$b\$ (compute).

  2. If the vertex \$a\$ has an outgoing \$i\$-labeled edge, then the VM traverses the edge \$a \to_i b\$ to arrive at \$b\$ (memoize).

The first situation is computing via function evaluation (save space). The second situation leverages memoization to avoid recomputing (save time). These two situations offer three high-level perspectives on the obj graph.

  1. Mathematically: The obj graph has an infinite number of obj vertices connected to each other by edges labeled from the infinite instruction set architecture inst. From this perspective, computing is traversal (i.e. look-up) as the obj graph is fully materialized.

  2. Theoretically: The obj graph is manifested as computations proceed where, if \$a \in \tt{obj}\$ and \$i \in \tt{i\nst}\$, then any binary operation \$ai\$ that has already been evaluated already exists in the obj graph and as such, can be traversed.

  3. Physically: The obj graph is a dynamic entity that expands and contracts given the resource constraints of the underlying physical machines supporting its manifestation across all levels of the memory hierarchy, where garbage collection prunes the graph and computation grows the graph.

Type Specification

An mm-ADT program in an mm-ADT type. In the type graph (a subgraph of the obj graph) a type is denoted by a vertex (an ungrounded vertex). That vertex is the type’s range. The type’s definition is encoded in the directed, deterministic path that ends at a vertex with no outgoing edges. The resultant vertex is a root vertex and is the type’s domain. If the type definition’s path length is 0, then the domain and the range are equal, and the type is a ctype (a canonical type). If the path length is greater than 0, then the directed binary edges of the path are labeled with instructions from inst. This construction is abstractly represented in the diagram below.

type path

The type graph forms the central structure upon by which various VM processes are enacted. These processes include type/program specification, compilation, optimization, and ultimately, via a homomorphism from the type graph to the value graph, evaluation. Given finite computing resources, the type graph does not exist eternally in a static form ready-made. No, instead, subgraphs of it must be generated. This is accomplished via an action of inst monoid on the set inst* (the Kleene star closure of inst). For instance, in mmlang the user juxtaposes a ctype (domain) and an inst to construct a dtype. That dtype is juxtaposed with another inst to yield another dtype so forth until a desired type is reached.

\[ \texttt{range} = ((((((\texttt{domain} \cdot \texttt{inst}_0) \cdot \texttt{inst}_1) \cdot \texttt{inst}_2) \ldots) \cdot \texttt{inst}_{n-2}) \cdot \texttt{inst}_{n-1}). \]

In general, the action of an inst on a type is the function \[ \texttt{inst}: T \to T, \] where if \$a \in \tt{i\nst}\$, then \[ a(x) = xa. \]

Said plainly, instructions in inst act on types by concatenating themselves to the type definition. Thus, algebraically, a type is an element of the free inst monoid rooted at a ctype.

Type Compilation

Type Optimization

Type Evaluation

A type compiles a type. A type evaluates a value. The inst monoid’s type specification action yields an element in the free inst monoid, which, in the obj graph, is realized as a path from a range vertex to a domain vertex. In the example obj graph encoding below, the range vertex is the type path's source and the domain vertex is the path’s target.

type path

During type evaluation, the type path is reversed to form the co-type path, where the domain vertex is the source and the range vertex is the target.

co type path

If \$x \in \tt{value}\$, then \$x\$ is propagated along the co-type path, where the \$\tt{domai\n}\$ and \$\tt{rang\e}\$ types perform runtime type checking and the instructions transform the source \$x\$ value at each step into the resultant \$y\$ value.

\[ y = ((((((((x \cdot \texttt{domain}) \cdot \texttt{inst}_0) \cdot \texttt{inst}_1) \cdot \texttt{inst}_2) \ldots) \cdot \texttt{inst}_{n-2}) \cdot \texttt{inst}_{n-1}) \cdot \texttt{range}). \]

mmlang> int[plus,1][plus,2][plus,3]                                   (1)
==>int[plus,1][plus,2][plus,3]
mmlang> int[plus,1][plus,2][plus,3][path]                             (2)
==>(int;int[plus,1];int[plus,1][plus,2];int[plus,1][plus,2][plus,3])
mmlang> 1=>int=>[plus,1]=>[plus,2]=>[plus,3]=>int                     (3)
==>7
mmlang> 1=>int[plus,1][plus,2][plus,3][path]                          (4)
==>(1;2;4;7)
mmlang> 1=>int[plus,1][plus,2][plus,3][path]>-                        (5)
==>7
1 An int<=int type with a type path length of 5.
2 The co-type path of the previous type encoded in a ;-poly.
3 The step-wise => evaluation of the co-type path.
4 The step-wise => evaluation of the co-type path chambered in a ;-poly.
5 The evaluation of the ;-poly simply returns the last path value.

In the mmlang example above, the step-wise => evaluation of the co-type path is in one-to-one correspondence with the mm-ADT VM’s execution plan. The mm-ADT algebras are particular constraints on the most general algebraic specification of mm-ADT: the obj magma.

1=>int=>[plus,1]=>[plus,2]=>[plus,3]=>int

Type Checking
Instruction Evaluation

Every mm-ADT instruction denotes a unary function, but mm-ADT instructions themselves may contain zero, one, or multiple sub-expressions as arguments. At the mm-ADT type-level, mm-ADT instructions are \$n\$-ary computable relations, where through currying and stream semantics, ultimately, unary functions are realized.

n-Ary Instructions

Instructions that have no arguments and which map one input to one output are nullary instructions. For example, [neg] (negative/negate) is a nullary instruction in the type int[neg] denoting the unary function \[ \begin{array}. \texttt{neg} &:& \mathbb{N} \rightarrow \mathbb{N} \\ \texttt{neg}(x) &\mapsto& -x. \end{array} \]

The unary instruction [plus,2] in int[plus,2] is evaluated by the processor as the unary function \[ \begin{array}. \texttt{plus_2} &:& \mathbb{N} \rightarrow \mathbb{N} \\ \texttt{plus_2}(x) &\mapsto& x + 2. \end{array} \]

Instructions can have arguments that are dependent on the incoming obj (i.e. the unary function argument). For instance, the unary instruction [plus,[mult,3]] in int[plus,int[mult,3]] denotes the unary function \[ \begin{array}. \texttt{plus_mult_3} &:& \mathbb{N} \rightarrow \mathbb{N} \\ \texttt{plus_mult_3}(x) &\mapsto& x + (x * 3). \end{array} \]

Finally, as example instruction when the domain and range differ, [gt,[plus,[id]]] in \[ \tt{bool⇐int[gt,int[plus,int[id]]]} \] denotes the unary function \[ \begin{array}. \texttt{gt_plus_id} &:& \mathbb{N} \rightarrow \{\texttt{true} \cup \texttt{false}\} \\ \texttt{gt_plus_id}(x) &\mapsto& x > (x + x). \end{array} \]

n-Ary Relations

Instruction Classes

Map
Filter
Trace
Branch

Processor Implementations

Reference

mmlang Grammar

obj   ::= (type | value)q
value ::= vbool | vint | vreal | vstr
vbool ::= 'true' | 'false'
vint  ::= [1-9][0-9]*
vreal ::= [0-9]+'.'[0-9]*
vstr  ::= "'" [a-zA-Z]* "'"
type  ::= ctype | dtype
ctype ::= 'bool' | 'int'  | 'real' | 'str' | poly | '_'
poly  ::= lst | rec | inst
q     ::= '{' int (',' int)? '}'
dtype ::= ctype q? ('<=' ctype q?)? inst*
sep   ::= ';' | ',' | '|'
lst   ::= '(' obj (sep obj)* ')' q?
rec   ::= '(' obj '->' obj (sep obj '->' obj)* ')' q?
inst  ::= '[' op(','obj)* ']' q?
op    ::= 'a' | 'add' | 'and' | 'as' | 'combine' | 'count' | 'eq' | 'error' |
          'explain' | 'fold' | 'from' | 'get' | 'given' | 'groupCount' | 'gt' |
          'gte' | 'head' | 'id' | 'is' | 'last' | 'lt' | 'lte' | 'map' | 'merge' |
          'mult' | 'neg' | 'noop' | 'one' | 'or' | 'path' | 'plus' | 'pow' | 'put' |
          'q' | 'repeat' |'split' | 'start' | 'tail' | 'to' | 'trace' | 'type' | 'zero'

The following language axioms have variables:

variable range

o1, o2, …​

objs

t1, t2, …​

types

v1, v2, …​

values

[a], [b], …​

instructions

{q1}, {q2}, …​

quantifiers

id(t1)                    := t1<=t1[id]
domain(t1)                := t1
range(t1)                 := t1
domain(t2<=t1[a])         := t1
range(t2<=t1[a])          := t2

The quantifiers of an mmlang program are elements of a ring with unity called the quantifier ring. In the equations to follow, * and + refer to the respective monoid and group operations the quantifier ring.

  • The obj magma juxtapositions, where instructions [a], [b], etc. act as functions on values and as free monoid concatenation on types. The particulars objs that are yielded juxtaposing an obj to the left of a type are forthcoming.

v1{q1}                v2{q2}               := v2{q1*q2}
v1{q1}                t2{q2}<=t1[a][b]     := b(a(v1)){q1*q2}
t2{q1}<=t1[a][b]      t3{q2}<=t2[c][d]     := t3{q1*q2}<=t1[a][b][c][d]
t2{q1}<=t1[a][b]      v1{q2}               := v1{q1*q2}
  • The inst instructions denote functions that operate on objs. The particular of each instruction is forthcoming. The branch instructions denote flow control operations that mirror the type ringoid where [compose] is *, [branch] is +, and [choose] is |.

v1{q1}[compose,t2{q2}<=t1[a][b],t3{q3}<=t2[c][d]]]     := t3{q1*q2*q3}<=t1[a][b][c][d](v1)
v1{q1}[branch,t2{q2}<=t1[a][b],t2{q3}<=t1[c][d]]       := t2{q1*q2}<=t1[a][b](v1),t2{q1*q2}<=t1[c][d](v1)
v1{q1}[branch,t2{q2}<=t1[a][b],t2{q3}<=t1[a][b]]       := t2{q1*(q2+q3)}<=t1[a][b][c][d](v1)
v1{q1}[choose,t2{q2}<=t1[a][b],t2{q3}<=t1[c][d]]       := t2{q1*q2}<=t1[a][b](v1) | t2{q1*q2}<=t1[c][d](v1)
v1{q1}[repeat,[obj{?}->{0},obj{?}->type,n]]            := ...
  • The polys are the free algebra of the branch operations encoded in mmlang itself. Thus, the inst branch rules above are mirrored in the poly compositions below.

(t2{q1}<=t1[a][b];t3{q2}<=t2[c][d])  := t3{q1*q2}[a][b][c][d]
(t2{q1}<=t1[a][b],t2{q2}<=t1[c][d])  := t2{q1+q2}<=t1[branch,t2{q1}<=t1[a][b],t2{q1}<=t1[c][d]]
(t2{q1}<=t1[a][b],t2{q2}<=t1[a][b])  := t2{q1+q2}<=t1[a][b]
(t2{q1}<=t1[a][b]|t2{q2}<=t1[c][d])  := t2{min(q1,q2),max(q1,q2)}<=t1[choose,t2{q1}<=t1[a][b],t2{q2}<=t1[c][d]]
  • The non-branching inst instructions are the functional primitives that are composed to create mm-ADT programs.

obj    type                     := operation or equivalence
--------------------------------------------------------
o1     bool<=obj[a,o2]          := o1.test(o2)
bool1  bool<=bool[and,bool2]    := bool1 & bool2
o1{q1} int<=obj[count]          := q1
o1     bool<=obj[eq,o2]         := o1 == o2
t1     str<=obj[explain]        ...
...
  • The type-oriented inst instructions specify the typing rules, where the _ symbol refers to the anonymous type. These equations are presented using the obj magma’s juxtaposition.

t1    _                     := t1
t1    _[a][b]               := b(a(t1))<=t1[a][b]
o1    [a,_]                 := true
o1    [a,o1]                := true
o1{0} [a,{0}]               := true
o1    [a,t1]                := t1(o1) != {0}

Instructions

The mm-ADT VM instruction set architecture is presented below, where the instructions are ordered by their classification and within each classification, they are ordered alphabetically.

Table 3. Instruction classes
Class Signature Description

Branch

y{*}<=x[branch]

Split objs across instructions.

Filter

x{?}<=x[filter]

Remove objs from an evaluation

Flatmap

y{*}<=x[flatmap]

Split objs across objects.

Initial

obj{0}<=x[initial]

Finalize objs in an evaluation

Map

y<=x[map]

Move objs between objects

Reduce

y<=x{*}[reduce]

Fold objs down to a single obj

SideEffect

x<=x[sideeffect]

Global state mutations

Terminal

x{*}<=obj{0}[terminal]

Generate objs for evaluation

Trace

y<=x[trace]

Traverser the obj graph (reflection)

Branch Instructions

The branch instruction support the splitting and merging of parallel, data flows, where each branch is a type and, depending on the particular branch instruction, some types, all types, or only those types given a predicate are evaluated.

[compose]
[branch]
[choose]

Filter Instructions

Map Instructions

Reduce Instructions

Trace Instructions


1. Applicable to quantifier rings with an total order over the carrier.