Type System

Implements a simple, dynamic type system for API generation.

author:Anthony Scopatz <scopatz@gmail.com>

Introduction

This module provides a suite of tools for denoting, describing, and converting between various data types and the types coming from various systems. This is achieved by providing canonical abstractions of various kinds of types:

  • Base types (int, str, float, non-templated classes)
  • Refined types (even or odd ints, strings containing the letter ‘a’)
  • Dependent types (templates such arrays, maps, sets, vectors)

All types are known by their name (a string identifier) and may be aliased with other names. However, the string id of a type is not sufficient to fully describe most types. The system here implements a canonical form for all kinds of types. This canonical form is itself hashable, being comprised only of strings, ints, and tuples.

Canonical Forms

First, let us examine the base types and the forms that they may take. Base types are fiducial. The type system itself may not make any changes (refinements, template filling) to types of this kind. They are basically a collection of bits. (The job of ascribing meaning to these bits falls on someone else.) Thus base types may be referred to simply by their string identifier. For example:

'str'
'int32'
'float64'
'MyClass'

Aliases to these – or any – type names are given in the type_aliases dictionary:

type_aliases = {
    'i': 'int32',
    'i4': 'int32',
    'int': 'int32',
    'complex': 'complex128',
    'b': 'bool',
    }

Furthermore, length-2 tuples are used to denote a type and the name or flag of its predicate. A predicate is a function or transformation that may be applied to verify, validate, cast, coerce, or extend a variable of the given type. A common usage is to declare a pointer or reference of the underlying type. This is done with the string flags ‘*’ and ‘&’:

('char', '*')
('float64', '&')

If the predicate is a positive integer, then this is interpreted as a homogeneous array of the underlying type with the given length. If this length is zero, then the tuple is often interpreted as a scalar of this type, equivalent to the type itself. The length-0 scalar interpretation depends on context. Here are some examples of array types:

('char', 42)  # length-42 character array
('bool', 1)   # length-1 boolean array
('f8', 0)     # scalar 64-bit float

Note

length-1 tuples are converted to length-2 tuples with a 0 predicate, i.e. ('char',) will become ('char', 0).

The next kind of type are refinement types or refined types. A refined type is a sub-type of another type but restricts in some way what constitutes a valid element. For example, if we first take all integers, the set of all positive integers is a refinement of the original. Similarly, starting with all possible strings the set of all strings starting with ‘A’ is a refinement.

In the system here, refined types are given their own unique names (e.g. ‘posint’ and ‘astr’). The type system has a mapping (refined_types) from all refinement type names to the names of their super-type. The user may refer to refinement types simply by their string name. However the canonical form refinement types is to use the refinement as the predicate of the super-type in a length-2 tuple, as above:

('int32', 'posint')  # refinement of integers to positive ints
('str', 'astr')      # refinement of strings to str starting with 'A'

It is these refinement types that give the second index in the tuple its ‘predicate’ name. Additionally, the predicate is used to look up the converter and validation functions when doing code generation or type verification.

The last kind of types are known as dependent types or template types, similar in concept to C++ template classes. These are meta-types whose instantiation requires one or more parameters to be filled in by further values or types. Dependent types may nest with themselves or other dependent types. Fully qualifying a template type requires the resolution of all dependencies.

Classic examples of dependent types include the C++ template classes. These take other types as their dependencies. Other cases may require only values as their dependencies. For example, suppose we want to restrict integers to various ranges. Rather than creating a refinement type for every combination of integer bounds, we can use a single ‘intrange’ type that defines high and low dependencies.

The template_types mapping takes the dependent type names (e.g. ‘map’) to a tuple of their dependency names (‘key’, ‘value’). The refined_types mapping also accepts keys that are tuples of the following form:

('<type name>', '<dep0-name>', ('dep1-name', 'dep1-type'), ...)

Note that template names may be reused as types of other template parameters:

('name', 'dep0-name', ('dep1-name', 'dep0-name'))

As we have seen, dependent types may either be base types (when based off of template classes), refined types, or both. Their canonical form thus follows the rules above with some additional syntax. The first element of the tuple is still the type name and the last element is still the predicate (default 0). However the type tuples now have a length equal to 2 plus the number of dependencies. These dependencies are placed between the name and the predicate: ('<name>', <dep0>, ..., <predicate>). These dependencies, of course, may be other type names or tuples! Let’s see some examples.

In the simplest case, take analogies to C++ template classes:

('set', 'complex128', 0)
('map', 'int32', 'float64', 0)
('map', ('int32', 'posint'), 'float64', 0)
('map', ('int32', 'posint'), ('set', 'complex128', 0), 0)

Now consider the intrange type from above. This has the following definition and canonical form:

refined_types = {('intrange', ('low', 'int32'), ('high', 'int32')): 'int32'}

# range from 1 -> 2
('int32', ('intrange', ('low', 'int32', 1), ('high', 'int32', 2)))

# range from -42 -> 42
('int32', ('intrange', ('low', 'int32', -42), ('high', 'int32', 42)))

Note that the low and high dependencies here are length three tuples of the form ('<dep-name>', <dep-type>, <dep-value>). How the dependency values end up being used is solely at the discretion of the implementation. These values may be anything, though they are most useful when they are easily convertible into strings in the target language.

Warning

Do not confuse length-3 dependency tuples with length-3 type tuples! The last element here is a value, not a predicate.

Next, consider a ‘range’ type which behaves similarly to ‘intrange’ except that it also accepts the type as dependency. This has the following definition and canonical form:

refined_types = {('range', 'vtype', ('low', 'vtype'), ('high', 'vtype')): 'vtype'}

# integer range from 1 -> 2
('int32', ('range', 'int32', ('low', 'int32', 1), ('high', 'int32', 2)))

# positive integer range from 42 -> 65
(('int32', 'posint'), ('range', ('int32', 'posint'),
                                ('low', ('int32', 'posint'), 42),
                                ('high', ('int32', 'posint'), 65)))

Shorthand Forms

The canonical forms for types contain all the information needed to fully describe different kinds of types. However, as human-facing code, they can be exceedingly verbose. Therefore there are number of shorthand techniques that may be used to also denote the various types. Converting from these shorthands to the fully expanded version may be done via the the canon(t) function. This function takes a single type and returns the canonical form of that type. The following are operations that canon() performs:

  • Base type are returned as their name:

    canon('str') == 'str'
    
  • Aliases are resolved:

    canon('f4') == 'float32'
    
  • Expands length-1 tuples to scalar predicates:

    t = ('int32',)
    canon(t) == ('int32', 0)
    
  • Determines the super-type of refinements:

    canon('posint') == ('int32', 'posint')
    
  • Applies templates:

    t = ('set', 'float')
    canon(t) == ('set', 'float64', 0)
    
  • Applies dependencies:

    t = ('intrange', 1, 2)
    canon(t) = ('int32', ('intrange', ('low', 'int32', 1), ('high', 'int32', 2)))
    
    t = ('range', 'int32', 1, 2)
    canon(t) = ('int32', ('range', 'int32', ('low', 'int32', 1), ('high', 'int32', 2)))
    
  • Performs all of the above recursively:

    t = (('map', 'posint', ('set', ('intrange', 1, 2))),)
    canon(t) == (('map',
                 ('int32', 'posint'),
                 ('set', ('int32',
                    ('intrange', ('low', 'int32', 1), ('high', 'int32', 2))), 0)), 0)
    

These shorthands are thus far more useful and intuitive than canonical form described above. It is therefore recommended that users and developers write code that uses the shorter versions, Note that canon() is guaranteed to return strings, tuples, and integers only – making the output of this function hashable.

Built-in Template Types

Template type definitions that come stock with xdress:

template_types = {
    'map': ('key_type', 'value_type'),
    'dict': ('key_type', 'value_type'),
    'pair': ('key_type', 'value_type'),
    'set': ('value_type',),
    'list': ('value_type',),
    'tuple': ('value_type',),
    'vector': ('value_type',),
    }

Built-in Refined Types

Refined type definitions that come stock with xdress:

refined_types = {
    'nucid': 'int32',
    'nucname': 'str',
    ('enum', ('name', 'str'), ('aliases', ('dict', 'str', 'int32', 0))): 'int32',
    ('function', ('arguments', ('list', ('pair', 'str', 'type'))), ('returns', 'type')): 'void',
    ('function_pointer', ('arguments', ('list', ('pair', 'str', 'type'))), ('returns', 'type')): ('void', '*'),
    }

Major Classes Overview

Holistically, the following classes are important to type system:

  • TypeSystem: This is the type system.
  • TypeMatcher: An imutable type for matching types against a pattern.
  • MatchAny: A singleton used to denote patterns.
  • typestr: Various string representations of a type as properties.

Type System API

class xdress.types.system.TypeSystem(base_types=None, template_types=None, refined_types=None, humannames=None, extra_types='xdress_extra_types', dtypes='dtypes', stlcontainers='stlcontainers', argument_kinds=None, variable_namespace=None, type_aliases=None, cpp_types=None, numpy_types=None, from_pytypes=None, cython_ctypes=None, cython_cytypes=None, cython_pytypes=None, cython_cimports=None, cython_cyimports=None, cython_pyimports=None, cython_functionnames=None, cython_classnames=None, cython_c2py_conv=None, cython_py2c_conv=None, typestring=None)[source]

A class representing a type system.

Parameters:

base_types : set of str, optional

The base or primitive types in the type system.

template_types : dict, optional

Template types are types whose instantiations are based on meta-types. this dict maps their names to meta-type names in order.

refined_types : dict, optional

This is a mapping from refinement type names to the parent types. The parent types may either be base types, compound types, template types, or other refined types!

humannames : dict, optional

The human readable names for types.

extra_types : str, optional

The name of the xdress extra types module.

dtypes : str, optional

The name of the xdress numpy dtypes wrapper module.

stlcontainers : str, optional

The name of the xdress C++ standard library containers wrapper module.

argument_kinds : dict, optional

Templates types have arguments. This is a mapping from type name to a tuple of utils.Arg kind flags. The order in the tuple matches the value in template_types. This is only vaid for concrete types, ie (‘vector’, ‘int’, 0) and not just ‘vector’.

variable_namespace : dict, optional

Templates arguments may be variables. These variables may live in a namespace which is required for specifiying the type. This is a dictionary mapping variable names to thier namespace.

type_aliases : dict, optional

Aliases that may be used to substitute one type name for another.

cpp_types : dict, optional

The C/C++ representation of the types.

numpy_types : dict, optional

NumPy’s Cython representation of the types.

from_pytypes : dict, optional

List of Python types which match may be converted to these types.

cython_ctypes : dict, optional

Cython’s C/C++ representation of the types.

cython_cytypes : dict, optional

Cython’s Cython representation of the types.

cython_pytypes : dict, optional

Cython’s Python representation of the types.

cython_cimports : dict, optional

A sequence of tuples representing cimports that are needed for Cython to represent C/C++ types.

cython_cyimports : dict, optional

A sequence of tuples representing cimports that are needed for Cython to represent Cython types.

cython_pyimports : dict, optional

A sequence of tuples representing imports that are needed for Cython to represent Python types.

cython_functionnames : dict, optional

Cython alternate name fragments used for mangling function and variable names. These should try to adhere to a lowercase_and_underscore convention. These may contain template argument namess as part of a format string, ie {'map': 'map_{key_type}_{value_type}'}.

cython_classnames : dict, optional

Cython alternate name fragments used for mangling class names. These should try to adhere to a CapCase convention. These may contain template argument namess as part of a format string, ie {'map': 'Map{key_type}{value_type}'}.

cython_c2py_conv : dict, optional

Cython convertors from C/C++ types to the representative Python types.

cython_py2c_conv : dict, optional

Cython convertors from Python types to the representative C/C++ types. Valuse are tuples with the form of (body or return, return or False).

typestring : typestr or None, optional

An type that is used to format types to strings in conversion routines.

basename(t)[source]

Retrieves basename from a type, e.g. ‘map’ in (‘map’, ‘int’, ‘float’).

canon(t)[source]

Turns the type into its canonical form. See module docs for more information.

clearmemo()[source]

Clears all method memoizations on this type system instance.

cpp_funcname(name, argkinds=None)[source]

This returns a name for a function based on its name, rather than its type. The name may be either a string or a tuple of the form (‘name’, template_arg1, template_arg2, ...). The argkinds argument here refers only to the template arguments, not the function signature default arguments. This is not meant to replace cpp_type(), but complement it.

cpp_literal(lit)[source]

Converts a literal value to it C++ form.

cpp_type(t)[source]

Given a type t, returns the corresponding C++ type declaration.

cython_c2py(name, t, view=True, cached=True, inst_name=None, proxy_name=None, cache_name=None, cache_prefix='self', existing_name=None)[source]

Given a variable name and type, returns cython code (declaration, body, and return statements) to convert the variable from C/C++ to Python.

cython_c2py_getitem(t)[source]

Helps find the approriate c2py value for a given concrete type key.

cython_cimport_lines(x, inc=frozenset(['c', 'cy']))[source]

Returns the cimport lines associated with a type or a set of seen tuples.

cython_cimport_tuples(t, seen=None, inc=frozenset(['c', 'cy']))[source]

Given a type t, and possibly previously seen cimport tuples (set), return the set of all seen cimport tuples. These tuple have four possible interpretations based on the length and values:

  • (module-name,) becomes cimport {module-name}
  • (module-name, var-or-mod) becomes from {module-name} cimport {var-or-mod}
  • (module-name, var-or-mod, alias) becomes from {module-name} cimport {var-or-mod} as {alias}
  • (module-name, 'as', alias) becomes cimport {module-name} as {alias}
cython_classname(t, cycyt=None)[source]

Computes classnames for cython types.

cython_ctype(t)[source]

Given a type t, returns the corresponding Cython C/C++ type declaration.

cython_cytype(t)[source]

Given a type t, returns the corresponding Cython type.

cython_funcname(name, argkinds=None)[source]

This returns a name for a function based on its name, rather than its type. The name may be either a string or a tuple of the form (‘name’, template_arg1, template_arg2, ...). The argkinds argument here refers only to the template arguments, not the function signature default arguments. This is not meant to replace cython_functionname(), but complement it.

cython_functionname(t, cycyt=None)[source]

Computes variable or function names for cython types.

cython_import_lines(x)[source]

Returns the import lines associated with a type or a set of seen tuples.

cython_import_tuples(t, seen=None)[source]

Given a type t, and possibly previously seen import tuples (set), return the set of all seen import tuples. These tuple have four possible interpretations based on the length and values:

  • (module-name,) becomes import {module-name}
  • (module-name, var-or-mod) becomes from {module-name} import {var-or-mod}
  • (module-name, var-or-mod, alias) becomes from {module-name} import {var-or-mod} as {alias}
  • (module-name, 'as', alias) becomes import {module-name} as {alias}

Any of these may be used.

cython_literal(lit)[source]

Converts a literal to a Cython compatible form.

cython_nptype(t, depth=0)[source]

Given a type t, returns the corresponding numpy type. If depth is greater than 0 then this returns of a list of numpy types for all internal template types, ie the float in (‘vector’, ‘float’, 0).

cython_py2c(name, t, inst_name=None, proxy_name=None)[source]

Given a variable name and type, returns cython code (declaration, body, and return statement) to convert the variable from Python to C/C++.

cython_pytype(t)[source]

Given a type t, returns the corresponding Python type.

cython_variablename(t, cycyt=None)

Computes variable or function names for cython types.

delmemo(meth, *args, **kwargs)[source]

Deletes a single key from a method on this type system instance.

deregister_argument_kinds(t)[source]

Removes a type and its argument kind tuple from the type system.

deregister_class(name)[source]

This function will remove a previously registered class from the type system.

deregister_refinement(name)[source]

This function will remove a previously registered refinement from the type system.

deregister_specialization(t)[source]

This function will remove previously registered template specialization.

dump(filename, format=None, mode='wb')[source]

Saves a type system out to disk.

Parameters:

filename : str

Path to file.

format : str, optional

The file format to save the type system as. If this is not provided, it is infered from the filenme. Options are:

  • pickle (‘.pkl’)
  • gzipped pickle (‘.pkl.gz’)

mode : str, optional

The mode to open the file with.

classmethod empty()[source]

This is a class method which returns an empty type system.

gccxml_type(t)[source]

Given a type t, returns the corresponding GCC-XML type name.

humanname(t, hnt=None)[source]

Computes human names for types.

isdependent(t)[source]

Returns whether t is a dependent type or not.

isrefinement(t)[source]

Returns whether t is a refined type.

istemplate(t)[source]

Returns whether t is a template type or not.

classmethod load(filename, format=None, mode='rb')[source]

Loads a type system from disk into a new type system instance. This is a class method.

Parameters:

filename : str

Path to file.

format : str, optional

The file format to save the type system as. If this is not provided, it is infered from the filenme. Options are:

  • pickle (‘.pkl’)
  • gzipped pickle (‘.pkl.gz’)

mode : str, optional

The mode to open the file with.

local_classes(*args, **kwds)[source]

A context manager for making sure the given classes are local.

register_argument_kinds(t, argkinds)[source]

Registers an argument kind tuple into the type system for a template type.

register_class(name=None, template_args=None, cython_c_type=None, cython_cimport=None, cython_cy_type=None, cython_py_type=None, cython_template_class_name=None, cython_template_function_name=None, cython_cyimport=None, cython_pyimport=None, cython_c2py=None, cython_py2c=None, cpp_type=None, human_name=None, from_pytype=None)[source]

Classes are user specified types. This function will add a class to the type system so that it may be used normally with the rest of the type system.

register_classname(classname, package, pxd_base, cpppxd_base, cpp_classname=None, make_dtypes=True)[source]

Registers a class with the type system from only its name, and relevant header file information.

Parameters:

classname : str or tuple

package : str

Package name where headers live.

pxd_base : str

Base name of the pxd file to cimport.

cpppxd_base : str

Base name of the cpppxd file to cimport.

cpp_classname : str or tuple, optional

Name of class in C++, equiv. to apiname.srcname. Defaults to classname.

make_dtypes : bool, optional

Flag for registering dtypes for this class simeltaneously with registering the class itself.

register_numpy_dtype(t, cython_cimport=None, cython_cyimport=None, cython_pyimport=None)[source]

This function will add a type to the system as numpy dtype that lives in the dtypes module.

register_refinement(name, refinementof, cython_cimport=None, cython_cyimport=None, cython_pyimport=None, cython_c2py=None, cython_py2c=None)[source]

This function will add a refinement to the type system so that it may be used normally with the rest of the type system.

register_specialization(t, cython_c_type=None, cython_cy_type=None, cython_py_type=None, cython_cimport=None, cython_cyimport=None, cython_pyimport=None)[source]

This function will add a template specialization so that it may be used normally with the rest of the type system.

register_variable_namespace(name, namespace, t=None)[source]

Registers a variable and its namespace in the typesystem.

strip_predicates(t)[source]

Removes all outer predicates from a type.

swap_dtypes(*args, **kwds)[source]

A context manager for temporarily swapping out the dtypes value with a new value and replacing the original value before exiting.

swap_stlcontainers(*args, **kwds)[source]

A context manager for temporarily swapping out the stlcontainer value with a new value and replacing the original value before exiting.

update(*args, **kwargs)[source]

Updates the type system in-place. Only updates the data attributes named in ‘datafields’. This may be called with any of the following signatures:

ts.update(<TypeSystem>)
ts.update(<dict-like>)
ts.update(key1=value1, key2=value2, ...)

Valid keyword arguments are the same here as for the type system constructor. See this documentation for more detail.

class xdress.types.system.typestr(t, ts)[source]

This is class whose attributes are properties that expose various string representations of a type. This is useful for the Python string formatting mini-language where attributes of an object may be accessed. For example:

“This is the Cython C/C++ type: {t.cython_ctype}”.format(t=typestr(t, ts))

This mechanism is used for accessing type information in conversion strings.

Parameters:

t : str or tuple

A valid repesentation of a type in the type systems

ts : TypeSystem

A type system to generate the string representations with.

cython_ctype

The Cython C/C++ representation of the type.

cython_ctype_nopred

The Cython C/C++ representation of the type without predicates.

cython_cytype

The Cython Cython representation of the type.

cython_cytype_nopred

The Cython Cython representation of the type without predicates.

cython_npctype

The Cython C/C++ representation of NumPy type.

cython_npctype_nopred

The Cython C/C++ representation of the NumPy type without predicates.

cython_npctypes

The expanded Cython C/C++ representation of the NumPy types.

cython_npctypes_nopred

The Cython C/C++ representation of the NumPy types without predicates.

cython_npcytype

The Cython Cython representation of NumPy type.

cython_npcytype_nopred

The Cython Cython representation of the NumPy type without predicates.

cython_npcytypes

The expanded Cython Cython representation of the NumPy types.

cython_npcytypes_nopred

The Cython Cython representation of the NumPy types without predicates.

cython_nppytype

The Cython Python representation of NumPy type.

cython_nppytype_nopred

The Cython Python representation of the NumPy type without predicates.

cython_nppytypes

The expanded Cython Cython representation of the NumPy types.

cython_nppytypes_nopred

The Cython Python representation of the NumPy types without predicates.

cython_nptype

The Cython NumPy representation of the type.

cython_nptype_nopred

The Cython NumPy representation of the type without predicates.

cython_nptypes

The expanded Cython NumPy representation of the type.

cython_nptypes_nopred

The Cython NumPy representation of the types without predicates.

cython_pytype

The Cython Python representation of the type.

cython_pytype_nopred

The Cython Python representation of the type without predicates.

type

This is a repr string of the raw type (self.t), mostly useful for comments.

type_nopred

This is a repr string of the raw type (self.t) without predicates.