Introducing Functor

In programing, you often embed some domain specific language in a host language. This might be for working with databases, or working with configuration. And when you have such DSL, you will probably want to use functions from your host-language on values from your dsl. Some database libraries are quite bad in this regard, where you set your queries as literal strings:

session.execute("select text,priority from memos where priority>9;")

On the other hand, if you use library, such as sql-alchemy, the interoperabilty with the host-language is much more seamless:

session.query(Memos).filter(Memos.priority > 9)

When you look at the code, it is apparent that some translation needs to happen that would convert the python expression Memos.priority > 9 to the SQL equivalent.

Category theory!

This idea of translating objects from one domain to another is at the core of the mathematical subject of category theory.

Category in mathemathics is defined by three things:

  • collection of objects
  • collection of possible transformations between these objects, called morphisms
  • if we can transform object X to Y, and we can transform object Y to Z, there needs to be transformation from A to C in our transformation collection

You could draw a category using upper-case letters for objects, and arrows to signify possible transformations.

Category

Functor is then mapping between categories. On the image you can see a category of red objects, with morphisms $f$, $g$ and three identity morphisms (i.e. arrows that transform and object to itself) and a functor $F$ that lifts the transformations from red category, to green, where there are only two objects.

Functor

I am not entirely sure I could say, that you could think of i.e. predicates in python as one category, predicates in sql as other category, and that sql-alchemy would form a functor between them. But even if that wouldn't be entirely true, it gives nice intuition for what we would want to achieve with utilizing category-theory in programming.

Functor in Python

So what in practice is a functor?

In general, anything where you can define a sensible fmap(function, ) operation. The anything in question are mostly containers, but it might be a generic object providing aditional context/metadata, for value(s) it could encapsulate, or produce.

Sensible in this context means, that the interface obeys certain laws, but before we state them expricitly, I need to define two functions, id and compose:

In [3]:
def id(x):
    return x

def compose(f,g):
    return lambda x: f(g(x))

The idea goes like this:

  • we have our generic type T, parametrized by type A, or T[A] in typings syntax
  • we interpret T[A], that our T can in some way produce value(s) of type A

T is a functor, if

  • if we can easily convert T[A] (a producer of A's) to T[B] (a producer of A's), by using the function fmap(fn, t), if we have function that accepts A and returns B
  • identity function doesn't indroduce changes:
    • fmap(id, t) == t
  • if we have functions f and g that are composable, it doesn't matter if we compose them first, or use fmap twice:
    • fmap(compose(f,g),t) == fmap(f,fmap(g,t))

The fmap(id, t) == t and fmap(compose(f,g),t) == fmap(f,fmap(g,t)) are the functor laws. I try to explain why should we care about them later, I'd rather show few concrete examples now:

In [11]:
def fmapList(fn, lst):
    return [fn(x) for x in lst]

fmapList(str,[1,2,3])
Out[11]:
['1', '2', '3']
In [12]:
def fmapIterable(fn, iterable):
    return (fn(x) for x in iterable)
In [13]:
class Tree:
    def __init__(self,left=None,data=None,right=None):
        self.left = left
        self.right = right
        self.data = data  
    
def fmapTree(fn, tree):
    if tree == None:
        return None
    else:
        return Tree(
            mapTree(fn,tree.left),
            fn(tree.data),
            mapTree(fn,tree.right)
        )

Interestingly enough, there is a map function in Python, and it works on iterables. Which unfortunately means that it allways returns an iterable. The interesting thing about functors is, that the map function preserves the type. If I map over list, I know it will be a list.

But there are more interesting things than containers that could be functors, for example functions:

In [14]:
def fmapFunction(function, fn):
    def wrapper(*arr):
        return fn(function(*arr))
    return wrapper

Now we can convert function that generates random integers to function that generates strings.

In [15]:
import random
fmapFunction(random.randint, str)(1,10)
Out[15]:
'6'

Lift for our functions

There is a recuring concept, where we use some bit of information about our datatype to convert functions that know nothig about it, to work with it.

We call this lifting.

For example, because I know that lists have map function, I could lift i.e. str to work on lists of things.

In [16]:
def lift(fn):
    def liftedfn(x):
        return fmapList(fn,x)
    return liftedfn 
In [17]:
lifted_str = lift(str)
lifted_str([1,2,3])
Out[17]:
['1', '2', '3']

Making this more usable

Interesting question is, how to make these more pleasan't to use. One approach might be defining fmap as @singledispatch, as we did with mconcat previously. Only thing that is a bit awkward is, that we can't have the function argument first.

In [26]:
from functools import singledispatch, partial

@singledispatch
def fmap(x, fn):
    raise Error("Not implemented for" + a)
    
def lift(fn):
    def liftedfn(x):
        return fmap(x, fn)
    return liftedfn 

Now we could register a few classes that would work with this. I.e. all of the collections such as list:

In [70]:
@fmap.register([].__class__)
def _(lst, fn):
    return [fn(x) for x in lst]

set:

In [74]:
@fmap.register(set().__class__)
def _(lst, fn):
    return set(fn(x) for x in lst)

dictionary:

In [75]:
@fmap.register({}.__class__)
def _(d, fn):
    return {x: fn(d[x]) for x in d}

Why should we care about lawfulness?

For functor, I have some intuition for its laws, because I had abstract algebra course at university and when I look at the laws my mins goes "This kida looks like definition of homomorphism, I remember those to be useful, I guess it makes sense then." I dont really remember the specifics, but because it is vaguely familliar, I am fine with it.

There is better reason to like lawful interfaces, though. Just by using the the laws of the interface, we can define new and usefull stuff.

Lets look at nesting of two different functors, where we would have T[U[A]]. It is easy to see, that for any a, T[U[a]] is a functor as well.

If map for T is tmap, and map for U is umap, we can define the tumap for the nesting of U in T like this:

def tumap(f,tu):
  return tmap(lambda x: umap(f,x), tu)

We could write this as:

$$ \begin{align*} \text{map}_{t_u}(f, t_u) &= {map}_{t} ( \lambda x: map_u(f,x), t_u) \end{align*} $$

And because we know the functor laws we can try to prove that they hold for this implementation as well.

  • for any tu, tumap(id, tu) == tu
$$ \begin{align*} \text{map}_{t_u}(id, t_u) &= {map}_{t} ( \lambda x: map_u(id,x), t_u) \\ &= {map}_{t} ( \lambda x: x, t_u) \\ &= {map}_{t} ( id, t_u) \\ &= t_u \\ \end{align*} $$
  • for any t and any two composable functions f and g, tumap(f,tumap(g,tu)) == map(compose(f,g),tu)
$$ \begin{align*} \text{map}_{t_u}(f, {map}_{tu}(g,t_u)) &= \text{map}_{t_u}(f, {map}_{tu}(g,t_u)) \\ &= {map}_{t} ( \lambda x: map_u(f,x), {map}_{t}( \lambda x: map_u(g,x),t_u)) \\ &= {map}_{t}((\lambda x: map_u(f,x)) \circ (\lambda x: map_u(g,x)), t_u) \\ &= {map}_{t}(\lambda x: map_u(f,map_u(g,x)), t_u) \\ &= {map}_{t} ( \lambda x: map_u(compose(f,g),x), t_u) \\ &= \text{map}_{t_u}(compose(f,g), t_u) \end{align*} $$

Because we have laws for the interface, we can do these kinds of self-contained proofs.

Now we can do nested fmap

Because we know, that functor of a functor is a functor, we can attempt a nested fmap.

In [101]:
def fmapNested(xs, fn):
    def recur(x):
        if x.__class__ in fmap.registry.keys():
            return fmap(x,recur)
        else:
            return fn(x)
    return fmap(xs,recur)

def liftNested(fn):
    return lambda x: fmapNested(x,fn)

Yay, now we can map over nested data-structures!

In [86]:
fmapNested([[1,2],{'a':[3,4]}],str)
Out[86]:
[['1', '2'], {'a': ['3', '4']}]

Effect tracking

Now, the interesting thing you can use functors for, is to track various side-effects and their meta-data with them. This is what i.e. Haskell does. The fmap then gives you the ability to say "I don't care about anything, just let me operate on the inner value(s)". These values might not be there, there might be error instead, there might be many, you might need to wait for a http request to complete, but with fmap, you can choose to ignore it, and just fmap over the values.

In [88]:
class ErrorOrValue:
    def __init__(self,error,value):
        self._error = error
        self._value = value
        
    def map(self,fn):
        if self._error:
            return self
        else:
            return ErrorOrValue(None,fn(self._value))
        
    def __repr__(self):
        if self._error:
            return "Error:" + self._error
        else:
            return "Value:" + repr(self._value)
        
@fmap.register(ErrorOrValue)
def _(x,fn):
    return x.map(fn)

With this, I can implement a silly little square-root function:

  • that would return the both results of square root :-)
  • that returns an error if there is no result
In [100]:
import math
def sqrt(x):
    if x>=0:
        return ErrorOrValue(None,[math.sqrt(x), - math.sqrt(x)])
    else:
        return ErrorOrValue("Can't sqrt "+str(x),None)

print(sqrt(16))
print(sqrt(-16))
Value:[4.0, -4.0]
Error:Can't sqrt -16

On one hand, suddenly I can't just do sqrt(sqrt(sqrt(256)))

In [105]:
sqrt(sqrt(sqrt(256)))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-105-21479426438d> in <module>()
----> 1 sqrt(sqrt(sqrt(256)))

<ipython-input-100-b7a52edf617b> in sqrt(x)
      1 import math
      2 def sqrt(x):
----> 3     if x>=0:
      4         return ErrorOrValue(None,[math.sqrt(x), - math.sqrt(x)])
      5     else:

TypeError: unorderable types: ErrorOrValue() >= int()

But I have my nested lift function, that can fix this:

In [109]:
lsqrt = liftNested(sqrt)
lsqrt(lsqrt(lsqrt([256])))
Out[109]:
[Value:[Value:[Value:[2.0, -2.0], Error:Can't sqrt -4.0], Error:Can't sqrt -16.0]]

In practice, you wouldn't probably use this sort of nesting. But it hints on interesting building blocks that are useful. And for the most part, this is what functors are. A basic building block, that more interesting things are based upon. For example, right now, we can only lift single-param functions. We should be able to solve this somehow.