Introducing Functor¶
In programing, you often embed some domain specific language in a host language. This might be for working with databases, or working with configuration. And when you have such DSL, you will probably want to use functions from your host-language on values from your dsl. Some database libraries are quite bad in this regard, where you set your queries as literal strings:
session.execute("select text,priority from memos where priority>9;")
On the other hand, if you use library, such as sql-alchemy, the interoperabilty with the host-language is much more seamless:
session.query(Memos).filter(Memos.priority > 9)
When you look at the code, it is apparent that some translation needs to happen that would convert the python expression Memos.priority > 9
to the SQL equivalent.
Category theory!¶
This idea of translating objects from one domain to another is at the core of the mathematical subject of category theory.
Category in mathemathics is defined by three things:
- collection of objects
- collection of possible transformations between these objects, called morphisms
- if we can transform object X to Y, and we can transform object Y to Z, there needs to be transformation from A to C in our transformation collection
You could draw a category using upper-case letters for objects, and arrows to signify possible transformations.
Functor is then mapping between categories. On the image you can see a category of red objects, with morphisms $f$, $g$ and three identity morphisms (i.e. arrows that transform and object to itself) and a functor $F$ that lifts the transformations from red category, to green, where there are only two objects.
I am not entirely sure I could say, that you could think of i.e. predicates in python as one category, predicates in sql as other category, and that sql-alchemy would form a functor between them. But even if that wouldn't be entirely true, it gives nice intuition for what we would want to achieve with utilizing category-theory in programming.
Functor in Python¶
So what in practice is a functor?
In general, anything where you can define a sensible fmap(function, )
operation. The anything in question are mostly containers, but it might be a generic object providing aditional context/metadata, for value(s) it could encapsulate, or produce.
Sensible in this context means, that the interface obeys certain laws, but before we state them expricitly, I need to define two functions, id
and compose
:
def id(x):
return x
def compose(f,g):
return lambda x: f(g(x))
The idea goes like this:
- we have our generic type T, parametrized by type A, or T[A] in typings syntax
- we interpret T[A], that our T can in some way produce value(s) of type A
T is a functor, if
- if we can easily convert T[A] (a producer of A's) to T[B] (a producer of A's), by using the function
fmap(fn, t)
, if we have function that accepts A and returns B - identity function doesn't indroduce changes:
fmap(id, t) == t
- if we have functions f and g that are composable, it doesn't matter if we compose them first, or use fmap twice:
fmap(compose(f,g),t) == fmap(f,fmap(g,t))
The fmap(id, t) == t
and fmap(compose(f,g),t) == fmap(f,fmap(g,t))
are the functor laws. I try to explain why should we care about them later, I'd rather show few concrete examples now:
def fmapList(fn, lst):
return [fn(x) for x in lst]
fmapList(str,[1,2,3])
def fmapIterable(fn, iterable):
return (fn(x) for x in iterable)
class Tree:
def __init__(self,left=None,data=None,right=None):
self.left = left
self.right = right
self.data = data
def fmapTree(fn, tree):
if tree == None:
return None
else:
return Tree(
mapTree(fn,tree.left),
fn(tree.data),
mapTree(fn,tree.right)
)
Interestingly enough, there is a map
function in Python, and it works on iterables. Which unfortunately means that it allways returns an iterable. The interesting thing about functors is, that the map function preserves the type. If I map over list, I know it will be a list.
But there are more interesting things than containers that could be functors, for example functions:
def fmapFunction(function, fn):
def wrapper(*arr):
return fn(function(*arr))
return wrapper
Now we can convert function that generates random integers to function that generates strings.
import random
fmapFunction(random.randint, str)(1,10)
Lift for our functions¶
There is a recuring concept, where we use some bit of information about our datatype to convert functions that know nothig about it, to work with it.
We call this lifting.
For example, because I know that lists have map function, I could lift i.e. str
to work on lists of things.
def lift(fn):
def liftedfn(x):
return fmapList(fn,x)
return liftedfn
lifted_str = lift(str)
lifted_str([1,2,3])
Making this more usable¶
Interesting question is, how to make these more pleasan't to use. One approach might be defining fmap as @singledispatch, as we did with mconcat previously. Only thing that is a bit awkward is, that we can't have the function argument first.
from functools import singledispatch, partial
@singledispatch
def fmap(x, fn):
raise Error("Not implemented for" + a)
def lift(fn):
def liftedfn(x):
return fmap(x, fn)
return liftedfn
Now we could register a few classes that would work with this. I.e. all of the collections such as list:
@fmap.register([].__class__)
def _(lst, fn):
return [fn(x) for x in lst]
set:
@fmap.register(set().__class__)
def _(lst, fn):
return set(fn(x) for x in lst)
dictionary:
@fmap.register({}.__class__)
def _(d, fn):
return {x: fn(d[x]) for x in d}
Why should we care about lawfulness?¶
For functor, I have some intuition for its laws, because I had abstract algebra course at university and when I look at the laws my mins goes "This kida looks like definition of homomorphism, I remember those to be useful, I guess it makes sense then." I dont really remember the specifics, but because it is vaguely familliar, I am fine with it.
There is better reason to like lawful interfaces, though. Just by using the the laws of the interface, we can define new and usefull stuff.
Lets look at nesting of two different functors, where we would have T[U[A]]. It is easy to see, that for any a, T[U[a]] is a functor as well.
If map for T is tmap, and map for U is umap, we can define the tumap for the nesting of U in T like this:
def tumap(f,tu):
return tmap(lambda x: umap(f,x), tu)
We could write this as:
$$ \begin{align*} \text{map}_{t_u}(f, t_u) &= {map}_{t} ( \lambda x: map_u(f,x), t_u) \end{align*} $$And because we know the functor laws we can try to prove that they hold for this implementation as well.
- for any
tu
,tumap(id, tu) == tu
- for any t and any two composable functions f and g, tumap(f,tumap(g,tu)) == map(compose(f,g),tu)
Because we have laws for the interface, we can do these kinds of self-contained proofs.
Now we can do nested fmap¶
Because we know, that functor of a functor is a functor, we can attempt a nested fmap.
def fmapNested(xs, fn):
def recur(x):
if x.__class__ in fmap.registry.keys():
return fmap(x,recur)
else:
return fn(x)
return fmap(xs,recur)
def liftNested(fn):
return lambda x: fmapNested(x,fn)
Yay, now we can map over nested data-structures!
fmapNested([[1,2],{'a':[3,4]}],str)
Effect tracking¶
Now, the interesting thing you can use functors for, is to track various side-effects and their meta-data with them. This is what i.e. Haskell does. The fmap then gives you the ability to say "I don't care about anything, just let me operate on the inner value(s)". These values might not be there, there might be error instead, there might be many, you might need to wait for a http request to complete, but with fmap, you can choose to ignore it, and just fmap over the values.
class ErrorOrValue:
def __init__(self,error,value):
self._error = error
self._value = value
def map(self,fn):
if self._error:
return self
else:
return ErrorOrValue(None,fn(self._value))
def __repr__(self):
if self._error:
return "Error:" + self._error
else:
return "Value:" + repr(self._value)
@fmap.register(ErrorOrValue)
def _(x,fn):
return x.map(fn)
With this, I can implement a silly little square-root function:
- that would return the both results of square root :-)
- that returns an error if there is no result
import math
def sqrt(x):
if x>=0:
return ErrorOrValue(None,[math.sqrt(x), - math.sqrt(x)])
else:
return ErrorOrValue("Can't sqrt "+str(x),None)
print(sqrt(16))
print(sqrt(-16))
On one hand, suddenly I can't just do sqrt(sqrt(sqrt(256)))
sqrt(sqrt(sqrt(256)))
But I have my nested lift function, that can fix this:
lsqrt = liftNested(sqrt)
lsqrt(lsqrt(lsqrt([256])))
In practice, you wouldn't probably use this sort of nesting. But it hints on interesting building blocks that are useful. And for the most part, this is what functors are. A basic building block, that more interesting things are based upon. For example, right now, we can only lift single-param functions. We should be able to solve this somehow.