Python: using functors to abstract IO

by Eli Korvigo   Last Updated August 10, 2018 18:05 PM

We are developing a data processing system centred around data samples and various ways one can transform them. The system is supposed to satisfy the following properties:

  1. It must be highly modular: a third-party user/developer should be able to add new and/or replace existing transformations;
  2. A third-party user/developer should be able to write any intermediate transformation into some sort of a database;
  3. A sample is a generic container, that has an ID invariant under any transformation;
  4. It must have a server-client API, that is the server side process (responsible for all the calculations) is supposed to run as a daemon.

To satisfy these properties, we have come up with a system heavily inspired by functional programming. We represent samples as functors, consequently their fmap (map in the following implementation) is responsible for applying any transformation. Apart from that, map also calls a special generic function write_sample that is responsible for writing any transformation. Users can add overloads via the add_writer function.

from typing import TypeVar, Generic
from oslash import Functor

from ourlib.io import add_writer


A = TypeVar('A')
B = TypeVar('B')


class Sample(Functor, Generic[A]):

    def __init__(self, sid: int, value: A):
        self._id = sid
        self._value = value

    @property
    def sid(self) -> int:
        return self._id

    def map(self, fn: Callable[[A], B]) -> 'Sample[B]':
        result = fn(self._value)
        output = type(self)(self.sid, result)
        write_sample(output)
        return output


@add_writer(object)
def write_sample(sample: Sample) -> None:
    # this is a polymorphic function that dispatches on sample content type;
    # users can add overloads for each content type;
    # the default implementation does nothing
    pass

By default write_sample does nothing if there is no strictly compatible overload for a particular Sample.value. So far this system has allowed us to satisfy the requirements without introducing too many interface requirements via abstract method specifications. There is a hiccup, though: we loose any control over possible exceptions in write_sample. This might crash the server-side process. I believe, we can put write_sample in a try/except block, catching all possible exceptions (i.e. except Exception: ...), but this practice is usually frowned upon. Alternatively, we can ask users to only supply write_sample instances that never raise errors, but I'm not sure this is a better thing to do.

P.S.

Ideally, any functor type should satisfy the following property sample.map(f).map(g) == sample.map(compose(f, g)). This property is technically satisfied, but the side-effects might be different: in the second case write_sample is only called once. This is theoretical remark, rather than an actual issue.



Related Questions



Python methods vs builtin functions

Updated June 22, 2017 14:05 PM


Python 3.6: Text Eraser Script

Updated May 24, 2018 09:05 AM