We are developing a data processing system centred around data samples and various ways one can transform them. The system is supposed to satisfy the following properties:
To satisfy these properties, we have come up with a system heavily inspired by functional programming. We represent samples as functors, consequently their
map in the following implementation) is responsible for applying any transformation. Apart from that,
map also calls a special generic function
write_sample that is responsible for writing any transformation. Users can add overloads via the
from typing import TypeVar, Generic from oslash import Functor from ourlib.io import add_writer A = TypeVar('A') B = TypeVar('B') class Sample(Functor, Generic[A]): def __init__(self, sid: int, value: A): self._id = sid self._value = value @property def sid(self) -> int: return self._id def map(self, fn: Callable[[A], B]) -> 'Sample[B]': result = fn(self._value) output = type(self)(self.sid, result) write_sample(output) return output @add_writer(object) def write_sample(sample: Sample) -> None: # this is a polymorphic function that dispatches on sample content type; # users can add overloads for each content type; # the default implementation does nothing pass
write_sample does nothing if there is no strictly compatible overload for a particular
Sample.value. So far this system has allowed us to satisfy the requirements without introducing too many interface requirements via abstract method specifications. There is a hiccup, though: we loose any control over possible exceptions in
write_sample. This might crash the server-side process. I believe, we can put
write_sample in a try/except block, catching all possible exceptions (i.e.
except Exception: ...), but this practice is usually frowned upon. Alternatively, we can ask users to only supply
write_sample instances that never raise errors, but I'm not sure this is a better thing to do.
Ideally, any functor type should satisfy the following property
sample.map(f).map(g) == sample.map(compose(f, g)). This property is technically satisfied, but the side-effects might be different: in the second case
write_sample is only called once. This is theoretical remark, rather than an actual issue.