design-notes.rst 6.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128
  1. .. _design:
  2. 8. Design Notes
  3. ===============
  4. This section describes some design decisions in the ThingFlow API
  5. that are or were under discussion.
  6. Closed Issues
  7. -------------
  8. These issues have already been decided, and any recommended changes implemented
  9. in the ThingFlow API. The text for each issue still uses the future tense,
  10. but we provide the outcome of the decision at the end of each section.
  11. Basic Terminology
  12. ~~~~~~~~~~~~~~~~~
  13. The terminology has evolved twice, once from the original *observer* and
  14. *observable* terms used by Mirosoft's RxPy to *subscribers* and *publishers*.
  15. Our underling communication model is really an internal publish/subscribe
  16. between the "things". This was the terminology used in our AntEvents framework.
  17. We still found that a bit confusing and changed to the current terminology
  18. of *input things* and *output things*. Rather than topics, we have ports, which
  19. are connected rather than subscribed to. We think this better reflects the
  20. dataflow programming style.
  21. ** Outcome**: Changed
  22. Output Things, Sensors, and the Scheduler
  23. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  24. Today, sensors are just a special kind of output thing. Depending on whether it is
  25. intended to be blocking or non-blocking, it implements ``_observe`` or
  26. ``observe_and_enqueue``. The reasoning behind this was to make it impossible to
  27. schedule a blocking sensor on the main thread. Perhaps this is not so important.
  28. If we relaxed this restriction, we could move the dispatch logic to the
  29. scheduler or the the base ``OutputThing`` class.
  30. This change would also allow a single output thing implementation to be used with
  31. most sensors. We could then build a separate common interface for sensors,
  32. perhaps modeled after the Adafruit Unified Sensor Driver
  33. (https://github.com/adafruit/Adafruit_Sensor).
  34. **Outcome**: Changed
  35. We created the *sensor* abstraction and the ``SensorAsOutputThing`` wrapper class to
  36. adapt any sensor to the output thing API. We left the original output thing API,
  37. as there are still cases (e.g. adapters) that do not fit into the sensor
  38. sampling model.
  39. Open Issues
  40. -----------
  41. At the end of each issue, there is a line that indicates the current bias for
  42. a decision, either **Keep as is** or **Change**.
  43. Disconnecting
  44. ~~~~~~~~~~~~~
  45. In the current system, the ``OutputThing.connect`` method returns a "disconnect"
  46. thunk that can be used to undo the connection. This is modeled after the
  47. ``subscribe`` method in Microsoft's Rx framework. Does this unnecessarily
  48. complicate the design? Will real dataflows use this to change their structure
  49. dynamically? If we eventually implement some kind of de-virtualization, it
  50. would be difficult to support disconnecting. Also, it might be more convenient
  51. for ``connect`` to return either the connected object or the output thing, to
  52. allow for method chaining like we do for filters (or is that going to be too
  53. confusing?).
  54. As an argument for keeping the disconnect functionality, we may want to change
  55. scheduled output things so that, if they have no connections, they are
  56. unscheduled (or we could make it an option). That would make it easy to stop a
  57. sensor after a certain number of calls by disconnecting from it.
  58. Bias: **Keep as is**
  59. Terminology: Reader/Writer vs. Source/Sink
  60. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  61. We introduced the *reader* and *writer* terms to refer to output things that
  62. introduce event streams into the system and input things that consume event
  63. streams with no output, respectively.
  64. A thing that accepts messages from an external source is a *output thing*
  65. in our system and an thing that emits messages to an external distination is an
  66. *input thing*. That is really confusing!
  67. Reader/writer is better, but it might still be confusion that a reader is
  68. injecting messages into an ThingFlow dataflow. Perhaps the terms *source*
  69. and *sink* would be more obvious. Is it worth the change?
  70. Bias: **Keep as is**
  71. The ``on_error`` Callback
  72. ~~~~~~~~~~~~~~~~~~~~~~~~~
  73. Borrowing from Microsoft's Rx framework, ThingFlow has three callbacks on each
  74. subscriber: ``on_next``, ``on_completed``, and ``on_error``. The ``on_error`` callback
  75. is kind of strange: since it is defined to be called *at most once*, it is
  76. really only useful for fatal errors. A potentially intermittent sensor error
  77. would have to to be propagated in-band (or via another topic in ThingFlow).
  78. In that case, what is the value of an ``on_error`` callback over just throwing a
  79. fatal exception? ThingFlow does provide a ``FatalError`` exception class. Relying
  80. just on the ``on_error`` callbacks makes it too easy to accidently swallow a fatal
  81. error.
  82. There are two reasons I can think of for ``on_error``:
  83. 1. Provide downstream components a chance to release resources. However, if we
  84. going to stop operation due to a fatal error, we would probably just want to
  85. call it for all active things in the system (e.g. an unrelated thing may
  86. need to save some internal state). We could let the system keep running, but
  87. that may lead to a zombie situation. It is probably better to fail fast and
  88. let some higher level component resolve the issue (e.g. via a process restart).
  89. 2. If a sensor fails, we may want to just keep running and provide
  90. best guess data going forward in place of that sensor. The ``on_error``
  91. callback gives us the opportunity to do that without impacting the downstream
  92. things. However, I am not sure how likely this use case is compared to the
  93. case where we have an intermittent error (e.g. a connection to a sensor node
  94. is lost, but we will keep retrying the connection).
  95. In general, error handling needs more experience and thought.
  96. Bias: **Change, but not sure what to**
  97. Related Work
  98. ------------
  99. The architecture was heavily influenced by Microsoft's Rx_ (Reactive Extensions)
  100. framework and the Click_ modular router. We started by trying to simplfy Rx for
  101. the IoT case and remove some of the .NETisms. A key addition was the support for
  102. multiple ports, which makes more complex dataflows possible.
  103. .. _Rx: https://msdn.microsoft.com/en-us/data/gg577609.aspx
  104. .. _Click: http://read.cs.ucla.edu/click/click