Tuesday, April 24, 2018

Functional Python Programming 2e -- Type Hints!

You might want to look into this: Functional Python Programming - Second Edition.

Let's talk about the type hints, shall we?

Most of the examples have had type hints added. This means running everything through mypy. And it also means running everything through doctest, as well.

More important than the technical steps, there's a change in viewpoint that comes with type hints.

If you follow a variety of Pythonistas on Twitter, you can see some debates on the merits of type-hinting. Some key points:

  • It's hard.
  • It's so hard, only do it if you absolutely need it.
  • It's too verbose
  • It's hard, but it can help.
  • It's really helpful.
  • It represents a "gap" in the language and without run-time type checking, the whole thing is worthless.
The last point a weird view. I work in a shop that's heavily Pythonic. But. You still hear nonsense. Python a very popular language and it's popularity is growing. The popularity of Python isn't like the popularity of a movie where you're not planning on making a living off of it (I know someone who makes their living off the popularity of movies.) The popularity of Python is like the popularity of automobiles or air travel or electricity.

I hear the "a real language would have prevented that with type-checking." And I respond, "Then why do you unit test?" And they don't really have much of an answer. Python has the same workflow as statically type-checked languages, so the "prevention" thing seems to be nonsense.

Moving on.

"It's hard." Anything new is hard. The complaint is vague, so it's *hard* to respond. (Heh.)

Anything like "only do it if you absolutely need it" bothers me because it seems like a passive-aggressive barricade around things. Also. It's vague.

Verbosity

Verbosity in type hints is a real problem. When creating complex objects from built-in types, we often forget to give names to the intermediate object classes.

Consider Dict[Tuple[Tuple[int, int], Tuple[int, int]], float]

It's long. It describes a structure like this {((12, 13), (14, 15)): 2.8284271247461903, ...} 

Writing something like the following d_map() function without hints is easy. Adding hints seems hard.

def d_map(points):
    return {(p1, p2): hypot(p1[0]-p2[0], p1[1]-p2[1]) for p1, p2 in points)}

The declaration became L.. O... N... G... because we ignored the intermediate types.

def d_map(points: List[Tuple[Tuple[int, int], Tuple[int, int]]]) -> Dict[Tuple[Tuple[int, int], Tuple[int, int]], float]:
    return {(p1, p2): hypot(p1[0]-p2[0], p1[1]-p2[1]) for p1, p2 in points)}

These hints, however, doesn't really describe what's happening. The hints elide important details. The hints don't reflect the underlying semantics of the data structure.

One of Python's strengths is the rich collection of first-class data structures with built-in syntax. We can abbreviate some complex concepts into succinct, expressive code.

However.

We shouldn't lose sight of what the succinct code represents. And in this case, it represents some rather complex concepts.
<rant>
Let me sit in my lawn chair and shake my fist in helpless fury at you kids. When I was your age, we sent half a semester of undergraduate work trying to get linked lists, and simple hash mapping to work. Months of work. Later on, as a professional -- years of actual experience -- it took forever to build a binary tree-based collections.Counter definition to gather simple numbers from a flat file. Nowadays, you just slap a Counter down into your code like it's a nothing. It's not a nothing. It's serious, sophisticated software engineering. It's more than Dict[Any, int]. </rant>

What can we do?

When in doubt, Expose the Intermediate Types.

Point = Tuple[int, int]
Leg = Tuple[Point, Point]
Distances = Dict[Leg, float]
def d_map(points: Iterable[Leg]) -> Distances:
    return {(p1, p2): hypot(p1[0]-p2[0], p1[1]-p2[1]) for p1, p2 in points)}

This exposes the details. In some cases, it causes us to rethink using a two-tuple to represent a point. The p1[0] syntax starts to chafe a little. Perhaps this should have been

class Point(NamedTuple):
    x: int
    y: int

That leads to tiny (almost-but-not-quite trivial) simplifications. Instead of building simple tuples for each point, we can now build named Point tuples and use p1.x and p1.y to make the code more civilized.

One consequence of this is actually avoiding (), [], and {} to build tuples and lists. Yes. This is heresy. I seriously recommend using tuple(), list(), dict(), and set() because we can replace them with equivalent types. And yes, I text my mother with the same fingers that wrote that.

"But," you object, "It's objectively LONGER! You didn't save me anything! You're a fraud!"

My first response is, "Correct." It is objectively longer. And "Correct," I didn't really "save" you anything; I'm not sure what you're saving. Lines of code do have a cost, but I think clarity has value. And finally, "Correct," I've often been wrong, and I may be wrong here, too.

I like this because the type definitions are reusable, I think this can add clarity throughout the application.

When this kind of declaration is part of a reusable module, the goodness spreads like smiles and hugs throughout the application. Before long, other functions have been tweaked and everyone is sending each other little teddy-bear hug gifts with rainbow cupcakes.

(Please don't exchange mylar balloons. They're evil. Also, see this.)

tl;dr

When your type hints seem ungainly and large, consider Exposing the Intermediate Types. Break down a big structural type hint into the constituent pieces.

If you had to create a class definition for EVERY variation on list, dict, set, and tuple, what would your new class be named?

If you had to describe the underlying meaning of a class -- separate from it's structure -- what name would you give it?

Picking names is one of the two hardest problems in computing. It isn't easy. (The other hardest problem? Cache invalidation and off-by-one errors.)

Friday, April 6, 2018

Should I use x.__len__() or len(x)?

In the context of providing type hints, someone had a function like this.

def f(x: Sized) -> Whatever: ...

And, since sized objects have a __len__() method it seemed sensible to use x.__len__(). It was a good question about the use of special methods.

My advice is to avoid using the special methods in general. Use them only when defining classes that need to behave like Python objects.

(I'll make an exception for using x.__dict__, to avoid having to introduce an explicit dictionary object when there's one built-in to most objects.)

Use len(x) and be happy.  The function wrapper around a special method is a common Python feature; it occurs in many places; use it.

Tuesday, April 3, 2018

RESTful Web Services Design

This -- REST is the new SOAP -- has so many demolished strawman arguments that it feels like looking at a van Gogh painting of people harvesting wheat.

I won't dive into listing all the strawmen. Most of my responses are approximately "How is that an actual problem?" or "Yes, it was new to you, so?" or "Yes, people disagreed with each other over an implementation choice."

Some of the observations about "proper REST" vs. "bah, that's not really RESTful" point out the differences between expedient REST-like design and really good REST design. Some of these considerations can be helpful.

The one point worthy of deeper thought is the nature of verb-heavy highly-stateful RPC design and RESTful noun-heavy design. The question here is the definition of state and the nature of state change. Some people appear to be enthralled with many nuanced state changes. I've been doing too much data warehouse and functional design where the data is essentially stateless and CRUD rules are refined down to CRD with a rare U under limited circumstances.

And, yes, that means using relatively "stateless" OO design where an object is wrapped inside a new object that includes derived data or a compositions of stateless objects. The following example leverages duck typing to create immutable objects where the class reflects the state of the object.

class Thing:
    def __init__(self, a, b):
        self.a, self.b = a, b
    def set_c(self, c):
        return DerivedThing(self, c)

class DerivedThing:
    def __init__(self, thing: Thing, c):
        self.thing, self.c = thing, c
    @property
    def a(self):
        return self.thing.a
    @property
    def b(self):
        return self.thing.b
    @property
    def value(self):
        return self.a * self.c + self.b

And, yes, I'm not building things which are absolutely stateless because Python has stateful lists and mappings, and web services rely on stateful persistence. And, yes, I reject functional purism because I'm stupid. Can we move on, now?

Something that seemed essential to me (but appears to be confusing from reading complaints about REST) is understanding the notion of "state." One view of state is an aggregation of details. The final state of an object is a reduction over the changes -- akin to a sum(), max(), or min(), or perhaps something more involved like last(). The paucity of REST verbs is not a problem when you understand current state as the end product of applying a journal of previous state change mementos. Each "change", then, isn't a complex Update (REST Put or Patch) where there aren't enough verbs to describe each nuanced change. It's a Create (REST Post) of the next change memento. The RESTful service can eagerly apply the change to compute the current state. Or it can lazily apply the changes to compute the current state.

Some of the blog post cited above sounds like "it was new and I didn't like it." Therefore, read the article, locate the strawmen, and know there will always be someone who will complain. Some of the complaints will have merit, some will be whining about the novelty.

In a RESTful context, I'm a fan of this kind of pattern.

/things
    post:
        summary: Creates a new thing with a and b
    responses:
        201:
            description: thing was created
/things/{id}/c
    post:
        summary: Sets a value of c for an existing thing, previous value is discarded.
    responses:
        201:
            description: c property of thing {id} was set
       
For more useful advice, start here, for example: RESTful API Designing guidelines — The best practices. Articles like this are useful, too: 10 Best Practices for Better RESTful API.

Tuesday, March 27, 2018

Functional Python Programming 2e -- Now With Type Hints

Functional Python Programming, 2nd ed.

This has been fun to cleanup some rambling, reset the math to be sure it's actually right.

And.

Type Hints.

Almost every example has had type hints added.

(And I raised the pylint scores be rearranging some spacing and what-not.)

Bonus. We will be moving the publication date up from June to possibly April. We're still doing technical reviews and what-not, so things aren't *done*.

What was hardest?

Generics, specifically, decorators can have quite complex type hints. Indeed, type hinting raises important questions about trying to write super-generic functions that can handle too wide a spectrum of types.

def some_function(arg):
    if isinstance(arg, dict):
        do_something(arg)
    elif isinstance(arg, list):
        do_something({i: v for i, v in enumerate(arg)})
    else: 
        do_something(dict(arg=arg))


This kind of thing turns out to be ill-advised. It's probably a bad design. More importantly, it's difficult to annotate, making it difficult to discern if it behaves correctly.

In this case, the argument is Union[Dict, Sequence, Any]. I've got a few examples of Union types, but they're rare because I'm not a fan in the first place. And the few places I used them, the complexity of getting past mypy type checks showed that they add risk and cost without a dramatic reduction in complexity.

In this specific case, the some_function() function is merely a type-converting wrapper around the do_something() function. It's probably better to refactor the type conversion responsibility into the clients of some_function().

The arguments about "encapsulation" or "the client shouldn't know that detail" are generally kind of silly. We're all adults here, we generally have to know what's going on with respect to the conversions in order to use the function correctly and write unit tests.

Tuesday, March 20, 2018

HATEOAS is useless? Or not used enough?

See Why HATEOAS is useless and what that means for REST.

The article provides a background leading up to these observations:
  • There are very few good tools to create a REST API using this style
  • There are no clients widely used to consume these types of APIs
The "useless" in the title is more like "not used enough."

There's a multi-part conclusion that may be more helpful if it's fleshed out further. For now, however, it appears that the big problems center around:
  • You still need to write Open Api Specifications (OAS, f/k/a Swagger). I don't think this is bad. The blog post makes it sound like a problem. I think it's essential.
  • You need to put versioning somewhere. The path is less than idea. I'm big on the Accept header containing application-specific MIME types. For example, application/vnd.com.your-name-here.app.json+v1. This doesn't strike me as a problem, either.
  • The whole approach is "closer to RPC than some REST lovers like to admit." I think this point revolves around the way JSON-RPC or SOAP involves some overheads above basic HTTP that are unhelpful. I don't think the "closer to RPC" follows logically from the lack of tooling for HATEOS, but it certainly could be true that a badly-done API might involve too many of the wrong kinds of overheads.
I think there's a hidden strawman here. The "automatic discovery" idea. I don't think this idea makes a lick of sense. Some people think it's implied (or required) by REST, and any failure to provide for fully-automated semantically rich discovery of an API is some kind of failure.

I don't think full semantic discovery is possible or even desirable. 
  • It's not possible because of the problem of assigning names and meanings to resources and verbs in an end-point. The necessary details can only be exposed with a semantically complete ontology and complex SPARQL queries into the ontology to find resources and end-points. 
  • It's not desirable because we replace a human-focused OAS with a complete ontology that has to be rigorously defined, and tested to be sure that all kinds of automated discovery algorithms can understand the provided details. And none of this addresses the actual application, it's all rich, detailed meta-description of the application.
I don't see why we're trying to replace people. API discovery is actually kind of hard. The resources, their relationships, and the verbs for getting or updating those resources involves an essentially difficult knowledge capture and dissemination problem. 

Friday, March 9, 2018

Python Interviews

The #Python Interviews book is out. Mike Driscoll interviewed a bunch of Python experts. And me, too. get 30% off the Amazon paperback version of the book using the code 30PYTHONhttps://goo.gl/5A3uhq

Here's a flavor of how this went:

Driscoll: So how did you end up becoming an author of Python books?
Lott: Most roles in my career more or less just happened to me, but becoming a writer was a conscious decision.
In this case, I had decided that there could be value in teaching the Python language and the associated software engineering skills. I started to collect notes for a book in 2002. By 2010, I had tried self-publishing several books on Python.
When Stack Overflow started, I was an early participant. There were many interesting Python questions. The questions showed gaps where more information was needed about Python specifically and software engineering in general. Over a few years, I answered thousands of questions about Python and somehow built up a large reputation.

Monday, March 5, 2018

Python Interviews -- Coming Soon from Packt

See https://www.packtpub.com/web-development/python-interviews

I'm honored.

I'll be studying what the other folks have to say in here. Being in the Python community means respecting other's views. And that means understanding them.

This looks like fun because it isn't *deeply* technical, it's about people and technology.