To begin with, let me offer short answers to each of these questions. I will then expand each into a longer detailed answer, but these short ones will hopefully help in navigating those.
No, Alternative
and Monoid
don’t mean different things; Alternative
is for types which have the structure both of Applicative
and of Monoid
. “Picking” and “combining” are two different intuitions for the same broader concept.
Alternative
contains empty
as well as <|>
because the designers thought this would be useful, and because this gives rise to a monoid. In terms of picking, empty
corresponds to making an impossible choice.
We need both Alternative
and Monoid
because the former obeys (or should) more laws than the latter; these laws relate the monoidal and applicative structure of the type constructor. Additionally, Alternative
can’t depend on the inner type, while Monoid
can.
MonadPlus
is slightly stronger than Alternative
, as it must obey more laws; these laws relate the monoidal structure to the monadic structure in addition to the applicative structure. If you have instances of both, they should coincide.
Doesn’t Alternative
mean something totally different from Monoid
?
Not really! Part of the reason for your confusion is that the Haskell Monoid
class uses some pretty bad (well, insufficiently general) names. This is how a mathematician would define a monoid (being very explicit about it):
Definition. A monoid is a set M equipped with a distinguished element ε ∈ M and a binary operator · : M × M → M, denoted by juxtaposition, such that the following two conditions hold:
- ε is the identity: for all m ∈ M, mε = εm = m.
- · is associative: for all m?,m?,m? ∈ M, (m?m?)m? = m?(m?m?).
That’s it. In Haskell, ε is spelled mempty
and · is spelled mappend
(or, these days, <>
), and the set M is the type M
in instance Monoid M where ...
.
Looking at this definition, we see that it says nothing about “combining” (or about “picking,” for that matter). It says things about · and about ε, but that’s it. Now, it’s certainly true that combining things works well with this structure: ε corresponds to having no things, and m?m? says that if I glom m? and m?’s stuff together, I can get a new thing containing all their stuff. But here’s an alternative intuition: ε corresponds to no choices at all, and m?m? corresponds to a choice between m? and m?. This is the “picking” intuition. Note that both obey the monoid laws:
- Having nothing at all and having no choice are both the identity.
- If I have no stuff and glom it together with some stuff, I end up with that same stuff again.
- If I have a choice between no choice at all (something impossible) and some other choice, I have to pick the other (possible) choice.
- Glomming collections together and making a choice are both associative.
- If I have three collections of things, it doesn’t matter if I glom the first two together and then the third, or the last two together and then the first; either way, I end up with the same total glommed collection.
- If I have a choice between three things, it doesn’t matter if I (a) first choose between first-or-second and third and then, if I need to, between first and second, or (b) first choose between first and second-or-third and then, if I need to, between second and third. Either way, I can pick what I want.
(Note: I’m playing fast and loose here; that’s why it’s intuition. For instance, it’s important to remember that · need not be commutative, which the above glosses over: it’s perfectly possible that m?m? ≠ m?m?.)
Behold: both these sorts of things (and many others—is multiplying numbers really either “combining” or “picking”?) obey the same rules. Having an intuition is important to develop understanding, but it’s the rules and definitions that determine what’s actually going on.
And the best part is that these both of these intuitions can be interpreted by the same carrier! Let M be some set of sets (not a set of all sets!) containing the empty set, let ε be the empty set ?, and let · be set union ∪. It is easy to see that ? is an identity for ∪, and that ∪ is associative, so we can conclude that (M,?,∪) is a monoid. Now:
- If we think about sets as being collections of things, then ∪ corresponds to glomming them together to get more things—the “combining” intuition.
- If we think about sets as representing possible actions, then ∪ corresponds to increasing your pool of possible actions to pick from—the “picking” intuition.
And this is exactly what’s going on with []
in Haskell: [a]
is a Monoid
for all a
, and []
as an applicative functor (and monad) is used to represent nondeterminism. Both the combining and the picking intuitions coincide at the same type: mempty = empty = []
and mappend = (<|>) = (++)
.
So the Alternative
class is just there to represent objects which (a) are applicative functors, and (b) when instantiated at a type, have a value and a binary function on them which follow some rules. Which rules? The monoid rules. Why? Because it turns out to be useful :-)
Why does Alternative
need an empty method/member?
Well, the snarky answer is “because Alternative
represents a monoid structure.” But the real question is: why a monoid structure? Why not just a semigroup, a monoid without ε? One answer is to claim that monoids are just more useful. I think many people (but perhaps not Edward Kmett) would agree with this; almost all of the time, if you have a sensible (<|>)
/mappend
/·, you’ll be able to define a sensible empty
/mempty
/ε. On the other hand, having the extra generality is nice, since it lets you place more things under the umbrella.
You also want to know how this meshes with the “picking”?intuition. Keeping in mind that, in some sense, the right answer is “know when to abandon the ‘picking’ intuition,” I think you can unify the two. Consider []
, the applicative functor for nondeterminism. If I combine two values of type [a]
with (<|>)
, that corresponds to nondeterministically picking either an action from the left or an action from the right. But sometimes, you’re going to have no possible actions on one side—and that’s fine. Similarly, if we consider parsers, (<|>)
represents a parser which parses either what’s on the left or what’s on the right (it “picks”). And if you have a parser which always fails, that ends up being an identity: if you pick it, you immediately reject that pick and try the other one.
All this said, remember that it would be entirely possible to have a class almost like Alternative
, but lacking empty
. That would be perfectly valid—it could even be a superclass of Alternative
—but happens not to be what Haskell did. Presumably this is out of a guess as to what’s useful.
Why does the Alternative
type class need an Applicative
constraint, and why does it need a kind of * -> *
? … Why not just [use] liftA2 mappend
?
Well, let’s consider each of these three proposed changes: getting rid of the Applicative
constraint for Alternative
; changing the kind of Alternative
’s argument; and using liftA2 mappend
instead of <|>
and pure mempty
instead of empty
. We’ll look at this third change first, since it’s the most different. Suppose we got rid of Alternative
entirely, and replaced the class with two plain functions:
fempty :: (Applicative f, Monoid a) => f a
fempty = pure mempty
(>|<) :: (Applicative f, Monoid a) => f a -> f a -> f a
(>|<) = liftA2 mappend
We could even keep the definitions of some
and many
. And this does give us a monoid structure, it’s true. But it seems like it gives us the wrong one . Should Just fst >|< Just snd
fail, since (a,a) -> a
isn’t an instance of Monoid
? No, but that’s what the above code would result in. The monoid instance we want is one that’s inner-type agnostic (to borrow terminology from Matthew Farkas-Dyck in a very related haskell-cafe discussion which asks some very similar questions); the Alternative
structure is about a monoid determined by f
’s structure, not the structure of f
’s argument.
Now that we think we want to leave Alternative
as some sort of type class, let’s look at the two proposed ways to change it. If we change the kind, we have to get rid of the Applicative
constraint; Applicative
only talks about things of kind * -> *
, and so there’s no way to refer to it. That leaves two possible changes; the first, more minor, change is to get rid of the Applicative
constraint but leave the kind alone:
class Alternative' f where
empty' :: f a
(<||>) :: f a -> f a -> f a
The other, larger, change is to get rid of the Applicative
constraint and change the kind:
class Alternative'' a where
empty'' :: a
(<|||>) :: a -> a -> a
In both cases, we have to get rid of some
/many
, but that’s OK; we can define them as standalone functions with the type (Applicative f, Alternative' f) => f a -> f [a]
or (Applicative f, Alternative'' (f [a])) => f a -> f [a]
.
Now, in the second case, where we change the kind of the type variable, we see that our class is exactly the same as Monoid
(or, if you still want to remove empty''
, Semigroup
), so there’s no advantage to having a separate class. And in fact, even if we leave the kind variable alone but remove the Applicative
constraint, Alternative
just becomes forall a. Monoid (f a)
, although we can’t write these quantified constraints in Haskell, not even with all the fancy GHC extensions. (No