Suppose I have a record, e.g. Person
, and I want to be able to look this person up through multiple data structures. Maybe there's an index by name, another index by the person's zip code, and another index by the person's current latitude and longitude. And maybe many more data structures. All of which exist because I need to efficiently look up a person or persons with different criteria.
If I just need to read a person's attributes, this is no problem. But now suppose I need to look up a person using one of these data structures and then update the person's data.
In an OOP language, each data structure would point to the same person in memory. So when you update one, you're implicitly updating the referents of the other data structures as well. This is pretty much the definition of side effects and impurity. I know it's totally counter to the Haskell paradigm, and I'm not expecting Haskell to work this way.
So, what is the Haskell-ish way to do it? To be clear, the problem is this: I look up a person by one data structure, and I pass that person (and maybe some other arbitrary data) into a function of type ArbitraryData -> Person -> Person
. How do I propagate this change across all the various lookup structures?
As a relative newcomer to Haskell, my first instinct is to reconstruct every lookup structure with the newly updated person, every time I update a person. But that seems like a lot of ceremony, with plenty of opportunity for me to screw up in a way GHC can't detect, and not at all elegant. Haskell is known for its elegance, and I can't imagine it lacks an elegant solution to such a common and basic problem. So I think I'm missing something.
For reference, this question expands on some of the issues I was discussing in the following questions:
Multiple lookup structures for same data: Memory duplication?
Identity of simulation objects in Haskell
Edit
One solution that just crossed my mind: Don't maintain a copy of each lookup structure in your main state. Just keep one single list of all persons in existence, and that's the only thing we need to update when we update a person. Every time you need to lookup by, say, zip code, pass the list of all persons into a function that generates the efficient by-zip-code data structure. Then perform the lookup on the result.
I don't know if this would be efficient. If it results in the CPU actually recomputing the lookup structure on each use, it's unacceptable. But I know Haskell sometimes can avoid reevaluating identical expressions. Unfortunately, I still haven't figured out when this is the case. So I don't know if this approach is viable.
So in other words: Can I write my functions as if they're computing the lookup each time, when in fact GHC will optimize it away for cases where the underlying data hasn't changed? Because that would be a very elegant solution to the problem I've identified above.
See Question&Answers more detail:
os