Purely Functional Data Structures
All
Stack Overflow 46
This Year
Stack Overflow 1
This Month
Stack Overflow 1
In my experience with FP you start with more of a top down approach in contrast to the bottom up approach discussed in the blog post.
There are many good resources on this. I've referenced a few below. Of course there are many more.
[1] https://www.amazon.com/Pearls-Functional-Algorithm-Design-Ri...
[2] https://www.amazon.com/Purely-Functional-Data-Structures-Oka...
[3] https://www.amazon.com/Algorithm-Design-Haskell-Richard-Bird...
[4] https://www.amazon.com/Structure-Interpretation-Computer-Pro...
For example, if you have a vector of 100 items, and you "mutate" that by adding an item (actually creating a new vector), the language doesn't allocate a new 101-length vector. Instead, we can take advantage of the assumption of immutability to "share structure" between both vectors, and just allocate a new vector with two items (the new item, and a link to the old vector.) The same kind of idea can be used to share structure in associative data structures like hash-maps.
I'm no expert on this, so my explanation is pretty anemic and probably somewhat wrong. If you're curious, the book "Purely Functional Data Structures" [0] covers these concepts in concrete detail.
[0]: https://www.amazon.com/Purely-Functional-Data-Structures-Oka...
Also available as a free PDF: https://www.cs.cmu.edu/~rwh/theses/okasaki.pdf
In addition to what the others said: if you need efficient, but yet immutable data structures (which should be an idiomatic F# way), you have to consider reading Chris Okasaki, Purely Functional Data Structures. There is also a thesis available (on which the book is based).
I think what you are looking for is an immutable map. Functional (or functionally inspired) programming languages (such as Haskell or Scala) have immutable versions of most of the containers you'd find in the STL. Operations such as insertion/removal etc. then return a copy of the map (preserving the original) with the copy containing your requested modification. A lot of work has gone into designing the datastructures so that the copies are able to point to as much of the original datastructure as possible to reduce time and memory complexity of each operation.
You can find a lot more details in a book such as this one: http://www.amazon.co.uk/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504.
This book has all the answers: http://www.amazon.co.uk/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504
By functional I assume you mean an immutable queue?
If you use F# and .NET there are for example:
If you like to read on how to implement a functional queue I recommend Purely Functional Data Structures by Chris Okasaki.
One of the first ways Okasaki implements a functional queue is using two
List<>
, one you pop from and one you push to. When the pop list is empty the push queue is reversed and becomes the pop list.Bear in mind this is in many ways a rather inefficient queue but it's also rather simple:
https://smile.amazon.com/Purely-Functional-Data-Structures-O...
I think the author is using the term "purely functional data structure" to mean a data structure that lends itself to an implementation in a purely functional language [1,2].
[1] https://www.amazon.com/Purely-Functional-Structures-Chris-Ok...
It's a reasonably complicated topic, but the basic idea is that since the data structures are immutable, much of their structure can be shared between versions with few changes. Most of these data structures end up using a Tree of some sort. Performance characteristics can be influenced to an extent by using bucketing to control the width/height of the tree.
[1] : Purely Functional Data Structures (Amazon: https://www.amazon.com/Purely-Functional-Structures-Chris-Ok...) [2] http://hypirion.com/musings/understanding-persistent-vector-... [3] https://www.youtube.com/watch?v=7BFF50BHPPo
This article, Haskell: Queues without pointers, describes a purely functional queue with O(1) amortized cost (edit: for adding and removing elements). I think the data structure comes from Chris Okasaki and more details are in his book.
The basic idea is to decompose the queue into two lists, one for the front and one for the back. New elements are added to "front". "Back" is stored in reverse order, to facilitate popping elements. When all elements of "back" are gone, "front" is reversed and re-identified as "back". This data structure has O(1) amortized cost for these operations, but apparently with some work it can be reduced to O(1), proper.
Edit: Okasaki's paper describes an elegant, purely functional implementation of queues and double-ended queues (deques). Deques allow adding or removing elements from either end. All such operations are O(1), worst case.
First of all, don't do premature optimisations. Have you measured your code? Maybe there is some concrete bottlenecks?
Since most of the objects consist of smaller objects joined via data stuctures, I think you can get rid of this problem by using persistent data structures .
Here is wonderful talk about some of them by Daniel Spiewak. If you want more, take a look on Purely Functional Data Structures by Chris Okasaki.
You can use a purely functional map implementation, where you just ignore the values.
See http://hackage.haskell.org/packages/archive/containers/0.1.0.1/doc/html/Data-IntMap.html (linked to from https://cstheory.stackexchange.com/questions/1539/whats-new-in-purely-functional-data-structures-since-okasaki ).
(sidenote: For more information on functional datastructures, see http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504 )
I don't think there's a shortcut here.
Just as you imply, Clojure's persistent data structures are quite a different thing from the immutable collections classes that Cocoa provides.
If you want to use Clojure's persistent data structures from Obj-C, the only way to do so is to re-implement them in Objective-C. My understand is that many of these are described in Okasaki's book, Purely Functional Data Structures, and in the papers of Phil Bagwell.
This other answer has some links: What is the data structure behind Clojure's sets?.
A related question was asked before: "Does functional programming replace GoF design patterns", with great responses.
The equivalent of "design patterns" is very vague in FP. In general, every time you see a "pattern" in your code you should create something to cover all of its uses in a uniform way. Often it will be a higher-order function.
For example, the following C code
can be thought of some basic "design pattern" - checking if there's some special element on the list. This snippet could appear many times in code with different conditions. In FP, you simply use a higher order function several times. It's not a "pattern" anymore.
Functional programming has its own practices, but they are much different from "design patterns" in OOP. They include use of polymorphism, lists, higher-order functions, immutability/purity, laziness [not all are essential or specific to FP]... See also "what are core concepts of FP". Also, type classes (Haskell), modules and functors (OCaml), continuations, monads, zippers, finger trees, monoids, arrows, applicative functors, monad transformers, many purely functional data structures (book) etc. Functional pearls, already mentioned by Randall Schulz, form a very rich resource of FP at its best.
To learn how to write idiomatic code, any book/resource on a functional programming language will suffice IMHO (for example, RWH and LYAH); differences between thinking imperatively and functionally are always explained there.
In dynamic languages, Jeff Foster's link is a good collection; here is a very clever use of memoization in JavaScript that could be considered a "design pattern".
You could try references to his book by Haskell or Clojure folk rather than just the CMU pdf : e.g.,
http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504
There was a question here on SO at :
What is the benefit of purely functional data structure?
There is also Clojure area this :
https://github.com/viksit/clojure-datastructures
And there was this on SE :
https://cstheory.stackexchange.com/questions/1539/whats-new-in-purely-functional-data-structures-since-okasaki
Hope something there provides a basis for a search that bears results :-)
You may have to use an academic or biz ref search engine and you may want to look at poster sessions at a conf because search is not obvious here, e.g., Mercury can generate Erlang code ... so searching caching and locality with respect to performance in functional programming in some hardware area dealing with latency.
Canada'a National Research Council (NRC) had some work going on ... you could try a search of their pub's/notices/reports
But note: a search with
bigdata latency locality NRC 2012
gives rather different result from
bigdata functional latency locality NSF 2012
( and I would next drop the 2012 and try using the google search tool date range option for recent results)
This would be a long answer, so please let me apologize in advance for pointing you towards a book and not directly answering. I highly recommend looking at Purely Functional Data Structures which is (legally) available as a PDF from the author. Though it's a good book to have in print/ebook anyway.
and the super short answer is use Clojure's built in
sorted-map
if you want this in practice (though writing your own will of course get nerd-street-cred) because sorted maps use a binary red-black-tree under the hoodIt's because of immutable state. A list is an object + a pointer, so if we imagined a list as a Tuple it might look like this:
Now let's get the first item in this "list" with a "head" function. This head function takes O(1) time because we can use fst:
If we want to swap out the first item in the list with a different one we could do this:
Which can also be done in O(1). Why? Because absolutely no other element in the list stores a reference to the first entry. Because of immutable state, we now have two lists,
tupleList
andtupleList2
. When we madetupleList2
we didn't copy the whole list. Because the original pointers are immutable we can continue to reference them but use something else at the start of our list.Now let's try to get the last element of our 3 item list:
That happened in O(3), which is equal to the length of our list i.e. O(n).
But couldn't we store a pointer to the last element in the list and access that in O(1)? To do that we would need an array, not a list. An array allows O(1) lookup time of any element as it is a primitive data structure implemented on a register level.
(ASIDE: If you're unsure of why we would use a Linked List instead of an Array then you should do some more reading about data structures, algorithms on data structures and Big-O time complexity of various operations like get, poll, insert, delete, sort, etc).
Now that we've established that, let's look at concatenation. Let's concat
tupleList
with a new list,("e", ("f", []))
. To do this we have to traverse the whole list just like getting the last element:The above operation is actually worse than O(n) time, because for each element in the list we have to re-read the list up to that index. But if we ignore that for a moment and focus on the key aspect: in order to get to the last element in the list, we must traverse the entire structure.
You may be asking, why don't we just store in memory what the last list item is? That way appending to the end of the list would be done in O(1). But not so fast, we can't change the last list item without changing the entire list. Why?
Let's take a stab at how that might look:
IF I modify "last", which is an immutable variable, I won't actually be modifying the pointer for the last item in the queue. I will be creating a copy of the last item. Everything else that referenced that original item, will continue referencing the original item.
So in order to update the last item in the queue, I have to update everything that has a reference to it. Which can only be done in optimally O(n) time.
So in our traditional list, we have our final item:
But if we want to change it, we make a copy of it. Now the second last item has a reference to an old version. So we need to update that item.
But if we update the second last item we make a copy of it. Now the third last item has an old reference. So we need to update that. Repeat until we get to the head of the list. And we come full circle. Nothing keeps a reference to the head of the list so editing that takes O(1).
This is the reason that Haskell doesn't have Doubly Linked Lists. It's also why a "Queue" (or at least a FIFO queue) can't be implemented in a traditional way. Making a Queue in Haskell involves some serious re-thinking of traditional data structures.
If you become even more curious about how all of this works, consider getting the book Purely Funtional Data Structures.
EDIT: If you've ever seen this: http://visualgo.net/list.html you might notice that in the visualization "Insert Tail" happens in O(1). But in order to do that we need to modify the final entry in the list to give it a new pointer. Updating a pointer mutates state which is not allowed in a purely functional language. Hopefully that was made clear with the rest of my post.