Functional Programming
Functional programming is a style of programming. Some use the words programming
paradigm. The more standard styles are procedural and objectoriented, which
are part of the imperative paradigm. Functional programs fundamentally evaluate
expressions; imperative programs fundamentally execute statements. Functional
programming uses the simplicity and power of functions to accomplish
programming tasks.
In a purely functional solution to a problem, there will be no mutation to data
structures, and special functions calling functions (not looping) will be the
primary control structure. Primarily this lecture discusses functional
programming using three functions: map, filter, and reduce, which each take a
function as an argument.
Functional languages easily support and frequently use the passing of functions
as arguments to other functions (these other functions are called "higherorder"
functions or functionals) and functions returning other functions as their
results. Python allows both.
There are programming languqages that are purely function (Haskell), others that
are mostly functional (ML whose major implementations are SML and OCaml the
Scheme dialect of Lisp, and Erlang), and still others that at least support a
functional style of programming (some better, some worse) when it is useful.
Python is in this latter category, although features like comprehensions in
Python emphasize its functional programming aspects (lambdas fall into this
category too).
Generally, functional programming is characterized as using immutable objects
and no state changes (not even rebinding of names). Strings, tuples and
frozensets are all immutable in Python, but lists, sets, and dicts are not.
In functional programming, we don't change data structures but produce new ones
that are variants of the old ones. For example, if we want to "change" the first
value of a tuple t to 1, we cannot: instead we create a new tuple whose first
values is 1 and whose other values are the other values in the tuple, using the
concatenation operator. The new tuple is specified as (1,)+t[1:]; note that we
need the comma in (1,) to distinguish it from (1): the former is a tuple
containing the value 1, the later is an int.
Functional programming creates lots of objects and must do so in a time and
space efficient way, and for the most part, functional languages achieve parity
in time/space efficiency with nonfunctional programming languages. Although,
mixed language like Python tend not to do as well when used functionally as true
functional languages. Emerging languages like Scala and Clojure are closing the
gap. Also, because of the simplicity of the semantics of functional programming,
it is easier to automatically transform functional programs to run efficiently
on parallel, cluster, or multicore computers (see the end of this lecture).
Functional programming languages are still not as widely used as imperative
languages, but they continue to find many uses in industry, and in some
industries (telecommunications) they have achieved dominance (at least with
some companies within the industries). Programmers who are trained to use
functional languages think about problems and solve problems differently. All
CS students should be exposed to functional programming as part of their
education (and I mean an exposure longer than one day).
To learn more about Python's use of functional programming, read section 10
(Functional Programming Modules) in Python's Library documentation, discussing
the itertools, functools, and the operator modules.

In this lecture we will look at just three important higherorder functions used
in functional programming: map (transform), filter, and reduce (accumulate).
Each operates on a function and an interable, which means they operate on lists
and tuples easily. We will write versions of these functions, to help explain
what they do, although more general and faster versions are already available
in in Python's builtins module. So mostly to study functional programming we
are concerned with studying how to call these functions.
(1) map/transform: this function takes a unary function and a list and produces
a samesized list of mapped/transformed values based on substituting each
value with the result of calling the parameter function on it. For example,
calling
map( lambda x : x**2, [i for i in irange(0,5)] )
produces a list of the squares of the values of the numbers [0, 1, 2, 3, 4, 5],
which is [0,1,4,9,16,25].
Note that lambdas are frequently (but not exclusively) used in calls to the map
function. We can use regularly defined functions as well, but often we need only
a short function, so lambdas often work out well.
Here is a simple implementation of the map function. Again, Python's builtin
map function is more general and faster; I show this one only to help you
undertand what map does.
def map(f,alist):
answer = []
for v in alist: # generate each value in alist
answer.append(f(v)) # put f(that value) in a list to return
return answer
The simpler definition below uses a comprehension. The use of map functions in
programming preceded the use of comprehensions, which are more powerful. So
comprehensions make the map function trivial to write.
def map(f,alist):
return [f(v) for v in alist]
(2) filter: this function takes a predicate (a unary function returning a bool,
although in Python most values have a bool interpretation) and some list of
values and produces a list (the same size or smaller) with only those values
for which the predicate returns True (or a value that is interpreted as True).
For example, calling
filter( predicate.is_prime, [i for i in irange(2,50)] )
produces a list of all the values between 2 and 50 that are prime:
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47].
Here is a simple implementation of the filter function. Again, Python's builtin
filter function is more general and faster; I show this one only to help you
undertand what filter does.
def filter(p,alist):
answer = []
for v in alist:
if p(v):
answer.append(v)
return answer
The simpler definition below uses a comprehension. The use of filter functions
in programming preceded the use of comprehensions, which are more powerful. So
comprehensions make the filter function trivial to write.
def map(p,alist):
return [v for v in alist if p(v)]
(3) reduce/accumulate: this function is different than the previous two: it
takes a binary function and some list of values and typically produces a single
value: it reduces or accumulates these results into a single value.
Unlike map and filter (which are defined and automatically imported from the
builtins module) we must import reduce from the functools module explicitly.
For example, calling
reduce( lambda x,y : x+y, [i for i in irange(1,100)] )
returns the sum of all the values 1 to 100, in the list. Here is a more
interesting call, because uses a noncommutative operator (subtract).
reduce( lambda x,y : xy, [1,2,3] )
which returns 4: or 1  2  3 or (12)3. Technically, this is called LEFT
reduction/accumulation because the operators are applied left to right. If
they had been applied right to left (right reduction), the result would have
been 1(23) = 1  (1) = 2. For all commutative operators, the association
order doesn't make a difference. That is, (1+2)+3 is the same as 1+(2+3). So
for 5 values, the reduce is equivalent to (((1+2)+3)+4)+5.
Note that the operator module defines a variety of functions like add (which
has the same meaning as lambda x,y: x+y) so we could also call this function
as reduce( operator.add, [i for i in irange(1,100)] ) if we had imported
operators.
Here is another interesting example
reduce( max, [4,2,3,8,6] )
which is equivalent to max(max(max(max(4,2),3),8),6) which evaluates as
follows, to compute the maximum of the entire list of values.
max(max(max(max(4,2),3),8),6) > max(max(max(4,3),8),6) >
max(max(4,8),6) > max(8,6) > 8
So, we can easily add up a list or compute its max using functional programming.
Here is the simplest implementation of reduce that I can think of. None is
returned for an empty list; otherwise f is applied to all the operands as shown
above to compute the reduced/accumulated value.
def reduce(f,alist)
if alist == []:
return None
answer = alist[0]
for v in alist[1::
answer = f(answer,v)
return v
Hand simulate this code (or single step the debugger on it) for the examples
above. reduce(f,[a,b]) returns f(a,b); reduce(f,[a,b,c]) returns f(f(a,b),c);
reduce(f,[a,b,c,d]) returns f(f(f(a,b),c),d); etc.
Here is a concrete example of a function style of programming. This expression
computes the length of the longest line in a file.
reduce(max, map(lambda l : len(l.rstrip()), [line for line in open('file')]))
First the comprehension puts all the lines in a list; then map creates a list
of lengths for each rstripped line; finally reduce applies max to the values
in the list to compute the maximum length line.
To return the longest line itself, not the length of the longest line, we could
compute as follows. Here the lambda for reduce (whose arguments will be two
lines from the file) returns the longer of the two lines; when reduced over all
lines in the file, the final result is the largest line in the file. The lambda
to map now strips spaces off the right end, but doesn't map lines to their
lengths.
reduce(lambda x,y: x if len(x) >= len(y) else y,
map(lambda l : l.rstrip(), [line for line in open('file')]))
Functional programmers spend a lot of time using these tools to build up their
idioms of expressions. We are just peeking at this topic now. It is possible
for reduce to return all type of results, not just simple ones: there are for
example, wasy to reduce lists of lists to produce just lists.
For example, these two lambdas end up computing a tuple whose index 0 in the
smallest value in a list and whose index 1 is the largest.
reduce(lambda tupx,tupy: ( min(tupx[0],tupy[0]), max(tupx[1],tupy[1]) )
map(lambda x : (x,x), [4,2,3,8,6] )
The result returned by calling reduce is (3,8). Here's how this function works:
The inner map produces the list [(4,4), (2,2), (3,3), (8,8), (6,6)]. The
reduce reduces (4,4) and (2,2) to (2,4); then reduces (2,4) and (3,3) to
(3,4); then reduces (3,4) and (8,8) to (3,8); then reduces (3,8) and (6,6)
too (3,8) which are the minmum/maximum respectively of the original list.
Two final truths.
1) The map, filter, and reduce function work on anything that is iterable
(which list and tuple are, but so are sets, dicts, and even strings). So I can
call map(lambda x : x.upper(), 'Hello') to produce the list
['H', 'E', 'L', 'L', 'O'].
2) The map and filter functions built into Python produce an iterable as a
result (not a real list). So if we call
print(map(lambda x : x.upper(), 'Hello'))
Python prints: <map object at 0x02DFFE30>
Which just says that the result is a map object, which is iterable. But if we
call
print(list(map(lambda x : x.upper(), 'Hello')))
the list constructor takes this iterable and produces a real list of its values
from it, so Python prints: ['H', 'E', 'L', 'L', 'O']

MapReduce, commutative functions, and parallel processing
MapReduce is a special language implementing the map/reduce functions running on
a parallel, cluster, or multicore computer. If we can write our code within
the MapReduce language, we can guarantee that it runs quickly on the
kinds of computers Google uses for its servers. Generally what it does is run
similar operations on huge amounts of data, combining results, until we get a
single answer. Apache Hadoop is open source version of MapReduce (but to really
see its power, we need a cluster of computer to run our code on).
How does MapReduce work? The story is long, but imagine we have a commutative
operator and want to compute: 1 + 2 + 3 + ... + n
We can evaluate this expression as shown above, one operator at a time left to
right, which would require n1 additions one right after the other (the former
must finish before the later starts). Even if we had multiples cores, doing the
operations in this order would require n1 sequential additions, one after the
other, so only one core at a time would be active.
1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12 + 13 + 14 + 15 + 16
    
+++   
   
3   
   
+++  
  
6  
  
+++ 
 
10 
 
+++

15
.... note that one more operand is used at each level
Here each level uses 1 core and there are 15 levels. In general, with N numbers
to add it take N1 time steps.
Now, how can MapReduce handle this problem?
Instead, because of commutivity, we can evaluate this expression in a
different way: add the 1st and 2nd values at the same time as we add the 3rd
and 4th at the same time as the 5th and 6th ... In all, we can add n/2 pairs
simultaneously (if we had n/2 cores). We can use this same trick for the
remaining n/2 sums, simultaneously adding them together; then for the n/4 sums,
..., to the final sum sums (for which only one processor is necessary). Here is
a pictorial representation of this process for 16 values.
1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12 + 13 + 14 + 15 + 16
               
+++ +++ +++ +++ +++ +++ +++ +++ 8 cores
       
3 7 11 15 19 23 27 31
       
+++ +++ +++ +++ 4 cores
   
10 26 42 58
   
+++ +++ 2 cores
 
36 100
 
+++ 1 core

136
Here each level uses as many cores as possible there are 4 levels. In general,
with N numbers to add it takes Log2 N times steps. Recall that Log2 1,000 is
10, Log2 1,000,000 is 20, and Log2 1,000,000,000 = 30. To perform 10**9
additions in 30 time steps, we'd need a half billion cores: not likely this is
coming in your lifetime. But if we had tensorhundreds of cores, we could keep
them all busy except for the last few (bottom) levels. So we could get our code
to run tensorhundreds of times faster.