Home > Functional Programming

Functional Programming

Functional Programming





Functional programming is a style of programming. Some use the words programming

paradigm. The more standard styles are procedural and object-oriented, which

are part of the imperative paradigm. Functional programs fundamentally evaluate

expressions; imperative programs fundamentally execute statements. Functional

programming uses the simplicity and power of functions to accomplish

programming tasks.



In a purely functional solution to a problem, there will be no mutation to data

structures, and special functions calling functions (not looping) will be the

primary control structure. Primarily this lecture discusses functional

programming using three functions: map, filter, and reduce, which each take a

function as an argument.



Functional languages easily support and frequently use the passing of functions

as arguments to other functions (these other functions are called "higher-order"

functions or functionals) and functions returning other functions as their

results. Python allows both.



There are programming languqages that are purely function (Haskell), others that

are mostly functional (ML -whose major implementations are SML and OCaml- the

Scheme dialect of Lisp, and Erlang), and still others that at least support a

functional style of programming (some better, some worse) when it is useful.

Python is in this latter category, although features like comprehensions in

Python emphasize its functional programming aspects (lambdas fall into this

category too).



Generally, functional programming is characterized as using immutable objects

and no state changes (not even rebinding of names). Strings, tuples and

frozensets are all immutable in Python, but lists, sets, and dicts are not.



In functional programming, we don't change data structures but produce new ones

that are variants of the old ones. For example, if we want to "change" the first

value of a tuple t to 1, we cannot: instead we create a new tuple whose first

values is 1 and whose other values are the other values in the tuple, using the

concatenation operator. The new tuple is specified as (1,)+t[1:]; note that we

need the comma in (1,) to distinguish it from (1): the former is a tuple

containing the value 1, the later is an int.



Functional programming creates lots of objects and must do so in a time and

space efficient way, and for the most part, functional languages achieve parity

in time/space efficiency with non-functional programming languages. Although,

mixed language like Python tend not to do as well when used functionally as true

functional languages. Emerging languages like Scala and Clojure are closing the

gap. Also, because of the simplicity of the semantics of functional programming,

it is easier to automatically transform functional programs to run efficiently

on parallel, cluster, or multi-core computers (see the end of this lecture).



Functional programming languages are still not as widely used as imperative

languages, but they continue to find many uses in industry, and in some

industries (telecommunications) they have achieved dominance (at least with

some companies within the industries). Programmers who are trained to use

functional languages think about problems and solve problems differently. All

CS students should be exposed to functional programming as part of their

education (and I mean an exposure longer than one day).



To learn more about Python's use of functional programming, read section 10

(Functional Programming Modules) in Python's Library documentation, discussing

the itertools, functools, and the operator modules.



------------------------------------------------------------------------------



In this lecture we will look at just three important higher-order functions used

in functional programming: map (transform), filter, and reduce (accumulate).

Each operates on a function and an interable, which means they operate on lists

and tuples easily. We will write versions of these functions, to help explain

what they do, although more general and faster versions are already available

in in Python's builtins module. So mostly to study functional programming we

are concerned with studying how to call these functions.



(1) map/transform: this function takes a unary function and a list and produces

a same-sized list of mapped/transformed values based on substituting each

value with the result of calling the parameter function on it. For example,

calling



map( lambda x : x**2, [i for i in irange(0,5)] )



produces a list of the squares of the values of the numbers [0, 1, 2, 3, 4, 5],

which is [0,1,4,9,16,25].



Note that lambdas are frequently (but not exclusively) used in calls to the map

function. We can use regularly defined functions as well, but often we need only

a short function, so lambdas often work out well.



Here is a simple implementation of the map function. Again, Python's builtin

map function is more general and faster; I show this one only to help you

undertand what map does.



def map(f,alist):

answer = []

for v in alist: # generate each value in alist

answer.append(f(v)) # put f(that value) in a list to return

return answer



The simpler definition below uses a comprehension. The use of map functions in

programming preceded the use of comprehensions, which are more powerful. So

comprehensions make the map function trivial to write.



def map(f,alist):

return [f(v) for v in alist]



(2) filter: this function takes a predicate (a unary function returning a bool,

although in Python most values have a bool interpretation) and some list of

values and produces a list (the same size or smaller) with only those values

for which the predicate returns True (or a value that is interpreted as True).

For example, calling



filter( predicate.is_prime, [i for i in irange(2,50)] )



produces a list of all the values between 2 and 50 that are prime:

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47].



Here is a simple implementation of the filter function. Again, Python's builtin

filter function is more general and faster; I show this one only to help you

undertand what filter does.



def filter(p,alist):

answer = []

for v in alist:

if p(v):

answer.append(v)

return answer



The simpler definition below uses a comprehension. The use of filter functions

in programming preceded the use of comprehensions, which are more powerful. So

comprehensions make the filter function trivial to write.



def map(p,alist):

return [v for v in alist if p(v)]



(3) reduce/accumulate: this function is different than the previous two: it

takes a binary function and some list of values and typically produces a single

value: it reduces or accumulates these results into a single value.



Unlike map and filter (which are defined and automatically imported from the

builtins module) we must import reduce from the functools module explicitly.



For example, calling



reduce( lambda x,y : x+y, [i for i in irange(1,100)] )



returns the sum of all the values 1 to 100, in the list. Here is a more

interesting call, because uses a non-commutative operator (subtract).



reduce( lambda x,y : x-y, [1,2,3] )



which returns -4: or 1 - 2 - 3 or (1-2)-3. Technically, this is called LEFT

reduction/accumulation because the operators are applied left to right. If

they had been applied right to left (right reduction), the result would have

been 1-(2-3) = 1 - (-1) = 2. For all commutative operators, the association

order doesn't make a difference. That is, (1+2)+3 is the same as 1+(2+3). So

for 5 values, the reduce is equivalent to (((1+2)+3)+4)+5.



Note that the operator module defines a variety of functions like add (which

has the same meaning as lambda x,y: x+y) so we could also call this function

as reduce( operator.add, [i for i in irange(1,100)] ) if we had imported

operators.



Here is another interesting example



reduce( max, [4,2,-3,8,6] )



which is equivalent to max(max(max(max(4,2),-3),8),6) which evaluates as

follows, to compute the maximum of the entire list of values.



max(max(max(max(4,2),-3),8),6) -> max(max(max(4,-3),8),6) ->

max(max(4,8),6) -> max(8,6) -> 8



So, we can easily add up a list or compute its max using functional programming.



Here is the simplest implementation of reduce that I can think of. None is

returned for an empty list; otherwise f is applied to all the operands as shown

above to compute the reduced/accumulated value.



def reduce(f,alist)

if alist == []:

return None



answer = alist[0]

for v in alist[1::

answer = f(answer,v)

return v



Hand simulate this code (or single step the debugger on it) for the examples

above. reduce(f,[a,b]) returns f(a,b); reduce(f,[a,b,c]) returns f(f(a,b),c);

reduce(f,[a,b,c,d]) returns f(f(f(a,b),c),d); etc.



Here is a concrete example of a function style of programming. This expression

computes the length of the longest line in a file.



reduce(max, map(lambda l : len(l.rstrip()), [line for line in open('file')]))



First the comprehension puts all the lines in a list; then map creates a list

of lengths for each rstripped line; finally reduce applies max to the values

in the list to compute the maximum length line.



To return the longest line itself, not the length of the longest line, we could

compute as follows. Here the lambda for reduce (whose arguments will be two

lines from the file) returns the longer of the two lines; when reduced over all

lines in the file, the final result is the largest line in the file. The lambda

to map now strips spaces off the right end, but doesn't map lines to their

lengths.



reduce(lambda x,y: x if len(x) >= len(y) else y,

map(lambda l : l.rstrip(), [line for line in open('file')]))



Functional programmers spend a lot of time using these tools to build up their

idioms of expressions. We are just peeking at this topic now. It is possible

for reduce to return all type of results, not just simple ones: there are for

example, wasy to reduce lists of lists to produce just lists.



For example, these two lambdas end up computing a tuple whose index 0 in the

smallest value in a list and whose index 1 is the largest.



reduce(lambda tupx,tupy: ( min(tupx[0],tupy[0]), max(tupx[1],tupy[1]) )

map(lambda x : (x,x), [4,2,-3,8,6] )



The result returned by calling reduce is (-3,8). Here's how this function works:

The inner map produces the list [(4,4), (2,2), (-3,-3), (8,8), (6,6)]. The

reduce reduces (4,4) and (2,2) to (2,4); then reduces (2,4) and (-3,-3) to

(-3,4); then reduces (-3,4) and (8,8) to (-3,8); then reduces (-3,8) and (6,6)

too (-3,8) which are the minmum/maximum respectively of the original list.



Two final truths.



1) The map, filter, and reduce function work on anything that is iterable

(which list and tuple are, but so are sets, dicts, and even strings). So I can

call map(lambda x : x.upper(), 'Hello') to produce the list

['H', 'E', 'L', 'L', 'O'].



2) The map and filter functions built into Python produce an iterable as a

result (not a real list). So if we call



print(map(lambda x : x.upper(), 'Hello'))



Python prints: <map object at 0x02DFFE30>



Which just says that the result is a map object, which is iterable. But if we

call



print(list(map(lambda x : x.upper(), 'Hello')))



the list constructor takes this iterable and produces a real list of its values

from it, so Python prints: ['H', 'E', 'L', 'L', 'O']



------------------------------------------------------------------------------



MapReduce, commutative functions, and parallel processing



MapReduce is a special language implementing the map/reduce functions running on

a parallel, cluster, or multi-core computer. If we can write our code within

the MapReduce language, we can guarantee that it runs quickly on the

kinds of computers Google uses for its servers. Generally what it does is run

similar operations on huge amounts of data, combining results, until we get a

single answer. Apache Hadoop is open source version of MapReduce (but to really

see its power, we need a cluster of computer to run our code on).



How does MapReduce work? The story is long, but imagine we have a commutative

operator and want to compute: 1 + 2 + 3 + ... + n



We can evaluate this expression as shown above, one operator at a time left to

right, which would require n-1 additions one right after the other (the former

must finish before the later starts). Even if we had multiples cores, doing the

operations in this order would require n-1 sequential additions, one after the

other, so only one core at a time would be active.



1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12 + 13 + 14 + 15 + 16

| | | | |

+-+-+ | | |

| | | |

3 | | |

| | | |

+--+--+ | |

| | |

6 | |

| | |

+---+--+ |

| |

10 |

| |

+---+--+

|

15



.... note that one more operand is used at each level



Here each level uses 1 core and there are 15 levels. In general, with N numbers

to add it take N-1 time steps.



Now, how can MapReduce handle this problem?



Instead, because of commutivity, we can evaluate this expression in a

different way: add the 1st and 2nd values at the same time as we add the 3rd

and 4th at the same time as the 5th and 6th ... In all, we can add n/2 pairs

simultaneously (if we had n/2 cores). We can use this same trick for the

remaining n/2 sums, simultaneously adding them together; then for the n/4 sums,

..., to the final sum sums (for which only one processor is necessary). Here is

a pictorial representation of this process for 16 values.



1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12 + 13 + 14 + 15 + 16

| | | | | | | | | | | | | | | |

+-+-+ +-+-+ +-+-+ +-+-+ +-+-+ +-+-+ +-+-+ +-+-+ 8 cores

| | | | | | | |

3 7 11 15 19 23 27 31

| | | | | | | |

+---+---+ +---+---+ +----+----+ +----+----+ 4 cores

| | | |

10 26 42 58

| | | |

+-------+-------+ +---------+---------+ 2 cores

| |

36 100

| |

+----------------+-----------------+ 1 core

|

136



Here each level uses as many cores as possible there are 4 levels. In general,

with N numbers to add it takes Log2 N times steps. Recall that Log2 1,000 is

10, Log2 1,000,000 is 20, and Log2 1,000,000,000 = 30. To perform 10**9

additions in 30 time steps, we'd need a half billion cores: not likely this is

coming in your lifetime. But if we had tens-or-hundreds of cores, we could keep

them all busy except for the last few (bottom) levels. So we could get our code

to run tens-or-hundreds of times faster.



Search more related documents:Functional Programming
Download Document:Functional Programming

Set Home | Add to Favorites

All Rights Reserved Powered by Free Document Search and Download

Copyright © 2011
This site does not host pdf,doc,ppt,xls,rtf,txt files all document are the property of their respective owners. complaint#downhi.com
TOP