Wednesday, October 27, 2010

Algorithm to recognize familiar

DEFINITION OF AN ALGORITHM

An algorithm is a method for solving a class of problems. While computer scientists think a lot about
algorithms, the term applies to any method of solving a particular type of problem. The repair manual for your car will describe a procedure, which could also be called an algorithm, for replacing the brake pads. The turn-by-turn travel instructions from MapQuest could be called an algorithm for getting from one place to another.

EXAMPLE—DESIGNING A STAIRCASE
You may be surprised, as we were, to know that every staircase must be custom-designed to fit the circumstances of total elevation (total “rise”) and total horizontal extent (total “run”).

To make stairs fit a person’s natural gait, the relationship of each step’s rise (lift height) to its run (horizontal
distance) should be consistent with a formula. Some say the following formula should be satisfied:
(rise * 2) + run = 25 to 27 inches

Others say the following simpler formula works well:
rise + run = 17 to 18 inches, Many say the ideal rise for each step is 7 in, but some say outdoor steps should be 6 in high because people are more likely to be carrying heavy burdens outside. In either case, for any particular situation, the total rise of the staircase will probably not be an even multiple of 6 or 7 in. Therefore, the rise of each step must be altered to create a whole number of steps. These rules lead to a procedure for designing a staircase. Our algorithm for designing a set of stairs will be to:


  1. Divide the total rise by 7 in and round the result to the nearest whole number to get the number of steps.
  2. We will then divide the total run by (the number of steps − 1) (see Fig. 2-1) to compute the run for each step.
  3. We will apply one of the formulas to see how close this pair of rise and run parameters is to the ideal.
  4. Then we will complete the same computations with one more step and one less step, and also compute the values of the formula for those combinations of rise and run.
  5. We will accept the combination of rise and run that best fits the formula for the ideal.

An algorithm is a way of solving a type of problem, and an algorithm is applicable to many particular
instances of the problem. A good algorithm is a tool that can be used over and over again, as is the case for our staircase design algorithm.

EXAMPLE—FINDING THE GREATEST COMMON DENOMINATOR
In mathematics, a famously successful and useful algorithm is Euclid’s algorithm for finding the greatest
common divisor (GCD) of two numbers. The GCD is the largest integer that will evenly divide the two numbers in question. Euclid described his algorithm about 300 BCE.

Without having Euclid’s algorithm, how would one find the GCD of 372 and 84? One would have to factor
the two numbers, and find the largest common factor. As the numbers in question become larger and larger,
the factoring task becomes more and more difficult and time-consuming. Euclid discovered an algorithm that
systematically and quickly reduces the size of the problem by replacing the original pair of numbers by smaller
pairs until one of the pair becomes zero, at which point the GCD is the other number of the pair (the GCD
of any number and 0 is that number).

Here is Euclid’s algorithm for finding the GCD of any two numbers A and B.
Repeat:
If B is zero, the GCD is A.
Otherwise:
find the remainder R when dividing A by B
replace the value of A with the value of B
replace the value of B with the value of R
For example, to find the GCD of 372 and 84, which we will show as:
GCD(372, 84)
Find GCD(84, 36) because 372/84 —> remainder 36
Find GCD(36, 12) because 84/36 —> remainder 12
Find GCD(12, 0) because 36/12 —> remainder 0; Solved! GCD = 12

More formally, an algorithm is a sequence of computations that operates on some set of inputs and produces
a result in a finite period of time. In the example of the algorithm for designing stairs, the inputs are the total rise and total run. The result is the best specification for the number of steps, and for the rise and run of each step. In the example of finding the GCD of two numbers, the inputs are the two numbers, and the result is the GCD. Often there are several ways to solve a class of problems, several algorithms that will get the job done. The question then is which algorithm is best? In the case of algorithms for computing, computer scientists have
developed techniques for analyzing the performance and judging the relative quality of different algorithms.
REPRESENTING ALGORITHMS WITH PSEUDOCODE

In computer science, algorithms are usually represented as pseudocode. Pseudocode is close enough to
a real programming language that it can represent the tasks the computer must perform in executing the algorithm. Pseudocode is also independent of any particular language, and uncluttered by details of syntax, which characteristics make it attractive for conveying to humans the essential operations of an algorithm.

There is no standard pseudocode form, and many computer scientists develop a personal style of pseudocode that suits them and their tasks. We will use the following pseudocode style to represent the GCD algorithm:
GCD ( a, b ) Function name and arguments
While b ! = 0 { ! = means “not equal”
indentation shows what to do while b ! = 0
r <-- a modulo b set r = a modulo b ( = remainder a/b)
a <-- b set a = original b
b <-- r set b = r (i.e., the remainder)
} border of the “while” repetition
return a when b = 0, return value of a as the GCD


CHARACTERIZING ALGORITHMS
To illustrate how different algorithms can have different performance characteristics, we will discuss a
variety of algorithms that computer scientists have developed to solve common problems in computing.
Sequential search Suppose one is provided with a list of people in the class, and one is asked to look up the name Debbie Drawe. A sequential search is a “brute force” algorithm that one can use. With a sequential search, the algorithm simply compares each name in the list to the name for which we are searching. The search ends when the algorithm finds a matching name, or when the algorithm has inspected all names in the list. Here is pseudocode for the sequential search. The double forward slash “//” indicates a comment. Note,
too, the way we use the variable index to refer to a particular element in list_of_names. For instance,
list_of_names[3] is the third name in the list.
Sequential_Search(list_of_names, name)
length <-- length of list_of_names
match_found <-- false
index <-- 1
// While we have not found a match AND
// we have not looked at every person in the list,
// (The symbol <= means "less than or equal to.")
// continue ...
// Once we find a match or get to the end of the list,
// we are finished
while match_found = false AND index <= length {
// The index keeps track of which name in the list
// we are comparing with the test name.
// If we find a match, set match_found to true
if list_of_names[index] = name then
match_found <-- true
index <-- index + 1
}
// match_found will be true if we found a match, and
// false if we looked at every name and found no match
return match_found
end

ANALYZING ALGORITHMS
If we know how long each statement takes to execute, and we know how many names are in the list, we
can calculate the time required for the algorithm to execute. However, the important thing to know about an
algorithm is usually not how long it will take to solve any particular problem. The important thing to know is
how the time taken to solve the problem will vary as the size of the problem changes.

The sequential search algorithm will take longer as the number of comparisons becomes greater. The real
work of the algorithm is in comparing each name to the search name. Most other statements in the algorithm get executed only once, but as long as the while condition remains true, the comparisons occur again and again. If the name we are searching for is in the list, on average the algorithm will have to look at half the names on the list before finding a match. If the name we are searching for is not on the list, the algorithm will have to look at all the names on the list.

If the list is twice as long, approximately twice as many comparisons will be necessary. If the list is a million
times as long, approximately a million times as many comparisons will be necessary. In that case, the time devoted to the statements executed only once will become insignificant with respect to the execution time overall. The running time of the sequential search algorithm grows in proportion to the size of the list being searched. We say that the “order of growth” of the sequential search algorithm is n. The notation for this is T(n).

We also say that an algorithm whose order of growth is within some constant factor of T(n) has a theta of NL say. “The sequential search has a theta of n.” The size of the problem is n, the length of the list being searched. Since for large problems the one-time-only or a-few-times-only statements make little difference, we ignore those constant or nearly constant times and simply focus on the fact that the running time will grow in proportion to the length of the list being searched. Of course, for any particular search, the time required will depend on where in the list the match occurs.

If the first name is a match, then it doesn’t matter how long the list is. If the name does not occur in the list, the search will always require comparing the search name with all the names in the list. We say the sequential search algorithm is Θ(n) because in the average case, and the worst case, its performance slows in proportion to n, the length of the list. Sometimes algorithms are characterized for best-case performance,but usually average performance, and particularly worst-case performance are reported. The average case is usually better for setting expectations, and the worst case provides a boundary upon which one can rely.

No comments:

Post a Comment