Sunday, June 15, 2014

Big O notation - java

Sometimes comes up

Big-Oh (the "O" stands for "order of") notation is concerned with what happens for very large values of N, therefore only the largest term in a polynomial is needed. All smaller terms are dropped.

Lets brush up on this

Big O notation is used in Computer Science to describe the performance or complexity of an algorithm. Big O specifically describes the worst-case scenario, and can be used to describe the execution time required or the space used (e.g. in memory or on disk) by an algorithm.

O(1) describes an algorithm that will always execute in the same time (or space) regardless of the size of the input data set. e.g reading the first element of an array

O(N) describes an algorithm whose performance will grow linearly and in direct proportion to the size of the input data set - e.g. searching for a string in a linked lisst

O(N2) represents an algorithm whose performance is directly proportional to the square of the size of the input data set. This is common with algorithms that involve nested iterations over the data set. eg. comparing strings in a 2d array = 3d array will be O(N3)

O(2N) denotes an algorithm whose growth will double with each additional element in the input data

Logarithm
This type of algorithm is described as O(log N). The iterative halving of data sets described in the binary search example produces a growth curve that peaks at the beginning and slowly flattens out as the size of the data sets increase e.g. an input data set containing 10 items takes one second to complete, a data set containing 100 items takes two seconds, and a data set containing 1000 items will take three seconds. Doubling the size of the input data set has little effect on its growth as after a single iteration of the algorithm the data set will be halved and therefore on a par with an input data set half the size. This makes algorithms like binary search extremely efficient when dealing with large data sets.


cpu or memory resources an algorithm requires - complexity is another term

Here is a table of typical cases, showing how many "operations" would be performed for various values of N. Logarithms to base 2 (as used here) are proportional to logarithms in other base, so this doesn't affect the big-oh formula.
 constantlogarithmiclinear quadraticcubic
nO(1)O(log N)O(N)O(N log N)O(N2) O(N3)
11 1 1 1 1 1
21 1 2 2 4 8
41 2 4 8 16 64
81 3 8 24 64 512
161 4 16 64 256 4,096
1,0241101,02410,2401,048,5761,073,741,824
1,048,5761201,048,57620,971,52010121016


Why big-oh notation isn't always useful (I like this)

Complexity analysis can be very useful, but there are problems with it too.
  • Too hard to analyze. Many algorithms are simply too hard to analyze mathematically.
  • Average case unknown. There may not be sufficient information to know what the most important "average" case really is, therefore analysis is impossible.
  • Unknown constant. Both walking and traveling at the speed of light have a time-as-function-of-distance big-oh complexity of O(N). Altho they have the same big-oh characteristics, one is rather faster than the other. Big-oh analysis only tells you how it grows with the size of the problem, not how efficient it is.

Typical big-oh values for common algorithms

Searching

Here is a table of typical cases.
Type of SearchBig-OhComments
Linear search array/ArrayList/LinkedList O(N)  
Binary search sorted array/ArrayList O(log N) Requires sorted data.
Search balanced treeO(log N)  
Search hash table O(1)  

Other Typical Operations

Algorithmarray
ArrayList
LinkedList
access front O(1)O(1)
access back O(1)O(1)
access middle O(1)O(N)
insert at front O(N) Yeah!O(1)
insert at back O(1)O(1)
insert in middleO(N) or n/2 ? meh
O(1)


intelligent comment

Time-space tradeoffs

Sometimes it's possible to reduce execution time by using more space, or reduce space requirements by using a more time-intensive algorithm.

another way to describe it

Big-O:
  • Describes how the algorithm scales and performs, in terms of either the execution time required or the space used.
  • Is relative representation of complexity. This allows you to reduce an algorithm to a variable which in turn allows you to easily compare it to another.
  • Describes an upper limit on the growth of a function, in the other words the “worst case scenario”.
There is also Big-Omega notation which looks at the lower bound / “best case scenario” stating that the algorithm will take at least X amount of time and Big-Theta which is tight bound to both lower and upper / “average”.

No comments:

Post a Comment