I saw in this video that computing clustering coefficient of central node of a star graph using the following algorithm is $\Theta(n^2)$ and for a clique it is $\Theta(n^3)$. is that correct?
def clustering_coefficient(G,v): neighbors = G[v].keys() if len(neighbors) == 1: return 0.0 links = 0.0 for w in neighbors: for u in neighbors: if u in G[w]: links += 0.5 return 2.0*links/(len(neighbors)*(len(neighbors)-1))
Asked By : Sajjad
Answered By : Paresh
Edit: My answer was based on assuming the code given as a pseudo-code, rather than a concrete Python example. For Python, the data structure is a dictionary (hash map) and so finding neighbors (the in
) operation will in fact be $O(1)$ on average, giving $\Theta(n^2)$ overall complexity on average for both graphs. Rare worst cases in hash maps can depend on implementation. In general for pseudo-codes, see my original answer below.
The complexity also depends on the data structure being used for storing the graph $G$. If an adjacency list is used, where for every vertex $v$ in the graph, its neighbors are stored in an array/linked list, then the above complexities are valid. Here's how:
For the central node of the star graph, there are $n-1$ neighboring vertices. Then, the pair of neighbors of this central node - the $w$ and $u$ vertices - can be selected in $\Theta(n^2)$ ways. For each such $w$, checking if $u$ is a neighbor of $w$ can be done in $\Theta(1)$ time, since each such $w$ only has $1$ neighbor - the central node $v$. This gives overall complexity of $\Theta(n^2)$.
For a clique of size $n$, every node has $\Theta(n)$ neighbors. Then, the pair of neighbors $w$ and $u$ for any node $v$ can be chosen in $\Theta(n^2)$ ways - this is the complexity of the two outer for
loops. For every such $w$, checking if $u$ is its neighbor requires going through the list of $\Theta(n)$ neighbors of $w$, and so the if
statement has $\Theta(n)$ complexity. This takes the overall complexity of the two nested for
loops and the if
loop to $\Theta(n^3)$.
However, there are other representations of the graph data structure, which give different (generally better) complexities for the same algorithm. For example, having an adjacency matrix to represent the graph means that the check for $u$ is a neighbor of $w$ can be done in $\Theta(1)$ time, even for a clique. Similarly, using a hash set instead of list/array to store neighbors would mean checking for neighbors can be done in $\Theta(1)$ on average. These are the more common representation. You could also choose an arbitrary representation where neighbors are sorted (just to demonstrate how the underlying data structure can affect algorithm complexity). In such a case, checking for neighbors will take $\Theta(\log n)$, giving overall $\Theta(n^2 \log n)$. Each has its pros and cons, and the appropriate data structure must be chosen according to your situation.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/7624
0 comments:
Post a Comment
Let us know your responses and feedback