Assume that we choose randomly $k$ distinct numbers $N_1$, $\dots$, $N_k$ in $\{1, \dots, k\}$ and we have a file of $k$ parts. We have these two cases :
We read (or write) sequentially from part $1$ to part $k$.
We read (or write) the part indexed $N_1$, then the part indexed $N_2$, ..., and the part indexed $N_k$.
Have these two cases the same complexity in the RAM model?
In which model these two cases don't have the same complexity?
Asked By : user7060
Answered By : Ran G.
The answer critically depends on the model in use, specifically whether or not the memory is cached, and how the cache is being used. In the following I'll explore the two main architectures: with and without a cache.
Throughout, we assume the file consists of $k$ parts: $f_1, \ldots, f_k$ where $f_i$ is 4 bytes (32bits).
A. The simple case: a single-layered RAM.
In this case, we assume that we have a large memory $M$, where each address contains 32 bits of data. Accessing an address in the memory takes $t_M$ time, and fetches the 32bits stored in that address.
We can assume that the file is stored sequentially in the memory, (although it doesn't matter), that is, $M[i]=f_i$ for $i\le k$ and $M[i]=0$ otherwise.
In this case, accessing the file in any order will take the same time, namely $k \cdot t_M$.
bottom line: RAM allows random-access with the same access time as sequential access. (this is why it is called a Random-Access Memory, RAM).
B. A more realistic case: a cached memory
In the more interesting case, the memory is layered. There is a small fast cache we denote $C$ that contains small amount of addresses, say 1024. Each address contains 32bits of data.
When accessing an address $i$ in the memory, the process is the following:
- if $M[i]$ exists in the cache, say $C[j]=M[i]$, then the access time is $t_c$. (this is called a cache hit)
- Otherwise, the slow memory is being accessed, and the data is fetched in time $t_C+t_M$. Additionally, $M[i]$ is now stored in the cache as well. (this is called a cache miss)
It is assumed that $t_C \ll t_M$. In case of a miss, if the cache is already full, we assume that the new value brought to the cache replaces the oldest item in the cache (the one that was last accessed the earliest)
B.1 A Single Access
Assume that at the beginning the cache is empty.
If the file is being accessed only a single time, then, each address we access is a miss, and it will take us $k(t_C+t_M)$ to access the entire file, regardless of the order.
B.2 Multiple Access
Now, assume we need to access the file twice. At the first time the cache is empty and so it will take us $k(t_C+t_M)$ to fetch the file, as above.
On the second time, things are a bit different.
If $k<1024$, then in the next access, the entire file is in the cache, and it will take us $k t_C$ time to access it again, regardless of the access order.
If $k>1024$ (say $k = 2048$), then in the second access the times slightly change:
- if we access the file sequentially, then we will always have a miss, since the cache will hold the 1024 parts we have most recently accessed, and will never hold the next one that we need. The time is $k(t_C+t_M)$,
- if we access it in a random order (that is independent of the order in the first time we accessed the file), then the first part will be there with probability $1/2= 1024/2048$, the second with probability $1023/2047$ and so on (the last 1024 parts will always cause a miss). This gives a slightly better access time.
B.3 Blocked cache
Realistic caches don't work byte-by-byte, but rather they work in blocks.
Let's say the block size is 256 words (that is, the 1024-cells cache holds 4 blocks of 256$\times$32bits each). Now, when you access an address $i$ in the memory, the following process happens:
- if $M[i]$ exists in the cache, then the access time is $t_c$.
- Otherwise, the slow memory is being accessed, and the entire block that contains $M[i]$ is brought into the cache. This takes time $t_C + 256t_M$. We assume we always work in aligned blocks of 256 words: so if you access any address between 0 and 255, the same block $M[0],\ldots, M[255]$ is fetched into the cache.
Now, if you access the file in sequential order, once you access $M[0]$ you get a miss, and the first block gets into the cache. This takes time $t_C + 256t_M$. The next 255 access will be a hit. The total time will be about $k(t_C + t_M)$.
However, if you access the file randomly, every part has a probability of about $1024/k$ to already be in the cache (this is not exactly correct, but close enough). Thus the time will be (approximately) $k \cdot (1024/k \cdot t_c + (1-1024/k)(t_C + 256t_M))$. This may be substantially worse than in the sequential case.
bottom line: if the cache is blocked, you get advantage accessing sequentially. Otherwise, it shouldn't matter that much.
C. The (even more) realistic case
Finally, I must add that there are more parameters that affect the analysis. First, usually the memory has several cache layers, e.g., L1--L3 cache, and a main memory. Each level has more memory but is slower than the layer beneath it.
In order to give a precise analysis, a full description of the memory and the access pattern should be given, since many parameters might have a dramatic effect on the total access time.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/33223
0 comments:
Post a Comment
Let us know your responses and feedback