Just out of interest I tried to solve a problem from "Recent" category of Project Euler ( Digit Sum sequence ). But I am unable to think of a way to solve the problem efficiently. The problem is as follows ( in the original question sequence has two ones in beginning , but it does not change the sequence ) :
The Digit Sum sequence is 1,2,4,8,16,23,28,38,49.... where the $n^{th}$ term of sequence is sum of digits preceding it in the sequence. Find the $10^{15}th$ term of the sequence.
The naive solution can't be implemented because it takes a lot of time. I tried to reduce the problem to a case of matrix exponentiation ( that would takes $O(log ( 10^{15}))$ amount of time ) but could not come up with such a recurrence fitting the linear criteria as recurrence for this sequence is quite peculiar. It can be seen that the sequence is governed by the recurrence :
$$ a_n = a_{n-1} + d( a_{n-1} ) ..... (1 )$$
where $a_n$ is $n^{th}$ term of the sequence and $d$ is a function which when given a natural number as input returns sum of digits of the number ( eg. $\;d(786)=21$ ). My second approach was to try to find some pattern in the sequence. It can be seen that the first few terms of the sequence can be written as
a_1 = 1 a_2 = 1 + d( 1 ) a_3 = 1 + d( 1 ) + d( 1 + d( 1 ) ) a_4 = 1 + d( 1 ) + d( 1 + d( 1 ) ) + d( 1 + d( 1 ) + d( 1 + d( 1 ) ) ) a_5 = 1 + d( 1 ) + d( 1 + d( 1 ) ) + d( 1 + d( 1 ) + d( 1 + d( 1 ) ) ) + d( 1 + d( 1 ) + d( 1 + d( 1 ) ) + d( 1 + d( 1 ) + d( 1 + d( 1 ) ) ) )
From the pattern above it becomes that $n^{th}$ term of the sequence can be generated by the following method :
- Write $2^{n-1}$ $1$'s with addition symbol between them.
- Leaving the first $1$, then apply the function $d$ on the next $2^{0}$ terms then on next $2^{1}$ terms, then on next $2^{2}$ terms and so on.
- Then apply the above method recursively on arguments of each $d$ function applied.
for example if n=3 we perform the following manipulations:
1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 1 + d( 1 ) + d( 1 + 1 ) + d( 1 + 1 + 1 + 1 ) 1 + d( 1 ) + d( 1 + d(1) ) + d( 1 + d( 1 ) + d( 1 +d( 1 ) ) )
By dynamic programming I can generate the $n^{th}$ term using the above method in time $O(log ( 2^{10^{15}} ) )$, which again is no better than the naive solution.
EDIT 1
Another thing that can be observed is that $d(a_n) = d(2^{n-1})$. For example $d(a_6)=d(23)=d(32)=5$. But I am unable to make use of this point. I again tried to find a linear recurrence relation ( for matrix exponentiation ) , but I am unable to find it.
EDIT 2
Following is the graph when the sequence is plotted for smaller range ( first $10^6$ terms of the sequence are plotted ).
PS: I know it is not advisable to ask solutions from Project Euler. But I just want a new direction or a hint, as I have been moving in circles for past few days. If that is also unacceptable I can remove the question if suggested.
Asked By : sashas
Answered By : Evil
Your sequence is described in oeis.org/A004207 as digits sum. There are some good points like sequence mod 9 has repeating pattern $(1, 2, 4, 8, 7, 5)^\infty$, it shares digital roots with oeis.org/A065075 and oeis.org/A001370. If those properties are useful is open problem (because there is no closed form equation for $n-th$ number).
There are some properties of this sequence worth mentioning:
When you calculate $n-th$ number, you need to store only counter (to know which numer it was) and the number itself. To restart there is nothing more needed, as the next number is current number + sum of it's digits.
Taking some steps to ensure speed at first it is good to put numbers into array, avoiding naive mod and div calculations, which are expensive. This gives speedup by constant, but looking at times it does matter.
From starting point you can calculate next one, and next, and it works upto some point, this very point is number of digits change.
What is more important, patterns are changing with increase of numbers.
Digit sums are small in compare to numbers itself, so only the part of number will change on the most operations.
So what really can we cache?
We know that with two numbers with the same sum of digits the addition to get the next number will be the same. What about the next one?
sasha
Spoiler alert, below is quite explicit cache pattern
It depends on additional conditions, like the numbers that does not change in the run, I will call it shift, starting amount as start.
Taking some arbitrary run, like $100$, and taking start from $0$ to $9$ we can cache every pattern from starting point up to $100$, count number of elements (to know how to deal with counter, which is needed to give some $n-th$ number), and memorize it.
Ok. Until now any start is covered, what changes beyond $100$?
Unfortunately this patterns stop working, because if you try from $100$ making start equal $1$, the next number is calculated perfectly, but second one breaks.
Here we have to cover shift set to $1$ and start set to $0$. It also means calculating tables for different shifts.
Do we really need to calculate all of them? Not really, no.
Part of tables is just another one starting item further.
For example starting from $1, 2, 4, 8$ gives the same sequence shifted.
So do we need to calculate longer cache?
No, we calculate it up to shift change to pick up another run, so it will save a lot of memory.
Now when table is covered, we start from initial setting, pick sum from table, add counter and see that: from $1$ we get to $101$, update variables and jump to $218$, repeating the steps $305$ -> $406$ -> $517$ -> $607$ -> $719$ -> $805$ -> $904$ -> $1003$.
Ok. What now?
We can continue until shift is higher than calculated.
Going further we might build more runs on fly, precalculate bigger runs, or observe another patterns (like we can partialy reuse tables already calculated).
Take a look at different shifts like $100, 1000, 10000, 100000, 1000000...$ they all give the same run being the same environment for digit sums, hence we can use the very same tables.
Making bigger tables than $100$ elements speeds up process further, making bigger jumps at once.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/56235
0 comments:
Post a Comment
Let us know your responses and feedback