INDIRA
GANDHI NATIONAL OPEN UNIVERSITY |
|||
B.Ed. Entrance Test October, 2016 Results held on 23.10.2016
|
Reachability is defined as follows: a digraph $G = (V, E)$ and two vertices $v,w \in V$. Is there a directed path from $v$ to $w$ in $G$?
Is it possible to write a polynomial time algorithm for it?
I asked this question on mathematics and got no answer by far.
Asked By : Gigili
Answered By : Artem Kaznatcheev
Although you already know from the other answers that the question is solvable in polynomial time, I thought I would expand on the computational complexity of reachability since you used complexity terminology in your question.
Reachability (or st-connectivity) in digraphs is the prototypical $NL$-complete problem where $NL$ stands for non-deterministic log space and we use deterministic log-space reductions (although I think it remains complete for $NC^1$ reductions, too).
- To see why it is in $NL$, notice that you can guess a next vertex at every step and verify that it is connected to the previous vertex. A series of correct guesses exists if and only if there is a path from $s$ to $t$.
- To see why is $NL$-hard, notice that the behavior of a non-deterministic Turing machine can be represented by a configuration graph. The nondeterministic machine accepts only if there exists a path from the start configuration to an accept configuration, and if the machine only uses $S(n)$ tape entries then the configuration graph is of size $O(|\Gamma|^{S(n)})$ where $\Gamma$ is the tape alphabet. If $S(n)$ is logarithmic, then the resulting graph is of polynomial size and the result follows.
But I don't have access to a non-deterministic machine, so why should I care? Well, we know lots of things about $NL$; one of those is that is in $P$, which you know from the other answers. However, here are tighter facts that can be useful:
From Savitch's theorem we know that $NL \subseteq \text{DSPACE}((\log n)^2)$: even on a deterministic machine you don't need that much space to solve the question.
We know that $NL \subseteq NC^2$: this means that in the circuit model, your question can be solved by a polynomial sized circuit of depth $O((\log n)^2)$. In a more "heuristic" sense, this means that the problem is parallelizable since Nick's Class captures the idea of quick solutions on a parallel computer.
We know that $NL \subseteq \text{LOGCFL}$ which means that it is not harder (up to log-space reductions) than membership checking in context-free languages which can be a good source of intuition.
Finally, the directed nature of the graph is essential. If the graph is undirected then we believe the question is significantly easier. In particular, undirected st-connectivity is complete for $L$ (deterministic log space) under first-order reductions (Reingold 2004; pdf).
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/12288
From the book Computer organization and design by Patterson&Hennessy:
Parity is a function in which the output depends on the number of 1s in in the input. For an even parity function, the output is 1 if the input has an even number of ones. Suppose a ROM is used to implement an even parity function with a 4-bit input. Then the contents of the ROM is $$\text{Address} \ 0 : \ 0 \\ \text{Address} \ 1: \ 1 \\ \text{Address} \ 2 : \ 0 \\ \text{Address} \ 3 : \ 1 \\ \vdots \\ \text{Address} \ 13 : \ 1 \\ \text{Address} \ 14 : \ 0 \\ \text{Address} \ 15 : \ 1$$
As per my understanding, ROM which implements the even parity function should store 0 at both the Address 1 and the Address 2, 1 at the Address 3, ... 0 at both the Address 13 and 14, then 1 at the Address 15, for the Address $k$ to represent the map-value of $(k)_{\text{base}2}$.
According to this the concept defined above is not clear enough, Can someone clarify the doubt?
Asked By : giuscri
Answered By : David Richerby
In this context, implementing something as a ROM just means a look-up table. If you want to know the parity of $x$, you put the binary coding of $x$ on the ROM's address wires and the value you read out is the value stored at that memory location within the ROM, which will be either 0 or 1.
And, yes, the contents of the ROM that you've quoted are wrong: they seem to be implementing parity in the sense that the output is 1 if, and only if, the input is an odd number, instead of implementing the even parity function.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/16599
Let $$L_\ = \{\langle M\rangle \mid M \text{ is a Turing Machine that accepts the string 1100}\}\, .$$
To proof that the language $L$ is undecidable I should reduce something to $L$, right?
I tried with the classic $A\ TM$, but I could not figure out how to reduce properly.
How I can I proof that $L$ is undecidable?
Asked By : Rafael Castro
Answered By : David Richerby
The usual reduction from the halting problem: for example, the same reduction that shows the zero-input halting problem to be undecidable.
Suppose you can determine whether a machine $M$ accepts some magic string (such as $1100$). You can then decide whether an arbitrary machine $M$ halts when given input $x$ as follows. Produce a machine $M_x$ that ignores its input and simulates $M$ running on input $x$. If $M$ halts, $M_x$ accepts whatever input it was given; otherwise, $M$ doesn't halt so nor does the simulation $M_x$. So $M_x$ accepts the magic string if, and only if, $M$ halts on input $x$. Hence, if you could decide whether a machine accepts the magic string, you could decide the halting problem.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/32095
In Algorithmic Randomness and Complexity from Downey and Hirschfeldt, it is stated on page 129 that
$\qquad \displaystyle \sum_{K(\sigma)\downarrow} 2^{-K(\sigma)} \leq 1$,
where $K(\sigma)\downarrow$ means that $K$ halts on $\sigma$, $\sigma$ being a binary string. $K$ denotes the prefix-free Kolmogorov complexity.
When does $K$ halt? I think it only halts on a finite number of inputs, since the classical proof on non-computability of the Kolmogorov complexity gives an upper bound on the domain of $K$. But then, the finite set of inputs on which $K$ halts can be chosen arbitrary (one just needs to store the finite number of complexities in the source code).
So is this sum well-defined? In other words, is the domain of $K$ well defined?
Asked By : pintoch
Answered By : Raphael
I think you are right; $K$ is a specific function which can not be computed. The author likely means to use some (arbitrary) approximative implementation; so no, this does not seem to be well-defined, if you are pedantic. You can also call it abuse of notation.
Consider this instead:
$\qquad \displaystyle \forall {M \in \mathcal{M}_K}.\ \sum_{M(\sigma)\downarrow} 2^{-K(\sigma)} \leq 1$
with $\mathcal{M}_K = \{M\ \mathrm{TM} \mid M(\sigma)\!\downarrow\ \implies\ M(\sigma)=K(\sigma) \}$ the set of all Turing machines that compute subfunctions of $K$.
In essence, this means: the bound holds no matter for which strings your implementation can compute the Kolmogorov complexity.
As Carl notes in the comments, it is plausible that the notation has nothing to do with halting or computing, as $K$ is not computable. Read $\sum_{K(\sigma)\!\downarrow}$ as sum ranging over the domain of $K$.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/2614
I am looking at an example Turing machine in my textbook, Automata and Computability by Dexter C. Kozen, and I'm confused as to how they determine the number of states this particular machine has. Here is the example:
"Here is a TM that accepts the non-context free set $\{a^nb^nc^n \mid > n\geq 0\}$. Informally, the machine starts in its start state s, then scans to the right over the input string, checking that it is of the form $a^* b^* c^*$. It doesn't write anything on the way across (formally, it writes the same symbol it reads). When it sees the first blank symbol _, it overwrites it with a right endmarker ]. Now it scans left, erasing the first c it sees, then the first b it sees, then the first a it sees, until it comes to the [. It then scans right, erasing one a, one b, and one c. It continues to sweep left and right over the input, erasing one occurrence of each letter in each pass. If on some pass it sees at least one occurrence of one of the letters and and no occurrences of another, it rejects. Otherwise, it eventually erases all the letters and makes one pass between [ and ] seeing only blanks, at which point it accepts.
Formally, this machine has $Q = \{s, q_1, ... , q_{10}, q_a, q_r\}, Σ > = \{a,b, c\}, Γ = \Sigma ∪ \{[, \_, ]\}$" (page 211, Example 28.1)
Are they simply creating states based on their informal definition? Or is there some methodology they are implementing that determines the number of states? If there is some sort of methodology, is it a general methodology that can be applied to other Turing machines? Any help regarding this would be greatly appreciated.
Asked By : tdark
Answered By : babou
I am afraid you cannot tell the number of states from an informal description. There may be states needed to handle trivial details or limit cases that are not obvious from that description. So the answer is usually: as many states as needed to actually implement the behavior described informally.
Better precision may be required when analyzing some complexity issues, though even then it may be up to some constant.
For simpler problems, people may give a detailed description of the TM where you can count the states. This may also happen when you study techniques for combining 2 TMs in some way. Then you may want a formula that gives you (a bound for) the number of states of the combination from the number of states of the 2 initial TM (and possibly some other parameters of the twp TM)
But describing TM in details is like programming in machine language: very tedious and highly prone to errors and bugs of all kinds. And usually you do not have a testing environent.
If there is a methodology, it is probably in combining TM that can perform some specific tasks. It is like library functions in programming languages, which implement specific algorithms that you combine to solve a problem.
So the informal description may also, to some extent, be understood as a high-level language that does not yet have a compiler. So you do not really know what machine code you will get. But do not take this last sentence too seriously.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/32145
While trying to improve the performance of my collision detection class, I found that ~80% of the time spent at the gpu, it spent on if/else conditions just trying to figure out the bounds for the buckets it should loop through.
More precisely:
each thread gets an ID, by that ID it fetches its triangle from the memory (3 integers each) and by those 3 it fetches its vertices(3 floats each).
Then it transforms the vertices into integer grid points (currently 8x8x8) and transforms them into the triangle bounds on that grid
To transform the 3 points into bounds, it finds the min/max of each dimension among each of the points
Since the programming language I am using is missing a minmax intrinsic, I made one myself, looks like this:
procedure MinMax(a, b, c): local min, max if a > b: max = a min = b else: max = b min = a if c > max: max = c else: if c < min: min = c return (min, max)
So on the average it should be 2.5 * 3 *3 = 22.5 comparisons which ends up eating up way more time than the actual triangle - edge intersection tests (around 100 * 11-50 instructions).
In fact, I found that pre-calculating the required buckets on the cpu (single threaded, no vectorization), stacking them in a gpu view along with bucket definition and making the gpu do ~4 extra reads per thread was 6 times faster than trying to figure out the bounds on the spot. (note that they get recalculated before every execution since I'm dealing with dynamic meshes)
So why is the comparison so horrendously slow on a gpu?
Asked By : user29075
Answered By : Wandering Logic
GPUs are SIMD architectures. In SIMD architectures every instruction needs to be executed for every element that you process. (There's an exception to this rule, but it rarely helps).
So in your MinMax
routine not only does every call need to fetch all three branch instructions, (even if on average only 2.5 are evaluated), but every assignment statement takes up a cycle as well (even if it doesn't actually get "executed").
This problem is sometimes called thread divergence. If your machine has something like 32 SIMD execution lanes, it will still have only a single fetch unit. (Here the term "thread" basically means "SIMD execution lane".) So internally each SIMD execution lane has a "I'm enabled/disabled" bit, and the branches actually just manipulate that bit. (The exception is that at the point where every SIMD lane becomes disabled, the fetch unit will generally jump directly to the "else" clause.)
So in your code, every SIMD execution lane is doing:
compare (a > b) assign (max = a if a>b) assign (min = b if a>b) assign (max = b if not(a>b)) assign (min = a if not(a>b)) compare (c > max) assign (max = c if c>max) compare (c < min if not(c>max)) assign (min = c if not(c>max) and c<min)
It may be the case that on some GPUs this conversion of conditionals to predication is slower if the GPU is doing it itself. As pointed out by @PaulA.Clayton, if your programming language and architecture has a predicated conditional move operation (especially one of the form if (c) x = y else x = z
) you might be able to do better. (But probably not much better).
Also, placing the c < min
conditional inside the else
of c > max
is unnecessary. It certainly isn't saving you anything, and (given that the GPU has to automatically convert it to predication) may actually be hurting to have it nested in two different conditionals.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/39871
I am wanting to try and prove that the English language is not regular. The alphabet is the set of all words in the English dictionary.
Looking at sentences, I was able to use this pattern of sentences
{ [ (determiner) (noun)]n (verb)n where n is >= 1} ⊆ of the English language
and { [ (determiner) (noun)]m (verb)n where m ≠ n} ∩ of English
determiner is anything like the, my etc. Noun is any noun and verb is also any verb.
All three of these (determiner, noun, and verb) are finite and disjoint. Would something like pumping lemma be a good way to approach this?
Asked By : user3295674
Answered By : Shreesh
It is not easy to prove the complexities of natural languages. See Complexity of natural languages for some hint that English language is not regular.
In fact in Evidence Against the Context-Freeness of Natural Language, Stuart M. Shieber shows that many natural languages are not even context-free.
However, I only know about Dutch and a dialect of Swiss German, for which proof of non context-freeness exists. And the proofs are quite old, the results about them were proved in 1980s.
For detail you can also refer - The Handbook of Computational Linguistics and Natural Language Processing edited by Alexander Clark, Chris Fox, Shalom Lappin.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/53165
I am aware of using Bellman-Ford on a graph $G=(V,E)$ with no negative cycles to find the single-source single-destination shortest paths from source $s$ to target $t$ (both in $V$) using at most $k$ edges. Assuming we have no negative edge weights at all, can we use Dijkstra's algorithm for the same?
My thoughts/algorithm: I was wondering if instead of having a $dist[u$] array (storing the best known distance from s to u), we could use a $dist[u][k]$ table to store the best known distance from $s$ to $u$ using at most $k$ edges (dynamic programming maybe?), and similarly have the priority queue with $(u,n)$ tuples as keys. We can then terminate the algorithm when the tuple popped off the priority queue is $(t,n)$ where t is the target destination and $n <= k$?
Asked By : Mathguy
Answered By : D.W.
If the graph has no negative edges, the problem can be solved in $O(k \cdot (|V|+|E|) \lg |E|)$ time using Dijkstra's algorithm combined with a product construction. We construct a new graph $G'=(V',E')$ with vertex set $V' = V \times \{0,1,2,\dots,k\}$ and edge set
$$E' = \{((v,i), (w,i+1)) : (v, w) \in E\}.$$
In other words, for each edge $v \to w$ in $G$, we have edge $(v,i) \to (w,i+1)$ for all $i$ in $G'$.
Now use Dijkstra's algorithm to find the shortest path in $G'$ from $(s,0)$ to a vertex of the form $(t,i)$ where $i \le k$. This will be the shortest path in $G$ that uses at most $k$ edges.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/53192
I'm reading The Art of Multiprocessor Programming and am currently trying to understand Chapter 4 — The Foundations of Shared Memory.
In section 4.2 it is shown how to build a multi-reader, single-writer (MRSW) safe Boolean register from a single-reader, single-writer (SRSW) safe Boolean register. So, the only difference is that the new register supports multiple readers, but the consistency guarantee (safe) is the same.
The implementation (figure 4.6) uses an array of SRSW safe Boolean registers, with one element for each reader. What I don't understand is why can't we treat a SRSW safe Boolean register as a MRSW safe Boolean register, already?
The definition for the safe consistency level says that a register is safe if:
- A read call that does not overlap a write call must return the value written by the most recent write call.
- A read call that overlaps a write may return any value within the register's allowed range of values (in the case of a Boolean register, just 0 and 1).
It seems to me that a SRSW safe Boolean register can already be treated like a MRSW safe Boolean register because any number of reads concurrent with a write will return either 0 or 1. Non-overlapping reads are obviously fine.
Is there a nuance I'm missing here or is it just a self-imposed limitation that a SRSW safe Boolean register can't be used with multiple readers, just by the definition? I understand this whole register derivation tower is pretty artificial and used mostly for theoretical considerations, so I'm thinking such an artificial limitation is very possible.
Asked By : Ionuț G. Stan
Answered By : Tom van der Zanden
You do not know anything about the behavior of the register if read calls overlap.
Perhaps the register is a sensitive electrical component and overlapping reads will blow it up?
Perhaps reading (counter-intuitively) requires changing the state of the register, for instance writing the target memory location which the value must be read in to? If two reads overlap then it might end up writing the value to the wrong location or to only one of the two locations?
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/44527
I have two sets B which is recursively enumerable and is not recursive, and A which is recursive. Is $A-B$ recursive and / or recursively enumerable? What about $B-A$?
$B-A$ is obviously recursively enumerable (to generate its elements, I can generate B's elements and check if they are in A).
If A is the empty set or $A \cap B$ is the empty set, it's easy. Otherwise, I believe $B-A$ is not recursive (I can't tell if a number is in B, since B is not recursive) and $A-B$ is not recursively enumerable (I can generate A's elements, but I can't check if they are in B), so it's not recursive either.
Am I wrong? How can I actually (and formally) prove any of those?
Asked By : nowembery
Answered By : babou
The empty set is recursive, hence B cannot be the empty set.
You are right the B-A is recursively enumerable.
You cannot prove B-A is not recursive for a good reason: it may be recursive. Maybe you can imagine an example of A recursive, B not recursive, and B-A recursive. Hint: it is not even necessary that B be recursively enumerable, or A recursive.
Proof that $B-A$ can have properties more specific than recursively enumerability (i.e. compatible with being recursively enumerable), indpendently of the properties of B.
Consider 2 disjoint alphabets $\Sigma$ and $\Sigma'$, such that $\Sigma\cap\Sigma'=\emptyset$. Take $C=\Sigma'\,^*$ and $A=\Sigma^*$. Let $E$ be a non-recursive language in $\Sigma^*$. Let $B=C\cup E$. Then $B$ is a non-recursive language on the alphabet $\Sigma\cup\Sigma'$, because $C$ is recursive (trivially). Now you can see that $B-A=C$ since substracting $A$ removes precisely $E$, i.e. all words in B that are in $\Sigma^*$. So you have $B$ non-recursive, and both $A$ and B-A are recursive. Here I chose $A$ recursive. But I could have chosen any non-recursive language on $\Sigma^*$ that contains $E$.
Similarly, you cannot prove anything about A-B. Think why.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/33975
I'm trying to figure out if there is a proper or commonly accepted name for this particular function (f
).
float clamp(float min, float max, float v) { if (v < min) return min; else if (v > max) return max; else return v; } float f(float min, float max, float v) { float t = (v - min) / (max - min); return clamp(0.0, 1.0, t); }
Asked By : Colin Basnett
Answered By : Big Al
This is sometimes called "0-1 normalization" or "feature scaling".
http://en.wikipedia.org/wiki/Feature_scaling
http://en.wikipedia.org/wiki/Normalization_(statistics)
The entire function f() is a Piecewise function, which consists of 3 domains, defined separately for
- v <= min
- v is between min and max, and
- v >= max.
The primary function is the normalization function when v is between min and max. The other two functions just limit the range or codomain of the function.
Piecewise functions are used often in mathematics to make the whole function continuous over a larger domain or to handle special cases like divide by zero, which the original poster might want to handle when min == max.
And what does the function do?
Creates a function f(x) where f(min) returns 0.0, f(max) returns 1.0, and f(v) rescales the input v from 0.0 to 1.0 linearly. Anything less than min maps to 0.0; anything more than max maps to 1.0.
One application of this is to calculate the percentile of a data samples (but from 0.0 to 1.0 vice 0 to 100).
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/24776
I have designed a classifier M which recognizes gestures and classifies it under any category always. A gesture is classified based on the hamming distance between the sample time series y and the training time series x. The result of the classifier are probabilistic values. There are 3 classes/categories with labels A,B,C which classifies hand gestures where there are 100 samples for each class which are to be classified (single feature and data length=100). The data are different time series (x coordinate vs time). The training set is used to assign probabilities indicating which gesture has occured how many times. So,out of 10 training samples if gesture A appeared 6 times then probability that a gesture falls under category A is
P(A)=0.6 similarly P(B)=0.3
and
P(C)=0.1
Now, I am trying to compare the performance of this classifier with Bayes classifier, K-NN, Principal component analysis (PCA) and Neural Network.
- On what basis,parameter and method should I do it if I consider ROC or cross validate since the features for my classifier are the probabilistic values for the ROC plot hence what shall be the features for k-nn,bayes classification and PCA?
- Is there a code for it which will be useful.
- What should be the value of k is there are 3 classes of gestures?
Please help. I am in a fix.
Asked By : user1214586
Answered By : Suresh
There isn't an easy way to do this. In fact a recent paper by Charles Parker explains some of the problems with using ROC curve measurements (there's a friendly overview at my blog - self promotion alert!!). His paper does make some recommendations on the best ways to compare different classifiers, so you will find something useful there.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/878
Universal Turing Machine can be boiled down to two components. Infinite tape of input and an action table, a finite state machine that moves read/write head along the tape and writes to it depending on input provided by the tape.
From this point of view cells have some properties very similar to UTMs, the DNA is in an essence a tape of instructions that can be read and written to. Rest of the cell behaves similar to action table, defining rules that guide which part of DNA is read and when it happens, a moving "head" along DNA "tape".
Subquestion 1: DNA can be used for computation. Can entire cell be used for similar purposes?
Subquestion 2: If every living organism contains at least one UTM is it possible that all organism are in some sense Turing Complete?
Asked By : user1561358
Answered By : Raphael
Every living organism has -- to our knowledge -- only a finite amount of resources available. So no, they can not be Turing-complete.
That said, there is quite a number of bio-inspired models of computation that can be studied formally. Sticker systems [1], for instance -- an abstraction of recombining DNA fragments -- can be shown to reach Turing-power when we assume infinite resources. Splicing systems [2] are another example.
I am not aware of any abstractions of entire cells. To my knowledge, we do not have a full understanding of how cells work so that is out of reach -- today.
Note: Turing-completeness requires programmability resp. universality. Can you elicit any response from a cell by applying suitable stimuli? I wouldn't think so. In my opinion, any given cell is more likely to correspond to a specific program with a specific purpose.
- Sticker Systems by G. Păun and G. Rozenberg (1998).
- Formal language theory and DNA: An analysis of the generative capacity of specific recombinant behaviors by T. Head (1987).
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/55426
Let's say I am building a time clock app. The employee can log into the interface and then enter time for multiple projects so that we can generate reports for billing as well as payroll.
A table in the database will record the time-in and time-out for each project as a row. A user should be able to sign out and back into the same project multiple times throughout the day, and also can enter time out of order (e.g. sign into Project X at 9:00 AM to 2:00 PM, then into Project Y from 7:00 AM to 8:30 AM).
So let's say my table looks like this:
Project | Time In | Time Out ----------|---------|--------- Project X | 9.25 | 14.25 Project Y | 7.00 | 8.50
(I am representing the data as a decimal number of hours simply because it's easy to read in the example; the data is actually MySQL TIMESTAMPs.)
Assuming that the data is not in any particular order, I am wondering what would be the simplest and/or most optimized way to validate that the next entry does not conflict with the existing time in such a way that the employee is signed into two projects simultaneously.
e.g. If the user tries to sign in at 12:00 PM or at 7:30 AM, it throws an error because the user was already working on Project X at noon and Project Y at 7:30.
I know I could do this by getting all the time clock data and then detecting if time_in <= input[time] <= time_out, but I'm wondering if there's a cleaner method to accomplish this task.
Asked By : M Miller
Answered By : Alexey Birukov
select * from table where end_time > 12.00PM order by end_time
fetch single row, and check if start_time < 12.00PM.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/30492
Suppose I have an alphabet of n symbols. I can efficiently encode them with $\lceil \log_2n\rceil$-bits strings. For instance if n=8:
A: 0 0 0
B: 0 0 1
C: 0 1 0
D: 0 1 1
E: 1 0 0
F: 1 0 1
G: 1 1 0
H: 1 1 1
Now I have the additional constraint that each column must contain at most p bits set to 1. For instance for p=2 (and n=8), a possible solution is:
A: 0 0 0 0 0
B: 0 0 0 0 1
C: 0 0 1 0 0
D: 0 0 1 1 0
E: 0 1 0 0 0
F: 0 1 0 1 0
G: 1 0 0 0 0
H: 1 0 0 0 1
Given n and p, does an algorithm exist to find an optimal encoding (shortest length) ? (and can it be proved that it computes an optimal solution?)
EDIT
Two approaches have been proposed so far to estimate a lower bound on the number of bits $m$. The goal of this section is to provide an analysis and a comparaison of the two answers, in order to explain the choice for the best answer.
Yuval's approach is based on entropy and provides a very nice lower bound: $\frac{logn}{h(p/n)}$ where $h(x) = xlogx + (1-x)log(x)$.
Alex's approach is based on combinatorics. If we develop his reasonning a bit more, it is also possible to compute a very good lower bound:
Given $m$ the number of bits $\geq\lceil log_2(n)\rceil$, there exists a unique $k$ such that $$ 1+\binom{m}{1} + ... +\binom{m}{k} \lt n \leq 1+\binom{m}{1} + ... + \binom{m}{k}+\binom{m}{k+1}$$ One can convince himself that an optimal solution will use the codeword with all bits low, then the codewords with 1 bit high, 2 bits high, ..., k bits high. For the $n-1-\binom{m}{1}-...-\binom{m}{k}$ remaining symbols to encode, it is not clear at all which codewords it is optimal to use but, for sure the weights $w_i$ of each column will be bigger than they would be if we could use only codewords with $k+1$ bits high and have $|w_i - w_j| \leq 1$ for all $i, j$. Therefore one can lower bound $p=max(w_i)$ with $$p_m = 0 + 1 + \binom{m-1}{2} +... + \binom{m-1}{k-1} + \lceil \frac{(n-1-\binom{m}{1}-...-\binom{m}{k}) (k+1)}{m} \rceil$$
Now, given $n$ and $p$, try to estimate $m$. We know that $p_m \leq p$ so if $p \lt p_{m'}$, then $m' \lt m$. This gives the lower bound for $m$. First compute the $p_m$ then find the biggest $m'$ such that $p \lt p_{m'}$
This is what we obtain if we plot, for $n=1000$, the two lower bounds together, the lower bound based on entropy in green, the one based on the combinatorics reasonning above in blue, we get:
Both look very similar. However if we plot the difference between the two lower bounds, it is clear that the lower bound based on combinatorics reasonning is better overall, especially for small values of $p$.
I believe that the problem comes from the fact that the inequality $H(X) \leq \sum H(X_i)$ is weaker when $p$ gets smaller, because the individual coordinates become correlated with small $p$. However this is still a very good lower bound when $p=\Omega(n)$.
Here is the script (python3) that was used to compute the lower bounds:
from scipy.misc import comb from math import log, ceil, floor from matplotlib.pyplot import plot, show, legend, xlabel, ylabel # compute p_m def lowerp(n, m): acc = 1 k = 0 while acc + comb(m, k+1) < n: acc+=comb(m, k+1) k+=1 pm = 0 for i in range(k): pm += comb(m-1, i) return pm + ceil((n-acc)*(k+1)/m) if __name__ == '__main__': n = 100 # compute lower bound based on combinatorics pm = [lowerp(n, m) for m in range(ceil(log(n)/log(2)), n)] mp = [] p = 1 i = len(pm) - 1 while i>= 0: while i>=0 and pm[i] <= p: i-=1 mp.append(i+ceil(log(n)/log(2))) p+=1 plot(range(1, p), mp) # compute lower bound based on entropy lb = [ceil(log(n)/(p/n*log(n/p)+(n-p)/n*log(n/(n-p)))) for p in range(1,p)] plot(range(1, p), lb) xlabel('p') ylabel('m') show() # plot diff plot(range(1, p), [a-b for a, b in zip(mp, lb)]) xlabel('p') ylabel('m') show()
Asked By : user3017842
Answered By : Alex Meiburg
There is an additional lower bound we can build, that will address cases like what @user3017842 mentioned in their comment on Yuval's answer. (Cases where $p$ is particularly small.) Suppose we knew $m$ already: Then we have $pm$ bits high total across all codewords. Since we're interested the cases where $p$ is small, we view these high bits as our limiting resource, and want to build a code with it (and see how many codewords we can possibly get out). We can have 1 codeword with all 0s, then $m$ codewords with a single 1, then $m \choose 2$ with two 1s, etc. If we call the highest number of bits in a codeword $k$, then $$pm = 0\cdot 1 + 1\cdot m + 2\cdot {m \choose 2}+... \le \sum_i^k i{m \choose i}$$ While our number of codewords $n$ is similarly bounded by $$n \le \sum_i^k {m \choose i}$$ If we look at the case where $p \le m$, then $k \le 2$ is already implied by the first inequality. ($pm = m^2 = m + 2{m \choose 2}$). So then the code would consist of the $0$-word, $m$ single-$1$-words, and $(p-1)m/2$ two-$1$-words. Thus $$ n \le 1 + m + (p-1)m/2 $$ or inverting $$ m \ge \frac{2(n-1)}{p+1} .$$ This will yield the tight lower bound of $m\ge 5$ on the example you provide, but as mentioned before, will probably only be very useful while $p \approx m$ (or $p \approx \sqrt n$).
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/45783
How would I rewrite an XOR gate into the three basic logic gates (AND, OR, NOT). To be more specific, I have to write it in such a way with 2 NOT gates, 2 OR gates, and 1 AND gate. I also have to do it with 1 OR gate, 2 AND gates, and 1 NOT gate.
I'm not looking for just the answer, I'm looking for a way to come up with the answer.
Thanks!
Asked By : 4everPixelated
Answered By : Pål GD
Hint: $a \oplus b = \neg \big( (a \land b) \lor (\neg a \land \neg b)\big)$ (you can't have both true and you can't have both false). Using De Morgan's, you should be able to break up the negation and the main $\lor$.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/44019
I don't recall hearing that nodes in red-black trees can't have one nil child and one non-nil child. However, I did hear that red-black trees have a worst-case height of $2log_2(n + 1)$, where n is the number of nodes, and I don't see how this could be the case if nodes can have one nil and one non-nil child, as a tree could be constructed that is simply a straight line/linked list, which would have a height of n.
Asked By : Kelmikra
Answered By : hengxin
Can nodes in red-black trees have one nil child and one non-nil child?
Yes. A red-black tree is a special case of binary search tree: each node is colored red or black. In red-black tree, if a child does not exist, the corresponding pointer field of that node contains the value NIL. NILs are treated as leaves.
red-black trees have a worst-case height of $2 \log 2(n+1)$
The fact that the height is of $O(\lg n)$ is guaranteed by the red/black coloring rules (copied from CLRS):
- A node is either red or black.
- The root is black.
- All leaves (NIL) are black.
- If a node is red, then both its children are black.
- For each node, all paths from the node to descendant leaves contain the same number of black nodes.
These rules, especially rules 4 and 5, ensure that the tree is approximately balanced: no any path is more than twice as long as any other. In terms of tree height, we have
A red-black tree with $n$ internal nodes has height at most $2 \lg (n+1)$.
Finally,
how this could be the case if nodes can have one nil and one non-nil child, as a tree could be constructed that is simply a straight line/linked list, which would have a height of $n$.
By the red/black coloring rules, a non-trivial (with enough number of nodes) red-black tree cannot be a straight line/linked list.
For more details, you may want to read the book "Introduction to Algorithms" by CLRS.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/40583
Paxos is more powerful and in the famous writing "Consensus on Transaction Commit" : http://research.microsoft.com/pubs/64636/tr-2003-96.pdf, Jim Gray and Leslie Lamport describe 2PC as a special case of Paxos.
Why do relational database use 2PC in real world? Also 2PC is not fault tolerant because it uses a single coordinator whose failure can cause the protocol to block.
Asked By : Nitish Upreti
Answered By : Andrei
One of the reasons is the message complexity. For N nodes, 2PC will require 3N to be exchanged whereas Paxos requires 4N. Also, Paxos adds sequence numbers to each message which adds a significant overhead to the overall execution.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/33719
Suppose I have a queue where I pull from left and push to the right, and suppose I have the contents in the queue as $a b c @ d e$ (from left to right, left is head, right is tail).
Is there a simple algorithm that doesn't require extra structures that makes $e$ at the head? meaning to get us to the queue $eabc@d$?
P.S.: I need an algorithm like that for the purpose of a queue automaton.
Asked By : TheNotMe
Answered By : Gari BN
If you can only push (enqueue) or pull (dequeue) from the queue, then your only option is to pull all the elements and re-enter them.
If you need such an operation, you can use deque (double-ended queue). See: http://en.wikipedia.org/wiki/Double-ended_queue .
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/19452
A drink dispenser requires the user to insert a coin ($\bar c$), then press one of three buttons: $\bar d_{\text{tea}}$ requests a cup of tea $e_{\text{tea}}$, ditto for coffee, and $\bar r$ requests a refund (i.e. the machine gives back the coin: $\bar b$). This dispenser can be modeled by the following CCS process:
$$ M \stackrel{\mathrm{def}}= c.(d_{\text{tea}}.\bar e_{\text{tea}}.M + d_{\text{coffee}}.\bar e_{\text{coffee}}.M + r.\bar b.M)$$
A civil war raises the price of coffee to two coins, while the price of tea remains one coin. We want a modified machine that delivers coffee only after two coins, and acquiesces to a refund after either one or two coins. How can we model the modified machine with a CCS process?
Asked By : Gilles
Answered By : jmad
You can easily profit from warfare that way:
$$ M \stackrel{\mathrm{def}} = c.( d_{\text{tea}}.\bar e_{\text{tea}}.M + r.\bar b.M + c.( d_{\text{coffee}}.\bar e_{\text{coffee}}.M + r.\bar b.\bar b.M ) ) $$
note that you have to press refund to get a tea if you put too many coins. If you don't want that, you can adapt it (or maybe set up a (finite is enough) counter) :
$$ M \stackrel{\mathrm{def}} = c.( d_{\text{tea}}.\bar e_{\text{tea}}.M + r.\bar b.M + c.( d_{\text{coffee}}.\bar e_{\text{coffee}}.M + d_{\text{tea}}.\bar b.\bar e_{\text{tea}}.M + r.\bar b.\bar b.M ) ) $$
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/444
I see that Σ* is claimed to be decidable in many documents, but I have never seen an example or easy demostration that it is decidable.
What is the proof that Σ* is decidable?
Asked By : Charles
Answered By : Andrej Bauer
Theorem: The set $\Sigma^{*}$ of all words is decidable.
Proof. According to the definition of decidability, we must provide a computable function $d$ which takes a word $w$ and outputs $1$ if $w \in \Sigma^{*}$, and outputs $0$ if $w \not\in \Sigma^{*}$. Such a function is very easily constructed, it is $$d(w) = 1,$$ That is, because every word is in $\Sigma^{*}$, the decision function always says "yes". QED.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/32726
I have as an assignment question to show that $QuadSat=\{\langle\phi\rangle\mid\phi$ is a satisfiable 3CNF formula with at least 4 satisfying assignments$\}$ is $\sf NP$-Complete.
My solution is as follows, which is pretty much copied almost 100% from a textbook example with only an extra requirement for satisfiablity at the end...
$$QuadSat\leq_{p} Clique$$ Let $\phi$ be a formula with k clauses such as $$\phi=\bigwedge_{1}^{k}(a_k\vee b_k\vee c_k)$$ The reduction $f$ generates the strong $\langle G,k\rangle$, where $G$ is an undirected graph defined as follows:
The nodes in $G$ are organized into $k$ groups of three nodes each called the \textbf{triples}, $t_1, \dots, t_k$. Each triple corresponds to one of the clauses in $\phi$, and each node in a triple corresponds to a literal in the associated clause. Label each node of $G$ with its corresponding literal in $\phi$.
The edges of $G$ connect all but two types of pairs of nodes in $G$: No edge is present between nodes in the same triple, and no edge is present between two nodes with contradictory labels. $QuadSat$ is satisfiable if and only if the resulting graph $G$ contains four or more $k$-$cliques$. Each unique $k$-$clique$ in $G$ represents a set of satisfying assignments to $QuadSat$.
The reduction runs in polynomial time, because the construction of the graph is a polynomial function; one pass through all the triples to create all the vertices for $V$, and one pass through the same triples to create the edges.
I feel like my explanation as to why my reduction is polynomial in time is severely weak, possibly bordering on wrong. How can I explain this better?
And something else: I think this only proves that QuadSat is in NP, but not necessarily NP Complete. How can I prove this?
Asked By : agent154
Answered By : Shaull
Your reduction indeed shows that $QuadSAT\in NP$, since you showed a reduction to a problem in NP. Perhaps an easier way to have done this, is to simply show that there is a "short witness" for the membership of a formula in $QuadSAT$. In order to show NP-hardness, you must show a reduction from an NP-hard problem. A good candidate for that is $SAT$ (or $3SAT$).
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/9806
I have the following Bayesian Network and need help with answering the following query.
EDITED:
Here are my solutions to questions a and b:
a)
P(A,B,C,D,E) = P(A) * P(B) * P(C | A, B) * P(D | E) * P(E | C)
b)
P(a, ¬b, c ¬d, e) = P(a) * P(¬b) * P(c | a, b) * P(¬d | ¬b) * P(e | c) = 0.02 * 0.99 * 0.5 * 0.99 * 0.88 = 0.0086
c)
P(e | a, c, ¬b)
This is my attempt:
a × ∑ P(a, ¬b, c, D = d, e) = d a × ∑ { P(a) * P(¬b) * P(c | a, b) * P(d) * P(e | c) + P(a) * P(¬b) * P(c | a,b) *P(¬d) d + P(e | c) }
Note that a is the alpha constant and that a = 1/ P(a,¬b, c)
The problem I have is that I don't know how to compute the constant a that the sum is multiplied by. I would appreciate help because I'm preparing for an exam and have no solutions available to this old exam question.
Asked By : mrjasmin
Answered By : D.W.
You're on the right path. Here's my suggestion. First, apply the definition of conditional probability:
$$ \Pr[e|a,c,\neg b] = {\Pr[e,a,c,\neg b] \over \Pr[a,c,\neg b]}. $$
So, your job is to compute both $\Pr[e,a,c,\neg b]$ and $\Pr[a,c,\neg b]$. I suggest that you do each of them separately.
To compute $\Pr[a,\neg b,c,e]$, it is helpful to notice that
$$ \Pr[a,\neg b,c,e] = \Pr[a,\neg b,c,d,e] + \Pr[a,\neg b,c,\neg d,e]. $$
So, if you can compute terms on the right-hand side, then just add them up and you've got $\Pr[a,\neg b,c,e]$. You've already computed $\Pr[a,\neg b,c,\neg d,e]$ in part (b). So, just use the same method to compute $\Pr[a,\neg b,c,d,e]$, and you're golden.
Another way to express the last relation above is to write
$$ \Pr[a,\neg b,c,e] = \sum_d \Pr[a,\neg b,c,D=d,e]. $$
If you think about it, that's exactly the same equation as what I wrote, just using $\sum$ instead of $+$. You can think about whichever one is easier for you to think about.
Anyway, now you've got $\Pr[e,a,c,\neg b]$. All that remains is to compute $\Pr[a,c,\neg b]$. You can do that using exactly the same methods. I'll let you fill in the details: it is a good exercise. Finally, plug into the first equation at the top of my answer, and you're done.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/13803
I read on Wikipedia and in lecture notes that if a lossless data compression algorithm makes a message shorter, it must make another message longer.
E.g. In this set of notes, it says:
Consider, for example, the 8 possible 3 bit messages. If one is compressed to two bits, it is not hard to convince yourself that two messages will have to expand to 4 bits, giving an average of 3 1/8 bits.
There must be a gap in my understand because I thought I could compress all 3 bit messages this way:
- Encode: If it starts with a zero, delete the leading zero.
- Decode: If message is 3 bit, do nothing. If message is 2 bit, add a leading zero.
- Compressed set: 00,01,10,11,100,101,110,111
What am I getting wrong? I am new to CS, so maybe there are some rules/conventions that I missed?
Asked By : Legendre
Answered By : Shaull
You are missing an important nuance. How would you know if the message is only 2 bits, or if it's part of a bigger message? For that, you must also encode a bit that says that the message starts, and a bit that says it ends. This bit should be a new symbol, because 1 and 0 are already used. If you introduce such a symbol and then re-encode everything to binary, you will end up with an even longer code.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/9938
Let $p$ be the six-figure Boolean function with the following definition:
$p(x_{0},x_{1},x_{2},x_{3},x_{4},x_{5})=\begin{cases} true & \text{if } x_{0}=x_{5} \text{ and } x_{1}=x_{4} \text{ and } x_{2}=x_{3}, \\ false & \text{else.} \end{cases}$
This function obviously yields $true$ iff $x_{0}x_{1}x_{2}x_{3}x_{4}x_{5}$ is a palindrome. Provide a BDD for $p$ relative to a variable ordering of your choice.
My problems begin when I try to define an appropriate variable ordering, so I am only able to guess it: $x_{0}=x_{5} < x_{1}=x_{4} < x_{2}=x_{3}$. I'm actually pretty lost with this exercise and any help is much appreciated (sorry for not being able to provide a better own approach).
Asked By : Uriel
Answered By : Uriel
So finally this should be the correct solution:
The variable ordering is $x_{0} < x_{5} < x_{1} < x_{4} < x_{2} < x_{3}$. The BDD is:
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/11741
I am trying to find out why $(\log(n))^{99} = o(n^{\frac{1}{99}})$. I tried to find the limit as this fraction goes to zero.
$$ \lim_{n \to \infty} \frac{ (\log(n))^{99} }{n^{\frac{1}{99}}} $$
But I'm not sure how I can reduce this expression.
Asked By : David Faux
Answered By : Reza
$\qquad \begin{align} \lim_{x \to \infty} \frac{ (\log(x))^{99} }{x^{\frac{1}{99}}} &= \lim_{x \to \infty} \frac{ (99^2)(\log(x))^{98} }{x^{\frac{1}{99}}} \\ &= \lim_{x \to \infty} \frac{ (99^3) \times 98(\log(x))^{97} }{x^{\frac{1}{99}}} \\ &\vdots \\ &= \lim_{x \to \infty} \frac{ (99^{99})\times 99! }{x^{\frac{1}{99}}} \\ &= 0 \end{align}$
I used L'Hôpital's rule law in each conversion assuming natural logarithm.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/9852
Given a $0,1$ (binary) integer program of the form: $$ \begin{array}{lll} \text{min} & f(x) & \\ \text{s.t.} &A\vec{x} = \vec{b} & \quad \forall i\\ &x_i\ge 0 & \quad \forall i\\ &x_i \in \{0,1\} & \quad \forall i \end{array} $$
Note: the size of $A$ is not fixed in either dimension.
I believe this problem has been shown to be hard to approximate (strongly ${\sf NP}$-Complete) Garey & Johnson.
If so, is this still the case when $A$, $\vec{b}$ have binary entries and $f(x)$ is a linear function ( $f(x) = \sum_i c_i x_i$ )?
Asked By : Jonas Anderson
Answered By : Yuval Filmus
One-in-three 3SAT is NP-complete. Looking at the reduction, it inherits the APX-hardness of 3SAT. You can formulate one-in-three 3SAT as a binary integer program with binary entries, so you problem is APX-hard.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/9810
Today, a talk by Henning Kerstan ("Trace Semantics for Probabilistic Transition Systems") confronted me with category theory for the first time. He has built a theoretical framework for describing probablistic transition systems and their behaviour in a general way, i.e. with uncountably infinite state sets and different notions of traces. To this end, he goes up through several layers of abstraction to finally end up with the notion of monads which he combines with measure theory to build the model he needs.
In the end, it took him 45 minutes to (roughly) build a framework to describe a concept he initially explained in 5 minutes. I appreciate the beauty of the approach (it does generalise nicely over different notions of traces) but it strikes me as an odd balance nevertheless.
I struggle to see what a monad really is and how so general a concept can be useful in applications (both in theory and practice). Is it really worth the effort, result-wise?
Therefore this question:
Are there problems that are natural (in the sense of CS) on which the abstract notion of monads can be applied and helps (or is even instrumental) to derive desired results (at all or in a nicer way than without)?
Asked By : Raphael
Answered By : Dave Clarke
Asking whether an occurrence of monad is natural is similar to asking whether a group (in the sense of group theory) is natural. Once you formalise something, in this case as an endofunctor, either it satisfies the axioms of being a monad or not. If it does satisfy the axioms, then one gets a lot of technical machinery for free.
Moggi's paper Notions of computation and monads pretty much seals the deal: monads are incredibly natural and useful for describing computational effects in a unified way. Wadler and others translated these notions to deal with computational effects in functional programming languages, by using the connection that a functor is a data type constructor. This adds the icing on the cake. FP monads allow a treatment of computational effects such as IO which would be extremely unnatural without them. Monads have inspired related useful notion such as arrows and idioms which are also very useful for structuring functional programs. See the Wadler link for references. FP monads are the same thing as category theory monads in the sense that for an FP monad to work the same equations need to hold --- the compiler relies on these. Generally, the presentation of the monad differs (different but equivalent operations and equations), but this is a superficial difference.
A vast amount of the work of Bart Jacobs, to take but one example, uses monads. A lot of work stems from coalgebra, which is a general theory of systems. One of Jacobs' (many) contributions to the area is the development of a generic notion of trace semantics for systems (described as coalgebras) based on monads. One could argue that the notion of trace semantics is natural: What is the semantics of a system? The list of actions that can be observed!
One way to understand monads is to first program in Haskell using monads. Then find one of the many good tutorials available (via google). Start from the programming angle, then move to the theoretical side, starting with some basic category theory.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/138
In Sipser's book there is a section describing how to decide
$\qquad\displaystyle \mathrm{ALL}_\mathrm{NFA} = \{ \langle N \rangle \mid N \text{ is an NFA}, L(N) = \Sigma^*\}$
in polynomial space. To do so, it shows $\overline{\mathrm{ALL}_\mathrm{NFA} }$ is in NPSPACE.
I don't understand this part: If $M$, the NTM deciding the language, rejects any strings, it must reject one of length at most $ 2^q $ where $q$ is the number of states in $M$. For any longer string that is rejected, some states would repeat. But why? Is there any alternative explanation that helps in understanding this part?
Asked By : CentAu
Answered By : Louis
Your question is really about the following statement: If an NFA with $n$ states does not accept every string, then it rejects some string of length at most $2^n$. (The rest of the proof comes from the fact that REACH is in NL.)
To see why this is so, first let's look at a DFA with $k$ states that does not accept every string. The transition function of the DFA corresponds to a directed graph $G$ on $k$ vertices. DFA computations bijectively correspond to walks in $G$ starting from the vertex $s$ representing the start state.
Since the DFA does not accept every string, there is a vertex representing a non-accepting state $j$ reachable by a walk from $s$. In particular, there is a simple path $P$ from $s$ to $j$, which can have length at most $k$ (e.g., DFS will produce one). $P$ corresponds to a string of length at most $k$ that is not accepted.
The NFA question reduces to the DFA one. Using the powerset construction, convert the NFA with $n$ states to a DFA with $k = 2^n$ states and then use the above argument.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/24390
I'm surprised that people keep adding new types in type theories but no one seems to mention a minimal theory (or I can't find it). I thought mathaticians love minimal stuff, don't they?
If I understand correctly, in a type theory with a impredicative Prop
, λ-abstraction and Π-types suffice. By saying suffice I mean it could be used as intuitionistic logic. Other types can be defined as following:
$$ \bot \stackrel{def}{=} \Pi \alpha: Prop. \alpha \\ \neg A \stackrel{def}{=} A \to \bot \\ A \land B \stackrel{def}{=} \Pi C: Prop. (A \to B \to C) \to C \\ A \lor B \stackrel{def}{=} \Pi C: Prop. (A \to C) \to (B \to C) \to C \\ \exists_{x: S}(P(x)) \stackrel{def}{=} \Pi \alpha: Prop. (\Pi x: S. P x \to \alpha) \to \alpha \\ $$
My first question is, do they (λ
, Π
) really suffice? My second question is, what do we need minimally if we don't have an impredicative Prop
, such as in MLTT? In MLTT, Church/Scott/whatever encoding doesn't work.
Asked By : 盛安安
Answered By : cody
To elaborate on gallais' clarifications, a type theory with impredicative Prop, and dependent types, can be seen as some subsystem of the calculus of constructions, typically close to Church's type theory. The relationship between Church's type theory and the CoC is not that simple, but has been explored, notably by Geuvers excellent article.
For most purposes, though, the systems can be seen as equivalent. Then indeed, you can get by with very little, in particular if you're not interested in classical logic, then the only thing you really need is an axiom of infinity: it's not provable in CoC that any types have more than 1 element! But with just an axiom expressing that some type is infinite, say a natural numbers type with the induction principle and the axiom $0\neq 1$, you can get pretty far: most of undergraduate mathematics can be formalized in this system (sort of, it's tough to do some things without the excluded middle).
Without impredicative Prop, you need a bit more work. As noted in the comments, an extensional system (a system with functional extensionality in the equality relation) can get by with just $\Sigma$ and $\Pi$-types, $\mathrm{Bool}$, the empty and unit types $\bot$ and $\top$, and W-types. In the intensional setting that's not possible: you need many more inductives. Note that to build useful W-types, you need to be able to build types by elimination over $\mathrm{Bool}$ like so:
$$ \mathrm{if}\ b\ \mathrm{then}\ \top\ \mathrm{else}\ \bot $$
To do meta-mathematics you'll probably need at least one universe (say, to build a model of Heyting Arithmetic).
All this seems like a lot, and it's tempting to look for a simpler system which doesn't have the crazy impredicativity of CoC, but is still relatively easy to write down in a few rules. One recent attempt to do so is the $\Pi\Sigma$ system described by Altenkirch et al. It's not entirely satisfying, since the positivity checking required for consistency isn't a part of the system "as is". The meta-theory still needs to be fleshed out as well.
A useful overview is the article Is ZF a hack? by Freek Wiedijk, which actually compares the hard numbers on all these systems (number of rules and axioms).
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/54548
Exercise 2.3-7 from "Introduction to Algorithms" by Cormen et al. Third Edition, states:
Describe a O(n lg n)-time algorithm that, given a set S of n integers and another integer x, determines whether of not there exist two elements in S whose sum is exactly x.
At first, I had no idea how to solve it since I thought you couldn't access elements of a set by index, but assuming you could, here was my solution:
First off, we sort the set S, and then for every element y in S, we search if x - y exists in S. If x - y exists then we are done, otherwise continue the process until we have looked at all elements in S.
Since we sort at the beginning it's O(n log n) and then we perform a binary search for every element, so total cost would be O(n log n) + O(n log n), therefore O(n log n).
But the solution they give in the Lecture Notes is:
The following algorithm solves the problem:
- Sort the elements in S.
- Form the set S'= {z : z = x − y for some y ∈ S}.
- Sort the elements in S'.
- If any value in S appears more than once, remove all but one instance. Do the same for S'.
- Merge the two sorted sets S and S'.
- There exist two elements in S whose sum is exactly x if and only if the same value appears in consecutive positions in the merged output.
And they go on to say:
Steps 1 and 3 require O(n lg n) steps. Steps 2, 4, 5, and 6 require O(n) steps. Thus the overall running time is O(n lg n).
So obviously they assume a set is just a regular array (you can access elements by index and the elements don't have to be unique). (And just to clarify my assumption, from the beginning of the book up to this point, they haven't put forth their definition of set, so it was easy to assume they were referring to an actual set).
It seems to me that even though their solution is O(n log n), my solution not only does it reduce the "hidden costs" greatly, it's also much more straightforward than theirs.
(To account for the possibility of repeated elements, nothing needs to be modified since the current y is not inside the searching subarray)
I've checked the published erratas (I know it wouldn't be a mistake per se anyway, but I thought it might've said something there), and there is nothing. So my question is: why is my solution wrong? Is my algorithm incorrect or is it my analysis?
Asked By : Paul McCullough
Answered By : Hendrik Jan
You are right, there is a assuption that the set is available as an unsorted array, but I think that most set representations will allow you to obtain all elements in linear time. I also agree with you that your solution is straightforward, and uses no funny tricks.
I guess their solution is to impress you that only sorting and merging is needed, no such complicated things as binary search. (insert smiley here.) Perhaps binary search was not yet introduced at that point?
To illustrate the point Raphael makes — there are infinitely many implementations — here is mine. I believe it is correct, and uses only one sort, and a linear pass over the array. So: sort the array (from small to large). To check whether $y,z\in S$ exist such that $y+z=x$ move over the array from both sides. If the sum of the two elements of the array is larger than $x$ move the larger element one down, if it is less than $x$ move the smaller element one up. (How does this compare to your proposal?)
Nevertheless I would not consider their solution as incorrect. I like to think that it is good to have several solutions available. In some circumstances one implementation may outperform another. E.g., if the set was to be given as a sorted linked list, your solution would be hard to implement. (And mine only works if the list is doubly linked.)
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/58263