**Problem Detail:**

As far as I have learned, an approximation algorithm for an optimization problem

- Runs in polynomial time, and
- Whose cost can be bounded by a function of input in terms of distance from the optimal cost.

If we consider the optimization version of subset-sum problem, it says

Given a set $A$ of integers, what is the maximum value of $sum(B) \le t$ where $B \subset A$, $sum(B) = \sum{b\in B}$ and $t \in Z^+$

In this problem, we know that we search for a maximum value that is smaller than or equal to $t$. So, a solution $B_1$ is better than another solution $B_2$ if $\dfrac{sum(B_1)}{sum(OPT)} > \dfrac{sum(B_2)}{sum(OPT)}$, $OPT$ being the optimal subset.

Now, let us consider the WSN localization problem, which is defined as:

We have a set $A = \{1,2,\dots,n\}$ of $n$ points in $d$-dimensions. We only know the coordinates of some points, let us denote them with $B = \{b_1, b_2, \dots, b_m\}$ where $B \subset P$. By using the positions of the nodes in $B$, we aim to assign positions to the nodes in $A$. While doing this, we use the Euclidean distance graph $G = <V,E,W>$ that is given as input. In this graph, each node $i \in V$ corresponds to a point $i \in P$ and each edge $\{i,j\} \in E$ means that there is a distance measurement between the point $i$ and point $j$. The weights of an edge corresponds to the Euclidean distance between two points.

We use a unit disk graph (UDG) model. In a UDG, an edge $\{i,j\}$ exists if and only if the Euclidean distance $\delta_{ij}$ between $i$ and $j$ is smaller than or equal to a specific value $R$. That indicates if there does not exist an edge $\{i,j\}$, then $\delta_{ij} > R$.

We aim to assign coordinates to each node $v \in V$ considering their Euclidean distances. This process is called *localization*.

As the motivation is wireless sensor nodes, we assume that we cannot measure distances with 100% accuracy. Therefore, each distance $\delta_{ij}$ is altered by adding a value $\epsilon$ to model the environmental noise. For any two points $i,j$ whose pairwise distance is $\hat{\delta}_{ij} < R$, the given distance is $\delta_{ij} = \hat{\delta}_{ij} + \epsilon$.

$\epsilon$ is up to $\pm P\%$ of the wireless range $R$ selected from a uniform random distribution. Let us assume that we always can localize %100 of the nodes i.e. the input graph is localizable. Our objective is to minimize the average localization error. This is computed by $\dfrac{\sum\limits_{v \in V}||v_{est} - v_{act}||}{|V|}$, where $v_{est}$ is the estimated position of node $v$, $v_{act}$ is the actual position of node $v$ and $||v_{est} - v_{act}||$ denotes the Euclidean distance between two.

The optimal solution for any instance is clearly $\forall v \in V, v_{est} = v_{act}$ and the cost is $cost(OPT) = 0$.

In subset sum, we know how close we are to the value $t$ and can compare two solutions. In traveling salesman problem, we can compare the solutions by their costs. However, in localization problem, we cannot compare two solutions without knowing the actual positions, which is impossible by the definition of the problem.

My question is: Can we anyhow prove if WSN localization problem is approximable or not by not being able to know how good a solution is? Or is it completely irrelevant?

For further reading, here is the NP-hardness proof of the localization problem with noisy distances. This paper defines the formal theory of the same problem.

###### Asked By : cagirici

###### Answered By : D.W.

## Definition of a well-formed optimization problem

It seems to me the core problem you have here is that the problem you have defined is not what I'd call a well-formed optimization problem. Normally, an optimization problem looks something like this:

Input: $x$

Goal: find $y$ that maximizes $\Phi(x,y)$

where $\Phi$ is some objective function that's specified as part of the problem statement. The key point here is that the objective function has to be computable, as a function of the input and the proposed output.

Your proposed formalization of the WSN localization problem does not have this form: the objective function is not computable as a function of the information known to us. Therefore, it is not a well-formed optimization problem.

This explains the problem you're having. The reason you can't compare how good different solutions are is that you don't have a computable way to measure how good a single solution is, given the information available to you. Consequently, your problem does not fit into the standard framework of optimization problems and approximation algorithms. This means you can't take advantage of all that theory, and strange things can happen.

So, at this point you have two options. First, you can try to formulate your problem as a way that *does* qualify in this framework, i.e., where you do have a computable objective function. The other option is to accept that you don't fit into this framework and you'll have to develop everything from scratch without being able to use the existing framework as is.

Another way to put this: if your objective function depends on values that aren't known (aren't part of the input), then what you haven't is not purely an algorithmic problem. To make this an algorithmic task, you need a way to measure success. One candidate way to do this would be to establish some probability distribution on the locations of the sensors, then frame this as a statistical inference problem, where you want to find the maximum likelihood estimate, or something like that.

Another approach would be to choose a different objective function, that *is* computable. For instance, maybe you could evaluate the accuracy of a solution in terms of how closely it matches the givens: for what fraction of edges $\{v,v'\}$ do we have $||v_\text{est}-v'_\text{est}|| < R+\varepsilon$, and what fraction of non-edges $v,v'$ do we have $||v_\text{est}-v'_\text{est}|| > R-\varepsilon$; maybe the objective function is the sum of these two fractions, or something. Of course, this is a different objective function, so it's maximizing something different, and the solution to that optimization problem might or might not match what you're actually looking for.

For posterity, here's some feedback on an older version of the problem, which has now been addressed by the revised question.

I see some misconceptions in this question, both about how we measure the quality of approximations, and about when approximation algorithms are applicable. The best answer I can give is to clear up these confusions/misconceptions; that will help you think about this kind of problem more clearly.

## Definition of the approximation ratio

You have a confusion about how we normally measure the quality of an approximation, in the standard theory of approximation algorithms. This shows up in your discussion of approximation algorithms for subset sum.

With subset-sum, the objective function we are maximizing is $\text{sum}(B)$. In the standard theory of approximability, we measure the quality of an approximation by how good the value of $\text{sum}(B)$ is, compared to the optimal value of that for the optimal choice of $B$ -- *not* compared to $t$.

When we ask whether there is a $2$-approximation for subset-sum, we ask whether there's an algorithm that outputs a set $B_\text{approx}$ such that

$${\text{sum}(B_\text{approx}) \over \text{sum}(B_\text{opt})} \ge 1/2,$$

where $B_\text{opt}$ is the optimal solution (the one that maximizes $\text{sum}(B)$). The ratio on the left-hand-side is called the *approximation ratio*. Notice how $t$ doesn't appear in this quantity. The approximation ratio is defined with reference to the optimal solution, not with reference to $t$.

## When approximation algorithms are applicable

Approximation algorithms are only applicable to optimization problems.

The "WSN localization problem", as you have defined it, is not an optimization problem: it is a decision problem. Since it's not an optimization problem, the entire concept of an approximation algorithm doesn't enter into it.

Asking "is it approximable?" is a category error: it's like asking "Is 'flying' a mammal?" -- 'flying' isn't even a noun, let alone an animal.

You then talk about two metrics. If you pick *one* of those metrics, you could define an optimization problem that is based upon the "WSN localization problem", where you ask to find the placement of node positions that complies with all the constraints and that maximizes (or minimizes) that metrics. Then you *would* have an optimization problem.

However, when you have two metrics, we don't get a well-defined optimization problem, since it's not clear what would count as optimal. Indeed, there might be multiple solutions that each achieve a different tradeoff between these two metrics, and is "Pareto-optimal". (See also What is a bicriteria approximation algorithm?.)

So, to count as an optimization function, you have to be maximizing/minimizing a *single* objective function; the standard theory of approximation algorithms is only applicable when you have a problem that is of this form. If your problem isn't of this form, it probably doesn't make sense to ask if it is approximable, as standard definitions of approximability don't apply.

Question Source : http://cs.stackexchange.com/questions/46982

**3200 people like this**

## 0 comments:

## Post a Comment

Let us know your responses and feedback