World's most popular travel blog for travel bloggers.

[Solved]: Classification of 2d arrays of outlines

, , No Comments
Problem Detail: 

I am not experienced in machine learning but I have been looking at the Python toolklit scikit-learn. I have thousands of mammograms and I have written an algorithm to find the outline of the images. I have some cases where the outline of the mammogram is not correct. I would like to be able to take these thousands of outlines of mammograms that are arrays (of different lengths) of the x and y coordinates of the edge and to classify them. This would allow me to find the outliers and also possibly allow me to categorise the images. Does a clustering algorithm exists that would allow me to do this?

Edit 1:

Expected outlineenter image description here

Unwanted outlineenter image description here

Asked By : Codey McCodeface

Answered By : D.W.

It's hard for me to tell, but it sounds like you want to automatically detect the mammograms where the outline is incorrect. I'm assuming the desired outline is the perimeter of some convex region in the image.

I'm not sure whether clustering is going to be the ideal approach here. The first approach that comes to mind for me is something like this: we might want to build a classifier $C$ that accepts an image $I$ and a point $(x,y)$ as input and outputs either 0 or 1 according to whether $(x,y)$ is considered inside the region. As features, you might include at least the following:

  • the grayscale intensity of $I(x,y)$, i.e., the darkness of the pixel at location $(x,y)$ in $I$.

  • the grayscale intensity of the neighborhood of $(x,y)$ in $I$. This could be computed as the intensity of $I'(x,y)$, where $I'$ is the result of applying a Gaussian blur to $I$.

  • for each line $\ell$ that's part of the outline you were given for $I$, the value of $f(\ell,(x,y))$, where you define $f(\ell,(x,y))$ to be $1$ or $0$ depending on which side of $\ell$ the point $(x,y)$ is on. For uniformity, you might choose a common orientation where one side of $\ell$ is towards the "inside" of the convex region given by the outline, and then let $f(\ell,(x,y))=1$ if $(x,y)$ is on that side of $\ell$ and let it be $0$ if $(x,y)$ is on the opposite side.

  • for each line $\ell'$ that's part of the edges you detected by applying an edge detector to $I$, the value of $f(\ell,(x,y))$.

Given these features, or some other suitable features of your choice, you might try to build a classifier $C$ by training on thousands of mammograms where you have ground truth (where you know what the correct outline is); for each such mammogram image $I$, you might pick a few hundred points $(x,y)$ at random, and then train on the resulting hundreds of thousands of inputs to $C$. Since you know ground truth, you'll know what the desired/correct value for $C(I,(x,y))$ is, so you can apply supervised learning to try to infer a classifier $C$.

Once you've built a classifier $C$, you might then apply it to every new mammogram in your test set to try to infer the correct outline.

I don't know whether this will be effective in practice, nor whether it solves the problem you want to solve, but this is the first thing that springs to mind for me.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/14019

3.2K people like this

 Download Related Notes/Documents

0 comments:

Post a Comment

Let us know your responses and feedback