I was reading Wiki on feature vectors, and as far as I can see, it suggests creating new features from already existing features:
Higher-level features can be obtained from already available features and added to the feature vector, for example for the study of diseases the feature 'Age' is useful and is defined as Age = 'Year of birth' - 'Year of death'. This process is referred to as feature construction.
But assuming that you already have included 'Year of birth' and 'Year of Death' as features, will adding 'Age' (that is, 'Year of birth' - 'Year of death') as a feature in the feature vector improve it in any way? I'm thinking not, as the variables are linearly dependent.
If it depends on the machine learning algorithm used, I am mostly interested in SVMs.
Asked By : The Unfun Cat
Answered By : edron79
I believe it can. Consider the following thought experiment: we are attempting to predict if a person lived to be over 100. Knowing year of birth provides some predictive power (e.g., if they were born after 1912, we know with certainty they did not live to be 100). Year of death also provides some information (people who died closer to the present day are likely to have had a longer lifespan). However "Age" (defined by year of death - year of birth) will be a perfect predictor.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/7232
0 comments:
Post a Comment
Let us know your responses and feedback