In traditional classification there are two types of errors we can make (associated with the two classes). Traditionally, algorithms have focused on minimizing the probability of making an error. However, in many practical settings the two types of errors have very different costs. For example, in tumor classification the probability of error is not the most appropriate criterion. The Neyman-Pearson (NP) approach seeks instead to minimize false negatives while constraining false positives to be below a certain significance level. The theory of learning classifiers with respect to the NP criterion has recently begun to emerge and there is now a need for the development of practical algorithms. We show that support vector machines (SVMs) - one of the most powerful family of algorithms for classification - can be adapted to this setting in a natural manner. Cost-sensitive SVMs for NP Learning
There are two immediate options for adapting SVMs to the NP criterion. Since SVM-based classifiers can be interpreted as hyperplanes in an appropriate feature space, one option is to simply shift the hyperplane to achieve the desired trade-off between false positives and false negatives. Another approach is to use a "cost-sensitive" SVM which introduces class-specific weights, penalizing training errors from one class more than the other. The key challenge in this approach lies in appropriately setting the free parameters in the SVM (in particular those which determine the relative costs for the two error types). With this in mind we have developed the theory of the 2nu-SVM, providing a characterization of the feasible parameter set for this method. Further contributions include a novel heuristic for improved error estimation and a strategy for efficiently searching the parameter space of this method. In the work below we have shown that the 2nu-SVM is much more effective for NP classification than simply shifting the decision boundary.
M.A. Davenport, R.G. Baraniuk, and C.D. Scott,
"Tuning support vector machines for minimax and Neyman-Pearson classification,"
IEEE Trans. on Pattern Analysis and Machine Intelligence, 32(10) pp. 1888-1898, October 2010.
We have modified the LIBSVM package to implement the 2nu-SVM (described in the work above). Our modifications can be downloaded here. The version posted is a Matlab interface for LIBSVM 2.8 (based on one written by Jun-Cheng Chen, Kuan-Jen Peng, Chih-Yuan Yang, and Chih-Huai Cheng from National Taiwan University). If you would prefer to use your own version, we have instructions for how to modify LIBSVM to implement the 2nu-SVM here. E-mail mdav-at-gatech-dot-edu if you find any bugs or have any questions.