Machine learning notes-K nearest neighbors

  • The k-nearest neighbor method does not have an explicit learning process, it is a lazy learning
  • It can be classified and regressed
  • The model is determined by three basic elements-distance measurement, selection of k value, and classification decision rules
  • The kd tree is a binary tree, which represents a division of the k-dimensional space . It is a data structure that facilitates quick retrieval of data in k-dimensional space
  • Constructing a kd tree is equivalent to continuously dividing the k-dimensional space with a hyperplane perpendicular to the coordinate axis to form a series of k-dimensional super rectangular regions. Each leaf node of the constructed kd tree corresponds to a division of the k-dimensional space
  • For k-dimensional data of n instances, the time complexity of building a kd tree is O(k*n*logn)
  • Choice of k value
    • The smaller the value of k, the equivalent of using training examples in a smaller field to make predictions. Shows that the more complex the model, the easier it is to overfit
    • But the larger the k value, the simpler the model . If k=N, it means that no matter what point is the class with the most classes in the training set
    • So generally k will take a smaller value, and then use cross-validation to determine

Guess you like

Origin blog.csdn.net/lz_peter/article/details/82861021