Multiclass-Classification 多分类器

  • Classification involving more than two classes (i.e., > 2 Classes)
  • Method 1. One-vs.-all (OVA): Learn a classifier one at a time
    • Given Variants of Classification - 图1 classes, train m classifiers: one for each class
    • Classifier Variants of Classification - 图2:treats tuples in class Variants of Classification - 图3 as positive & all others as negative
    • To classify a tuple Variants of Classification - 图4, the set of classifiers vote as an ensemble. If classifier Variants of Classification - 图5 predicts the positive class, then class Variants of Classification - 图6 gets one vote. If classifier Variants of Classification - 图7 predicts the negative class then all non-Variants of Classification - 图8 classes get one vote.
  • Method 2. All-vs.-all (AVA): Learn a classifier for each pair of classes
    • Given m classes, construct Variants of Classification - 图9 binary classifiers
    • A classifier is trained using tuples of the two classes
    • To classify a tuple Variants of Classification - 图10, each classifier votes. Variants of Classification - 图11 is assigned to the class with maximal vote
  • Comparison

    • All-vs.-all tends to be superior to one-vs.-all
    • Problem: Binary classifier is sensitive to errors, and errors affect vote count

    Semi-supervised Classification 半监督学习分类器

  • Semi-supervised: Uses labeled and unlabeled data to build a classifier

  • Self-training:
    • Build a classifier using the labeled data
    • Use it to label the unlabeled data, and those with the most confident label prediction are added to the set of labeled data
    • Repeat the above process
    • Advantage: easy to understand; disadvantage: may reinforce errors
  • Co-training: Use two or more classifiers to teach each other

    • Use two disjoint and independent selections of attributes of each tuple to train two good classifiers, say Variants of Classification - 图12 and Variants of Classification - 图13
    • Then Variants of Classification - 图14 and Variants of Classification - 图15 are used to predict the class label for unlabeled data tuples Variants of Classification - 图16
    • Teach each other: The tuples in Variants of Classification - 图17 having the most confident prediction from Variants of Classification - 图18 are added to the set of labeled training data for Variants of Classification - 图19, & vice versa
    • Retrain two classifiers using the extended training sets, using the same disjoint attribute selections

      Active-Learning 主动学习

  • Class labels are expensive to obtain

  • Active learner: query human (oracle) for labels
  • Pool-based approach: Uses a pool of unlabeled data
    • Variants of Classification - 图20: a small subset of Variants of Classification - 图21 is labeled, Variants of Classification - 图22: a pool of unlabeled data in Variants of Classification - 图23
    • Use a query function to carefully select one or more tuples from Variants of Classification - 图24 and request labels from an oracle (a human annotator)
    • The newly labeled samples are added to Variants of Classification - 图25, and learn a model
    • Goal: Achieve high accuracy using as few labeled data as possible
  • Evaluated using learning curves: Accuracy as a function of the number of instances queried (# of tuples to be queried should be small)
  • Research issue: How to choose the data tuples to be queried?

    • Uncertainty sampling: choose the least certain ones
    • Reduce version space, the subset of hypotheses consistent with the training data
    • Reduce expected entropy over Variants of Classification - 图26: Find the greatest reduction in the total number of unlabelled data

      Transfer-Learning 迁移学习

  • Transfer learning: Build classifiers for one or more similar source tasks and apply to a target task

  • vs Traditional learning: Build a new classifier for each new task