Performance Evaluation¶
This package provides tools to assess the performance of a machine learning algorithm.
Classification Performance¶
- correctrate(gt, pred)¶
Compute correct rate of predictions given by pred w.r.t. the ground truths given in gt.
- errorrate(gt, pred)¶
Compute error rate of predictions given by pred w.r.t. the ground truths given in gt.
- confusmat(k, gt, pred)¶
Compute the confusion matrix of the predictions given by pred w.r.t. the ground truths given in gt. Here, k is the number of classes.
It returns an integer matrix R of size (k, k), such that R(i, j) == countnz((gt .== i) & (pred .== j)).
Examples:
julia> gt = [1, 1, 1, 2, 2, 2, 3, 3]; julia> pred = [1, 1, 2, 2, 2, 3, 3, 3]; julia> C = confusmat(3, gt, pred) # compute confusion matrix 3x3 Array{Int64,2}: 2 1 0 0 2 1 0 0 2 julia> C ./ sum(C, 2) # normalize per class 3x3 Array{Float64,2}: 0.666667 0.333333 0.0 0.0 0.666667 0.333333 0.0 0.0 1.0 julia> trace(C) / length(gt) # compute correct rate from confusion matrix 0.75 julia> correctrate(gt, pred) 0.75
Hit rate (for retrieval tasks)¶
- hitrate(gt, ranklist, k)¶
Compute the hitrate of rank k for a ranked list of predictions given by ranklist w.r.t. the ground truths given in gt.
Particularly, if gt[i] is contained in ranklist[1:k, i], then the prediction for the i-th sample is said to be hit within rank ``k``. The hitrate of rank k is the fraction of predictions that hit within rank k.
- hitrates(gt, ranklist, ks)¶
Compute hit-rates of multiple ranks (as given by a vector ks). It returns a vector of hitrates r, where r[i] corresponding to the rank ks[i].
Note that computing hit-rates for multiple ranks jointly is more efficient than computing them separately.
Receiver Operating Characteristics (ROC)¶
Receiver Operating Characteristics (ROC) is often used to measure the performance of a detector, thresholded classifier, or a verification algorithm.
The ROC Type¶
This package uses an immutable type ROCNums defined below to capture the ROC of an experiment:
immutable ROCNums{T<:Real}
p::T # positive in ground-truth
n::T # negative in ground-truth
tp::T # correct positive prediction
tn::T # correct negative prediction
fp::T # (incorrect) positive prediction when ground-truth is negative
fn::T # (incorrect) negative prediction when ground-truth is positive
end
One can compute a variety of performance measurements from an instance of ROCNums (say r):
- true_positive(r)¶
the number of true positives (r.tp)
- true_negative(r)¶
the number of true negatives (r.tn)
- false_positive(r)¶
the number of false positives (r.fp)
- false_negative(r)¶
the number of false negatives (r.fn)
- true_postive_rate(r)¶
the fraction of positive samples correctly predicted as positive, defined as r.tp / r.p
- true_negative_rate(r)¶
the fraction of negative samples correctly predicted as negative, defined as r.tn / r.n
- false_positive_rate(r)¶
the fraction of negative samples incorrectly predicted as positive, defined as r.fp / r.n
- false_negative_rate(r)¶
the fraction of positive samples incorrectly predicted as negative, defined as r.fn / r.p
- recall(r)¶
Equivalent to true_positive_rate(r).
- precision(r)¶
the fraction of positive predictions that are correct, defined as r.tp / (r.tp + r.fp).
- f1score(r)¶
the harmonic mean of recall(r) and precision(r).
Computing ROC Curves¶
The package provides a function roc to compute an instance of ROCNums or a sequence of such instances from predictions.
- roc(gt, pred)¶
Compute an ROC instance based on ground-truths given in gt and predictions given in pred.
- roc(gt, scores, thres[, ord])
Compute an ROC instance or an ROC curve (a vector of ROC instances), based on given scores and a threshold thres.
Prediction will be made as follows:
- When ord = Forward: predicts 1 when scores[i] >= thres otherwise 0.
- When ord = Reverse: predicts 1 when scores[i] <= thres otherwise 0.
When ord is omitted, it is defaulted to Forward.
Returns:
- When thres is a single number, it produces a single ROCNums instance;
- When thres is a vector, it produces a vector of ROCNums instances.
Note: Jointly evaluating an ROC curve for multiple thresholds is generally much faster than evaluating for them individually.
- roc(gt, (preds, scores), thres[, ord])
Compute an ROC instance or an ROC curve (a vector of ROC instances) for multi-class classification, based on given predictions, scores and a threshold thres.
Prediction is made as follows:
- When ord = Forward: predicts preds[i] when scores[i] >= thres otherwise 0.
- When ord = Reverse: predicts preds[i] when scores[i] <= thres otherwise 0.
When ord is omitted, it is defaulted to Forward.
Returns:
- When thres is a single number, it produces a single ROCNums instance.
- When thres is a vector, it produces an ROC curve (a vector of ROCNums instances).
Note: Jointly evaluating an ROC curve for multiple thresholds is generally much faster than evaluating for them individually.
- roc(gt, scores, n[, ord])
Compute an ROC curve (a vector of ROC instances), with respect to n evenly spaced thresholds from minimum(scores) and maximum(scores). (See above for details)
- roc(gt, (preds, scores), n[, ord])
Compute an ROC curve (a vector of ROC instances) for multi-class classification, with respect to n evenly spaced thresholds from minimum(scores) and maximum(scores). (See above for details)
- roc(gt, scores, ord])
Equivalent to roc(gt, scores, 100, ord).
- roc(gt, (preds, scores), ord])
Equivalent to roc(gt, (preds, scores), 100, ord).
- roc(gt, scores)
Equivalent to roc(gt, scores, 100, Forward).
- roc(gt, (preds, scores))
Equivalent to roc(gt, (preds, scores), 100, Forward).