# Supervised Learning (COMP0078) | My Assignment Tutor

Supervised Learning (COMP0078) { Coursework 2Mark HerbsterDue : 11 January 2021.CW2_20v3SubmissionYou should produce a report about your results. You will not only be assessed on the correctness/qualityof your answers but also on clarity of presentation. Additionally make sure that your code is wellcommented. Please submit on moodle i) your report as well as a ii) zip file with your source code. Regardingthe use of libraries, you should implement regression using matrix algebra directly. Likewise any basicmachine learning routines such as cross-validation should be implemented directly. Otherwise libraries areokay.Questions please e-mail sl-support@cs.ucl.ac.uk .1 PART I [50%]1.1 Kernel perceptron (Handwritten Digit Classification)Introduction: In this exercise you will train a classifier to recognize hand written digits. The taskis quasi-realistic and you will (perhaps) encounter difficulties in working with a moderately large datasetwhich you would not encounter on a toy” problem.Figure 1: Scanned DigitsYou may already be familiar with the perceptron, this exercise generalizes the perceptron in two ways,first we generalize the perceptron to use kernel functions so that we may generate a nonlinear separatingsurface and second, we generalize the perceptron into a majority network of perceptrons so that instead ofseparating only two classes we may separate k classes.Adding a kernel: The kernel allows one to map the data to a higher dimensional space as we did withbasis functions so that class of functions learned is larger than simply linear functions. We will considera single type of kernel, the polynomial Kd(p; q) = (p · q)d which is parameterized by a positive integer dcontrolling the dimension of the polynomial.Training and testing the kernel perceptron: The algorithm is online that is the algorithms operateon a single example (xt; yt) at a time. As may be observed from the update equation a single kernel1function K(xt; ·) is added for each example scaled by the term αt (may be zero). In online training werepeatedly cycle through the training set; each cycle is known as an epoch. When the classifier is nolonger changing when we cycle thru the training set, we say that it has converged. It may be the case forsome datasets that the classifier never converges or it may be the case that the classifier will generalizebetter if not trained to convergence, for this exercise I leave the choice to you to decide how many epochsto train a particular classifier (alternately you may research and choose a method for converting an onlinealgorithm to a batch algorithm and use that conversion method). The algorithm given in the table correctlydescribes training for a single pass thru the data (1st epoch). The algorithm is still correct for multipleepochs, however, explicit notation is not given. Rather, latter epochs (additional passes thru the data) isrepresented by repeating the dataset with the xi’s renumbered. I.e., suppose we have a 40 element trainingset f(x1; y1); (x2; y2); :::; (x40; y40)g to model additional epochs simply extend the data by duplication, hencean m epoch dataset is(x1; y1); : : : ; (x40; y40)| {z }epoch 1; (x41; y41); : : : ; (x80; y80)| {z }epoch 2; : : : ; (x(m-1)×40+1; y(m-1)×40+1); : : : ; (x(m-1)×40+40; y(m-1)×40+40)| {z }epoch mwhere x1 = x41 = x81 = : : : = x(m-1)×40+1, etc. Testing is performed as follows, once we have trained aclassifier w on the training set, we simply use the trained classifier with only the prediction step for eachexample in test set. It is a mistake when ever the prediction ^ yt does not match the desired output yt, thusthe test error is simply the number of mistakes divided by test set size. Remember in testing the updatestep is never performed. Two Class Kernel Perceptron (training)Input:f(x1; y1);: : :; (xm; ym)g2(

QUALITY: 100% ORIGINAL PAPER – NO PLAGIARISM – CUSTOM PAPER