**Learning Algorithms**. We will compare decision trees (J48), neural networks, and k-nearest neighbors (IBk). You should use the defaults for these algorithms with the following exceptions:**trees>J48**Set unpruned to True.**functions>MultiLayerPerceptron**. Set hiddenLayers to 5, set trainingTime to 1000. (We will experiment with other settings below).**lazy>IBk**. Set KNN to 1 (which is the default; we will experiment with other values below).

**Data Sets**. We will apply these three algorithms to the same data sets as for HW2:`hw2-1`

,`hw2-2`

, and`br`

. http://classes.engr.oregonstate.edu/eecs/spring2005/cs534/data/.br data files: br-test.arff br test data file br-train.arff br training data file hw2-1 data files hw2-1-10.arff 10 training examples hw2-1-20.arff 20 training examples hw2-1-50.arff 50 training examples hw2-1-100.arff 100 training examples hw2-1-200.arff 200 training examples hw2-1-400.arff 400 training examples hw2-1-test.arff test data file hw2-2 data files hw2-2-25.arff 25 training examples hw2-2-50.arff 50 training examples hw2-2-100.arff 100 training examples hw2-2-200.arff 200 training examples hw2-2-600.arff 600 training examples hw2-2-test.arff test data file

You will run the three learning algorithms on each training data file and evaluate the results on the corresponding test data files.

**Results**. You should turn in the following:- A table in the following format:
hw2-1: N J48 NeuralNet kNN 10 xxx yyy zzz 20 xxx yyy zzz 50 xxx yyy zzz 100 xxx yyy zzz 200 xxx yyy zzz 400 xxx yyy zzz hw2-2: N J48 NeuralNet kNN 25 xxx yyy zzz 50 xxx yyy zzz 100 xxx yyy zzz 200 xxx yyy zzz 600 xxx yyy zzz br: N J48 NeuralNet kNN 614 xxx yyy zzz

Where`xxx`

gives the error rate of J48,`yyy`

gives the error rate of NeuralNetwork and`zzz`

gives the error rate of IBk. We will measure error rates on separate files of test points. - Graphs of the results for
`hw2-1`

and`hw2-2`

plotting the performance of the three algorithms as a function of the size of the training data set (known as a "learning curve"). - Plot of the data points for
`hw2-1-200`

and`hw2-2-200`

with lines showing the decision boundary learned by J48. This will require that you read the decision tree and understand the decision boundary. J48 displayes the tree in the following format:x1 <= 1.0: positive (75.0/17.0) x1 > 1.0 | x2 <= 5.0: negative (42.0/12.0) | x2 > 5.0: positive (33.0/10.0)

The first line indicates a split on feature x1 with threshold 1.0. The first branch leads to a leaf labeled "positive". The numbers in parentheses indicate that this leaf contains 75 data points of which 17 were misclassified. Indentation indicates child nodes. The vertical bars are intended to make it easier to see the indentations.Note: You should only plot line segments that separate the two classes (not all separating lines chosen by J48). You should also plot the optimal decision boundaries as determined on HW2.

- Plot of the data points for
`hw2-1-200`

and`hw2-2-200`

with a curve showing the decision boundary computed by the neural network code. To assist you with this, I have provided an additional file`grid.arff`

. This file contains 10201 points on a 0.1 grid for x in [-5,5] and y in [-5,5]. To compute the decision boundary for neural networks, select this as your "Supplied test set" in WEKA. Then after the neural network training is complete, you can right-click on the last entry in the Result list and select "Visualize classifier errors". You can visualize the decision boundary by selection "X: x (Num)" and "Y: y (Num)". All of the points in`grid.arff`

are labeled Positive. Incorrectly classified points are plotted by WEKA as blue squares, correctly classified points are plotted as blue x's. This will allow you to see the boundary. However, to determine the points on the boundary, click the "Save" button and choose a file name in which to save the outputs. If you examine this file, you will see that it contains five comma-separated values per line. The second and third values give the X and Y coordinates of the points. The fourth value is the predicted class and the fifth value is the correct class. You should write a program (or perl script) to find pairs of lines where the predicted class changes from one line to the next and where the X coordinate does not change. These points will give an approximation to the decision boundary. You should also plot the optimal decision boundaries as determined on HW2. - Plot of the data points for
`hw2-1-200`

and`hw2-2-200`

with a curve showing the decision boundary computed by the IBk (first nearest neighbor) rule. As with neural networks, you will need to use`grid.arff`

to determine the decision boundary. Again, you should plot the optimal decision boundaries as determined on HW2. - The results of additional experiments with neural networks.
Specifically, use the data set
`hw2-1-50`

and repeat the neural network training with 3 different random seeds of your own choosing. You can set the random seed on the parameter panel for the neural network algorithm. Report the error rate on the test set from each of these three random seeds.Now on the same data set, change the number of hidden units to 40 by setting "hiddenLayers" to 40. Train this network with four different random seeds and report the error rate from each.

An interesting thing to do is to visualize the misclassification errors of each network and compare them. You may also want to visualize the decision boundaries of each network using

`grid.arff`

inside Weka. (You do not need to turn in any additional graphs of these boundaries.) Another interesting thing to do is to train the network for a longer period of time by increasing the trainingTime. What happens to the weight values of the network as you train longer? - The results of additional experiments with IBk. Specifically,
use the data set
`hw2-1-50`

and repeat IBk training with the KNN parameter set to 3, 5, and 9. Report the error rate on the test set from each of these settings.Again, an interesting thing to do is to visualize the decision boundaries of these different KNN settings. You should see that the boundary becomes smoother as you increase K.

- A table in the following format: