Machine Learning - Lecture 6: Knowledge test

Chris Thornton


Question

Newspapers sometimes rank universities in terms of numbers of applicants. What is the explicit structure of the data? Suggest some possible forms of implicit structure.

Question

What is the difference between Euclidean and city-block distance? How can we choose between them in a particular application?

Question

What benefits might be obtained by normalizing the values of a discrete variable? How could the normalization be accomplished?

Question

What is the difference between a conditioning and a conditioned value in a defined probability?

Question

Where we have just one class, and one attribute variable, we can work out all conditional probabilities directly from the dataset. Why is this more difficult with more than one attribute?

Question

In a visual display of k-means clustering (with data based on two numeric variables), it appears as if the centroids repel each other. Explain this effect.

Question

How can we obtain predictions from a set of centroids that have been obtained using the k-means algorithm? Specify a reasonable decision rule.

Question

Let's say we use k-means for predicting classifications, but find that with only n means in play, predictions are often wrong. Are we bound to improve prediction performance if we add one more mean (i.e., one more centroid)?

Question

The following data describe individuals in terms of occupation, symptom and ailment.

  SYMPTOM       OCCUPATION    AILMENT
  sneezing      nurse         flu
  sneezing      farmer        hayfever
  headache      builder       concussion
  headache      nurse         flu
  sneezing      teacher       flu
  headache      teacher       concussion
Work out the probability of one of these indviduals being a nurse.

Work out the conditional probability of someone having concussion given they're a builder, and then the conditional probability of a concussed builder having a headache.

Use Bayes rule to get the probability that a sneezing builder has flu. Use the NBC to predict his/her probable ailment.