| Height (cm) | Weight (kg) | T shirt size |
|---|---|---|
| 158 | 58 | M |
| 158 | 59 | M |
| 158 | 63 | M |
| 160 | 59 | M |
| 160 | 60 | M |
| 163 | 60 | M |
| 163 | 61 | M |
| 160 | 64 | L |
| 163 | 64 | L |
| 165 | 61 | L |
| 165 | 62 | L |
| 165 | 65 | L |
| 168 | 62 | L |
| 168 | 63 | L |
| 168 | 66 | L |
| 170 | 63 | L |
| 170 | 64 | L |
| 170 | 68 | L |
Sử dụng KNN với K = 3 để dự đoán kích thước áo cho 2 người sau. Hãy dùng khoảng cách Euclid (tổng giá trị tuyệt đối hiệu theo từng chiều):
Khoảng cách Euclid giữa điểm (h, w) và một mẫu (hi, wi) là:
$$d = \sqrt{(h - h_i)^2 + (w - w_i)^2}$$
| Height (CM) | Weight (KG) | Class |
|---|---|---|
| 167 | 51 | Underweight |
| 182 | 62 | Normal |
| 176 | 69 | Normal |
| 173 | 64 | Normal |
| 172 | 65 | Normal |
| 174 | 56 | Underweight |
| 169 | 58 | Normal |
| 173 | 57 | Normal |
| 170 | 55 | Normal |
| 170 | 57 | ? |
Consider the training examples shown in the following table for a binary classification. The table shows a training set for a problem of predicting whether a loan applicant will repay his/her loan obligation or defaulting on his/her loan.
| Tid | Home Owner | Marital Status | Annual Income | Defaulted Borrower |
|---|---|---|---|---|
| 1 | Yes | Single | 125K | No |
| 2 | No | Married | 100K | No |
| 3 | No | Single | 70K | No |
| 4 | Yes | Married | 120K | No |
| 5 | No | Divorced | 95K | Yes |
| 6 | No | Married | 60K | No |
| 7 | Yes | Divorced | 220K | No |
| 8 | No | Single | 85K | Yes |
| 9 | No | Married | 75K | No |
| 10 | No | Single | 90K | Yes |
Using the kNN approach that we discussed in the class, predict the class label for this test example,
X = (Home Owner = No, Marital Status = Married, Income = $120K).
Assume that k = 3 and distance is L2 norm.
| ID | Speed | Weight | Qualified |
|---|---|---|---|
| 1 | 2.50 | 600 | no |
| 2 | 3.75 | 800 | no |
| 3 | 2.25 | 550 | no |
| 4 | 3.25 | 825 | no |
| 5 | 2.75 | 750 | no |
| 6 | 4.50 | 500 | no |
| 7 | 3.50 | 525 | no |
| 8 | 3.00 | 325 | no |
| 9 | 4.00 | 400 | no |
| 10 | 4.25 | 375 | no |
| 11 | 2.00 | 200 | no |
| 12 | 5.00 | 250 | no |
| 13 | 8.25 | 850 | no |
| 14 | 5.75 | 875 | yes |
| 15 | 4.75 | 625 | yes |
| 16 | 5.50 | 675 | yes |
| 17 | 5.25 | 950 | yes |
| 18 | 7.00 | 425 | yes |
| 19 | 7.50 | 800 | yes |
| 20 | 7.25 | 575 | yes |
Test instance: X = (Speed = 5.20, Weight = 500). Use Min-max normalization, then apply KNN with k = 3, 5. Distance metric: Euclidean (L2 norm).
| ID | Height | Age | Weight |
|---|---|---|---|
| 1 | 5 | 45 | 77 |
| 2 | 5.11 | 26 | 47 |
| 3 | 5.6 | 30 | 55 |
| 4 | 5.9 | 34 | 59 |
| 5 | 4.8 | 40 | 72 |
| 6 | 5.8 | 36 | 60 |
| 7 | 5.3 | 19 | 40 |
| 8 | 5.8 | 28 | 60 |
| 9 | 5.5 | 23 | 45 |
| 10 | 5.6 | 32 | 58 |
| 11 | 5.5 | 38 | ? |