AI

Decision Tree

Gini

$$\text{Gini}(S) = 1 - \sum_{i=1}^{K} p_i^2$$

Gini Split

$$\text{Gini}_{\text{split}} =\sum_{j=1}^{m}\frac{|S_j|}{|S|}\text{Gini}(S_j)$$

Best split

$$\text{Best Split}=\arg\min \sum_{j=1}^{m} \frac{|S_j|}{|S|} \text{Gini}(S_j)$$

Entropy

$$H(S) = - \sum_{i=1}^{K} p_i \log_2 p_i$$

Coditional Entropy

$$H(S \mid X) = \sum_{j=1}^{m} \frac{|S_j|}{|S|}H(S_j)$$

Information Gain

$$\text{Gain}(S, X) = H(S) - \sum_{j=1}^{m} \frac{|S_j|}{|S|} H(S_j)$$

Tên Tóc Ch.Cao Cân Nặng Dùng kem? Kết quả
Sarah Vàng T.Bình Nhẹ Không Cháy
Dana Vàng Cao T.Bình Không
Alex Nâu Thấp T.Bình Không
Annie Vàng Thấp T.Bình Không Cháy
Emilie Đỏ T.Bình Nặng Không Cháy
Peter Nâu Cao Nặng Không Không
John Nâu T.Bình Nặng Không Không
Kartie Vàng Thấp Nhẹ Không
Predict 1VàngCaoNhẹKhông?
Predict 2NâuThấpNặng?
Predict 3ĐỏT.BìnhNhẹKhông?

Buy Cars

ID Age Car Type Class
1 23 Family High
2 17 Sports High
3 43 Sports High
4 68 Family Low
5 32 Truck Low
6 20 Family High
Prediction
Ví dụ Age Car Type Dự đoán Class
1 25 Family ?
2 70 Family ?
3 30 Truck ?
4 40 Sports ?

Buy Computer

RID age income student credit_rating Class: buys_computer
1youthhighnofairno
2youthhighnoexcellentno
3middlehighnofairyes
4seniormediumnofairyes
5seniorlowyesfairyes
6seniorlowyesexcellentno
7middlelowyesexcellentyes
8youthmediumnofairno
9youthlowyesfairyes
10seniormediumyesfairyes
11youthmediumyesexcellentyes
12middlemediumnoexcellentyes
13middlehighyesfairyes
14seniormediumnoexcellentno
Predict
Ví dụ age income student credit_rating Dự đoán (buys_computer)
1 youth high yes fair ?
2 youth low no excellent ?
3 middle medium no fair ?
4 senior low yes excellent ?

Predict the risk class of a car driver based on the following attributes:

Attribute Description Values
time time since obtaining a drivers license in years {1-2, 2-7, >7}
gender gender {male, female}
area residential area {urban, rural}
risk the risk class {low, high}
Manually classified training examples:
ID time gender area risk
1 1-2 m urban low
2 2-7 m rural high
3 >7 f rural low
4 1-2 f rural high
5 >7 m rural high
6 1-2 m rural high
7 2-7 f urban low
8 2-7 m urban low

Loan Approval

AgeJobHouseCreditLoan Approved
YoungFalseNoFairNo
YoungFalseNoGoodNo
YoungTrueNoGoodYes
YoungTrueYesFairYes
YoungFalseNoFairNo
MiddleFalseNoFairNo
MiddleFalseNoGoodNo
MiddleTrueYesGoodYes
MiddleFalseYesExcellentYes
MiddleFalseYesExcellentYes
OldFalseYesExcellentYes
OldFalseYesGoodYes
OldTrueNoGoodYes
OldTrueNoExcellentYes
OldFalseNoFairNo
Predict
AgeJobHouseCreditLoan Approved
YoungFalseNoGood?

Play Tennis

OutlookTempHumidityWindyPlay
SunnyHotHighFalseNo
SunnyHotHighTrueNo
OvercastHotHighFalseYes
RainyMildHighFalseYes
RainyCoolNormalFalseYes
RainyCoolNormalTrueNo
OvercastCoolNormalTrueYes
SunnyMildHighFalseNo
SunnyCoolNormalFalseYes
RainyMildNormalFalseYes
SunnyMildNormalTrueYes
OvercastMildHighTrueYes
OvercastHotNormalFalseYes
RainyMildHighTrueNo
Predict
OutlookTempHumidityWindyPlay
SunnyHotNormalTrue?

Predict buy House

City SizeAvg. IncomeLocal InvestorsLOHAS AwarenessDecision
BigHighYesHighYes
MediumMedNoMedNo
SmallLowYesLowNo
BigHighNoHighYes
SmallMedYesHighNo
MedHighYesMedYes
MedMedYesMedNo
BigMedNoMedNo
MedHighYesLowNo
SmallHighNoHighYes
SmallMedNoHighNo
MedHeighNoMedNo
Predict
City SizeAvg. IncomeLocal InvestorsLOHAS AwarenessDecision
MedMedNoMed?