Data mining assignment using WEKA
$30-250 USD
着払い
There are 2 parts of the assignment.
Part A)
1. Assume that you have a Facebook data set that has 25784 HKUST students in total. Each student is described by 10 features(items in association rule terminology). Assume you have calculated the counts for three variables.
a. likes-Harry-Potter 2324
b. music-major 1029
c. theatre-major 878
1)What would be the largest possible support for associations containing one of this three "items", two of them and three of them respectively?
2).If 166 Theatre majors liked Harry Potter (TM --> HP), what would be the support, confidence and lift of this rule?
Other 2 questions of part A you may find in the attachment.
Part B)
In the second attachment you may find a real hall-of-fame baseball player data set, in which there are 1340 players and among them 125 players have actually been selected into hall of fame. Each player is described by 16 features about their career statistics such as number of home run, positions, triples, etc,. You will investigate if K nearest neighbor method is able to predict the hall of fame selection given a player's statistics. K nearest neighbor technique is called IBK under "Lazy" category of classification methods.
Answer the following questions:
1) Does the k-NN method work?Why?
2) Conceptually, what are the pros and cons of setting the k as 1 or a very large number?
3) Experiment different values of k and report the CCI you get using 10-fold cross-validation.
プロジェクトID: #4008403
プロジェクトについて
3人のフリーランサーが、平均$45 で、この仕事に入札しています。
Is your project due in 2 days time? I can help with the project but its too short a deadline for me. perhaps providing assistance on how to approach the project?