Wednesday, 30 May 2018

SVM Classification using WEKA Java Code


In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis. A Support Vector Machine (SVM) is a discriminative classifier formally defined by a separating hyperplane. In other words, given labeled training data (supervised learning), the algorithm outputs an optimal hyperplane which categorizes new examples. The vectors (cases) that define the hyperplane are the support vectors. SVM uses a technique called the kernel trick to transform your data and then based on these transformations it finds an optimal boundary between the possible outputs.


Requirements:
==========
2 Jar Files

2 Datasets

How to Implement:
==============

ClevelandHeartDiseaseTrainingDataset.arff contains lot of patients Health Records. It has 5 attributes and 1 class attribute.
  1) sex: patient sex (1 = male, 0 = female), 
  2) cp: chest pain type (1 = typical angina, 2 = atypical angina, 3 = non-anginal pain, 4 = asymptomatic), 
  3) slope: the slope of the peak exercise ST segment (1 = upsloping, 2 = flat, 3 = downsloping)
  4) ca: number of major vessels (0-3) colored by flourosopy
  5) thal: (3 = normal, 6 = fixed defect, 7 = reversable defect)
  6) class: (0 = no heart disease, 1 = presence of heart disease)
  
This Classification Algorithm classify this training dataset. After Classification, it generate some classification rules. Followed by, this algorithm load ClevelandHeartDiseaseTestingDataset.arff. This testing dataset contains 5 attributes with one class attribute. This class attribute contains (?) question mark. Because we predict each testing record is possible to presence of heart disease or not. Then this classification algorithm predicts each records class attribute value based on classification rules (It is generated after Training Process).


How to Run this Code in Command Prompt:
===================================

>set classpath=%classpath%;weka-3.7.1-beta.jar;

>set classpath=%classpath%;weka-3.7.3.jar;

>javac SVMClassification.java

>java SVMClassification

Output: SVMOutput.txt


SVM Classification Output

@relation ClevelandHeartDiseaseTestingDataset

@attribute sex {0,1}
@attribute cp {1,2,3,4}
@attribute slope {1,2,3}
@attribute ca {0,1,2,3}
@attribute thal {3,6,7}
@attribute class {0,1}

@data
1,1,2,0,3,1
0,1,1,0,3,1
1,2,2,0,3,1
1,3,1,2,7,0


Tuesday, 29 May 2018

Naive Bayes Classification using WEKA Java Code

Naive Bayes is a simple technique for constructing classifiers: models that assign class labels to problem instances, represented as vectors of feature values, where the class labels are drawn from some finite set. It is not a single algorithm for training such classifiers, but a family of algorithms based on a common principle: all naive Bayes classifiers assume that the value of a particular feature is independent of the value of any other feature, given the class variable. For example, a fruit may be considered to be an apple if it is red, round, and about 10 cm in diameter. A naive Bayes classifier considers each of these features to contribute independently to the probability that this fruit is an apple, regardless of any possible correlations between the color, roundness, and diameter features.


Requirements:
==========
2 Jar Files

2 Datasets


How to Implement:
==============

ClevelandHeartDiseaseTrainingDataset.arff contains lot of patients Health Records. It has 5 attributes and 1 class attribute.
  1) sex: patient sex (1 = male, 0 = female), 
  2) cp: chest pain type (1 = typical angina, 2 = atypical angina, 3 = non-anginal pain, 4 = asymptomatic), 
  3) slope: the slope of the peak exercise ST segment (1 = upsloping, 2 = flat, 3 = downsloping)
  4) ca: number of major vessels (0-3) colored by flourosopy
  5) thal: (3 = normal, 6 = fixed defect, 7 = reversable defect)
  6) class: (0 = no heart disease, 1 = presence of heart disease)
  
This Classification Algorithm classify this training dataset. After Classification, it generate some classification rules. Followed by, this algorithm load ClevelandHeartDiseaseTestingDataset.arff. This testing dataset contains 5 attributes with one class attribute. This class attribute contains (?) question mark. Because we predict each testing record is possible to presence of heart disease or not. Then this classification algorithm predicts each records class attribute value based on classification rules (It is generated after Training Process).

How to Run this Code in Command Prompt:
===================================

>set classpath=%classpath%;weka-3.7.1-beta.jar;

>set classpath=%classpath%;weka-3.7.3.jar;

>javac NaiveBayesClassification.java

>java NaiveBayesClassification



Naive Bayes Classification Output


@relation ClevelandHeartDiseaseTestingDataset

@attribute sex {0,1}
@attribute cp {1,2,3,4}
@attribute slope {1,2,3}
@attribute ca {0,1,2,3}
@attribute thal {3,6,7}
@attribute class {0,1}

@data
1,1,2,0,3,1
0,1,1,0,3,1
1,2,2,0,3,1
1,3,1,2,7,0

Monday, 28 May 2018

MLR Classification Output

@relation ClevelandHeartDiseaseData

@attribute sex {0,1}
@attribute cp {1,2,3,4}
@attribute slope {1,2,3}
@attribute ca {0,1,2,3}
@attribute thal {3,6,7}
@attribute class {0,1}

@data
1,1,2,0,3,1
0,1,1,0,3,1
1,2,2,0,3,1
1,3,1,2,7,0

MLR Classification using WEKA Java Code

MLR is short form for Multinomial Logistic Regression algorithm. Multinomial logistic regression is known by a variety of other names, including softmax regression, multiclass LR, multinomial logit, polytomous LR, conditional maximum entropy model and maximum entropy (MaxEnt) classifier. Multinomial logistic regression is a classification method that generalizes logistic regression to multiclass problems, i.e. with more than two possible discrete outcomes. That is, it is a model that is used to predict the probabilities of the different possible outcomes of a categorically distributed dependent variable, given a set of independent variables (which may be real-valued, binary-valued, categorical-valued, etc.).

Requirements:
=============
2 Jar Files

2 Datasets

How to Implement:
=================

ClevelandHeartDiseaseTrainingDataset.arff contains lot of patients Health Records. It has 5 attributes and 1 class attribute.
  1) sex: patient sex (1 = male, 0 = female), 
  2) cp: chest pain type (1 = typical angina, 2 = atypical angina, 3 = non-anginal pain, 4 = asymptomatic), 
  3) slope: the slope of the peak exercise ST segment (1 = upsloping, 2 = flat, 3 = downsloping)
  4) ca: number of major vessels (0-3) colored by flourosopy
  5) thal: (3 = normal, 6 = fixed defect, 7 = reversable defect)
  6) class: (0 = no heart disease, 1 = presence of heart disease)
  
This Classification Algorithm classify this training dataset. After Classification, it generate some classification rules. Followed by, this algorithm load ClevelandHeartDiseaseTestingDataset.arff. This testing dataset contains 5 attributes with one class attribute. This class attribute contains (?) question mark. Because we predict each testing record is possible to presence of heart disease or not. Then this classification algorithm predicts each records class attribute value based on classification rules (It is generated after Training Process).

How to Run this Code in Command Prompt:
===============================

>set classpath=%classpath%;weka-3.7.1-beta.jar;

>set classpath=%classpath%;weka-3.7.3.jar;

>javac MLRClassification.java

>java MLRClassification

Output: MLROutput.txt


Saturday, 26 May 2018

MLP Classification Output

@relation ClevelandHeartDiseaseTestingDataset

@attribute sex {0,1}
@attribute cp {1,2,3,4}
@attribute slope {1,2,3}
@attribute ca {0,1,2,3}
@attribute thal {3,6,7}
@attribute class {0,1}

@data
1,1,2,0,3,1
0,1,1,0,3,1
1,2,2,0,3,1

1,3,1,2,7,0

MLP Classification using WEKA Java Code

MLP is short form for Multi Layer Perceptron algorithm. The multilayer perceptron is the most known and most frequently used type of neural network. On most occasions, the signals are transmitted within the network in one direction: from input to output. There is no loop, the output of each neuron does not affect the neuron itself. Using the backpropagation algorithm for training, they can be used for a wide range of applications, from the functional approximation to prediction in various fields, such as estimating the load of a calculating system or modelling the evolution of chemical reactions of polymerization, described by complex systems of differential equations.


Requirements:
===========
2 Jar Files


2 Datasets


How to Implement:
==============
ClevelandHeartDiseaseTrainingDataset.arff contains lot of patients Health Records. It has 5 attributes and 1 class attribute.
  1) sex: patient sex (1 = male, 0 = female), 
  2) cp: chest pain type (1 = typical angina, 2 = atypical angina, 3 = non-anginal pain, 4 = asymptomatic), 
  3) slope: the slope of the peak exercise ST segment (1 = upsloping, 2 = flat, 3 = downsloping)
  4) ca: number of major vessels (0-3) colored by flourosopy
  5) thal: (3 = normal, 6 = fixed defect, 7 = reversable defect)
  6) class: (0 = no heart disease, 1 = presence of heart disease)
  
  
This Classification Algorithm classify this training dataset. After Classification, it generate some classification rules. Followed by, this algorithm load ClevelandHeartDiseaseTestingDataset.arff. This testing dataset contains 5 attributes with one class attribute. This class attribute contains (?) question mark. Because we predict each testing record is possible to presence of heart disease or not. Then this classification algorithm predicts each records class attribute value based on classification rules (It is generated after Training Process).


How to Run this Code in Command Prompt:
================================

>set classpath=%classpath%;weka-3.7.1-beta.jar;

>set classpath=%classpath%;weka-3.7.3.jar;

>javac MLPClassification.java

>java MLPClassification


Output: MLPOutput.txt


Friday, 25 May 2018

FURIA Classification Output

@relation ClevelandHeartDiseaseTestingDataset

@attribute sex {0,1}
@attribute cp {1,2,3,4}
@attribute slope {1,2,3}
@attribute ca {0,1,2,3}
@attribute thal {3,6,7}
@attribute class {0,1}

@data
1,1,2,0,3,1
0,1,1,0,3,0
1,2,2,0,3,1
1,3,1,2,7,0

FURIA Classification using WEKA Java Code

FURIA is short form for Fuzzy unordered rule induction algorithm. It is the improved algorithm of existing RIPPER Algorithm. It has simple and comprehensive rule sets. Furthermore, it has many extensions with modifications. Instead of Conventional Rules & unordered rule sets, FURIA learns fuzzy rules only. This FURIA algorithm has an efficient rule stretching method for deal with uncovered examples. Compared with existing RIPPER, C4.5 and other classifiers, FURIA provides best classification results. In other wors, Classification Accuracy is Very High compared with others. 

Requirements:
===========
2 Jar Files

2 Datasets

How to Implement:
==============

ClevelandHeartDiseaseTrainingDataset.arff contains lot of patients Health Records. It has 5 attributes and 1 class attribute.
  1) sex: patient sex (1 = male, 0 = female), 
  2) cp: chest pain type (1 = typical angina, 2 = atypical angina, 3 = non-anginal pain, 4 = asymptomatic), 
  3) slope: the slope of the peak exercise ST segment (1 = upsloping, 2 = flat, 3 = downsloping)
  4) ca: number of major vessels (0-3) colored by flourosopy
  5) thal: (3 = normal, 6 = fixed defect, 7 = reversable defect)
  6) class: (0 = no heart disease, 1 = presence of heart disease)
  
This Classification Algorithm classify this training dataset. After Classification, it generate some classification rules. Followed by, this algorithm load ClevelandHeartDiseaseTestingDataset.arff. This testing dataset contains 5 attributes with one class attribute. This class attribute contains (?) question mark. Because we predict each testing record is possible to presence of heart disease or not. Then this classification algorithm predicts each records class attribute value based on classification rules (It is generated after Training Process).


How to Run this Code in Command Prompt:
=================================

>set classpath=%classpath%;weka-3.7.1-beta.jar;

>set classpath=%classpath%;weka-3.7.3.jar;

>javac FURIAClassification.java

>java FURIAClassification

Thursday, 24 May 2018

C4.5 Classification Output

@relation ClevelandHeartDiseaseTestingDataset

@attribute sex {0,1}
@attribute cp {1,2,3,4}
@attribute slope {1,2,3}
@attribute ca {0,1,2,3}
@attribute thal {3,6,7}
@attribute class {0,1}

@data
1,1,2,0,3,1
0,1,1,0,3,1
1,2,2,0,3,0
1,3,1,2,7,0

Cleveland Heart Disease Testing Dataset

@relation 'ClevelandHeartDiseaseTestingDataset'

@attribute sex {0,1}
@attribute cp {1,2,3,4}
@attribute slope {1,2,3}
@attribute ca {0,1,2,3}
@attribute thal {3,6,7}
@attribute class {0,1}

@data
1,1,2,0,3,?
0,1,1,0,3,?
1,2,2,0,3,?

Cleveland Heart Disease Training Dataset

@relation 'ClevelandHeartDiseaseTrainingDataset'

@attribute sex {0,1}
@attribute cp {1,2,3,4}
@attribute slope {1,2,3}
@attribute ca {0,1,2,3}
@attribute thal {3,6,7}
@attribute class {0,1}

@data
1,1,3,0,6,1
1,4,2,3,3,1
1,4,2,2,7,0
1,3,3,0,3,1

C4.5 Classification using WEKA Java Code

C4.5 algorithm is an improved ID3 algorithm designed by Ross in 1993. It is also called J48 algorithm. This C4.5 algorithm has more advantages over ID3. These are,

1) C4.5 algorithm can handle both categorical as well as discrete data.

2) The C4.5 decision tree algorithm was one of the first algorithms, which can handle missing values. Quinlan (author of the algorithm), has explained, how C4.5 handles missing values. Missing attribute values are simply not used in gain and entropy calculations.
3) C4.5 does tree pruning, by going back through the tree after its creation. It attempts for removing branches which are not of help by replacing internal nodes with leaf nodes.

Requirements:
=============
2 Jar Files
2 Datasets


ClevelandHeartDiseaseTrainingDataset.arff contains lot of patients Health Records. It has 5 attributes and 1 class attribute.
1) sex: patient sex (1 = male, 0 = female), 
2) cp: chest pain type (1 = typical angina, 2 = atypical angina, 3 = non-anginal pain, 4 = asymptomatic), 
3) slope: the slope of the peak exercise ST segment (1 = upsloping, 2 = flat, 3 = downsloping)
4) ca: number of major vessels (0-3) colored by flourosopy
5) thal: (3 = normal, 6 = fixed defect, 7 = reversable defect)
6) class: (0 = no heart disease, 1 = presence of heart disease)
This Classification Algorithm classify this training dataset. Then generate some classification rules. Followed by, this algorithm load ClevelandHeartDiseaseTestingDataset.arff. This dataset contains 5 attributes with one class attribute. This class attribute contains (?) question mark. Because we predict each testing record is possible to presence of heart disease or not. Then this classification algorithm predicts each records class attribute value based on classification rules (It is generated after Training Process).




How to Run:
==========

>set classpath=%classpath%;weka-3.7.1-beta.jar;

>set classpath=%classpath%;weka-3.7.3.jar;

>javac C45Classification.java

>java C45Classification

Output: C45Output.txt