Abstract: learning algorithms and database technology to

 Abstract: Heart disease is a most
harmful one that cause to death. It has a serious long term disability. This disease attacks a person so instantly.
So diagnosing patients correctly on timely basis is the most challenging task
for the medical support. A wrong
diagnosis by the hospital leads for loosing reputation. The accurate diagnosis of   heart disease, is one of the most important
biomedical problems. The
purpose of this paper is to develop an effective treatment using data mining
techniques that can help remedial situations. More data mining classification algorithms like decision trees, neural
networks, Bayesian classifiers, Support vector machines, Association Rule, K-nearest
neighbour classification are used to
diagnosis the heart diseases.

 

Keywords-

 

1. INTRODUCTION

Heart disease is
the class of diseases that involve the heart or blood vessels. It is one of the most-flying
diseases of the modern world. The diagnosis of the heart disease should be accurately and correctly. Normally
it is diagnosed by using a medical specialist. If we use the techniques
integrated with the medical information system then it would be more advantageous
and it will reduce the cost also. This can be done after comparing different
data mining techniques for finding their suitability. Data mining combines
statistical analysis, machine learning algorithms and database technology to
extract hidden patterns and relationships from large databases. The
diagnosis of heart disease depends on clinical and pathological data. Heart
disease prediction system can assist medical professionals in predicting heart
disease status based on the clinical data of patients.Researchers have been applying different data
mining techniques to help health care professionals with improved accuracy in
the diagnosis of heart disease. Neural network, Naïve Bayes, Genetic algorithm,
Decision Tree, classification via clustering, and direct kernel self-organizing
map are some techniques used. By using some data mining techniques heart
disease prediction can be made simple by various characteristic to find out
whether the person suffers from heart attack or not, and it also takes less
time to for the prediction and improve the medical diagnosis of diseases with
good accuracy and minimizes the occurrence of heart attack. Data mining along
with soft computing techniques helps to unravel hidden relationships and
diagnose diseases efficiently even with uncertainties and inaccuracies.

Data Mining is about explaining the past and
predicting the future by means of data analysis. It is a field which combines
statistics, machine learning, artificial intelligence and database technology.
The value of data mining applications is often estimated to be very high.
Disease
prediction plays most important role in the data mining. Data Mining is a
process of discovering interesting patterns and knowledge from huge amount of
data. It refers to extracting or mining knowledge from large amount of data.
Extracting knowledge it is also called knowledge mining from data or knowledge
extraction or Knowledge Discovery from Data (KDD). The knowledge discovery
process typically involves data cleaning, data integration, data selection,
data transformation, pattern discovery, pattern evaluation and knowledge
presentation. Nowadays, healthcare organization generates a voluminous data
that results lack of information to make the right decision. Data mining
techniques can be used to extract the needful information from healthcare
organizations.

 

 

2. LITERATURE REVIEW

Numerous work has
been done related to heart prediction system by using various data mining
techniques and algorithms by many authors. The aim of all is to achieve better
accuracy and to make the system more efficient so that it can predict the
chances of heart attack. This paper aims at analyzing the various data mining
techniques introduced in recent years for heart disease prediction. Different
data mining techniques have been used in the diagnosis over different Heart
disease datasets. Knowledge of the risk factors associated with heart disease
helps health care professionals to identify patients at high risk of having
heart disease. Statistical analysis has identified the risk factors associated
with heart disease to be age, blood pressure, smoking habit, total cholesterol,
diabetes, hypertension, family history of heart disease, obesity, and lack of physical
activity.

            In paper 1, it describes about heart
disease prediction were further analysed using three data mining classification
techniques namely decision tree, artificial neural network, and SVM. The
results were compared and the accuracy obtained were as follows: 79.05%,
80.06%, and 84.12%, respectively. Their analysis shows that out of these four
classification models, SVM predicts heart disease with the highest accuracy.

            In paper2, it describes the
approaches to identify the risk factors from the extracted itemsets that cause
heart disease. Here it surveys various latest frequent pattern mining
algorithms on data streams to understand various advantages and disadvantages,
so they provides a way of using new insights in the direction of frequent
pattern.

            In
paper3, it describes that the Neural Networks with 15 attributes has
outperformed over all other data mining techniques. Decision Tree has shown
good accuracy with C4.5, ID3, CART and J48. Decision Tree has shown good
accuracy with the help of genetic algorithm and feature subset selection. Naïve
Bayes algorithm gives an average prediction with 90% accuracy. The following
table shows the interpretation of various research papers we have studied

//In
paper4, it describes BPNN and
BNN gave the highest classification accuracy of 78.43 %, while RBF kernel SVM
gave the lowest classification accuracy of 60.78 %. BNN presented the best
sensitivity of 96.55 % and RBF kernel SVM displayed the lowest sensitivity of
41.38 %. Both polynomial kernel SVM and RBF kernel SVM presented the minimum
and maximum specificity of 45.45 % and 86.36 %, respectively.

In
paper5, it describes that fast algorithms such as decision tree have relatively
poor accuracy compared to other knowledge models like neural networks. In order
to overcome this problem, a large number of decision trees are generated for
the same data set, and used simultaneously for prediction. Random forest is one
such ensemble based method which is commonly used with decision trees. This
System mainly focuses on the supervised learning technique called the Random
forests for classification of data by changing the values of different hyper
parameters in Random Forests Classifier to get accurate classification results.

 

 In paper 6, it describes about three
Classification function Techniques in Data mining are compared for predicting
Heart Disease with reduced number of attributes .They are Naïve Bayes, Decision
Tree and Classification by Clustering. Here Genetic algorithm is used to
determine the attributes which contribute more towards the diagnosis of heart
ailments which indirectly reduces the number of tests which are needed to be
taken by a patient. Fourteen attributes are reduced to 6 attributes using
genetic search. The observations exhibit that the Decision Tree data mining
technique outperforms other two data mining techniques after incorporating
feature subset selection with relatively high model construction time. Naïve
Bayes performs consistently before and after reduction of attributes with the
same model construction time. Classification via clustering performs poor
compared to other two methods.

In paper 7, two experiments were conducted with
all 13 attributes and with 6 attributes of reduced dataset by applying
attribute selection method. The observation was that SVM (97.9%, 89.4%), Simple
logistic (69.2%, 71.6%) and Multilayer perceptron (74.3%, 79.1%) techniques are
achieved different accuracy in two scenario. From this it shows that SVM has
gretest accuracy.

 

 

 

 

 

 

 

 

 

3. CONCLUSION

Heart Disease is a fatal disease by its
nature. This disease makes a life threatening complexities such as heart attack
and death. The importance of Data Mining in the Medical Domain is realized and
steps are taken to apply relevant techniques in the Disease Prediction. The
various research works with some effective techniques done by different people
were studied.

///////Study reveals that the Neural Networks
with 15 attributes has outperformed over all other data mining techniques.
Decision Tree has shown good accuracy with C4.5, ID3, CART and J48. Decision
Tree has shown good accuracy with the help of genetic algorithm and feature
subset selection. Naïve Bayes algorithm gives an average prediction with 90%
accuracy. The following table shows the interpretation of various research
papers we have studied:

REFERENCES

1 Salha M. Alzahani, Afnan Althopity-“An Overview of Data Mining Techniques Applied for
Heart Disease Diagnosis and Prediction”

2 H K Shifali, Dr. B. Srinivasu, Rajashekar
Shastry, B N Ranga Swamy-“Mining
of Medical Data to Identify Risk Factors of Heart Disease Using Frequent
Itemset”

 3 Prof. Mamta Sharma 1, Farheen Khan2 , Vishnupriya
Ravichandran-“Comparing Data Mining Techniques Used For Heart Disease
Prediction”

4
Abhishek Taneja-“Heart Disease Prediction System Using
Data Mining Techniques”

5 Priya R Patil-”
Automated Diagnosis of Heart Disease using Data
Mining Techniques ”

6 Shamsher Bahadur Patel,
Pramod Kumar Yadav, Dr. D. P.Shukla-” Predict
the Diagnosis of Heart Disease Patients Using Classification Mining Techniques”

7 M. Hanumathappa-” heart disease prediction using
classification techniques with feature selection method”

Go Top