In the Preprocess tab, you can view attributes in the input file, properties of the selected attribute, and visualisation of class distribution for each attribute.
Building a Na?ve Bayes Classifier with 10 fold cross-validation. The correctly classified instances can be viewed by right clicking on Classifier in Results Window.
Visualization - It displays a matrix of two-dimensional scatter plots of each pair of attributes.
Preparing input Major effort in the process of data mining/machine learning goes into the preparation of input. In order to analyze data using Weka, you need to prepare it in the Attribute Relation File Format (ARFF) and then load it in its Explorer. Spreadsheets, Comma Separated Value (CSV) files and databases can be converted to ARFF. In ARFF, there is an @relation tag, @attribute tag and @data tag to represent the dataset name, attribute information and values respectively.
Classifying data Weka should preferably be used through a graphical user interface called 'Explorer' than the command-line interface. The other two interfaces are 'Knowledge Flow Interface,' which supports design configuration for streamed data processing and 'Experimenter,' which helps users compare a variety of learning techniques. In this example, we use an ARFF named age.arff which contains a few selective words in the attribute and @data contains their number of occurrences per 10,000 words in a blog dataset written by bloggers belonging to various age groups.
1. Open the file you want to analyze using the Open file option in the Preprocess tab in Weka explorer, ie open the age data file, age.arff. 2. Once the input file has been opened, all attributes in the input file are shown in the Attributes Window. Properties of the selected attributes like Attribute Name, Attribute Type, number of missing values, etc are displayed in the 'Selected Attribute' window. Here, you can select attributes that you want to include in working relations, eg age prediction. 3. Select the classifier algorithm in the Classify tab. In this example, we selected Na?ve Bayes with 10 fold Cross-Validation. Next, click on Start. The result is displayed in the Classifier Output window as shown in figure on the left.
Get most out of your technology infrastructure investments with Dell
About CIOL | Media Kit | Site Map | Contact Us | Help | Write to us | Jobs@CyberMedia | Privacy Policy
Copyright © CyberMedia India Online Ltd. All rights reserved. Usage of content from web site is subject to Terms and Conditions.