摘要:贝叶斯分类器的分类原理是通过某对象的先验概率,利用贝叶斯公式计算出其后验概率,即该对象属于某一类的概率,选择具有最大后验概率的类作为该对象所属的类。目前研究较多的贝叶斯分类器主要有四种,分别是:Naive Bayes、TAN、BAN和GBN。
贝叶斯网络是一个带有概率注释的有向无环图,图中的每一个结点均表示一个随机变量,图中两结点 间若存在着一条弧,则表示这两结点相对应的随机变量是概率相依的,反之则说明这两个随机变量是条件独立的。网络中任意一个结点X 均有一个相应的条件概率表(Conditional Probability Table,CPT),用以表示结点X 在其父结点取各可能值时的条件概率。若结点X 无父结点,则X 的CPT 为其先验概率分布。贝叶斯网络的结构及各结点的CPT 定义了网络中各变量的概率分布。
基于贝叶斯理论的朴素贝叶斯分类(Naive Bayes,NB)方法是一种简单而有效的分类方法,它也是机器学习领域中应用广泛的分类算法之一。本文介绍了朴素贝叶斯分类算法的基本原理,研究了基于朴素贝叶斯算法的数据分类。实际应用表明了朴素贝叶斯算法是一种有效的分类算法。
关键词:朴素贝叶斯算法;文本分类;数据
Abstract:Bayesian classifier principle is the prior probability of an object by using the Bayesian formula to calculate the probability of subsequent experience, that the object belongs to a class of probability, choose a maximum a posteriori probability of the class as the object belongs to the class. Currently studied in Bayesian classifier, there are four, namely: Naive Bayes, TAN, BAN and GBN.
Bayesian network is annotated with a probability of a directed acyclic graph, each node in a random variable indicated the figure between two nodes if there is an arc, then the corresponding two nodes the probability of a random variable is dependent, and vice versa indicated that the two random variables are independent conditions. Any network node X has a corresponding conditional probability table (Conditional Probability Table, CPT), to indicate the node in its parent node X to take all possible values of the conditional probability. If no parent node node X, then X, the prior probability distribution for CPT. Bayesian network structure and the nodes of the network in the CPT definition of the probability distribution of each variable.
Based on Bayesian theory Bayesian classification (Naive Bayes, NB) is a simple and effective classification method, which is widely used in the field of machine learning classification algorithms. This article describes the simple basic principle of Bayesian classification algorithm, Naive Bayes algorithm is studied based on data classification. The application shows that the Naive Bayes algorithm is an effective classification algorithm.
Keywords:Naive Bayes algorithm;Text Classification;Data