华远地产股份有限公司6007432011

时间：2025-07-13

太原房产网 http://

Learning with Bayesian NetworksDavid HeckermanPresented by Colin Rickert

Introduction to Bayesian Networks

Bayesian networks represent an advanced form of general Bayesian probability A Bayesian network is a graphical model that encodes probabilistic relationships among variables of interest1 The model has several advantages for data analysis over rule based decision trees1

Outline1. 2. 3. 4. 5. 6.

Bayesian vs. classical probability methods Advantages of Bayesian techniques The coin toss prediction model from a Bayesian perspective Constructing a Bayesian network with prior knowledge Optimizing a Bayesian network with observed knowledge (data) Exam questions

Bayesian vs. the Classical Approach

The Bayesian probability of an event x, represents the person’s degree of belief or confidence in that event’s occurrence based on prior and observed facts. Classical probability refers to the true or actual probability of the event and is not concerned with observed behavior.

Bayesian vs. the Classical Approach

Bayesian approach restricts its prediction to the next (N+1) occurrence of an event given the observed previous (N) events. Classical approach is to predict likelihood of any given event regardless of the number of occurrences.

Example

Imagine a coin with irregular surfaces such that the probability of landing heads or tails is not equal. Classical approach would be to analyze the surfaces to create a physical model of how the coin is likely to land on any given throw. Bayesian approach simply restricts attention to predicting the next toss based on previous tosses.

Advantages of Bayesian TechniquesHow do Bayesian techniques compare to other learning models?1. Bayesian networks can readily handle

incomplete data sets.

2. Bayesian networks allow one to learn about causal relationships3. Bayesian networks readily facilitate use of prior knowledge 4. Bayesian methods provide an efficient method for preventing the over fitting of data (there is no need for pre-processing).

Handling of Incomplete Data

Imagine a data sample where two attribute values are strongly anti-correlated With decision trees both values must be present to avoid confusing the learning model Bayesian networks need only one of the values to be present and can infer the absence of the other: Imagine two variables, one for gunowner and the other for peace activist. Data should indicate that you do not need to check both values

Learning about Causal Relationships

We can use observed knowledge to determine the validity of the acyclic graph that represents the Bayesian network. For instance is running a cause of knee damage? Prior knowledge may indicate that this is the case. Observed knowledge may strengthen or weaken this argument.

Use of Prior Knowledge and Observed Behavior

Construction of prior knowledge is relatively st

raightforward by constructing “causal” edges between any two factors that are believed to be correlated. Causal networks represent prior knowledge where as the weight of the directed edges can be updated in a posterior manner based on new data

Avoidance of Over Fitting Data

Contradictions do not need to be removed from the data.

Data can be “smoothed” such that all available data can be used

The “Irregular” Coin Toss from a Bayesian Perspective

Start with the set of probabilities = { 1,…, n} for our hypothesis. For coin toss we have only one representing our belief that we will toss a “heads”, 1- for tails. Predict the outcome of the next (N+1) flip based on the previous N flips:

represents information we have observed thus far (i.e. = {D}

for 1, … ,N D = {X1=x1,…, Xn=xn} Want to know probability that Xn+1=xn+1 = heads

Bayesian Probabilities

Posterior Probability, p( |D, ): Probability of a particular value of given that D has been observed (our final value of ) . In this case = {D}. Prior Probability, p( | ): Prior Probability of a particular value of given no observed data (our previous “belief”) Observed Probability or “Likelihood”, p(D| , ): Likelihood of sequence of coin tosses D being observed given that is a particular value. In this case = { }. p(D| ): Raw probability of D

Bayesian Formulas for Weighted Coin Toss (Irregular Coin)

where

*Only need to calculate p( |D, ) and p( | ), the rest can be derived

IntegrationTo find the probability that Xn+1=heads, we must integrate over all possible values of to find the average value of which yields:

Expansion of Terms1. Expand observed probability p( |D, ):

2. Expand prior probability p( | ):

*“Beta” function yields a bell curve upon integration which is a typical probability distribution. Can be viewed as our expectation of the shape of the curve.

…… 此处隐藏：2679字，全部文档内容请下载后查看。喜欢就下载吧 ……

华远地产股份有限公司6007432011.doc 将本文的Word文档下载到电脑

下载这篇word文档

上一篇：从Alessi的代表产品看意大利后现代产品设计

下一篇：全军四会政治教员标兵授课演示讲稿之四