钟喜佳

Model Representation 模型表示

To establish notation for future use, we'll use x^{(i)} to denote the “input” variables (living area in this example), also called input features, andy^{(i)} to denote the “output” or target variable that we are trying to predict (price). A pair(x^{(i)} , y^{(i)} )is called a training example, and the dataset that we'll be using to learn—a list of m training examplesx)y)m—is called a training set. Note that the superscript “(i)” in the notation is simply an index into the training set, and has nothing to do with exponentiation. We will also use X to denote the space of input values, and Y to denote the space of output values. In this example, X = Y = ℝ.

To describe the supervised learning problem slightly more formally, our goal is, given a training set, to learn a function h : X → Y so that h(x) is a “good” predictor for the corresponding value of y. For historical reasons , this function h is called a hypothesis. Seen pictorially, the process is therefore like this:

When the target variable that we're trying to predict is continuous, such as in our housing example, we call the learning problem a regression problem. When y can take on only a small number of discrete values (such as if, given the living area, we wanted to predict if a dwelling is a house or an apartment, say), we call it a classification problem.

当我们试图预测的目标变量是连续的时,例如在我们的住房示例中,我们将学习问题称为回归问题。当y只能承受少量离散值时(例如,如果给定生活区域,我们想要预测住宅是房屋还是公寓),我们将其称为分类问题。

码字很辛苦,转载请注明来自.钟喜佳笔记 .《Model Representation 模型表示 》