吴恩达-机器学习笔记

2018/08/06

课程链接

https://www.coursera.org/learn/machine-learning/home/welcome

Welcome

Welcome to Machine Learning! I’m excited to have you in the class and look forward to helping you become an expert in machine learning.

After you finish watching the Week 1 lectures, there’s also a set of Review Questions to help you check your understanding. You should be able to complete the review questions in a few minutes. You can attempt the review questions as many times as you like, and we will only use your highest score.

This machine learning class was the class that had started Coursera. I’m excited to be teaching it again. By the time you finish this class, you’ll know how to apply the most advanced machine learning algorithms to such problems as anti-spam, image recognition, clustering, building recommender systems, and many other problems. You’ll also know how to select the right algorithm for the right job, as well as become expert at ‘debugging’ and figuring out how to improve a learning algorithm’s performance.

I hope you’ll have a lot of fun learning about machine learning.

Machine Learning Honor Code

We strongly encourage students to form study groups, and discuss the lecture videos (including in-video questions). We also encourage you to get together with friends to watch the videos together as a group. However, the answers that you submit for the review questions should be your own work. For the programming exercises, you are welcome to discuss them with other students, discuss specific algorithms, properties of algorithms, etc.; we ask only that you not look at any source code written by a different student, nor show your solution code to other students.

Guidelines for Posting Code in Discussion Forums

Scenario 1: Code to delete

Learner Question/Comment: “Here is the code I have so far, but it fails the grader. Please help me fix it.”

Why Delete?: The reason is that if there is a simple fix provided by a student, a quick copy and paste with a small edit will provide credit without individual effort.

Learner Question: A student substitutes words for the math operators, but includes the variable names (or substitutes the equivalent greek letters (θ for ‘theta’, etc). This student also provides a sentence-by-sentence, line by line, description of exactly what their code implements. “The first line of my script has the equation “hypothesis equals theta times X”, but I get the following error message…”.

Why Delete?: This should be deleted. “Spelling out” the code in English is the same as using the regular code.

Scenario 2: Code not to delete

Learner Question: How do I subset a matrix to eliminate the intercept?

Mentor Response: This probably would be okay, especially if the person posting makes an effort to not use familiar variable names, or to use a context which has nothing to do with the contexts in the assignments.

It is clearly ok to show examples of Octave code to demonstrate a technique. Even if the technique itself is directly applicable to a programming problem at hand. As long as what is typed cannot be “cut and pasted” into the program at hand.

E.g. how do I set column 1 of a matrix to zero? Try this in your Octave work area:

A = magic(3)

A(:,1) = 0

The above is always acceptable (in my understanding). Demonstrating techniques and learning the language/syntax are important Forum activities.


What is Machine Learning?

Two definitions of Machine Learning are offered. Arthur Samuel described it as: “the field of study that gives computers the ability to learn without being explicitly programmed.” This is an older, informal definition.

Tom Mitchell provides a more modern definition: “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.”

Example: playing checkers.

E = the experience of playing many games of checkers

T = the task of playing checkers.

P = the probability that the program will win the next game.

In general, any machine learning problem can be assigned to one of two broad classifications:

Supervised learning and Unsupervised learning.


Supervised Learning

In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output.

Supervised learning problems are categorized into “regression” and “classification” problems. In a regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function. In a classification problem, we are instead trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories.

Example 1:

Given data about the size of houses on the real estate market, try to predict their price. Price as a function of size is a continuous output, so this is a regression problem.

We could turn this example into a classification problem by instead making our output about whether the house “sells for more or less than the asking price.” Here we are classifying the houses based on price into two discrete categories.

Example 2:

(a) Regression - Given a picture of a person, we have to predict their age on the basis of the given picture

(b) Classification - Given a patient with a tumor, we have to predict whether the tumor is malignant or benign.


Unsupervised Learning

Unsupervised learning allows us to approach problems with little or no idea what our results should look like. We can derive structure from data where we don’t necessarily know the effect of the variables.

We can derive this structure by clustering the data based on relationships among the variables in the data.

With unsupervised learning there is no feedback based on the prediction results.

Example:

Clustering: Take a collection of 1,000,000 different genes, and find a way to automatically group these genes into groups that are somehow similar or related by different variables, such as lifespan, location, roles, and so on.

Non-clustering: The “Cocktail Party Algorithm”, allows you to find structure in a chaotic environment. (i.e. identifying individual voices and music from a mesh of sounds at a cocktail party).


Mentor Program Overview:

Community Mentors are successful, dedicated Coursera learners who volunteer to assist with support and discussion forum moderation in courses that they have already completed. They have been recruited by Coursera to encourage newer learners, answer questions, set an example by posting thoughtful and timely content, and report platform bugs and inappropriate content to Coursera.

As you use the discussion areas, please be aware that the ideas expressed by participants in this course, including the Mentors, do not represent the views of Stanford University. The mentors are not employed by Stanford University and they have not been vetted by Stanford University as experts on course content or course facilitation.


Get to Know Your Classmates

Overview

Working well with your classmates is an important part of this online course. Thus, at the beginning of this course, we would like you to take time to break the ice and get to know each other. You may already know some of your classmates or have just met them. Establishing personal interaction with other students will make your online learning experience much more enjoyable and engaging. As such, we encourage you to participate in this activity, though it is optional.

Meet and Greet

Tell everyone your story! Optionally, you are asked to provide a brief introduction to your classmates. If you don’t know what to include in your introduction, you may want to provide information that you’d like to share with your classmates by answering some of the following questions.

Suggested Topics

Where are you from? If you wish to include this information in your post, you can also include it below the body of your post in the “tags” area. For example, include your state (if living within the United States) or country in the tags section. Career and education? What is your educational background? What do you currently do? Are you currently pursuing a change in careers and/or more education? Hopes? Why did you decide to take this course? What are your expectations of this course? What problem are you trying to solve? What do you hope to put into place in your life the day this course is over? Other info? Share with us any other information that might help others in the class find you when searching the forums. What common interests might you share with your classmates? We have tens of thousands of students enrolled in this course – put something in your post that will help others who are like you to find you. Go to the Meet and Greet forum and click the New Thread button to begin a new thread. Use your name and a brief summary as the subject of your post. For example, Robert Smith: Exploring Career Options. Read some your classmates’ postings. Pick at least 2 classmates’ postings that are most interesting to you and add your friendly comments.

Updating Your Profile

Optionally, please consider updating your profile, which can also be accessed by clicking the Profile link in the menu that appears when you click on your name at the top-right corner of this screen. When people find you in the forums, they can click on your name to view your complete profile and get to know you more.

Time

This activity will take approximately 1 hour to complete.

These guidelines for interacting with fellow classmates via the forums were originally compiled by The University of Illinois.


Model Representation

To establish notation for future use, we’ll use x^{(i)}x (i) to denote the “input” variables (living area in this example), also called input features, and y^{(i)}y (i) to denote the “output” or target variable that we are trying to predict (price). A pair (x^{(i)} , y^{(i)} )(x (i) ,y (i) ) is called a training example, and the dataset that we’ll be using to learn—a list of m training examples (x(i),y(i));i=1,…,m—is called a training set. Note that the superscript “(i)” in the notation is simply an index into the training set, and has nothing to do with exponentiation. We will also use X to denote the space of input values, and Y to denote the space of output values. In this example, X = Y = ℝ.

To describe the supervised learning problem slightly more formally, our goal is, given a training set, to learn a function h : X → Y so that h(x) is a “good” predictor for the corresponding value of y. For historical reasons, this function h is called a hypothesis. Seen pictorially, the process is therefore like this:

When the target variable that we’re trying to predict is continuous, such as in our housing example, we call the learning problem a regression problem. When y can take on only a small number of discrete values (such as if, given the living area, we wanted to predict if a dwelling is a house or an apartment, say), we call it a classification problem.


Cost Function

We can measure the accuracy of our hypothesis function by using a cost function. This takes an average difference (actually a fancier version of an average) of all the results of the hypothesis with inputs from x’s and the actual output y’s.


Cost Function - Intuition I

Cost Function - Intuition II

Gradient Descent

Gradient Descent Intuition

Gradient Descent For Linear Regression



扫描关注:Andy风雨无阻

(转载本站文章请注明作者和出处 风雨无阻

Show Disqus Comments

Post Directory