Guocheng's Space
https://wei170.github.io/
Recent content on Guocheng's SpaceHugo -- gohugo.ioenMon, 22 Jul 2019 00:00:00 +0000Week 7 - Machine Learning
https://wei170.github.io/blog/coursera/ml/ml-stanford-7/
Mon, 22 Jul 2019 00:00:00 +0000https://wei170.github.io/blog/coursera/ml/ml-stanford-7/Additional Note for Improving Deep Neural Network
https://wei170.github.io/blog/coursera/ml/improve-dnn/
Sun, 07 Jul 2019 00:00:00 +0000https://wei170.github.io/blog/coursera/ml/improve-dnn/Practical aspects of Deep Learning Regularization What we learn in Week 3 is L2 Regularization.
L1 Regularization is without the square of the $\theta$.
Implementation tip: if you implement gradient descent, one of the steps to debug gradient descent is to plot the cost function J as a function of the number of iterations of gradient descent and you want to see that the cost function J decreases monotonically after every elevation of gradient descent with regularization.Week 6 - Machine Learning
https://wei170.github.io/blog/coursera/ml/ml-stanford-6/
Sat, 06 Jul 2019 00:00:00 +0000https://wei170.github.io/blog/coursera/ml/ml-stanford-6/Deciding What to Try Next Errors in your predictions can be troubleshooted by:
Getting more training examples Trying smaller sets of features Trying additional features Trying polynomial features Increasing or decreasing $\lambda$ Model Selection and Train/Validation/Test Sets Test Error $$ J_{test}(\Theta) = \dfrac{1}{2m_{test}} \sum_{i=1}^{m_{test}}(h_\Theta(x^{(i)}_{test}) - y^{(i)}_{test})^2 $$
Just because a learning algorithm fits a training set well, that does not mean it is a good hypothesis. The error of your hypothesis as measured on the data set with which you trained the parameters will be lower than any other data set.Week 5 - Machine Learning
https://wei170.github.io/blog/coursera/ml/ml-stanford-5/
Wed, 03 Jul 2019 00:00:00 +0000https://wei170.github.io/blog/coursera/ml/ml-stanford-5/Neural Network Cost Function $$ \begin{gather} J(\Theta) = - \frac{1}{m} \sum_{i=1}^m \sum_{k=1}^K \left[y^{(i)}_k \log ((h_\Theta (x^{(i)}))_k) + (1 - y^{(i)}_k)\log (1 - (h_\Theta(x^{(i)}))_k)\right] + \frac{\lambda}{2m}\sum_{l=1}^{L-1} \sum_{i=1}^{s_l} \sum_{j=1}^{s_{l+1}} ( \Theta_{j,i}^{(l)})^2 \end{gather} $$
Some notations:
L = total number of layers in the network $s_l$ = number of units (not counting bias unit) in layer l K = number of output units/classes Note:
The double sum simply adds up the logistic regression costs calculated for each cell in the output layer The triple sum simply adds up the squares of all the individual Θs in the entire network.Week 4 - Machine Learning
https://wei170.github.io/blog/coursera/ml/ml-stanford-4/
Mon, 01 Jul 2019 00:00:00 +0000https://wei170.github.io/blog/coursera/ml/ml-stanford-4/Non-linear Hypotheses If create a hypothesis with r polynominal terms from $n$ features, then there will be $\frac{(n+r-1)!}{r!(n-1)!}$. For quadratic terms, the time complexity is $O(n^{2}/2)$. Not pratical to compute.
Neural networks offers an alternate way to perform machine learning when we have complex hypotheses with many features.
Neurons and the Brain There is evidence that the brain uses only one “learning algorithm” for all its different functions. Scientists have tried cutting (in an animal brain) the connection between the ears and the auditory cortex and rewiring the optical nerve with the auditory cortex to find that the auditory cortex literally learns to see.Week 3 - Machine Learning
https://wei170.github.io/blog/coursera/ml/ml-stanford-3/
Sat, 29 Jun 2019 00:00:00 +0000https://wei170.github.io/blog/coursera/ml/ml-stanford-3/Classification Now we are switching from regression problems to classification problems. Don’t be confused by the name “Logistic Regression”; it is named that way for historical reasons and is actually an approach to classification problems, not regression problems.
Binary Classification Problem y can take on only two values, 0 and 1
Hypothesis Representation We could approach the classification problem ignoring the fact that y is discrete-valued, and use our old linear regression algorithm to try to predict y given x.Week 2 - Machine Learning
https://wei170.github.io/blog/coursera/ml/ml-stanford-2/
Fri, 28 Jun 2019 00:00:00 +0000https://wei170.github.io/blog/coursera/ml/ml-stanford-2/Mutiple Features Linear regression with multiple variables is also known as multivariate linear regression.
The notation for equations:
$$ x_j^{(i)} = \text{value of feature } j \text{ in the }i^{th}\text{ training example} $$
$$ x^{(i)} = \text{the input (features) of the }i^{th}\text{ training example} $$
$$ m = \text{the number of training examples} $$
$$ n = \text{the number of features} $$
The multivariable form of the hypothesis function:Week 1 - Machine Learning
https://wei170.github.io/blog/coursera/ml/ml-stanford-1/
Thu, 27 Jun 2019 00:00:00 +0000https://wei170.github.io/blog/coursera/ml/ml-stanford-1/The Hypothesis Function $$\hat{y} = h_\theta(x) = \theta_0 + \theta_1 x$$
Cost Function To measure the accuracy of the hypothesis function. This takes an average (actually a fancier version of an average) of all the results of the hypothesis with inputs from x’s compared to the actual output y’s.
$$J(\theta_0, \theta_1) = \dfrac{1}{2m} \displaystyle \sum _{i=1}^m \left( \hat{y}_i- y_i \right)^2 = \dfrac{1}{2m} \displaystyle \sum _{i=1}^m \left (h _\theta(x_i) - y_i \right)^2$$Overview of The Cellular Drone Development
https://wei170.github.io/blog/cellular-drone-development/
Mon, 24 Jun 2019 00:00:00 +0000https://wei170.github.io/blog/cellular-drone-development/Overview I named this project DCenter for Drone Center, hoping that it can develop into a centralized platform that support commercial drones to fly half-autonomously with collision avoidance and stream data and 4K video to any user.
Currently, I setup a simplified infrastructure on the Heroku and GCP, and build an Android app as both the adaptor and the controller of the drone. Cellular Drone blog will give you a more comprehensive view of the whole project and the evaluation of the current stage.Cellular Drone
https://wei170.github.io/blog/cellular-drone/
Fri, 31 May 2019 00:00:00 +0000https://wei170.github.io/blog/cellular-drone/Motivation Over the past few years, drones have become central to the functions of various businesses and governmental organizations and have managed to pierce through areas where certain industries were either stagnant or lagging behind.
However, almost all the commercial drones are controlled with the remote controller via the WiFi connection, which highly restricts the scope of the mobility and scale of discovery. In addition, the WiFi network is not secure.
https://wei170.github.io/blog/coursera/ml/cnn/
Mon, 01 Jan 0001 00:00:00 +0000https://wei170.github.io/blog/coursera/ml/cnn/The main benefits of padding are the following:
It allows you to use a CONV layer without necessarily shrinking the height and width of the volumes. This is important for building deeper networks, since otherwise the height/width would shrink as you go to deeper layers. An important special case is the “same” convolution, in which the height/width is exactly preserved after one layer.
It helps us keep more of the information at the border of an image.