Engineer. I write about machine learning, engineering and career. Follow me here and on Twitter for future content https://twitter.com/logancyang

If you have taken any machine learning courses before, you must have come across logistic regression at some point. There is this sigmoid function that links the linear predictor to the final prediction. Depending on the course, this sigmoid function may be pulled out of thin air and introduced as the function that maps the number line to the desired range [0, 1]. There is an infinite number of functions that could do this mapping, why choose this one? One critical point to focus on is that the output of the sigmoid is interpreted as a *probability*. It’s obvious that…

In my previous article Visualizing Optimization Trajectory of Neural Nets, I showed a number of animated figures to demonstrate the training process of neural networks. Since then some readers expressed interest in the code, so I ported it to a Python package and published it on PyPI. With this package, you can produce similar plots easily, experiment with the default datasets and default models out-of-the-box, or plot for your own data modules and/or models by implementing an interface in the style of **PyTorch Lightning**.

To install directly from PyPI

`pip install loss-landscape-anim`

PCA does projection to reduce dimensionality.

Intuitively, it just draws orthogonal lines from data points to a lower-dimensional hyperplane and finds the landing spots. The axes of the hyperplane are selected based on the *maximization of variance they capture*. The more variance they capture, the more *information* the projections retain. Through this axis selection scheme, PCA achieves minimal information loss and does a good *compression* of the data.

I moved to New York City from California two years ago. Having been spoiled by the Californian sunshine and nice weather for several years, I forgot how it’s like living in a city with a lot of precipitation. Sometimes I feel the rain never stops. No matter what season it is, it can rain for several days in a row in NYC. And they are not just drizzles, these are heavy thunderstorms — it’s not an exaggeration if I say one of them can exceed the amount the precipitation Los Angeles gets in an entire year.

My wife doesn’t understand…

Earlier this year I came across a viral Twitter thread by Randall Kanna about how to create one’s own computer science degree with free online content. It was not only excellent for people who don’t have prior knowledge in computer science, it’s also valuable for new software engineers who didn’t major in CS in college. That’s when I thought about creating my own full-stack machine learning engineering degree. After being in the tech industry and having learned it the hard way, I believe a custom-designed curriculum is going to be valuable to myself and others who have similar goals like…

In this article, I’m going to code up some special kind of squiggly shapes that might not help us colonize the galaxy any time soon, but probably will bend your mind and make math fun again. The subject is called **fractals**.

Before answering this question, let’s first look at some examples. Consider the length of the coastline of Britain. How long do you think it is? After a bit Googling you may find:

11,073 miles according to the mapping authority for the United Kingdom

But it also mentions something called the ** Coastline Paradox**. …

When you want some values from a certain probability distribution, say, a normal distribution, you could simply call `rnorm`

in R, or `numpy.random.normal`

in Python. But have you ever wondered how they do it under the hood? The underlying idea is incredibly simple yet powerful. In this article, I'm going to explain it visually without boring you with any math symbols. In 3 minutes, you'll be able to implement your own custom distribution simulator. Let's get started!

Any computer system should come with a pseudorandom number generator that is able to give you (pseudo) uniformly distributed random numbers. This is…

As the beating heart of deep learning, a solid understanding of backpropagation is required for any deep learning practitioner. Although there are a lot of good resources that explain backpropagation on the internet already, most of them explain from very different angles and each is good for a certain type of audience. In this post, I’m going to combine intuition, animated graphs and code together for beginners and intermediate level students of deep learning for easier consumption. A good assessment of the understanding of any algorithm is whether you can code it out yourself from scratch. …

*Update: I have ported the code to a Python package **here**. Feel free to experiment and produce similar plots like the ones in this post!*

In the previous post, I showed some animated plots for the training process of linear regression and logistic regression. Developing a good “feel” of how they “learn” is helpful because they can be used as a baseline before applying more complex models. Although most deep neural networks also use gradient-based learning, similar intuition is much harder to come by. One reason is that the parameters are very high dimensional and there are a lot of…

*This is one post in a series for machine learning optimization animations. Each plot can serve as a flashcard for easy consumption.*

If you are like me, you may prefer looking at pictures that move to pages of Greek symbols when it comes to learning math. It’s more intuitive, more fun, and a great way to look under the hood and debug if things go wrong. So here I’m not going to bore you with equations. Equations and long derivations are important, but you already have countless books and notes for them.