mcramer (u/mcramer) - Readit News

mcramer commented on ML on Apple ][+ mdcramer.github.io/apple-... · Posted by u/mcramer

gwbas1c · 4 months ago

Any particular reason why the author chose to do this on an Apple ][?

(I mean, the pictures look cool and all.)

IE, did the author want to experiment with older forms of basic; or were they trying to learn more about old computers?

mcramer · 4 months ago

I wrote about my motivation at https://mdcramer.github.io/apple-2-blog/motivation/, which is obviously tongue in cheek. Tl;dr, I refurbished my Apple ][+ to try to recover a game I wrote in high school (https://mdcramer.github.io/apple-2-blog/recover/). After being unable to find the floppy with the game, I thought I'd try something just for giggles.

mcramer commented on ML on Apple ][+ mdcramer.github.io/apple-... · Posted by u/mcramer

amilios · 4 months ago

Bit of a weird choice to draw a decision boundary for a clustering algorithm...

mcramer · 4 months ago

How so? Drawing decision boundary is a pretty common visualization technique for understanding how an algorithm partitions a data space.

mcramer commented on ML on Apple ][+ mdcramer.github.io/apple-... · Posted by u/mcramer

windsignaling · 4 months ago

I'm surprised no one else has commented that a few of the conceptual comments in this article are a bit odd or just wrong.

> The final accuracy is 90% because 1 of the 10 observations is on the incorrect side of the decision boundary.

Who is using K-means for classification? If you have labels, then a supervised algorithm seems like a more appropriate choice.

> K-means clustering is a recursive algorithm

It is?

> If we know that the distributions are Gaussian, which is very frequently the case in machine learning

It is?

> we can employ a more powerful algorithm: Expectation Maximization (EM)

K-means is already an instance of the EM algorithm.

mcramer · 4 months ago

> Who is using K-means for classification? If you have labels, then a supervised algorithm seems like a more appropriate choice. The generated data is labeled but we can imagine those labels don't exist when running k-means. There are many applications for unsupervised clustering. I don't, however, think that there are many applications for running much of anything on an Apple ][+.

> K-means clustering is a recursive algorithm My bad. It's iterative. I'll fix that. Thanks.

> If we know that the distributions are Gaussian, which is very frequently the case in machine learning Gaussian distributions are very frequent and important in machine learning because of the Central Limit Theorem but, beyond that, you are correct. While many natural phenomena are approximately normal, the reason for the Gaussian's frequent use is often mathematical mathematical convenience. I'll correct my post.

> we can employ a more powerful algorithm: Expectation Maximization (EM) Excellent point. I will fix that, too. "While k-means is simple, it does not take advantage of our knowledge of the Gaussian nature of the data. If we know that the distributions are at least approximately Gaussian, which is frequently the case, we can employ a more powerful application of the Expectation Maximization (EM) framework (k-means is a specific implementation of centroid-based clustering that uses an iterative approach similar to EM with 'hard' clustering) that takes advantage of this." Thank you for pointing out all of this!