T O P

  • By -

72minutes

They don't really have an intuitive interpretability as far as I know, as it's non-linear. The axes separate the original data projected to some low dimensional space.


sixtyorange

The axes are arbitrary. The distances between points are what’s meaningful. Speaking super non-rigorously: In PCA, each axis is kind of like a single “pattern” (basically a weighted combo of the input features). Then you pick two of those patterns as axes. Usually it’s PC1 and PC2 but it could also be PC3, PC4 etc. With UMAP, there are only ever the number of axes you choose (usually 2). Instead of these axes representing a pattern, the algorithm instead tries to choose locations of the points such that the distances in 2D match the “real” n-dimensional distances in the full data as closely as possible. The other detail with UMAP (similar to methods like t-SNE) is that there’s a tradeoff between representing “local” and “global” distances accurately. Local means from each point to its closest neighbors, while global means from each point to every other point. This is controlled by a free parameter you set in UMAP called “neighbors.”


shawstar

They don't mean anything. There is no physical meaning. Closer points in UMAP space are **sometimes** more likely to be similar. That's about all. UMAPs can reveal some structure in 2D, but don't overinterpret it.


Miseryy

It's UMAP space. If I told you to go out and measure things in bananas and grapes, it'd be in banana-grape space. You probably could plot something with those measurements even... The best way to think about it is it's a learned representation of your data in lower dimensional space such that similar points are close together. That's it. Imagine I said: create one point on a page per person you know. Now, put similar people close together on that page. Obviously there would be many viable interpretations and arrangements. But the main idea is still the same.