How can I turn a series into an array and keep the index?

I'm running a k-means algorithm (k=5) to cluster my Data. To check the stability of my algorithm I first run the algorithm once on my whole dataset and afterwards I run the algorithm multiple times on 2/3 of my dataset (using a different random states for the splits). I use the results to predict the cluster of the remaining 1/3 of my data. Finally I want to compare the predicted cluster with the cluster I get when I run k-means on the whole dataset. This is where I get stuck.
Since k-means always assigns different labels to the (more or less) same clusters I can't just compare the them. I tried using .value_counts() to reasign the labels 0 to 4 based on their frequency. But because I run this check multiple times, I need something that works in a loop.
Basically when I use .value_counts() I get something like this

     PredictedCluster  
4              55555  
0              44444
2              33333
1              22222
3              11111

If I could turn this into an array like this

a = [[4, 55555],[0,44444],...,[3,11111]]

I would be fine. Basically I want to get an array where the labels are sorted by size.

Can anyone please tell me how to do this or what other approach I could use to solve my problem?



Read more here: https://stackoverflow.com/questions/64892135/how-can-i-turn-a-series-into-an-array-and-keep-the-index

Content Attribution

This content was originally published by jjunk at Recent Questions - Stack Overflow, and is syndicated here via their RSS feed. You can read the original post over there.

%d bloggers like this: