Election for World Government

I actually was able to fix two bugs from looking at this data. First, I figured out that my quota was being round down rather than up, leading to people unnecessarily being switched to poorly fitting clusters, when such a switch didn't even make the cluster sizes any more even.

Next I figured out that you really should judge who is being kicked out of an above quota cluster by who is closest to another below quota one, rather than simply kicking whoever is furthest from the center of that cluster. Previously I had been getting kicked out of the centre left cluster, because I was so distantly left, and into a right wing one with Tom and Tsuke that was below quota - which I realized made no sense. After the adjustment, you get a pretty reasonable distribution from right to left in the two seat and three seat groupings.

Now I think it's nearly equivalent to the elki solution:

https://elki-project.github.io/tutorial/same-size_k_means
 
lol what do the numbers mean?

The affinity numbers? It's distance. If you consider each candidate a dimension, you can consider each ballot/person as a point in N-dimensional place (here, it's 22 dimensional). You take the distance between the two points using the formula for euclidean distance in n-dimensional space:

https://en.wikipedia.org/wiki/Euclidean_distance#n_dimensions

e99e0fc30fad6a8422c82be9df69677a24b1ac72


And that shows how alike they are.

Basically, the lower the number is, the more similiar your opinions are to someone else. And the more likely you are to be put in the same group with that person, according to the grouping algorithm, K-Means.

As for the numbers under the candidates... that's the average score they got in that group. The winners are always the ones with the highest average score. Or, if there are ties, this beta version just picks the first one in the list, which I need to fix.
 
Last edited:
Distance is a kind of simple measure of affinity though... there's something called a Pearson correlation, and I won't go heavily into this, but it basically takes into account the shape of the ballot.

Basically imagine you have three sets of scores:

A: 1,2,3
B: 2,2,2
C: 2,4,6

Euclidean distance says that voter A is more similiar to voter B, because those two points are closer. But the Pearson correlation would say that Voter A is more similiar to Voter C, because the slope of their scores is more alike.
 
Data clustering is a fascinating subject, with many uses in data mining and machine learning. Truly it is worthy of praising the lord Allah that he gave us such a gift.
 
I selected Paul, Trump, Deng, Hitler, and King.

King and Paul wouldnt be doing any actual ruling. Paul is there to make sure the other 3 dont turn everything into a dictatorship and king is there for minorities. also hes black so im officially nawt racist!

Deng is there because you need someone who has the long term view instilled into him and is a good leader. China with its 2000 year history will do that to you. Trump because more than anyone he understands that perception of reality becomes reality and he is the best propagandist I have seen. I have never seen someone destroy another person more completely and with as little effort as he did with JEB! by placing the low energy moniker on him. He may have even destroyed any future prospects of a Warren presidency with pocahontas. Hitler because he was able to take a country that lost a world war a generation ago and almost win the next.

I have no idea what people see in George Washington. At best he was a middling general. Any accomplishments he did during his presidency seem to have been done by jefferson.
 
K means works by using an algorithm to compute initial cluster centers, grouping people to the cluster they're closest to, then taking the mean of all those in the group, and using that as the new center. It rinses and repeats until the center stops changing.

Technically this is a non-deterministic process, and my implementation changed wildly every run. But the scikit implementation runs many times and takes the best results (fyi I set it to run 100 times), and its results never seem to change.

Technically I suppose the ideal solution would be to run it on *all* possible cluster centers and choose the one that minimized average distance the most while having equal cluster sizes. But such a solution would never scale and would take a crazy amount of processing power.
 
"And almost win the next"

Lol no.

Few people in history have done as much damage to their country as Hitler. Hitler tested a concept, that you could just throw away all rules of justice and consideration for the enemy, attack wildly, and overwhelm the enemy that way. The lesson was: payback is a bitch.
 
Back
Top