Given a list of words from Google news for which a 'semantic' distance was available
Take the top most 150 semantically similar word pairs for each of the 70k words
Calculate phonetic similarity for each word pair (using
Removed all pairs where the two words had the same stem (used Porter stemming).
Output of just under 2 million word pairs (1959712).
Pretty pictures and some interesting word pairs.