2022-02-23

A starting word for WORDLE

Do you play Wordle? Then, you should have a good starting word. I have picked my starting word from a set that has the minimum cumulative mean Levenshtein distance compared to the rest of the words in the dictionary. This metric represents the minimum number of letters required to change one word into another, e.g. the distance between trope and trove is equal to one, and then averaged over the entire dictionary.

Python Virtualenv

In this case, it can be useful to create a separate python virtual environment by typing inside the command prompt:

   py -m venv venv_wordle

and enable the virtual environment with:

   venv_wordle\scripts\activate

We also need a couple of python modules, the NLTK module can be installed with:

   pip install NLTK 

and the dedicated module that implements the Levenshtein distance with:

   pip install levenshtein

Python Script

The script below determines the cumulative mean distance of each word. From the full list, it filters out words that are a mix of numbers and letters and also have double letters.

    # filename: wordle.py
    from nltk.corpus import wordnet
    import Levenshtein as distance
    wordle = [ n for n in wordnet.all_lemma_names() 
                if ( len(n)==5 and len(set(n)) == 5 and n.isalpha() ) ]
    mindistance = 5
    firstguess = ""
    for startword in wordle:
        sum = 0 
        for word in wordle:
            sum = sum + distance.distance(startword,word)
        sum = sum/(len(wordle)-1)
        # print(f"word {startword} distance = {sum}")
        if sum < mindistance:
            # replacing the starting word 
            firstguess = startword
            mindistance = sum
            print(f"new first guess {firstguess} distance = {sum}")


    

The output of the script is shown in the figure below:

Caret is the word with the smallest distance that is also included in the word list of the game.

Hopefully, this will help you to become a successful Wordle player.