Generating wizardly words.
I had the idea to make a random word generator while writing my stories. After all, naming is hard. What's a convincing name for a novel city? Or a person? Or a food?
I'm no neural network genius, so I created a hash-table approach to figuring out probable letter combinations. You can feed the program a file, say a list of French words, and it will determine possible letter combinations along with their probability.
For example, while it parses the word "wizardly", it knows that "wiz" is likely followed by the letter "a" (in this case, the likelihood is 100% because the sample data is a single word 🤔).
The way it works is similar to large language models, kinda, and uses context (in most cases 3 letters but it can be more) to determine the next most likely letter. ChatGPT uses a context window of 128k tokens to determine the next word. Wizardly Words uses 1-5 letters. Yes, I know, not quite as beefy as ChatGPT, but also doesn't cost $700,000 a day to run. In fact, it's... free! 💸
Back to how it works - context matters, but it's also important to not overfit. If we were to have a context of 5 letters, instead of 3, we run the risk of creating words that sound too familiar. Here's the difference between words based on the English dictionary that use a context of 3 letters v. 5 letters:
3 Character Context | 5 Character Context |
---|---|
strate | parcel |
potem | cale |
votingl | talist |
omicrom | fade |
negapost | ventra |
twent | stimulale |
scro | adonize |
intraino | squa |
sumper | butter |
aire | encourage |
You'll notice that the 5 character context comes up with more "realistic" sounding words, but often finds words that already exist (nearly half). The 3 character context model sounds less cohesive, but also allows for more interesting variants.
If you're interested in trying out the word generator in your own program, feel free to install SmallCharacterModel as a Swift Package and train it on your own data! Make sure to report any bugs or leave comments / questions.
Otherwise, you can play around with my sample app that uses the model builder and word generator. Here's what it looks like:
Join the discussion on Reddit!