The limit of Lojban

How many gismu are possible?

Gismu is a 5-letter word that functions as a verb in Lojban language.
It has a specific structure of the form CVCCV or DVCV,
where C is a consonant, V is a vowel and D is an initial cluster of two consonants like pl. There is a rule a gismu can't exist if it's too similar to an already existing gismu. E.g. there is a gismu dansu (to dance) and a could-be gismu tansu can't exist because t is blocked by a similar sound d. On the same token two official gismu can't exist if they differ in the last vowel only. Like no dansa since dansu already exists.
This rule applies to official gismu only. What if we extend this rule to all possible gismu. How many of them would we able to get?
The answer is...

around 17895
possible experimental gismu
+ 1392 official gismu

How did I count it? Here is the algorithm.

  • Create all possible gismu (larfa, panfo)
  • Try adding them one by one to the list of already existing gismu including official ones.
  • If they are not blocked by any similarity rule every new candidate gismu is added to the list.
  • Finally, count how many of them you got.
  • Repeat the operation from step 1 but first shuffle the list of possible gismu so that you'd be trying to add them in another order.

To my surprise the last step doesn't change the result much. It's always around 17890 - 17900 possible gismu. A future research could determine the maximum number of experimental gismu if the optimal order of candidate gismu is used.

The code on Github implementing this algorithm

Speed of depletion of free gismu space

Obviously, at first almost every new possible gismu arbitrarily taken from the possible space can become a new gismu not blocking any other gismu. Over time, it's getting harder and harder to find vacant gismu. Here is the chart showing the speed of depletion. X-axis shows the number of gismu assigned by far. Y-axis shows after how many attempts a new gismu is successfully assigned. Looks like Y-axis needs a double logarithm scale.