Friday, March 12, 2010

The Firm Size Entroplet

The statistical growth of a sampled set of company sizes (measured in terms of number of employees) should follow an entropic dispersion. The growth of an arbitrary firm behaves much like the adaptation of a species as I described recently in a post called "Dispersion, Diversity, and Resilience". The two avenues for growth include a maximum entropy variation in time intervals (T) and a maximum entropy variation in innovation or preferential attachment of employees to large firms (X). The combination of these two stochastic variants leads to the entroplet function.

Data from this Science article by RL Axtell parsimoniously supports the model. The probability density function has to normalize to 1 and we have one free parameter (N) with which to fit the data.
p(Size) = N / (Size+N)^2
The figure below shows the best fit to the data superimposed on Axtell's Zipf law straight line. The model suggests that the characteristic dispersed firm size is N=2 employees. Axtell obtained a regression fit for an exponent of 2.059, which agrees well with the MaxEnt value of 2. The entroplet also works better for the small firm data, where it looks like Zipf's law should truncate. The data in the large size tail region suffers from relatively poor statistics due to the low frequency of occurrence.


I will give a quick outline[1] of another proof for deriving the entroplet that differs from the one I used for species adaptation (which used a cdf instead of a pdf). We want to find the pdf of two random variables R = X/T where X and T remain independent, each exponentially distributed, with the mean scaled to unity. Then the pdf is
p(r) = Integral( t * p(t*r | t) * p(t) )
where we assume that t ranges over T and R=X/t becomes a scaled version of X. This basically states that we have placed an uncertainty in the two values and then turn the reciprocal into a multiplicative factor to make the conditional probability trivial to solve over all possible values of the random variables (see this post for a similar derivation).
p(r) = Integral( t * exp(-t*r) * exp (-t) dt )
which for r > 0 reduces to the normalized function
p(r) = 1/(1+r)^2
To use this for general modeling, we denormalize the values of unity with the parameter N, and change the rate R to a proportional Size.
p(Size) = N/(N+Size)^2




I continue to walk through these case studies because the math model invariably fits the data to a tee.

In the words of Joseph McCauley, the realm of econophysics occupies the world "where noise rather than foresight reigns supreme". In this case, we have no idea how the growth of an arbitrary company will proceed; yet we know the average rates of growth and other metrics. This gives us enough of a toehold so that we can estimate the entire distribution from those numbers with the help of our noisy friend, entropy.


---

[1] Adapted from"Probability and Random Processes for Electrical Engineering" by Albert Leon-Garcia, p.273.