Friday, October 24, 2008

Estimating URR from Dispersive Field Size Aggregation

This post continues from the analysis of oil field size distribution from a few days ago. That discussion ended with questions regarding the utility of the analysis for arbitrary regions. It seemed to work well for North Sea oil but eyeballing for Mexico, the Parabolic Fractal Law looked a better empirical fit.

I did not talk much about the statistics of the rank histograms in the previous post. To remedy that I ran a few Monte Carlo simulations to determine what noise we can expect on the histograms. In particular, I went back to the North Sea data. The figure below shows a Monte Carlo run for the Dispersive Aggregation model where I sampled from the inverted distribution with P acting as a stochastic variate:
Size=c*(1/P-1) where P=[0..1]
For a run of 200 samples, the results look like:

One can linearize this curve by taking the reciprocals of the variates and replotting. Note the sparseness of the endpoints which means that random fluctuations could change the local slope significantly (which has big implications for the Parabolic Fractal Model as well).

Plotting the MC data simulated for 430 point on top of the actual North Sea data and we get this:

The following figure gives a range for the single adjustable parameter in the model. For the North Sea oil, I replotted using the MaxRank and two values of C which bounded the maximum value.



The parameter C acts like a multiplier so it essentially moves the curves up or down on a log-log plot.


One can try to estimate URR from the closed-form solution but as I said before, the lack of a "top" to the data makes it unreliable. The actual distribution goes like 1/Size so that this integrates to the logarithm of a potentially large number -- in other words, it diverges. So unless one can put a cap on a maximum field size, ala the PFM's curvature, the URR can look infinite. From the model's perspective, one can emulate this behavior by not allowing a narrow window of probability for those large reservoir sizes.

In terms of geological time, we have one finish line, corresponding to the current time. But the growth lifetimes forr the dispersion to occur correspond roughly to all the points between now and the early history of oil formation some 300 million years ago. So we have to integrate to average out these times.
Cumulative (PDF(Size)) =
Integral (PDF(Size)dSize) from 0 .. Now =
Integral (PDF(kt)dt) from 0 .. Now
where the value of Now you can consider as roughly 300 million years from the start of the oil age. Small values of T correspond to the start of dispersion at longer times ago and higher values result in values closer to the present (Now) time. The number T itself scales proportionately to the rank index on a field distribution plot if dispersion proceeds more or less linearly with time (kT ~ Size). Also, a rank value of 1 corresponds to the largest value on a rank histogram plot from which can estimate Maximum Field Size. Given a mature enough set of field data, this provides close to the ceiling for where fields cannot aggregate.

We essentially blank-out a probability window for field sizes above a certain value. This gives the following from the Dispersion relation:
P(Size) = K*Integral (C/(Size+C)^2 dSize) from Size = [0..L]
P(Size) = Size*(L+C)/(Size+C)/L


... inverting

Size = C*P/(1+C/L-P)
where P=[0..1]
The following set of curves shows the dispersive aggregate growth models under the conditions of a maximum field size constraint, set to L=1000.


Converting this graph to a rank histogram and you can notice an interesting stretching going on. Since we do have a constraint on field size, we can calculate an equivalent URR for the area under the curve.


We need to use the rank histogram to get the counting correct. Then the URR derives to:
URR = MaxRank * C * ((1+C/L)*ln (1+L/C) - 1)
for most cases, this approximates to:
URR ~ MaxRank * C * (ln (L/C) - 1)
Note that the URR has a stronger dependence on the parameter C than the maximum field size L , which has a weak logarithmic behavior. I will discuss the case of the USA further down this post but keep in mind that Americans have more oil fields by far than anyone else in the world, i.e. a huge MaxRank, yet our URR does not swamp out everyone else.

To test the model against reality, I retrieved field size data from Khebab's post on oil field sizes and Laherrere's paper on "Estimates of Oil Reserves".
  • North Sea (see above)
  • Mexico
  • Norway
  • World (minus USA/Canada)
  • West Siberia
  • Niger
Plus one estimate
  • USA
This chart from Laherre shows data from Mexico superimposed with the Dispersive Aggregation model (no field size constraint). Note that the field Canterell may fall in the predicted path and not form some sort of outlier (as some have suggested due to it meteorological origin).
Free Image Hosting at www.ImageShack.us

For Norway (courtesy of Khebab and Laherrere) we get the following two curves with data separated in time by several years. Note how the Maximum Rank shifts right as the value of C grows with time. Actually this shows that we may have some difficulty in separating out the decision of not developing smaller fields with an actual physical limit on the number of small fields that we count as production-level discoveries. (Caution: The values for C are in Gb, so they have to be mutiplied by 1000 to match the other C's in this post)

Free Image Hosting at www.ImageShack.us
Old data

Free Image Hosting at www.ImageShack.us
Recent data

The World data plot (excluding USA and Canada) from Laherrere does not collect rank info from the smaller oil fields, so the vertical asymptote shown here gives a prediction of a Maximum Rank, approximately 9000 fields worldwide.

This gives a range in URR's from 1100 Gb to 1850 Gb, for values of C from 15 to 25 and a Maximum Field size of 250 Gb. I estimated the MaxRank for this model from Robelius' thesis.
An article by Ivanhoe and Leckie (1993) in Oil & Gas Journal reported the total amount of oil fields in the world to almost 42000, of which 31385 are in the USA. According to the latest Oil & Gas Journal worldwide production survey, the total number of oil fields in the USA is 34969 (Radler, 2006). The number of fields outside the USA is estimated to 12500, which is in good accordance with the number 12465 given by IHS Energy (Chew, 2005). Thus, the total number of oil fields in the world is estimated to 47 500.
From Khebab, the PFM gives a low end estimate ignoring the missing parts of the rank histogram:
Using his (Laherrere's) parameters, we can compute a world URR (excluding the US and Canada, conventional oil) equals to 1.250 Trillions of Barrels (Tb) without considering oil fields with sizes below 50 Mb.
This chart from West Siberia bins histogram data on a linear plot.



Niger Delta data does not work very well at all. This could potentially work well as a candidate for constrained reservoir sizes, yet we can not rule out the possibility that some large fields have avoided discovery thus far.
Free Image Hosting at www.ImageShack.us

I haven't found a field size distribution yet for the USA alone, but I generated the following figure as a prediction. I used a maximum rank of 34500 from Robelius's thesis and came up with two curves, with one assuming a maximum field size of 10 Gb (lower curve). The latter corresponds to a URR of 185 Gb. If I used C=0.7 and Max Field of 15 Gb, then I get URR pf 217 Gb. Ideally, I would like to get data from the USA (fat chance) to see how closely the Dispersive theory will agree with such a large (34,500) statistical sample.



Overall, most of the characteristic size (C) parameters for all the field size distribution curves fall in the range of 15 to 30 Mb (Siberia at 44), except for the USA which looks definitely less than 1 Mb. What exactly does this mean? For one, it means that the USA has a much higher fraction of smaller oil fields than the rest of the world. Is this actually due to more resources invested into prospecting for smaller oil fields than the rest of the world? Or is it because the USA has a physical preponderance of smaller oil fields? I don't know. Yet, the latter does make some sense considering how much more reserve growth that the USA shows than the rest of the world (and the number of stripper wells we have). Slower reserve growth occurs for exactly the same reason -- slower relative dispersion in comparison to distance involved -- that it does for dispersive aggregation. After all, I constructed the underlying model in identical ways, substituting natural discovery in Dispersive Aggregation for man-made discovery in Dispersive Reserve Growth.

You can well ask why the curve nosedives so steeply near the maximum rank. It really only looks that way on a log-log plot. Actually the distribution flattens out near zero and this creates a graphical illusion of sorts. The dispersion model says that up to a certain recent time in geological history, many of the oil fields have not started dispersing significantly -- at this point the slow rates have not yet made their impact and the fast rates haven't had any time to evolve. This manifests as an unknown distribution of sizes for oil fields before this point. (If you plot a population's yearly income on a rank histogram you will see this same effect, in this case due to a similar truncation due to a slow income growth early in a career). The USA field essentially has a much slower dispersive evolution than the rest of the world, so we have a much higher fraction of small fields that have not aggregated.

The Dispersive Field Size Aggregation falls into the Dispersion Theory category of models that seem to have a high degree of cohesion and connectedness. It looks like we can actually connect the dots from dispersive field sizes to the Logistic shape of the Hubbert Peak since the underlying statistical fundamentals have much commonality in terms of temporal and spatial behaviors. For now I can't find a derivation of the Parabolic Fractal Law, which to me looks like a heuristic. I always base my observations on a model. I don't lock-step believe in heuristics, mainly because I have a perhaps unhealthy obsession with understanding why a heuristic works at all (review my railing against the "derivation" of the Logistic that I have frequently written about, until I figured it out to my satisfaction). By definition, a heuristic does not have to explain anything, it just has to describe the results. And describing the results in a mathematical equation does not cut it for me. For all I know, all the Wall Street quantitative analysts (the "quants") have based all their derivative and hedge fund "models" on heuristics -- and look a where that has got us.

In my mind, Dispersive Aggregation makes a lot of sense and it seems to fit the data. I smell another TOD post brewing.