Wealth of Nations


I recently read ‘one of the worlds most important books’ An Inquiry into the Nature and Causes of the Wealth of Nations by Adam Smith in 1776. I read the book as a detailed analysis of how capitalism spread through western civilization and how capitalism is an economy of greed.

The first three books can be categorized as economics/finance related:

  • Book I: Of the Causes of Improvement in the productive Powers of Labour.
  • Book II: Of the Nature, Accumulation, and Employment of Stock.
  • Book III: Of the different Progress of Opulence in different Nations.

The fourth and fifth books as politics/policy related:

  • Book IV: Of Systems of political Economy.
  • Book V: Of the Revenue of the Sovereign or Commonwealth.

The book is significantly detailed to the point of Smith wandering into long digressions and it’s written in ‘eighteenth-century’ language. Therefore, while reading the book I also read ‘The Condensed Wealth of Nations’ book by Eamonn Butler to refer to after each chapter.

Final Thought

Smith says it’s ‘good’ to allow your instinct toward selfishness to rule your life, he calls it capitalism, and it works for a century or so before the robbed and disenfranchised revolt and kill off the greedy ones who rule society to their benefit.


Duration (Intro)

Duration based measures are used to capture the market risk sensitivities of bonds. These measures:

  • Calculate price movements relative to yield changes for yield sensitive bond instruments.
  • Can be added together to generate an overall price sensitivity relative to yield.
  • Do not capture price changes due to embedded convexity/optionality.
  • Do not capture the different sources of yield changes, such as distinguishing between those caused by changing perceptions of credit and those derived from changing perceptions of interest rates.

Institutions typically examine the price risk inherent in a bond using the notion of duration. This involves calculating the degree to which a bonds price moves given a particular change in market yields. The simplest form of duration is Macaulay duration, which represents the weighted average maturity of a bond. The present value of each cash flow is weighted by the period when it is due, and the resultant sum is divided by the market price. The units of measurements are years. Once we have calculated the Macaulay duration we can calculate the Modified duration. Modified duration is the percentage change in bond price for (sensitivity to) a percentage change in yield. Modified duration gives a linear approximation in changes in price and yield. The duration number represents a percentage value that changes to a 1% or 1 basis point move in yield. The modified duration number can be amended to generate an actual price change given a specific yield change aka: dollar duration, PVBP, PV01 or DV01. The regularly used formula for PVBP is:


For fixed rate bonds the modified duration will increase as the bonds maturity increases, and decrease as the bonds interest rate increases. The further the cash flows are from today, the greater the impact of discounting and hence interest rate changes on value. For zero coupon bonds there will be no cash flows before maturity, therefore the duration will always be equal to its maturity. For variable floating rate instruments, from an interest rate perspective, each time its rate resets the instrument is returning to fair value. The change in value to interest rate changes is dependant on the periods between the rate change and the next reset date. The Macaulay duration of these instruments is equivalent to the time to the next coupon reset.

Duration can be applied to a portfolio of positions as it allows the aggregation of multiple risks. Calculating the price sensitivity of a portfolio consisting of multiple instruments is straightforward, with the PV01 of the portfolio being equal to the sum of the individual measurements.

Limitations of Duration

Convexity and Non-Linearity


  • The estimation of price changes given changes in yield using duration is linear. It assumes that the price move if yields move ten basis points is ten times higher than when yields move one basis point, which is incorrect. The relationship between price and yield for a regular bond is convex. For most simple fixed income instruments the amount of price change for a given yield change increases as overall yields fall. The more convex the relationship, the greater the amount of relative price changes as yields fall.
  • Estimating price changes using duration is equivalent to using the tangent to the price/yield curve – as the size of the yield change increases, so too will the difference between a duration based estimate and a true value change. For a standard interest-bearing instrument, duration estimates overstate declines in value as yields rise and understate rises in value as yields fall.

No Single Source of Price Change Across Instruments

Summing together the sensitivities of different instruments to generate a single sensitivity can be attractive, but can be misleading:

  • Maturity: yield changes differ according to maturity, for instance yields on 10-Year bonds might increase by five basis points while those on 2-Year bonds might only move by one basis point.
  • Credit Risk: duration measurements cannot account for moves due to changes in credit risk, for instance credit perceptions can change on specific bonds and therefore have different effects on different types of bonds i.e. government bonds, corporate bonds etc.

Greeks Δ Γ ϒ Θ ρ (Intro)

The Greeks are the quantities representing the sensitivity of the price of derivatives such as options to a change in underlying parameters on which the value of an instrument or portfolio of financial instruments is dependent. Each risk variable of the (major) Greeks is defined below:

  • Δ (Delta) represents the sensitivity in the options value to a change in the underlying price.
  • Γ (Gamma) represents the rate of change between an options delta and the underlying price changes, (second-order derivative).
  • ϒ (Vega) represents the sensitivity of an options value to the underlying’s volatility.
  • Θ (Theta) represents the sensitivity of an options value to time.
  • ρ (Rho) represents the sensitivity of an options value to interest rate changes.

For options traders the Greeks are essential, as they provide a guide to analysing the current and potential exposures of a position. If the option holder is planning to keep the option to maturity the Greeks become potentially irrelevant. However if the trader wants to manage the exposures of options then the sensitivities are fundamentally important. If buying an option gives a trader a positive delta exposure, then that trader needs to hedge by generating a short position. Similarly, estimates of the change in position will guide a trader as to which future trades may be necessary to rebalance the exposure.

For a portfolio of positions, the sensitivities are broadly additive, and care should be taken when the maturities of instruments differ. For example a single vega exposure may be misleading since the volatilities of instruments with different maturities/strikes may change at different rates. As time passes the sensitivities of a portfolio change irrespective of movements in other market factors and option prices will change along with the delta and gamma of the options (bleed). Combinations of options give rise to variability in the gamma of a portfolio as the underlying price changes. In this instance the amount of gamma may rise and fall but additionally the gamma may change sign.


Delta (Δ) is the sensitivity of the option price to movements in the price of the underlying asset. Delta is the most important measure of option sensitivity. Delta can be represented as (where C = security price, and S = asset price):


For an individual option, delta is usually expressed as a percentage. For instance, if an option had a 50% delta, then if a ‘small change in the price of the underlying asset’ was USD 1, the option price should increase by about USD 0.50. Note that while the delta for a simple call option is positive, the delta of a put will be negative. As the price of the underlying asset rises, the value of a put option will decrease. When talking about a portfolio of positions, the idea of delta as a percentage becomes less meaningful. Instead, delta is better considered as the amount of market exposure to some underlying asset. The concept of delta as positional exposure extends beyond the option market. If a trader owns a security, this can be represented as having 100% delta in the amount of that security. Many non-option trading desks therefore use the term delta in describing position risks.

Delta neutrality can be achieved in a multitude of ways. The delta of an option can be eliminated not only through the purchase/sale of an underlying asset, but also by buying/selling an option that has (or options that have) equivalent delta. Similarly, a trader looking to be ‘long’ some market can achieve this either by buying the underlying security, by buying call options, or by selling put options. Although the payoff from each of the strategies may be quite different, in each case the trader is better off if the market rises. The position is additive; as long as a trader is consistent about the units of measurement, the instantaneous position risk of a portfolio will be the sum of the deltas from the different instruments. Note that the delta of a position will change with time. This is especially the case with options, where the delta must theoretically be rebalanced continuously. The correct hedge for today will not be appropriate for tomorrow.


Gamma (Γ) is the rate of change of the delta of an option, with respect to a change in the underlying price. An option always has a positive gamma, although gamma may tend to zero. A positive gamma means that as the price of the underlying asset rises, the effective position thrown up by ownership of the option also increases; for a call option the positive delta goes up, for a put option the negative delta is reduced. Positive gamma generates the ‘super-trader’ effect; the position gets longer as the market rises, shorter as it falls. There is a flipside to the ‘super-trader’ effect of being long gamma – the option has to be paid for first. The profit that a long option trader makes from rebalancing may well be less than the premium outlay. Indeed, one-way of thinking about the premium price of an option is that it is ‘a fair price’ equivalent to the amount of gamma profit that a buyer should generate over time. An increase in volatility will increase the gamma of options that have strikes away from ‘the money’, while decreasing the gamma of at-the-money options. Gamma can be represented as (where C = security price, and S = asset price):gam

The delta exposure generated by an option can be eliminated by trading in linear securities, such as an underlying asset or a forward contract on that asset. The delta for these securities is constant and the gamma is zero. In order to hedge the gamma of a portfolio, it is necessary to trade in securities that also generate gamma positions – essentially an option trader must hedge options with other options. However, just as with delta hedging, gamma hedging will itself require rebalancing, especially since the rate of change of gamma with respect to time will itself differ between securities. If the perfect continuous-time/price securities market idealized in the Black-Scholes formulas actually existed, then there would be no need to worry about the gamma of a portfolio. Option positions could be hedged by continuous trading in the underlying asset. However, in reality rebalancing costs money and involves some risk; market conditions will not perfectly equate to the idealized situation. If a portfolio can be constructed so that delta rebalancing occurs less frequently, then this may be advantageous. Since the gamma position of an option trader represents the amount of rebalancing required, then that position needs to be diminished in order to reduce the requirement to rebalance.

Gamma is almost at a maximum for ‘at-the-money’ options. This makes sense as deeply ‘in’ or ‘out’ of the money options will reach extreme values, which will not change particularly if the underlying price changes. Gamma is at a maximum when delta equals 0.5, i.e. when the option is close to the money, but not at the money.


Vega represents the sensitivity of a security price with respect to the change in volatility. It is also known as kappa or zeta. Vega can be represented as (where C = security price, and σ = volatility):


The volatility used in these calculations will generally be the implied volatility that is currently associated in the market with options of this type (think of volatility as a market price, not an estimation of market movements). Conventionally, a vega value is multiplied by the underlying position to express a cash amount that would be made or lost if the implied volatility used to price an option increased by 1%. An operator who is long vega (typically the owner of an option/s) will make money if implied volatility rises. Simple European-style puts and calls with identical strikes/maturities will have identical vega exposure. For simple options vega is always positive, though it may be close to zero.


Theta (θ) measures the rate of change in an option value with respect to a change in the remaining maturity (time) of the option. If all the parameters (asset price, risk-free rate, volatility, and so on) remain constant, the value of the option will be constantly reduced as time passes. This is because the option’s time value is lost. Therefore, theta is a measure of time decay. Theta is nearly always negative as, if nothing else changes, an option price declines as it approaches expiry and the ‘optionality’ disappears. As an option reaches expiry, the absolute level of theta for ATM options increases, whilst for OTM/ITM it decreases. In many respects theta is a simple reverse of gamma with one key exception: the passage of time is a certainty whereas changes in the underlying price are unpredictable. Theta can be represented as (where C = security price, and T = time period):


Rho measures the sensitivity of a security price to the change in risk-free rate. The price of a simple call option on a non-futures contract increases with rises in rates because the forward price rises – the converse is true for a put. The amount of increase/decrease will rise with time. Deeply ITM calls and puts will have the largest (absolute) amount of rho, since these options are almost certain to be exercised (they have a delta of approximately +/-100%), the value will move in step with the change in the forward prices. Rho can be represented as (where C = security price, and r = risk-free rate):

rhoTable of Greeks

table of greeks

Regression (Intro)


Regression analysis is a statistical process for estimating the relationships among variables. It includes many techniques for modelling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables. More specifically, regression analysis helps one understand how the typical value of the dependent variable (or ‘criterion variable’) changes when any one of the independent variables is varied, while the other independent variables are held fixed. Regression analysis is widely used for prediction. Regression analysis is also used to understand which among the independent variables are related to the dependent variable, and to explore the forms of these relationships. In restricted circumstances, regression analysis can be used to infer causal relationships between the independent and dependent variables. However this can lead to illusions or false relationships, so caution is advisable; i.e. correlation does not imply causation.

Simple and Multiple Linear Regression

  • Simple Linear Regression; fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible. The slope of the fitted line is equal to the correlation between y and x corrected by the ratio of standard deviations of these variables. The intercept of the fitted line is such that it passes through the centre of mass (x, y) of the data points.
  • Multiple Linear Regression; attempts to model the relationship between two or more explanatory variables and a response variable by fitting a linear equation to observed data. Every value of the independent variable x is associated with a value of the dependent variable y.


Beta is a measure of the volatility, or systematic risk, of an asset or portfolio in comparison to the market as a whole. Beta is used in the capital asset pricing model (CAPM), a model that calculates the expected return of an asset based on its beta and expected market returns. Also known as “beta coefficient.” Beta is calculated using regression analysis, and you can think of beta as the tendency of a security’s returns to respond to swings in the market. A beta of 1 indicates that the security’s price will move with the market. A beta of less than 1 means that the security will be less volatile than the market. A beta of greater than 1 indicates that the security’s price will be more volatile than the market. In equity markets, market risk is captured by beta calculations. Beta seeks to capture risk in an equity asset/portfolio through relating changes in value to changes in overall equity market conditions. Betas are not universally calculated the same way and this can cause significant differences between estimates among different operators.

Beta is usually defined as the covariance of an asset relative to the variance of the market/index. See formula below where a = asset and i = index.beta

The majority of approaches to Beta use some form of regression analysis, however there is no consensus on precisely how this should be performed. Limitations to Beta calculations arise when an operator selects the choice of index and choice of time period.

  • Choice of Index; calculations can become distorted based on the selection of index, i.e. when the operator has to choose the index; S&P 500, FTSE 100 or a world index.
  • Choice of Time Period; past estimates can be unreliable as companies change significantly (for example due to restructuring and acquisitions), i.e. when the operator has to choose between one year, five years or ten years for example.


A statistical measure that represents the percentage of a fund or asset’s movements that can be explained by movements in a benchmark index. It is a statistic used in the context of to either predict future outcomes or the test a hypotheses. R-squared values range from 0 to 100. An R-squared of 100 means that all movements of a security are completely explained by movements in the index. A high R-squared (between 85 and 100) indicates the fund’s performance patterns have been in line with the index. A fund with a low R-squared (70 or less) doesn’t act much like the index. A higher R-squared value will indicate a more useful beta figure. For example, if a fund has an R-squared value of close to 100 but has a beta below 1, it is most likely offering higher risk-adjusted returns. A low R-squared means you should ignore the beta. Formula for R-Squared below; SS = sums of squares.r2

Capital Asset Pricing Model (CAPM)

The CAPM model describes the relationship between risk and expected return and is used in the pricing of risky assets.capmFor the CAPM, investors need to be compensated in two ways: time value of money and risk. The time value of money is represented by the risk-free rate in the formula and compensates the investors for placing money in any investment over a period of time. The other half of the formula represents risk and calculates the amount of compensation the investor needs for taking on additional risk. This is calculated by taking a risk measure (Beta) that compares the returns of the asset to the market over a period of time and to the market premium. The CAPM is the expected return of a asset/portfolio that equals the rate on a risk-free asset plus a risk premium. If this expected return does not meet or beat the required return, then the investment should not be undertaken.

Market Stress Testing (Intro)

Stress testing can be defined as a simulation used on a portfolio to determine reactions to different financial situations. Stress tests are generally computer-generated simulation models that test hypothetical scenarios. The test is seen, as a useful method for determining how a portfolio will fare during a period of financial crisis. Monte Carlo simulation is one of the most widely used methods of stress testing. Stress testing should be used as a supplement for VaR, as VaR has major limitations. Stress testing can be broken down into two analyses, sensitivity and scenario analysis.

Sensitivity Analysis: Given a shift in a specific risk factor, what is the outcome (sensitivity) for a portfolio? Sensitivity analysis looks at changes in outcomes over a broad range of variables movements, with present values recalculated at certain price changes. The analysis can be used to determine how different values of an independent variable will impact a particular dependent variable under a given set of assumptions. It can be used within specific boundaries that depend on one or more input variables, such as the effect that changes in interest rates will have on a bond’s price. This can also include asymmetrical variables (option like).

Scenario Analysis: Stress events rarely impact one risk variable alone (such as interest rates), therefore sensitivity analysis has major limitations. When a stress event occurs multiple risk variables change simultaneously, including correlation between variables. As a result sensitivity analysis should be supplemented with scenario analysis. Scenario analysis focuses on approximations of what a portfolio’s value would decrease to if an unfavorable event, or worst-case scenario occurred. Scenario analysis aims at shocking different reinvestment rates for expected returns. There are different ways to approach scenario analysis, one method is to determine the standard deviation of returns, and then compute what value would be expected for the portfolio if each variable is shocked by three standard deviations above/below the average return. Arguably this is a stressed version of a VaR model/method as the outcomes are still a representation from a probability distribution.

Relevance of Stress Testing


Stress testing is useless unless it is actionable – the results need to inform the operator, in order to make specific decisions. Therefore the operator should decide the objectives of the stress test before applying any shocks. Examples of potential objectives include:

  • Setting risk limits.
  • Setting risk appetite and consistently assessing risk appetite.
  • Contingency planning.
  • Liquidity management.
  • Long term strategic business planning.
  • Model testing; do the results align with other risk models (VaR, etc.)?

Stress Testing Across Risk Stripes

  • Market Stress Testing: Market stress tests intended to capture exposure to unlikely but plausible events in abnormal markets. Running stress tests on market-related risks using multiple sensitivities and scenarios that assume significant changes in risk factors such as credit spreads, equity prices, interest rates, currency rates or commodity prices.
  • Credit Stress Testing: Credit stress tests measure and manage credit risk in credit portfolios. The process assesses the potential impact of alternative economic and business scenarios on estimated credit losses. For example scenarios can be articulated in terms of macroeconomic factors, and stress test results may indicate credit migration, changes in delinquency trends and potential losses in a credit portfolio.
  • Country Stress Testing: Country stress testing aims to identify potential losses arising from a country crisis by capturing the impact of large asset price movements in a country based on market shocks combined with counterparty specific assumptions.
  • Liquidity Stress Testing: Liquidity stress tests intended to ensure sufficient liquidity under a variety of adverse scenarios. For example is the operator’s position large in illiquid markets.

VaR (Intro to Methods)

Value at Risk (VaR) is a widely used risk measure of the risk of loss on a specific portfolio of financial assets. VaR is the maximum loss not exceeded with a given probability defined as the confidence level, over a given period of time.

There are three main methods of calculating VaR: Delta-Normal VaR, Historical VaR, and Monte Carlo VaR:

VaR Methods

Delta-Normal (Parametric) VaR:

The Delta-Normal (variance-covariance) method requires use of a normal distribution, because it utilizes the expected return and standard deviation of returns. The four moments of a normal distribution are:

  1. Mean = 0 (for normal dist) – the simple mathematical average of a set of two or more numbers.
  2. Standard Deviation (Volatility) – measures the amount of variation or dispersion from the average. The volatility is the second moment and produces the tails of the normal distribution.
  3. Skewness = 0 (for normal dist) – a measure of the asymmetry of the probability distribution of a real valued random variable about its mean.
  4. Kurtosis = 3 (for normal dist) – A statistical measure used to describe the distribution of observed data around the mean. Sometimes referred to as the volatility of volatility.

Variable inputs in the calculation are the mean and standard deviation of the portfolio/asset. We calculate the chosen time frame returns (continuously compounded returns that are time consistent with log normal). Then calculate the mean (the peak of the normal distribution). Then calculate the standard deviation (volatility). Then multiply our volatility (StDev) by our chosen confidence level, usually 95% or 99% to return our VaR as a percentage.

Advantages of Delta-Normal VaR

  • Easy to implement and data readily available.
  • Simple efficient calculations.
  • Conducive to analysis because risk factors, correlations and volatilities are defined.

Disadvantages of Delta-Normal VaR

  • The need to assume a normal distribution.
  • Unable to properly account for distributions with fat tails, either because of unidentified time variation, or unidentified risk factors and/or correlations.
  • Not suitable for non-linear relationships like options because it cannot capture the instability of options delta/sensitivities.

Historical (Non-Parametric) VaR:

The Historical method simply re-organizes actual historical returns, putting them in order from worst to best. It then assumes that history will repeat itself, from a risk perspective. To calculate historical VaR accumulate a number of past daily returns, rank the returns from highest to lowest, and identify the lowest percentile chosen. The highest return of the percentile is our VaR.

Advantages of Historical VaR

  • Easy to implement and data readily available.
  • Simple efficient calculations.
  • Valuation based on actual prices.
  • Includes all correlations as embedded in market price changes.
  • Not exposed to model risk.

Disadvantages of Historical VaR

  • Includes all correlations and volatilities in the specific time period.
  • A small number of observations may lead to insufficient distribution tails.
  • Time variation of risk in the past may not represent variation in the future.

Monte Carlo VaR:

The Monte Carlo method generates multiple simulations and possible outcomes from the distribution of inputs specified by us the operator. We calculate our possible outcome by summing the mean with the volatility and multiplying by a stochastic process (random number generator) based on a normal distribution. Some stochastic (having a random probability distribution or pattern that may be analysed statistically but may not be predicted precisely) processes of Monte Carlo VaR:

  • Brownian Motion (Random Walk) – is a simple continuous stochastic process for modelling random behavior that evolves over time. Examples of such behavior are the random movements of fluctuations in an asset’s price. This model requires an assumption of perfectly divisible assets and a frictionless market (i.e. that no transaction costs occur either for buying or selling). Another assumption is that asset prices have no jumps; there are no surprises in the market.
  • Geometric Brownian Motion – is a continuous-time stochastic process in which the logarithm of the randomly varying quantity follows a Brownian motion with drift (stochastic drift is the change of the average value of a stochastic process).
  • Mean Reverting Stochastic Process (Ornstein–Uhlenbeck) – the process can be considered to be a modification of the random walk in continuous time, or Wiener process, in which the properties of the process have been changed so that there is a tendency of the walk to move back towards a central location, with a greater attraction when the process is further away from the centre.

Once we have calculated our possible outcome we then multiply it to our initial asset/portfolio price, and repeat multiples times. Using the portfolio expected return and standard deviation, which are part of the Monte Carlo output; VaR is calculated in the same way as the Delta-Normal method. Monte Carlo VaR will look more normally distributed as sample size increases, because the simulations provide the variance/volatility based on the mean.

Advantages of Monte Carlo VaR

  • The most powerful model, and the most flexible where it can incorporate additional risk factors easily.
  • It can account for linear and non-linear risks.
  • Large numbers of scenarios/simulations can produce well-described distributions.

Disadvantages of Monte Carlo VaR

  • Lengthy computational time as valuations escalate.
  • Subject to model risk of the stochastic process chosen (BM, GBM, MR).
  • Subject to sampling variation at low numbers or simulations.

Hybrid VaR:

There are a number of additional VaR methods that use hybrid models.

The Black Swan 101

ImageThis post lists 100 points Nassim Nicholas Taleb makes in his 2007 literary/philosophical (trading) book “The Black Swan: The Impact of the Highly Improbable”. In Part 1 Taleb covers ‘how we seek validation’, Part 2 ‘how we predict’, Part 3 the ‘technical argument (for Part 1&2)’, and Part 4 ‘how Taleb gets even with the Black Swan’. I highly recommend reading the book to gain a greater depth of understanding.


  1. History is written with hindsight and retrospective always, unless it was written in real-time.
  2. Hotel journalism; writing about war while not leaving the hotel of country of war, not empirical journalism.
  3. Scalable/Unscalable. E.G. a profession that is “scalable,” is one in which you are not paid by the hour and thus subject to the limitations of the amount of your labour. Unscalable professions include; dentists, consultants, or massage profes­sionals, cannot be scaled: there is a cap on the number of patients or clients you can see in a given period of time.
  4. Turkey analogy – being fed for 1000 days the Turkey could expect to be fed the next day by confirmation of last 1000 days however next day could be thanksgiving/Christmas day.
  5. Confirmation is dangerous, look for supporting evidence you will find it, therefore look for instances that make you wrong.
  6. Try to avoid interpretation.
  7. Tunnelling; is focusing on a small number of sources of uncertainty, or causes of known Black Swans.
  8. Myths impart order to the disorder of human perception.
  9. Constant second-guessing of your past actions is harmful, make a narrative of it and write it down.
  10. Journos put a cause on a headline to make you swallow and follow a “wanted” narrative. Journalists can teach us how to not learn.
  11. Events that are nonrepeatable are ignored before their oc­currence, and overestimated after (for a while).
  12. Narratives can be lethal, check stats for facts. The way to avoid the ills of the narrative fallacy is to favour experimentation over storytelling, experience over history, and clinical knowledge over theories.
  13. Linear progression, a Platonic idea, is not the norm i.e. Studying.
  14. In some strategies, you gamble dollars to win a succes­sion of pennies while appearing to be winning all the time. In others, you risk a succession of pennies to win dollars, (bleed or blow up).
  15. Be skeptical, nonacademic, antidogmatic, and obsessively empirical.
  16. Silent evidence; like extinct species based on extant fossils and does crime pay based on criminals that get caught, (ingrain silent evidence into mindset).
  17. What you see and don’t see, Katrina support could affect cancer patients by governments using public money for support. Also bin laden causes more road deaths due to people being afraid to fly.
  18. A life saved is a statistic; a person hurt is an anecdote. Statistics are in­visible; anecdotes are salient. Likewise, the risk of a Black Swan is invisible.
  19. Humans are told an optimistic bent is supposed to be good for us. This appears to justify general risk taking as a positive enterprise, and one that is glorified in common culture. Danny Kahneman has evidence that we generally take risks not out of bravado but out of ignorance and blindness to probability.
  20. A problem with the universe and the human race is that we are the surviving Casanovas, being here is a consequential low-probability occurrence, and we tend to forget it.
  21. Roulette, if you beat high odds, take into account others peoples loses, it was luck.
  22. Cosmetic “Because” not always so simple to look for causes and effects, due to randomness, handle “Because” with care particularly in cases of silent evidence.
  23. In addition to the confir­mation error and the narrative fallacy, the manifestations of silent evi­dence further distort the role and importance of Black Swans. In fact, they cause a gross overestimation at times (say, with literary success), and underestimation at others (the stability of history; the stability of our human species). We are made to be superficial, to heed what we see and not heed what does not vividly come to mind. Out of sight out of mind, we ignore the cemetery.
  24. Silent evidence from those who don’t succeed (in the cemetery), CEO linked success with attributes in autobiography example.
  25. Ludic fallacy, casino probabilities and risk example (tiger maimed, daughter kidnap).
  26. Dogma-prone newspaper readers and opera lovers who have cosmetic exposure to culture and shallow depth.
  27. The cosmetic and the Platonic rise naturally to the surface. This is a simple ex­tension of the problem of knowledge. It is simply that one side of a li­brary, the one we never see, has the property of being ignored. This is also the problem of silent evidence. It is why we do not see Black Swans.
  28. We are not manufactured, in our current edition of human race, to understand abstract matters—we need context. Randomness and uncertainty are abstractions. We respect what has happened, ignoring what could have happened. We are naturally shallow and superficial—and we do not know it.
  29. -Steps to a higher form of life; you may have to denarrate (shut down the television set, minimize time spent reading newspapers, ignore the blogs). Train your reasoning abilities to control your decisions; nudge System 1 (the heuristic or experiential system) out of the important ones. Train yourself to spot the difference between the sensational and the empirical. This insulation from the toxicity of the world will have an ad­ditional benefit: it will improve your well-being. Also, bear in mind how shallow we are with probability, the mother of all abstract notions. Above all, learn to avoid “tunnelling.”
  30. To be able to “focus” is a great virtue if you’re a brain surgeon, or chess player. But the last thing you need to do when you deal with uncertainty is to “focus”. This “focus” makes you a sucker; it translates into prediction problems.                                                                                                                                                                                                                                                                     PART 2 – PREDICTION:
  31. We have seen how good we are at narrating backward, at inventing stories that convince us that we understand the past. For many people, knowledge has the remarkable power of producing confidence instead of measurable aptitude.
  32. Predicting fail; ask a room full of people to estimate a range of possible values with a 98% confidence, of a number I am thinking of (something unknown).
  33. Epistemic arrogance bears a double effect: we overestimate what we know, and underestimate uncertainty, by compressing the range of possible uncertain states (i.e., by reducing the space of the unknown).
  34. We underestimate our error rate even with Gaussian variables (Mediocristan), no chance of predicting in Extremistan.
  35. Guessing (what I don’t know, but what someone else may know) and predicting (what has not taken place yet) is the same thing.
  36. When you develop your opinions on the basis of weak evidence, you will have difficulty interpreting subsequent information that contradicts these opinions, even if this new information is obviously more accurate. Two mechanisms are at play here: the confirmation bias, and belief perseverance, the tendency not to reverse opinions you already have. We treat ideas like possessions, and it will be hard for us to part with them.
  37. The more detailed knowledge one gets of empirical reality, the more one will see the noise (i.e., the anecdote) and mistake it for actual information. Remember that we are swayed by the sensational. Listening to the news on the radio every hour is far worse for you than reading a weekly magazine, because the longer interval allows information to be filtered.
  38. Economists forecasts; tested and failed to be better than random forecasts. Economists forecast tend to be similarish, nobody wants to be off the wall – they want to sell their forecasts with “confidence”. (Pro traders rarely hire economists for their own consumption, but rather to provide stories for their less sophisticated clients). Study by Bouchaud researched two thousand predictions by security analysts. What it showed was that these brokerage-house analysts predicted nothing—a naive forecast made by someone who takes the figures from one period as predictors of the next would not do markedly worse. Also worse yet, the forecasters’ errors were significantly larger than the average difference between individual forecasts, which indicates herding.
  39. One’s ideological commitments influence one’s perception. You invoke the outlier.
  40. Tetlocks Expert study; There is something in us designed to protect our self-esteem. We humans are the victims of an asymmetry in the perception of random events. We attribute our successes to our skills, and our failures to external events outside our control, namely to randomness. We feel responsible for the good stuff, but not for the bad. This causes us to think that we are better at we do for a living. The other effect of this asymmetry is that we feel a little unique, unlike others, for whom we do not perceive such an asymmetry.
  41. Economics is the most insular of fields. Economics is perhaps the subject that currently has the highest number of philistine scholars—scholarship without erudition and natural curiosity can close your mind and lead to the fragmentation of disciplines.
  42. Plans fail because of what we have called tunnelling, the neglect of sources of uncertainty outside the plan itself. In fact, the more routine the task, the better you learn to forecast planning. But there is always something nonroutine in our modern environment. We are too narrow-minded a species to consider the possibility of events straying from our mental projections, but furthermore, we are too focused on matters internal to the project to take into account external uncertainty, the “unknown unknown”. There is also the nerd effect, which stems from the mental elimination of off-model risks, or focusing on what you know. You view the world from within a model. We cannot truly plan, because we do not understand the future—but this is not necessarily bad news. We could plan while bearing in mind such limitations. It just takes guts.
  43. Once a forecast is on a page or on a computer screen, or, worse, in a PowerPoint presentation, the projection takes on a life of its own, losing its vagueness and abstraction and becoming what philosophers call reified, invested with concreteness; it takes on a new life as a tangible object. A classical mental mechanism, called anchoring, seems to be at work here. You lower your anxiety about uncertainty by producing a number, then you “anchor” on it, like an object to hold on to in the middle of a vacuum. Danny Kahneman wheel of fortune study; the subjects first looked at the number on the wheel, which they knew was random, then they were asked to estimate the number of African countries in the United Nations. Those who had a low number on the wheel estimated a low number of African nations; those with a high number produced a higher estimate.
  44. We use reference points in our heads, say sales projections, and start building beliefs around them because less mental effort is needed to compare an idea to a reference point than to evaluate it in the absolute {System 1 at work!). We cannot work without a point of reference. So the introduction of a reference point in the forecaster’s mind will work wonders.
  45. Each day will bring you closer to your death but further from the receipt of a letter you are waiting for. This subtle but extremely consequential property of scalable randomness is unusually counterintuitive. We misunderstand the logic of large deviations from the norm.
  46. Forecasting without incorporating an error rate uncovers three fallacies, all arising from the same misconception about the nature of uncertainty. The first fallacy: variability matters. The first error lies in taking a projection too seriously, without heeding its accuracy. We do not teach you to build an error rate around prediction. The second fallacy lies in failing to take into account forecast degradation as the projected period lengthens. Forecasting by bureaucrats tends to be used for anxiety relief rather than for adequate policy making. The third fallacy, and perhaps the gravest, concerns a misunderstanding of the random character of the variables being forecast. Owing to the Black Swan, these variables can accommodate far more optimistic—or far more pessimistic—scenarios than are currently expected.
  47. 5-year outlooks for banks example; the Black Swan of the Russian financial default of 1998 and the accompanying meltdown of the values of Latin American debt markets. It had such an effect on the firm that, none of the five was still employed there a month after the sketch of the 1998 five-year plan. But they continued to do the plan with new managers.
  48. Inadvertent discovery like the Internet, penicillin and the laser. Research involves a large element of serendipity, which can pay off big as long as one knows how serendipitous the business can be and structures it around that fact. Viagra, which changed the mental outlook and social mores of retired men, was meant to be a hypertension drug.
  49. Prediction requires knowing about technologies that will be discovered in the future. But that very knowledge would almost automatically allow us to start developing those technologies right away. Ergo, we do not know what we will know. Why don’t we take it into account? The answer lies in pathology of human nature. Remember the psychological discussions on asymmetries in the perception of skills. We see flaws in others and not in ourselves. Once again we seem to be wonderful at self-deceit machines.
  50. Lorenz was producing a computer model of weather dynamics, and he ran a simulation that projected a weather system a few days ahead. Later he tried to repeat the same simulation with the exact same model and what he thought were the same input parameters, but he got wildly different results. Lorenz realized that the consequential divergence in his results arose not from error, but from a small rounding in the input parameters. This became known as the butterfly effect, since a butterfly moving its wings in India could cause a hurricane in New York, two years later. Lorenz’s findings generated interest in the field of chaos theory.
  51. To clarify, Platonic is top-down, formulaic, closed-minded, self-serving, and commoditized; a-Platonic is bottom-up, open-minded, skeptical, and empirical.
  52. Legions of empirical psychologists of the heuristics and biases school have shown that the model of rational behaviour under uncertainty is not just grossly inaccurate but plain wrong as a description of reality. Their results also bother Platonified economists because they reveal that there are several ways to be irrational. So if people make inconsistent choices and decisions, the central core of economic optimization fails. You can no longer produce a “general theory,” and without one you cannot predict. You have to learn to live without a general theory.
  53. Let’s say your high school teacher asks you to extend a series of dots. With a linear model, you can run a single straight line from the past to the future. If you project from the past in a linear way, you continue a trend. But possible future deviations from the course of the past are infinite. Linear analysis; Remember that “R-square” is unfit for Extremistan; it is only good for academic promotion.
  54. What is the most potent use of our brain? It is precisely the ability to project conjectures into the future and play the counterfactual game. Used correctly and in place of more visceral reactions, the ability to project effectively frees us from immediate, first-order natural selection—as opposed to more primitive organisms that were vulnerable to death and only grew by the improvement in the gene pool through the selection of the best. In a way, projecting allows us to cheat evolution: it now takes place in our head, as a series of projections and counterfactual scenarios. This ability to mentally play with conjectures, even if it frees us from the laws of evolution, is itself supposed to be the product of evolution—it is as if evolution has put us on a long leash whereas other animals live on the very short leash of immediate dependence on their environment. For Dennett, our brains are “anticipation machines”; for him the human mind and consciousness are emerging properties, those properties necessary for our accelerated development.
  55. Why do we listen to experts and their forecasts? A candidate explanation is that society reposes on specialization, effectively the division of knowledge. We have a natural tendency to listen to the expert, even in fields where there may be no experts.
  56. When we think of tomorrow, we just project it as another yesterday. Go to the primate section of the Bronx Zoo where you can see our close relatives in the happy primate family leading their own busy social lives. You can also see masses of tourists laughing at the caricature of humans that the lower primates represent. Now imagine being a member of a higher-level species (say a “real” philosopher, a truly wise person), far more sophisticated than the human primates. You would certainly laugh at the people laughing at the nonhuman primates. Clearly, to those people amused by the apes, the idea of a being who would look down on them the way they look down on the apes cannot immediately come to their minds—if it did, it would elicit self-pity. Just as autism is called “mind blindness,” this inability to think dynamically, to position oneself with respect to a future observer, we should call “future blindness.”
  57. In practice, randomness is fundamentally incomplete information. Randomness, in the end, is just unknowledge. The world is opaque and appearances fool us.
  58. One should learn under severe caution. History is certainly not a place to theorize or derive general knowledge, nor is it meant to help in the future, without some caution. We can get negative confirmation from history, which is invaluable, but we get plenty of illusions of knowledge along with it.
  59. It is not possible to hold a situation in one’s head without some element of bias.
  60. Accept that being human involves some amount of epistemic arrogance in running your affairs. Do not be ashamed of that. Do not try to always withhold judgment—opinions are the stuff of life. Do not try to avoid predicting—yes, after this diatribe about prediction I am not urging you to stop being a fool. Just be a fool in the right places. What you should avoid is unnecessary dependence on large-scale harmful predictions. Avoid the big subjects that may hurt your future: be fooled in small matters, not in the large. Do not listen to economic forecasters or to predictors in social science (they are mere entertainers). Avoid government social security forecasts for the year 2040. Know how to rank beliefs not according to their plausibility but by the harm they may cause.
  61. The bottom line: be prepared! Narrow-minded prediction has an analgesic or therapeutic effect. Be aware of the numbing effect of magic numbers. Be prepared for all relevant eventualities.
  62. Maximize the serendipity around you.
  63. Indeed, we have psychological and intellectual difficulties with trial and error, and with accepting that series of small failures are necessary in life. Understand that we humans have a mental hang-up about failures: “You need to love to lose” is a good motto. People are often ashamed of losses, so they engage in strategies that produce very little volatility but contain the risk of a large loss. I learned about this problem from the finance industry, in which we see “conservative” bankers sitting on a pile of dynamite but fooling themselves because their operations seem dull and lacking in volatility.
  64. Barbell Strategy; If you know that you are vulnerable to prediction errors, and if you accept that most “risk measures” are flawed, because of the Black Swan, then your strategy is to be as hyperconservative and hyperaggressive as you can be instead of being mildly aggressive or conservative. Instead of putting your money in “medium risk” investments (how do you know it is medium risk? by listening to tenure-seeking “experts”?), you need to put a portion, say 85 to 90%, in extremely safe instruments, like Treasury bills—as safe a class of instruments as you can manage to find on this planet. The remaining 10 to 15 % you put in extremely speculative bets, as leveraged as possible (like options), preferably venture capital-style portfolios. Make sure that you have plenty of these small bets; avoid being blinded by the vividness of one single Black Swan. Have as many of these small bets as you can conceivably have. That way you do not depend on errors of risk management; no Black Swan can hurt you at all, beyond your “floor,” the nest egg that you have in maximally safe investments. Or, equivalently, you can have a speculative portfolio and insure it (if possible) against losses of more than, say, 15%. You are “clipping” your incomputable risk, the one that is harmful to you. The average will be medium risk but constitutes a positive exposure to the Black Swan. More technically, this can be called a “convex” combination.
  65. In these businesses you are lucky if you don’t know anything— particularly if others don’t know anything either, but aren’t aware of it. And you fare best if you know where your ignorance lies, if you are the only one looking at the unread books, so to speak. This dovetails into the “barbell” strategy of taking maximum exposure to the positive Black Swans while remaining paranoid about the negative ones. For your exposure to the positive Black Swan, you do not need to have any precise understanding of the structure of uncertainty. I find it hard to explain that when you have a very limited loss you need to get as aggressive, as speculative, and sometimes as “unreasonable” as you can be.
  66. Don’t look for the precise and the local. Simply, do not be narrow-minded. Do not try to predict precise Black Swans—it tends to make you more vulnerable to the ones you did not predict. Remember that infinite vigilance is just not possible.
  67. Beware of precise plans by governments. Let governments predict (it makes officials feel better about themselves and justifies their existence) but do not set much store by what they say. Remember that the interest of these civil servants is to survive and self-perpetuate—not to get to the truth.
  68. Do not waste your time trying to fight forecasters, stock analysts, economists, and social scientists, except to play pranks on them. It is ineffective to moan about unpredictability: people will continue to predict foolishly, especially if they are paid for it, and you cannot put an end to institutionalized frauds. If you ever do have to heed a forecast, keep in mind that its accuracy degrades rapidly as you extend it through time. If you hear a “prominent” economist using the word equilibrium, or normal distribution, do not argue with him; just ignore him.
  69. These recommendations have one point in common: asymmetry. Put yourself in situations where favourable consequences are much larger than unfavourable ones. Indeed, the notion of asymmetric outcomes as the central idea of this book: I will never get to know the unknown since, by definition, it is unknown. However, I can always guess how it might affect me, and I should base my decisions around that. We can have a clear idea of the consequences of an event, even if we do not know how likely it is to occur. I don’t know the odds of an earthquake, but I can imagine how San Francisco might be affected by one. This idea that in order to make a decision you need to focus on the consequences (which you can know) rather than the probability (which you can’t know) is the central idea of uncertainty. You can build an overall theory of decision making on this idea. All you have to do is mitigate the consequences. As I said, if my portfolio is exposed to a market crash, the odds of which I can’t compute, all I have to do is buy insurance, or get out and invest the amounts I am not willing to ever lose in less risky securities.                                                                                                                                                                                                                         PART 3 – TECHNICAL (HEART) ARGUMENT:
  70. Matthew effect, cumulative advantage in sociology – luck and randomness. Preferential-attachment; Another way to think about the process of inequal­ity; the more you use a word, the less effortful you will find it to use that word again, so you bor­row words from your private dictionary in proportion to their past use. This explains why out of the sixty thousand main words in English, only a few hundred constitute the bulk of what is used in writings, and even fewer appear regularly in conversation. For an idea to be contagious, a mental category must agree with our nature. Preferential-attachment theories are intuitively appealing, but they do not account for the possibility of being supplanted by newcomers.
  71. Consider the following sobering statistic. Of the five hundred largest U.S. companies in 1957, only seventy-four were still part of that select group, the Standard and Poor’s 500, forty years later. Only a few had dis­appeared in mergers; the rest either shrank or went bust.
  72. Capitalism is, among other things, the revitalization of the world thanks to the opportunity to be lucky. Luck is the grand equalizer, because almost everyone can benefit from it.
  73. Randomness has the beneficial effect of reshuffling society’s cards, knocking down the big guy. In the arts, fads do the same job. A newcomer may benefit from a fad, as followers multiply thanks to a preferential attachment style epidemic. Then, guess what? He too becomes history.
  74. Page examines the effects of cognitive diversity on problem solving and shows how variability in views and methods acts like an engine for tinkering. It works like evolution. By subverting the big structures we also get rid of the Platonified one way of doing things—in the end, the bottom-up theory-free empiricist should prevail.
  75. People live longer in societies that have flatter social gradients. Winners kill their peers as those in a steep social gradient live shorter lives, regardless of their economic condition.
  76. Remember this: the Gaussian-bell curve variations face a headwind that makes probabilities drop at a faster and faster rate as you move away from the mean, while “scalables,” or Mandelbrotian variations, do not have such a restriction.
  77. In the Gaussian frame­ work, inequality decreases as the deviations get larger caused by the in­crease in the rate of decrease. Not so with the scalable: inequality stays the same throughout. The inequality among the superrich is the same as the inequality among the simply rich—it does not slow down. Consider this effect. Take a random sample of any two people from the U.S. population who jointly earn $1 million per annum. What is the most likely breakdown of their respective incomes? In Mediocristan, the most likely combination is half a million each. In Extremistan, it would be $50,000 and $950,000. For any large total, the breakdown will be more and more asymmetric. Why is this so? Persons with wealth are rarer that such middle combinations would be less likely.
  78. Some use the rule to imply that 80% of the work is done by 20% of the people. As far as axioms go, this one wasn’t phrased to impress you the most: it could easily be called the 50/01 rule, that is, 50% of the work comes from 1% of the workers. This formulation makes the world look even more unfair, yet the two formulae are exactly the same. How? Well, if there is inequality, then those who constitute the 20% in the 80/20 rule also contribute unequally—only a few of them deliver the lion’s share of the results. This trickles down to about one in a hundred con­tributing a little more than half the total. The 80/20 rule is only metaphorical; it is not a rule, even less a rigid law. Note here that it is not all uncertainty. In some situations you may have a concentration, of the 80/20 type, with very predictable and tractable properties, which enables clear decision making, because you can identify beforehand where the meaningful 20% are.
  79. We can make good use of the Gaussian approach in variables for which there is a rational reason for the largest not to be too far away from the average. If there is gravity pulling numbers down, or if there are physi­cal limitations preventing very large observations, we end up in Medioc­ristan. If there are strong forces of equilibrium bringing things back rather rapidly after conditions diverge from equilibrium, then again you can use the Gaussian approach. Note the following principle: the rarer the event, the higher the error in our estimation of its probability—even when using the Gauss­ian. The Gaussian bell curve sucks randomness out of life—which is why it is popular. We like it because it allows certainties! How? Through averaging.
  80. Correlation measures will be likely exhibit severe instability; it will depend on the period for which it was com­puted. Yet people talk about correlation as if it were something real, making it tangible, investing it with a physical property, reifying it. The same illusion of concreteness affects what we call “standard” deviations. Take any series of historical prices or values. Break it up into subsegments and measure its “standard” deviation. Surprised? Every sam­ple will yield a different “standard” deviation. Then why do people talk about standard deviations? Go figure. Note here that, as with the narrative fallacy, when you look at past data and compute one single correlation or standard deviation, you do not notice such instability.
  81. If you use the term statistically significant, beware of the illusions of cer­tainties. Odds are that someone has looked at his observation errors and assumed that they were Gaussian, which necessitates a Gaussian context, namely, Mediocristan, for it to be acceptable.
  82. Once you get a bell curve in your head it is hard to get it out.
  83. If you’re dealing with qualitative inference, such as in psychology or medicine, looking for yes/no answers to which magnitudes don’t apply, then you can assume you’re in Mediocristan without serious problems. The impact of the improbable cannot be too large. But if you are dealing with aggregates, where magnitudes do matter, such as income, your wealth, return on a portfolio, or book sales, then you will have a problem and get the wrong distribution if you use the Gaussian, as it does not belong there. One single number can disrupt all your averages; one single loss can erad­icate a century of profits. You can no longer say “this is an exception.”
  84. Temperature degrees are, in a way, a means for your mind to translate some external phenomena into a number. Likewise, the Gaussian bell curve is set so that 68.2% of the observations fall between minus one and plus one standard deviations away from the aver­age. I repeat: do not even try to understand whether standard deviation is average deviation—it is not, and a large (too large) number of people using the word standard deviation do not understand this point. Standard deviation is just a number that you scale things to, a matter of mere correspondence if phenomena were Gaussian. These standard deviations are often nicknamed “sigma.” People also talk about “variance” (same thing: variance is the square of the sigma, i.e., of the standard deviation). The main point of the Gaussian bell curve is that most observations hover around the mediocre, the mean, while the odds of a deviation decline faster and faster (exponentially) as you move away from the mean. If you need to retain one single piece of information, just remember this dramatic speed of decrease in the odds as you move away from the average. Outliers are increasingly un­likely. You can safely ignore them.
  85. I have not for the life of me been able to find anyone around me in the business and statistical world who was intellectually consistent in that he both accepted the Black Swan and rejected the Gaussian and Gaussian tools. Many peo­ple accepted my Black Swan idea but could not take it to its logical conclusion, which is that you cannot use one single measure for randomness called standard deviation (and call it “risk”); you cannot expect a simple answer to characterize uncertainty.
  86. The world, epistemologically, is literally a differ­ent place to a bottom-up empiricist. We don’t have the luxury of sitting down to read the equation that governs the universe; we just observe data and make an assumption about what the real process might be, and “cal­ibrate” by adjusting our equation in accordance with additional informa­tion. As events present themselves to us, we compare what we see to what we expected to see. It is usually a humbling process, particularly for some­ one aware of the narrative fallacy, to discover that history runs forward, not backward. What I am talking about is opacity, incompleteness of information, the invisibility of the generator of the world. History does not reveal its mind to us—we need to guess what’s inside of it.
  87. The problem of the circularity of statistics (aka statistical regress argument) is as follows. Say you need past data to dis­cover whether a probability distribution is Gaussian, fractal, or something else. You will need to establish whether you have enough data to back up your claim. How do we know if we have enough data? From the proba­bility distribution—a distribution does tell you whether you have enough data to “build confidence” about what you are inferring. If it is a Gauss­ian bell curve, then a few points will suffice (the law of large numbers once again). And how do you know if the distribution is Gaussian? Well, from the data. So we need the data to tell us what the probability distribution is, and a probability distribution to tell us how much data we need. This causes a severe regress argument. This regress does not occur if you assume beforehand that the distrib­ution is Gaussian. It happens that, for some reason, the Gaussian yields its properties rather easily. Extremistan distributions do not do so.
  88. Most models, of course, attempt to be precisely predictive, not just descriptive; I find this infuriating. They are nice tools for illustrating the genesis of Extremistan, but I insist that the “generator” of reality does not appear to obey them closely enough to make them helpful in precise fore­casting.
  89. If the world of finance were Gaussian, an episode such as the 1987 crash (more than twenty standard deviations) would take place every several bil­lion lifetimes of the universe. If you come fresh to the business, do not rely on the old theoretical tools, and do not have a high expectation of certainty. Portfolio theory turned into rigorous economic theory. A contagion that allows business people to blame the flawed scientific method.
  90. So the Gaussian pervaded our business and scientific cultures, and terms such as sigma, variance, standard deviation, correlation, R square, and the eponymous Sharpe ratio, all directly linked to it, pervaded the lingo. If you read a mutual fund prospectus, or a description of a hedge fund’s exposure, odds are that it will supply you, among other information, with some quantitative summary claiming to measure “risk.” That measure will be based on one of the above buzzwords derived from the bell curve and its kin. Today, for instance, pension funds’ investment pol­icy and choice of funds are vetted by “consultants” who rely on portfolio theory. If there is a problem, they can claim that they relied on standard scientific method.
  91. Scholes and Merton made the formula dependent on the Gauss­ian, but their “precursors” subjected it to no such restriction. Not only does an option on a very long shot benefit from Black Swans, but it benefits disproportion­ately from them something Scholes and Merton’s “formula” misses. The option payoff is so powerful that you do not have to be right on the odds: you can be wrong on the probability, but get a monstrously large payoff. I’ve called this the “double bubble”: the miss pricing of the probability and that of the payoff.
  92. Not one of these users of portfolio theory in twenty years of debates, explained how they could accept the Gaussian framework as well as large deviations. Not one.
  93. People did not understand the elementary asymmetry involved: you need one single observation to reject the Gaussian, but millions of observations will not fully confirm the validity of its application. Why? Because the Gaussian bell curve disallows large deviations, but tools of Extremistan, the alternative, do not disallow long quiet stretches.
  94. Skeptical empiricism advocates the opposite method. I care about the premises more than the theories, and I want to minimize reliance on theo­ries, stay light on my feet, and reduce my surprises. I want to be broadly right rather than precisely wrong. Elegance in the theories is often indica­tive of Platonicity and weakness, it invites you to seek elegance for ele­gance’s sake. A theory is like medicine (or government): often useless, sometimes necessary, always self-serving, and on occasion lethal. So it needs to be used with care, moderation, and close adult supervision.
  95. See table on page 284 for “Two ways to approach randomness”, skeptical empiricism vs. the platonic approach.
  96. The notion of a practitioner, his thinking is rooted in the belief that you cannot go from books to prob­lems, but the reverse, from problems to books.
  97. We no longer believe in papal infallibility; we seem to believe in the infallibility of the Nobel.
  98. The antidote to Black Swans is precisely to be noncommoditized in thinking. But beyond avoiding being a sucker, this attitude lends itself to a protocol of how to act—not how to think, but how to convert knowledge into action and figure out what knowledge is worth.                                                                                                                                                                                                                       PART 4 – HOW TALEB GETS EVEN WITH THE BLACK SWAN:
  99. Half the time I am a hyperskeptic; the other half I hold certainties and can be intransigent about them. Of course I am hyperskeptic where others, particularly those I call bildungsphilisters, are gullible, and gullible where others seem skeptical. I am skeptical about confirmation—though only when errors are costly—not about disconfirmation. Having plenty of data will not provide confirma­tion, but a single instance can disconfirm. I am skeptical when I suspect wild randomness, gullible when I believe that randomness is mild. Half the time I hate Black Swans, the other half I love them. I like the randomness that produces the texture of life, the positive accidents. Half the time I am hyperconservative in the conduct of my own affairs; the other half I am hyperaggressive. This may not seem exceptional, ex­cept that my conservatism applies to what others call risk taking, and my aggressiveness to areas where others recommend caution. I worry less about small failures, more about large, potentially terminal ones. I worry far more about the “promising” stock market, particularly the “safe” blue chip stocks, than I do about speculative ventures— the former present invisible risks, the latter offer no surprises since you know how volatile they are and can limit your downside by investing smaller amounts. I worry less about advertised and sensational risks, more about the more vicious hidden ones. I worry less about terrorism than about dia­betes, less about matters people usually worry about because they are ob­vious worries, and more about matters that lie outside our consciousness and common discourse (I also have to confess that I do not worry a lot— I try to worry about matters I can do something about). I worry less about embarrassment than about missing an opportunity. In the end this is a trivial decision making rule: I am very aggressive when I can gain exposure to positive Black Swans—when a failure would be of small moment—and very conservative when I am under threat from a negative Black Swan. I am very aggressive when an error in a model can benefit me, and paranoid when the error can hurt. This may not be too in­teresting except that it is exactly what other people do not do. In finance, for instance, people use flimsy theories to manage their risks and put wild ideas under “rational” scrutiny. Half the time I am intellectual, the other half I am a no-nonsense prac­titioner. I am no-nonsense and practical in academic matters, and intellec­tual when it comes to practice. Half the time I am shallow, the other half I want to avoid shallowness. I am shallow when it comes to aesthetics; I avoid shallowness in the con­text of risks and returns. My aestheticism makes me put poetry before prose, Greeks before Romans, dignity before elegance, elegance before culture, culture before erudition, erudition before knowledge, knowledge before intellect, and intellect before truth. But only for matters that are Black Swan free. Our tendency is to be very rational, except when it comes to the Black Swan.
  100. We are quick to forget that just being alive is an extraordinary piece of good luck, a remote event, a chance occurrence of monstrous proportions. Imagine a speck of dust next to a planet a billion times the size of the earth. The speck of dust represents the odds in favour of your being born; the huge planet would be the odds against it. Stop sweating the small stuff. Stop looking the gift horse in the mouth—remember that you are a Black Swan.
  101. Thank you for reading this post – @SamuelDunlopfx