Minimizing Rules by p3qM75z


									Complexity as a Methodology, Point
         of View, Theory
           Bruce Kogut
    EIASM and Oxford University
            June 2006
               What Complexity Seems
                to Mean In Practice
Interdisciplinary sharing of knowledge and creating a larger community
   of scholarship.

Appreciation of the ‘a-linear’ view of the world

The importance of ‘events’ for triggering change.

Analyzing the statistical properties of large datasets

Understanding local interactions by micro-rules whose effects depend
  on topology (structure) but whose interpretations rely upon
  contextual knowledge.

More attention to what Elster and Hedstrom call ‘mechanisms’ as
  opposed to ‘causes’
  Research Strategies: 1. Some Old one, 2. Some
      opportunistic ones, 3. Some new ones.

1. Greco-Latin Squares to Charles Ragin’s
   comparative methods

2. Borrowed simulation ‘structures and

3. Graph dynamics relying on new
   estimation techniques
     Example of (1): Old Method
• In economics and management, we would like to
  determine the complementarities, or
  interactions, that compose ‘best practices’ to
  improve performance.
• Economics gave us an elegant analysis of
  complementarities but poor methods.
• The empirical problem of complementarities is
  saturating an experimental design. (This is
  identical to the theory of monotone comparative
  statics: the power set of combinations has to be
  tested for its effect on performance.)
Example of (2): a useful opportunistic strategy is
    the NK model applied to complementarities
•   Consider a NK model in which a technological landscape is hardwired (the number of
    nodes is given, they are connected, k-the interactions-are given: but … N and K can
    be varied. Fitness values are randomly assigned to nodes and hence to their

     –   Random boolean nets have been useful in biochemistry in which there are rules of ‘and’, ‘or’,
         ‘not and’. (However, from genes to phenotypic expression, there are many things that
         intervene: RNA, proteins. And fitness can be ‘endogenous’: my fitness can depend on how fit
         is your fitness in a given space.)

     –   We know a priori the central results from simulations in other fields: there are finite multiple
         optima that for some k (such as k=2) has known expected values. We also know much about
         search time to optima (this has received a lot of attention in science and little in social
         science: is the rate by which we have gotten to where we are ‘explainable’?.

     –   The apparatus sneaks in a language: long jumps, landscapes, iterations, that imply firms are
         engaged in search over a technological terrain that awaits to be discovered. (This is not
         social construction.)

     –   Unfortunately, it is hard to feed data to the model.
Example of (3) is the application of graphs to
            understanding data
 1.       We now have a much appreciation that static representations of networks cannot
          easily isolate ‘endogeneity’ (I smoke because I am weak or because you smoke)
          but more importantly cannot easily identify ‘social rules’.

 2.       We have though a better understanding even for static graphs that some
          properties are consistent or inconsistent with important social behaviors.
      •       For example, we know that the absence of a power law in degrees is inconsistent with
              ‘preferential attachment’. Since preferential attachment is a reasonable way to represent
              such concepts as ‘prestige’ and ‘reputation’, a non-finding is important.

 3.       We still have a long way to go regarding ‘estimations’:
      •       our models are convenient (even if very hard: exponential random graph models)
      •       It is hard to rule out other explanations not specified.

 4.       Exciting space (for me) is the combination of estimation and simulation to arrive at
          better understandings of the ‘possible’ interpretations.
      •       Formal models will also be critical.
Return to Example (1): What can
old methods say to complexity?
Consider a question:
 What are good corporate and labor
 institutions for generating growth?

Some argue that there are two prototypes:
 Coordinated (e.g. Germany) and market
 (e.g. US) and each are good for growth (?)
     Here are data from Hall and Gingerich who want to
   show there are 2 best configurations for setting policy
 COUNTRY             Growth     Degree of wage       Level of wage      Labor     Shareholder power   Stock market   Dispersion of
                                   coordination       coordination     turnov                                 size         control

 Austria                 1                   1                  1          1                     1              1               1

 Germany                 1                   1                  1          1                     1              1               1

 Italy                   1                   1                  0          1                     1              1               1

 Belgium                 1                   0                  1          1                     1              1               1

 Norway                  1                   1                  1          1                     0              1               1

 Finland                 1                   1                  1          0                     1              1               1

 Portugal                1                   0                  1          1                     1              1               1

 Sweden                  0                   0                  1          1                     1              0               1

 France                  0                   0                  1          1                     1              1               1

 Denmark                 1                   1                  1          0                     1              1               0

 Japan                   1                   1                  1          1                     0              0               0

 Netherlands             0                   0                  1          1                     1              0               1

 Switzerland             0                   1                  1          1                     1              0               0

 Spain                   1                   0                  0          0                     0              1               1

 Ireland                 1                   0                  0          0                     0              1               0

 Australia               0                   0                  0          0                     0              0               0

 New Zealand             0                   0                  0          0                     0              0               0

 Canada                  0                   0                  0          0                     0              0               0

 United Kingdom          0                   0                  0          0                     0              0               0

 United States           0                   0                  0          0                     0              0               0

The coordination dichotomies are all coded in the same direction, with a score of 1 signaling conformity with “coordinated”
market economies and a score of 0 signaling conformity with “liberal” market economies.
    Observations on the Data
• N of 20 countries and yet 6 variables, hence 64 possible
  combinations (2^6).

• Of these 64, only 15 are uniquely observed.

• We are making inferences based on a poorly populated
   – Sparseness may reflect the operation of a maximizing hand that
     rules out inefficient (?) combinations.
   – It may reflect “path dependency” and hence the paths not taken
     (even if perhaps better).
   – It may reflect cultural preferences that rule out certain
     institutions, such as stock markets historically in some countries.
          Approach One: Crisp Logic:
We can try to find good causal ‘combinations’ by
    borrowing from electrical engineering.
 Using three logical gates (join, meet, null), what are the minimal circuits
 you need, or what are the fewest elements you need to ‘cause’

             Absorption: A+ AB = A

             Reduction:         AB + Ab = A(B+b) = A(1) = A

Advantage of this method: it is simple and intuitive.

Problem is that with too much sparseness, we won’t get much
 Solution for High Growth/Low Initial GDP per
   capita, without simplifying assumptions:
1. degreewc levelwc turnover sharehld STOCKMKT +
   stockmkt dispersn
              Simplifying Assumptions
Consider the case where there are two solutions:
                                    ABC + ABc
No reduction is possible.

If we permit two assumptions, we can achieve a simplification:

Y = ABC + aBc + ABc + aBC
      = (ABC + aBC) + (aBc + ABc)
      = (BC) + (Bc)

This is a type of simulation, but done by intuition –called theory– on unobservables.

It is a theory that explicitly reduces the complexity: It posits, ‘let’s imagine that if we had
     the data or if nature had been more experimental, we would indeed observe two
     cases with positive outcomes. These cases are aBc and aBC. Once we do this, we
     arrive at B.

This is very similar to Michael Hannan’s recent work in propositional logic in which
   premises are fed to a computer program that derives logical propositions. We
   simply say: let’s use nature as far as we can to ‘infer’ propositions and then add in
   theory to derive more simple expressions.
Solution for Low Growth/High Initial GDP per
 C. without simplifying assumptions:

 1. degreewc levelwc turnover sharehld stockmkt dispersn +
    dispersn +
 3. degreewc LEVELWC TURNOVER SHAREHLD stockmkt

 D. with simplifying assumptions:

 1. degreewc stockmkt +
 2. SHAREHLD stockmkt
Let’s do better by understanding more
 clearly the Limited Diversity in data
                                                 Table 6: Mapping Limited Diversity and Assessing Simplifying Assumptions*

                                                                               Configurations of Labor Institutions
                                        dlt             dlT              dLt             Dlt          dLT             DlT    DLt   DLT
                     psc                 5               0                0              0             0               0      0     1
                     psC                 0               0                0              0             0               0      0     0
    of Corporate     pSc                 1               0                0              0             0               0      0     0
    Institutions     Psc                 0               0                0              0             0               0      0     1
                     pSC                 1               0                0              0             0               0      0     1
                     PsC                 0               0                0              0             2               0      0     0
                     PSc                 0               0                0              0             0               0      1     0
                     PSC                 0               0                0              0             3               1      1     2

                     Corporate Institutions (upper case denotes corporatist elements):
                     P = low shareholder power; p = high shareholder power
                     S = small stock market; s = large stock market
                     C = low dispersion of control; c = high dispersion of control
                     Labor Institutions (upper case denotes corporatist elements):
                     D = high degree of wage coordination; d = low degree of wage coordination
                     L = high level of wage coordination; l = low level of wage coordination
                     T = low level of labor turnover; t = high level of labor turnover

                     *Shaded portion of the table shows cells covered by the equation for high growth.
        Logical exploration of the Not-Observed
1. Reconsider the result for low growth:

low_growth = degreewc*stockmkt + SHAREHLD*stockmkt

2. We did not though combine our knowledge of what determines ‘low
   growth’ with that for what determines ‘high growth’. We can do this

Apply De Morgan’s Law by reversing the outcome, changing all upper-
  case to lower-case, and vice verse, and then also changing
  intersection to union, and vice versa:

   high_growth = (DEGREEWC + STOCKMKT)*(sharehld +

We have arrived now at the maximal saturation of our experimental
  design, filling in as many of the 64 cells that we can.
     After maximal saturation…

3. Finally, simplify the terms using Boolean algebra:

[hg=D*sh + DS + Ssh + SS.

By absorption rule and since SS= 1*1= 1=S,

         high_growth = STOCKMKT + DEGREEWC*sharehld

4. And if we are not happy with two explanations, we can theorize what we
should observe by simplifying assumptions and reduce further.

This is a combination of an incomplete saturated design methodology that
analyzes complex non-linear interactions by a combination of logic, theory,
and simulation using DATA.
Example Two Reviewed: NK model
• NK models impose a large penalty on experimentation:

   – Landscapes are rugged and organizations easily get trapped.

   – Long-jumps are random.

• Consider Fontana’s idea of Neutrality (and the
  implementation by Lobo, Fontana, and Miller)
   – Fitness is discretized into bands such that organizations are inert
     to small changes in fitness caused by experiments in
   – However, for large changes in fitness, experiments can lead to
     adoption of new configurations.
Simulating Neutrality in the Kauffman/Levinthal NK
 Model (Amit Jain Implementation and Simulation)
                          Simulation run for neutrality in rugged landscapes

                          N= 10
                          Number of organizations = 100
                          Number of time periods = 100
                          Number of runs = 50
                          Only local search
                          Landscape does not change (p=0)
                          Runs made for M=? (in program this is M=0) (standard N-K), 10, 25, 100

                          Time periods in graph
                          1-100     Standard N-K run (M = ? /0)
                          100-200 M=10
                          200-300 M=25
                          300-400 M=100

                     Four simulations are run: first panel is the
                     standard, the next 3 vary neutrality from fitness
                     bands of 10%, 25%, 100%: that is, change only if
                     change in fitness hits the band.
 Under Neutrality, Organizations
Discover ‘Ridges’ Between Peaks
Comparing results:

1. Fitness value is higher under neutrality (for these number of simulations).
    In other words, local traps are less confining.
2. More organizational forms are ‘viable’ over the short-run.
3. We believe, but are checking, to show that there is more exploration of possible space
    if N=10, then we have 1024 combinations. But with only a 100 organizations, how
    many combinations are actually explored in a period of time. Here we are returning to
    the type of questions: how long should it take to see a possible universe realized?
4. We still don’t how, nor do we think we know how, to fit data to this simulation.

Neutrality is a reasonable concept by which to capture the capability of firms to learn by
   trial and error before engaging in massive ‘retooling’ or ‘reengineering’.

It also captures the idea of ‘institutions’ and ‘institutional transplants’: many institutions
     can cross borders because they are ‘neutral’.
Example (3): Topologies, Graphs,
       Data, Inferences
1.    Science of the complexity should be engagement of
      theory, data, estimation, simulation, imagining the
2.    To understand ‘large complex systems’, we need a lot
      of data.
     1.   A lot of what we do uses small data sets from which we try to
          make claims about asymptotic significance.
3.    Alternatively, we can see social action as agent driven
      who are interacting by rules.
     1.   We would like to liberate them from strong topological
          impositions (e.g. regular graphs, or NK landscapes) but still
          come to understand the relation of local and macro structures
          on behavior, and vice versa.
    Analyses of large data sets:
      Venture Capital In US
•   We know little about entrepreneurial
    activities in terms of network dynamics.
•   Many good studies on venture capital,
    but we have no global picture.
•   We have no studies on dynamics.
                 Theories on VCs
Two common hypotheses:
1.   VCs do deal to signal prestige: this should lead to prestigious getting
     more rich. Graph prediction: power law in degree.

2.    VCs do deals to find ‘complements’ in expertise. Graph prediction: power
      law in ‘weighted link strength’.

Implication of (1) for components and clusters:

Venture capital is ‘clustered’ in geographies and a few prestigious companies
     come later to bridge them.

Implication of (2) for components and clusters:

VC firms will seek new partners when new expertise is required and we will
      thus see ‘repeated ties’ for investments in known areas and ‘new ties’ for
      investments in new areas.

Thus we will have a dynamic between the conservational rule of relying on
     proven expertiese and the diversity rule of seeking new partners.
            Deal structure
• Over 150,000 transactions over 40 years.

• Several thousand VC investors, targets

• Let’s start by posing a simple question:
  – Do regional markets grow and then become
  – Or is Braudel right: regions develop in relation
    to global (national) dynamics.
Deals distribution among Firms



Percentage of Deals











































                                                                      Percentage of Firms
                                   Number of Deals
                                                                                                                                                High number of
                    30,000                                                                                       6                              deals per Firm

                                         Number of Deals         Num Deals per Firm

                    25,000                                                                                       5

                                                                                                                     Number of deals per Firm
                    20,000                                                                                       4
  Number of Deals

                    15,000                                                                                       3

                    10,000                                                                                       2

                     5,000                                                                                       1

                       -                                                                                         0





























Technological breaks create opportunities for new entrants.
National Component Grew Early and Connected
  Regions and Sectors: So much for Clusters




 Percentage of Nodes


                                                                  Sectors Covered by the Giant
                                                                  Geographies Covered by the Giant
                                                                  Size of the Giant Component


                              1962   1965   1968   1971   1974    1977       1980       1983         1986
               We do not find a power law in degrees: VC syndications don’t seem to
                             be the product of ‘preferential attachment’
              167.387                                               378.742


                                                                              1              140
                        1                       61                                    K

              1527.65                                               3416.19


              .694947                                                         1           1104
                        1                       623

                  Inference by adduction: the dog did not bark, the graph does not have power
                  law in degree, hence the culprit of rich get richer is innocent and released.
      We do have Power Laws in Strength:
  Incumbents like to rely upon trusted partners

• Most Deals are Incumbent                            10000

  to Incumbent

• Hence we find power laws                              100

  in repeated ties.                                      10

• Trusted expertise based
                                                              1   10   100       1000   10000   100000

                                                                         Strength s

  on experience, not
  signalling of prestige,
  seems to matter.
 VC networks have far more repeated ties than Guimera Uzzi et al Broadway
Percentage of Deals Where ‘Local’
   cluster is greater than global


                                                                                   Geography    Sector

       Percentage of deals




                                   1961 1965 1968 1971 1974 1977 1980 1983 1986 1989 1992 1995 1998 2001 2004

In other words, clusters are stronger globally than ‘within’ region or sector.
   New Links are Formed when…
• A VC company goes
  to a new geography
                                                             VC       Target       Year
  or sector                  New geography
                                             Diversification Degree
                                                     -1.577    -1.504
                                                     -0.023    -0.023      -0.023      -0.024
  – that is, when it needs   New sector              -1.723
   new expertise.            Interaction              1.502
                             Firm degree                        0.002        0.002      0.002
                                                                    0            0          0
                             Target degree                                 -0.046      -0.056

• VC firms are drawn to
                                                                           -0.002      -0.002
                             Constant                 1.398      1.24        1.487    23.439
                                                     -0.015    -0.018        -0.02 26,313.25
  successful targets.
     Conclusions to Example 3
•    What we showed:                   •   What we did not show:

1.   Looking at dynamics of graph      1. A formal model of the choice
     properties rules out certain         between new and old tie that
     micro behaviors.                     is the equivalent to the
                                          Simon/Barabasi model of
2.   Clustering and giant                 preferred attachment.
     component analysis confirms       2. Agent based models that test
     the Braudel hypothesis:              more precisely the micro rules
     clusters develop in relation to      employed.
     the national graph.                   1. Why not shown: I don’t think
                                              we have a good empirical
                                              model yet.
                                           2. We are out of time.
• Social systems are harder than physical systems if we play only by
  the rules of the latter.
    – We don’t ask an electron ‘where y’a been and when were y’a d’ere?’
    – We can ask people this question.
• Physical systems have given topologies:
    – forests are reasonably viewed as 3 dimensional lattices.
    – American suburbs are often 2 dimensional lattices, but Paris is not, and
      people move around.
    – Geographical space is not always the same as social space.
• Engineers often like to get rid of people because the problem is hard
    – Systems are most often, even today, socio-technical.
    – Machines and people inhabit the same graph.
                           Interactions in physical systems. High Power Items - Jet Engine
                                      Adding in People makes this much harder

                                                Function is physical
                                             and cannot be represented
                                             logically and symbolically

                                      Side effects are             Modules display
                                        high power                multiple behaviors             The design can be
          High power
                                    and can’t be isolated     in multiple energy domains       converted to a picture

                 Severe back-          Module behavior              Modules are indep               The picture
                   loading              changes when                   in design             is an incomplete abstract
                                     combined into system                                  representation of the design

                                    Modules must be                     Modules must
                                     designed anew                  be validated physically
Interfaces must
be tailored to fct                  specifically for
                                     their function
                                                                   Separate module and system
   Main fct carriers                                                validation steps are needed
   can’t be standardized
                                 A construction process
                                  exists that eliminates
                                                                           Systems cannot be designed with
                                most assembled interfaces
                                                                          good confidence that they will work

  From Whitney, MIT.
                  A conclusion
• Complexity is a point of view that the pursuit of
  plausibility is more rewarding than certainty.

• Social sciences needs to move to an open science
  model, where we spend more time in projects, less time
  collecting data.

• Simulations and estimation should be seen as part of the
  interpretative methodology to identify plausible
  mechanisms as opposed to verify causes.

• Interactions, rules, non-saturated designs, simulations,
  estimations, graph theory are the words in the new

• But the going will not be easy… Consider Wings and
  Engines and ….. People
    Appendix: If Time Permitted: Extend This Method of
   Experimental Design and Simulated and Real Data to
          Complementarities in Manufacturing

• Consider activity systems that describe how auto companies
  manufacture efficiently with quality, including the work teams and
  social organization.

• Can we identify better ‘prototypical’ strategies that are robust across
      Strategy and Prototypes
Consider strategy as the problem of choosing capabilities
  and markets, that is the sets C and M are the givens to
  the decision choose CxM such that S* = argmax(C,M).
This can rarely if ever be solved, so that people think
  heuristically instead by prototypes that represent the
  “best configuration”: differentiate, cut cost, have religion.
This formulation is close to the theory of complementarities
  a la Milgrom and Roberts.
The empirical question is: Can we pick out the best
  configurations from the data?
                 Fuzzy Sets:
• Data are no longer “crisp”.
• Important consideration is coding and functional
• Rules are set theoretic: a necessary condition means
  that the outcome is a subset of the condition; a sufficient
  means that the condition is a subset of the outcome.
• Values are calculated for combinations using fuzzy set
  algebra. These values are compared to value of the
  outcome. If outcome value larger, then indicates
  combination/element is sufficient; if smaller, then
  combination/element is necessary.
                                      Benchmarking the fuzzy
                                     configuration against Data
                      1.2                                                                 1.0

                                                                         Actual Quality
Actual Productivity





                       .2                                                                  .2

                        -.2    0.0   .2    .4        .6       .8   1.0                    0.0
                                                                                            -.2   0.0   .2   .4       .6     .8       1.0
                                          Predicted Productivity

                                                                                                                  Predicted Quality

                            The data are 70 or so auto plants around the world and consist of
                            observations on teams, technologies, work processes, scale, etc.. These
                            practices were analyzed to find unique combinations of ‘minimal’
                            practices sufficient to achieve performance.

To top