golden state society: realism

Showing posts with label realism. Show all posts

Monday, August 30, 2010

Criteria for assessing economic models

How can we assess the epistemic warrant of an economic model that purports to represent some aspects of economic reality? The general problem of assessing the credibility of an economic model can be broken down into more specific questions concerning the validity, comprehensiveness, robustness, reliability, and autonomy of the model. Here are initial definitions of these concepts.

Validity is a measure of the degree to which the assumptions employed in the construction of the model are thought to correspond to the real processes underlying the phenomena represented by the model.
Comprehensiveness is the degree to which the model is thought to succeed in capturing the major causal factors that influence the features of the behavior of the system in which we are interested.
Robustness is a measure of the degree to which the results of the model persist under small perturbations in the settings of parameters, formulation of equations, etc.
Autonomy refers to the stability of the model's results in face of variation of contextual factors.
Reliability is a measure of the degree of confidence we can have in the data employed in setting the values of the parameters.

These are features of models that can be investigated more or less independently and prior to examination of the empirical success or failure of the predictions of the model.

Let us look more closely at these standards of adequacy. The discussion of realism elsewhere suggests that we may attempt to validate the model deductively, by examining each of the assumptions underlying construction of the model for its plausibility or realism (link). (This resembles Mill's "deductive method" of theory evaluation.) Economists are highly confident in the underlying general equilibrium theory. The theory is incomplete (or, in Daniel Hausman's language, inexact; link), in that economic outcomes are not wholly determined by purely economic forces. But within its scope economists are confident that the theory identifies the main causal processes: an equilibration of supply and demand through market-determined prices.

Validity can be assessed through direct inspection of the substantive economic assumptions of the model: the formulation of consumer and firm behavior, the representation of production and consumption functions, the closure rules, and the like. To the extent that the particular formulation embodied in the model is supported by accepted economic theory, the validity of the model is enhanced. On the other hand, if particular formulations appear to be ad hoc (introduced, perhaps, to make the problem more tractable), the validity of the model is reduced. If, for example, the model assumes linear demand functions and we judge that this is a highly unrealistic assumption about the real underlying demand functions, then we will have less confidence in the predictive results of the model.

Unfortunately, there can be no fixed standard of evaluation concerning the validity of a model. All models make simplifying and idealizing assumptions; so to that extent they deviate from literal realism. And the question of whether a given idealization is felicitous or not cannot always be resolved on antecedent theoretical grounds; instead, it is necessary to look at the overall empirical adequacy of the model. The adequacy of the assumption of fixed coefficients of production cannot be assessed a priori; in some contexts and for some purposes it is a reasonable approximation of the economic reality, while in other cases it introduces unacceptable distortion of the actual economic processes (when input substitution is extensive). What can be said concerning the validity of a model's assumptions is rather minimal but not entirely vacuous. The assumptions should be consistent with existing economic theory; they should be reasonable and motivated formulations of background economic principles; and they should be implemented in a mathematically acceptable fashion.

Comprehensiveness too is a weak constraint on economic models. It is plain that all economic theories and models disregard some causal factors in order to isolate the workings of specific economic mechanisms; moreover, there will always be economic forces that have not been represented within the model. So judgment of the comprehensiveness of a model depends on a qualitative assessment of the relative importance of various economic and non-economic factors in the particular system under analysis. If a given factor seems to be economically important (e.g. input substitution) but unrepresented within the model, then the model loses points on comprehensiveness.

Robustness can be directly assessed through a technique widely used by economists, sensitivity analysis. The model is run a large number of times, varying the values assigned to parameters (reflecting the range of uncertainty in estimates or observations). If the model continues to have qualitatively similar findings, it is said to be robust. If solutions vary wildly under small perturbations of the parameter settings, the model is rightly thought to be a poor indicator of the underlying economic mechanisms.

Autonomy is the theoretical equivalent of robustness. It is a measure of the stability of the model under changes of assumptions about the causal background of the system. If the model's results are highly sensitive to changes in the environment within which the modeled processes take place, then we should be suspicious of the results of the model.

Assessment of reliability is also somewhat more straightforward than comprehensiveness and validity. The empirical data used to set parameters and exogenous variables have been gathered through specific well-understood procedures, and it is mandatory that we give some account of the precision of the resulting data.

Note that reliability and robustness interact; if we find that the model is highly robust with respect to a particular set of parameters, then the unreliability of estimates of those parameters will not have much effect on the reliability of the model itself. In this case it is enough to have "stylized facts" governing the parameters that are used: roughly 60% of workers' income is spent on food, 0% is saved, etc.

Failures along each of these lines can be illustrated easily.

The model assumes that prices are determined on the basis of markup pricing (costs plus a fixed exogenous markup rate and wage). In fact, however, we might believe (along neoclassical lines) that prices, wages, and the profit rate are all endogenous, so that markup pricing misrepresents the underlying price mechanism. This would be a failure of validity; the model is premised on assumptions that may not hold.
The model is premised on a two-sector analysis of the economy. However, energy production and consumption turn out to be economically crucial factors in the performance of the economy, and these effects are overlooked unless we represent the energy sector separately. This would be a failure of comprehensiveness; there is an economically significant factor that is not represented in the model.
We rerun the model assuming a slightly altered set of production coefficients, and we find that the predictions are substantially different: the increase in income is only 33% of what it was, and deficits are only half what they were. This is a failure of robustness; once we know that the model is extremely sensitive to variations in the parameters, we have strong reason to doubt its predictions. The accuracy of measurement of parameters is limited, so we can be confident that remeasurement would produce different values. So we can in turn expect that the simulation will arrive at different values for the endogenous variables.
Suppose that our model of income distribution in a developing economy is premised on the international trading arrangements embodied in GATT. The model is designed to represent the domestic causal relations between food subsidies and the pattern of income distribution across classes. If the results of the model change substantially upon dropping the GATT assumption, then the model is not autonomous with respect to international trading arrangements.
Finally, we examine the data underlying the consumption functions and we find that these derive from one household study in one Mexican state, involving 300 households. Moreover, we determine that the model is sensitive to the parameters defining consumption functions. On this scenario we have little reason to expect that the estimates derived from the household study are reliable estimates of consumption in all social classes all across Mexico; and therefore we have little reason to depend on the predictions of the model. This is a failure of reliability.

These factors--validity, comprehensiveness, robustness, autonomy, and reliability--figure into our assessment of the antecedent credibility of a given model. If the model is judged to be reasonably valid and comprehensive; if it appears to be fairly robust and autonomous; and if the empirical data on which it rests appears to be reliable; then we have reason to believe that the model is a reasonable representation of the underlying economic reality. But this deductive validation of the model does not take us far enough. These are reasons to have a priori confidence in the model. But we need as well to have a basis for a posteriori confidence in the particular results of this specific model. And since there are many well-known ways in which a generally well-constructed model can nonetheless miss the mark--incompleteness of the causal field, failure of ceteris paribus clauses, poor data or poor estimates of the exogenous variables and parameters, proliferation of error to the point where the solution has no value, and path-dependence of the equilibrium solution--we need to have some way of empirically evaluating the results of the model.

(Here is an application of these ideas to computable general equilibrium (CGE) models in an article published in On the Reliability of Economic Models: Essays in the Philosophy of Economics; link. See also Lance Taylor's reply and discussion in the same volume.)

Criteria for assessing economic models

Validity is a measure of the degree to which the assumptions employed in the construction of the model are thought to correspond to the real processes underlying the phenomena represented by the model.
Comprehensiveness is the degree to which the model is thought to succeed in capturing the major causal factors that influence the features of the behavior of the system in which we are interested.
Robustness is a measure of the degree to which the results of the model persist under small perturbations in the settings of parameters, formulation of equations, etc.
Autonomy refers to the stability of the model's results in face of variation of contextual factors.
Reliability is a measure of the degree of confidence we can have in the data employed in setting the values of the parameters.

The model assumes that prices are determined on the basis of markup pricing (costs plus a fixed exogenous markup rate and wage). In fact, however, we might believe (along neoclassical lines) that prices, wages, and the profit rate are all endogenous, so that markup pricing misrepresents the underlying price mechanism. This would be a failure of validity; the model is premised on assumptions that may not hold.
The model is premised on a two-sector analysis of the economy. However, energy production and consumption turn out to be economically crucial factors in the performance of the economy, and these effects are overlooked unless we represent the energy sector separately. This would be a failure of comprehensiveness; there is an economically significant factor that is not represented in the model.
We rerun the model assuming a slightly altered set of production coefficients, and we find that the predictions are substantially different: the increase in income is only 33% of what it was, and deficits are only half what they were. This is a failure of robustness; once we know that the model is extremely sensitive to variations in the parameters, we have strong reason to doubt its predictions. The accuracy of measurement of parameters is limited, so we can be confident that remeasurement would produce different values. So we can in turn expect that the simulation will arrive at different values for the endogenous variables.
Suppose that our model of income distribution in a developing economy is premised on the international trading arrangements embodied in GATT. The model is designed to represent the domestic causal relations between food subsidies and the pattern of income distribution across classes. If the results of the model change substantially upon dropping the GATT assumption, then the model is not autonomous with respect to international trading arrangements.
Finally, we examine the data underlying the consumption functions and we find that these derive from one household study in one Mexican state, involving 300 households. Moreover, we determine that the model is sensitive to the parameters defining consumption functions. On this scenario we have little reason to expect that the estimates derived from the household study are reliable estimates of consumption in all social classes all across Mexico; and therefore we have little reason to depend on the predictions of the model. This is a failure of reliability.

Criteria for assessing economic models

Validity is a measure of the degree to which the assumptions employed in the construction of the model are thought to correspond to the real processes underlying the phenomena represented by the model.
Comprehensiveness is the degree to which the model is thought to succeed in capturing the major causal factors that influence the features of the behavior of the system in which we are interested.
Robustness is a measure of the degree to which the results of the model persist under small perturbations in the settings of parameters, formulation of equations, etc.
Autonomy refers to the stability of the model's results in face of variation of contextual factors.
Reliability is a measure of the degree of confidence we can have in the data employed in setting the values of the parameters.

The model assumes that prices are determined on the basis of markup pricing (costs plus a fixed exogenous markup rate and wage). In fact, however, we might believe (along neoclassical lines) that prices, wages, and the profit rate are all endogenous, so that markup pricing misrepresents the underlying price mechanism. This would be a failure of validity; the model is premised on assumptions that may not hold.
The model is premised on a two-sector analysis of the economy. However, energy production and consumption turn out to be economically crucial factors in the performance of the economy, and these effects are overlooked unless we represent the energy sector separately. This would be a failure of comprehensiveness; there is an economically significant factor that is not represented in the model.
We rerun the model assuming a slightly altered set of production coefficients, and we find that the predictions are substantially different: the increase in income is only 33% of what it was, and deficits are only half what they were. This is a failure of robustness; once we know that the model is extremely sensitive to variations in the parameters, we have strong reason to doubt its predictions. The accuracy of measurement of parameters is limited, so we can be confident that remeasurement would produce different values. So we can in turn expect that the simulation will arrive at different values for the endogenous variables.
Suppose that our model of income distribution in a developing economy is premised on the international trading arrangements embodied in GATT. The model is designed to represent the domestic causal relations between food subsidies and the pattern of income distribution across classes. If the results of the model change substantially upon dropping the GATT assumption, then the model is not autonomous with respect to international trading arrangements.
Finally, we examine the data underlying the consumption functions and we find that these derive from one household study in one Mexican state, involving 300 households. Moreover, we determine that the model is sensitive to the parameters defining consumption functions. On this scenario we have little reason to expect that the estimates derived from the household study are reliable estimates of consumption in all social classes all across Mexico; and therefore we have little reason to depend on the predictions of the model. This is a failure of reliability.

Friday, April 16, 2010

Feyerabend as artisanal scientist

I've generally found Paul Feyerabend's position on science to be a bit too extreme. Here is one provocative statement in the analytical index of Against Method:

Thus science is much closer to myth than a scientific philosophy is prepared to admit. It is one of the any forms of thought that have been developed by man, and not necessarily the best. It is conspicuous, noisy, and impudent, but it is inherently superior only for those who have already decided in favour of a certain ideology, or who have accepted it without having ever examined its advantages and its limits. And as the accepting and rejecting of ideologies should be left to the individual it follows that the separation of state and church must be supplemented by the separation of state and science, that most recent, most aggressive, and most dogmatic religious institution. Such a separation may be our only chance to achieve a humanity we are capable of, but have never fully realised.

Fundamentally my objection is that Feyerabend seems to leave no room at all for rationality in science: no scientific method, no grip for observation, and no force to scientific reasoning. A cartoon takeaway from his work is a slogan: science is just another language game, a rhetorical system, with no claim to rational force based on empirical study and reasoning. Feyerabend seems to be the ultimate voice for the idea of relativism in knowledge systems -- much as Klamer and McCloskey seemed to argue with regard to economic theory in The Consequences of Economic Rhetoric.

This isn't a baseless misreading of Feyerabend. In fact, it isn't a bad paraphrase of Against Method. But it isn't the whole story either. And at bottom, I don't think it is accurate to say that Feyerabend rejects the idea of scientific rationality. Rather, he rejects one common interpretation of that notion: the view that scientific rationality can be reduced to a set of universal canons of investigation and justification, and that there is a neutral and universal set of standards of inference that decisively guide choice of scientific theories and hypotheses. So I think it is better to understand Feyerabend as presenting an argument against a certain view in the philosophy of science rather than against science itself.

Instead, I now want to understand Feyerabend as holding something like this: that there is "reasoning" in scientific research, and this reasoning has a degree of rational credibility. However, the reasoning that scientists do is always contextual and skilled, rather than universal and mechanical. And it doesn't result in proofs and demonstrations, but rather a preponderance of reasons favoring one interpretation rather than another. (Significantly, this approach to scientific justification sounds a bit like the view argued about sociological theories in an earlier posting.)

Here are a few reasons for thinking that Feyerabend endorses some notion of scientific rationality.

First, Feyerabend is a philosopher and historian of science who himself demonstrates a great deal of respect for empirical and historical detail. The facts matter to Feyerabend, in his interpretation of the history of science. He establishes his negative case with painstaking attention to the details of the history of science -- Newton, optics, quantum mechanics. This is itself a kind of empirical reasoning about the actual intellectual practices of working scientists. But if Feyerabend were genuinely skeptical of the enterprise of offering evidence in favor of claims, this work would be pointless.

Second, his own exposition of several scientific debates demonstrates a realist's commitment to the issues at stake. Take his discussion of the micro-mechanisms of reflection and light "rays". If there were in principle no way of evaluating alternative theories of these mechanisms, it would be pointless to consider the question. But actually, Feyerabend seems to reason on the assumption that one theory is better than another, given the preponderance of reasons provided by macro-observations and mathematical-physical specification of the hypotheses.

Third, he takes a moderate view on the relation between empirical observation and scientific theory in "How to Be a Good Empiricist":

The final reply to the question put in the title is therefore as follows. A good empiricist will not rest content with the theory that is in the centre of attention and with those tests of the theory which can be carried out in a direct manner. Knowing that the most fundamental and the most general criticism is the criticism produced with the help of alternatives, he will try to invent such alternatives. (102)

This passage is "moderate" in a specific sense: it doesn't give absolute priority to a given range of empirical facts; but neither does it dismiss the conditional epistemic weight of a body of observation.

So as a historian of science, Feyerabend seems to have no hesitation himself to engage in empirical reasoning and persuading, and he seems to grant a degree of locally compelling reasoning in the context of specific physical disputes. And he appears to presuppose a degree of epistemic importance -- always contestable -- for a body of scientific observation and discovery.

What he seems most antagonistic to is the positivistic idea of a universal scientific method -- a set of formally specified rules that guide research and the evaluation of theories. Here is how he puts the point in "On the Limited Validity of Methodological Rules" (collected in Knowledge, Science and Relativism).

It is indubitable that the application of clear, well-defined, and above all 'rational' rules occasionally leads to results. A vast number of discoveries owe their existence to the systematic procedures of their discoverers. But from that, it does not follow that there are rules which must be obeyed for every cognitive act and every scientific investigation. On the contrary, it is totally improbable that there is such a system of rules, such a logic of scientific discovery, which permeates all reasoning without obstructing it in any way. The world in which we live is very complex. Its laws do not lay open to us, rather they present themselves in diverse disguises (astronomy, atomic physics, theology, psychology, physiology, and the like). Countless prejudices find their way into every scientific action, making them possible in the first place. It is thus to be expected that every rule, even the most 'fundamental', will only be successful in a limited domain, and that the forced application of the rule outside of its domain must obstruct research and perhaps even bring it to stagnation. This will be illustrated by the following examples. (138)

It is the attainability of a universal, formal philosophy of science that irritates him. Instead, he seems to basically be advocating for a limited and conditioned form of local rationality -- not a set of universal maxims but a set of variable but locally justifiable practices. The scientist is an artisan rather than a machinist. Here is a passage from the concluding chapter of Against Method:

The idea that science can, and should, be run according to fixed and universal rules, is both unrealistic and pernicious. It is unrealistic, for it takes too simple a view of the talents of man and of the circumstances which encourage, or cause, their development. And it is pernicious, for the attempt to enforce the rules is bound to increase our professional qualifications at the expense of our humanity. In addition, the idea is detrimental to science, for it neglects the complex physical and historical conditions which influence scientific change. It makes our science less adaptable and more dogmatic: every methodological rule is associated with cosmological assumptions, so that using the rule we take it for granted that the assumptions are correct. Naive falsificationism takes it for granted that the laws of nature are manifest and not hidden beneath disturbances of considerable magnitude. Empiricism takes it for granted that sense experience is a better mirror of the world than pure thought. Praise of argument takes it for granted that the artifices of Reason give better results than the unchecked play of our emotions. Such assumptions may be perfectly plausible and even true. Still, one should occasionally put them to a test. Putting them to a test means that we stop using the methodology associated with them, start doing science in a different way and see what happens. Case studies such as those reported in the preceding chapters show that such tests occur all the time, and that they speak against the universal validity of any rule. All methodologies have their limitations and the only 'rule' that survives is 'anything goes'.

His most basic conclusion is epistemic anarchism, expressed in the "anything goes" slogan, but without the apparent relativism suggested by the phrase: there is no "organon," no "inductive logic," and no "Scientific Method" that guides the creation and validation of science. But scientists do often succeed in learning and defending important truths about nature nonetheless.

(Here is an online version of the analytical contents and concluding chapter of Against Method. And here is a link to an article by John Preston on Feyerabend in the Stanford Encyclopedia of Philosophy.)

Feyerabend as artisanal scientist

I've generally found Paul Feyerabend's position on science to be a bit too extreme. Here is one provocative statement in the analytical index of Against Method:

Thus science is much closer to myth than a scientific philosophy is prepared to admit. It is one of the any forms of thought that have been developed by man, and not necessarily the best. It is conspicuous, noisy, and impudent, but it is inherently superior only for those who have already decided in favour of a certain ideology, or who have accepted it without having ever examined its advantages and its limits. And as the accepting and rejecting of ideologies should be left to the individual it follows that the separation of state and church must be supplemented by the separation of state and science, that most recent, most aggressive, and most dogmatic religious institution. Such a separation may be our only chance to achieve a humanity we are capable of, but have never fully realised.

Third, he takes a moderate view on the relation between empirical observation and scientific theory in "How to Be a Good Empiricist":

The final reply to the question put in the title is therefore as follows. A good empiricist will not rest content with the theory that is in the centre of attention and with those tests of the theory which can be carried out in a direct manner. Knowing that the most fundamental and the most general criticism is the criticism produced with the help of alternatives, he will try to invent such alternatives. (102)

It is indubitable that the application of clear, well-defined, and above all 'rational' rules occasionally leads to results. A vast number of discoveries owe their existence to the systematic procedures of their discoverers. But from that, it does not follow that there are rules which must be obeyed for every cognitive act and every scientific investigation. On the contrary, it is totally improbable that there is such a system of rules, such a logic of scientific discovery, which permeates all reasoning without obstructing it in any way. The world in which we live is very complex. Its laws do not lay open to us, rather they present themselves in diverse disguises (astronomy, atomic physics, theology, psychology, physiology, and the like). Countless prejudices find their way into every scientific action, making them possible in the first place. It is thus to be expected that every rule, even the most 'fundamental', will only be successful in a limited domain, and that the forced application of the rule outside of its domain must obstruct research and perhaps even bring it to stagnation. This will be illustrated by the following examples. (138)

The idea that science can, and should, be run according to fixed and universal rules, is both unrealistic and pernicious. It is unrealistic, for it takes too simple a view of the talents of man and of the circumstances which encourage, or cause, their development. And it is pernicious, for the attempt to enforce the rules is bound to increase our professional qualifications at the expense of our humanity. In addition, the idea is detrimental to science, for it neglects the complex physical and historical conditions which influence scientific change. It makes our science less adaptable and more dogmatic: every methodological rule is associated with cosmological assumptions, so that using the rule we take it for granted that the assumptions are correct. Naive falsificationism takes it for granted that the laws of nature are manifest and not hidden beneath disturbances of considerable magnitude. Empiricism takes it for granted that sense experience is a better mirror of the world than pure thought. Praise of argument takes it for granted that the artifices of Reason give better results than the unchecked play of our emotions. Such assumptions may be perfectly plausible and even true. Still, one should occasionally put them to a test. Putting them to a test means that we stop using the methodology associated with them, start doing science in a different way and see what happens. Case studies such as those reported in the preceding chapters show that such tests occur all the time, and that they speak against the universal validity of any rule. All methodologies have their limitations and the only 'rule' that survives is 'anything goes'.

Feyerabend as artisanal scientist

I've generally found Paul Feyerabend's position on science to be a bit too extreme. Here is one provocative statement in the analytical index of Against Method:

Thus science is much closer to myth than a scientific philosophy is prepared to admit. It is one of the any forms of thought that have been developed by man, and not necessarily the best. It is conspicuous, noisy, and impudent, but it is inherently superior only for those who have already decided in favour of a certain ideology, or who have accepted it without having ever examined its advantages and its limits. And as the accepting and rejecting of ideologies should be left to the individual it follows that the separation of state and church must be supplemented by the separation of state and science, that most recent, most aggressive, and most dogmatic religious institution. Such a separation may be our only chance to achieve a humanity we are capable of, but have never fully realised.

Third, he takes a moderate view on the relation between empirical observation and scientific theory in "How to Be a Good Empiricist":

The final reply to the question put in the title is therefore as follows. A good empiricist will not rest content with the theory that is in the centre of attention and with those tests of the theory which can be carried out in a direct manner. Knowing that the most fundamental and the most general criticism is the criticism produced with the help of alternatives, he will try to invent such alternatives. (102)

It is indubitable that the application of clear, well-defined, and above all 'rational' rules occasionally leads to results. A vast number of discoveries owe their existence to the systematic procedures of their discoverers. But from that, it does not follow that there are rules which must be obeyed for every cognitive act and every scientific investigation. On the contrary, it is totally improbable that there is such a system of rules, such a logic of scientific discovery, which permeates all reasoning without obstructing it in any way. The world in which we live is very complex. Its laws do not lay open to us, rather they present themselves in diverse disguises (astronomy, atomic physics, theology, psychology, physiology, and the like). Countless prejudices find their way into every scientific action, making them possible in the first place. It is thus to be expected that every rule, even the most 'fundamental', will only be successful in a limited domain, and that the forced application of the rule outside of its domain must obstruct research and perhaps even bring it to stagnation. This will be illustrated by the following examples. (138)

The idea that science can, and should, be run according to fixed and universal rules, is both unrealistic and pernicious. It is unrealistic, for it takes too simple a view of the talents of man and of the circumstances which encourage, or cause, their development. And it is pernicious, for the attempt to enforce the rules is bound to increase our professional qualifications at the expense of our humanity. In addition, the idea is detrimental to science, for it neglects the complex physical and historical conditions which influence scientific change. It makes our science less adaptable and more dogmatic: every methodological rule is associated with cosmological assumptions, so that using the rule we take it for granted that the assumptions are correct. Naive falsificationism takes it for granted that the laws of nature are manifest and not hidden beneath disturbances of considerable magnitude. Empiricism takes it for granted that sense experience is a better mirror of the world than pure thought. Praise of argument takes it for granted that the artifices of Reason give better results than the unchecked play of our emotions. Such assumptions may be perfectly plausible and even true. Still, one should occasionally put them to a test. Putting them to a test means that we stop using the methodology associated with them, start doing science in a different way and see what happens. Case studies such as those reported in the preceding chapters show that such tests occur all the time, and that they speak against the universal validity of any rule. All methodologies have their limitations and the only 'rule' that survives is 'anything goes'.

Sunday, January 24, 2010

Defining and specifying social phenomena

Insect (df): a class within the arthropods that have a chitinous exoskeleton, a three-part body (head, thorax, and abdomen), three pairs of jointed legs, compound eyes, and two antennae.

What is involved in offering a definition of a complex social phenomenon such as "fascism", "rationality", "contentious politics", "social capital", or "civic engagement"? Is there any sense in which a definition can be said to be correct or incorrect, given the facts we find in the world? Are some definitions better than others? Does a definition correspond to the world in some way? Or is a definition no more than a conventional stipulation about how we propose to use a specific word?

There are several fundamental questions that need answering when we consider the meaning of a term such as "fascism" or "contentious politics". What do we intend the term to refer to? How is the term used in ordinary language? What are the paradigm cases? What are the ordinary criteria of application of the term -- the necessary and sufficient conditions, the rules of application? What characteristics do we mean to pick out in using the term? What is our proto-theory that guides our use and application of the concept?

From the scientific point of view, the use of a concept is to single out a family of objects or phenomena that can usefully be considered together for further analysis and explanation. "Metals" are a group of materials that have similar physical properties such as conductivity and ductility. And it turns out that these phenomenologically similar materials also have important underlying physical properties in common, that explain the phenomenological properties. So it is possible to provide a physical theory of metals that unifies and explains their observable similarities. The scientist's interest, then, is in the phenomena and not the concept or its definition.

In order to investigate further we need to do several kinds of work. We need to specify more exactly what it is that we are singling out. What is "civic engagement"? Does this concept single out a specific range of behaviors and motivations? Would we include a spontaneous gift to a fund for a family who lost their home to a fire "civic engagement"? What about membership in a college fraternity? So we have to say what we mean by the term; we have to indicate which bits of the world are encompassed by the term; and perhaps we need to give some reason to expect that these phenomena are relevantly similar.

Several semantic acts are relevant in trying to do this work. "Ostension" is the most basic: pointing to the clear cases of civic engagement or fascism and saying "By civic engagement I mean things like these and things relevantly similar to them." If we go this route then we put a large part of the burden of the semantics in the world and in the judgment of the observer: is this next putative example of the stuff really similar to the paradigm examples?

But there is also an intensional part of the work: what do we intend to designate in pointing to this set of paradigm cases? Is it the motivation of the activity, the features of social connections involved in the activity, or the effects of the activity that are motivating the selection of cases? Is fascism a kind of ideology, a type of social movement, or a type of political organization? These questions aren't answered by the gesture of ostension; rather, the observer needs to specify something about the nature of the phenomena that are intended to be encapsulated by the concept.

Once we have stipulated the extension and criteria of application of the term, we can then take a further step and offer a theory of this stuff. It may be a theory in materials science intended to explain the workings of some common characteristics of this stuff -- electrical or thermal conductivity, melting point, hardness. Or it may be a social theory of the origins and institutional tendencies of the stuff (fascism, social movements, civic engagement). Either way, the theory goes beyond semantics and makes substantive empirical statement about the world.

It is not the case that all scientific concepts are constructed through a process of abstraction from observable phenomena. A theoretical concept is one whose meaning exceeds the observable associations or criteria associated with the concept. It may postulate unobservable mechanisms or structures which are only indirectly connected to observable phenomena, or it may hypothesize distinctions and features that help to explain the gross behavior of the phenomena. The value of a theoretical concept is not measured by its fit with ordinary language usage or its direct applicability to the observable world; instead, a theoretical concept is useful if it helps the theorist to formulate hypotheses about the unobservable mechanisms that underlie a phenomenon and that help to provide some empirical order to the phenomena.

In order to support empirical research, theoretical concepts need somehow to be related to the world of observation and experience. An important activity is “operationalizing” a theoretical concept. This means specifying a set of observable or experimental characteristics that permit the investigator to apply the concept to the world. But the operational criteria associated with a concept do not exhaust its meaning, and different investigators may provide a different set of operational criteria for the same concept. And a specific scheme of operationalization of a concept like "social capital" or "civic engagement" may itself be debated.

The idea of a "natural kind" arises in the natural sciences. Concepts like metal, acid, insect, and gene are linguistic elements that are thought to refer to a family or group of entities that share fundamental properties in common. Kinds are thought to exist in the world, not simply in conceptual schemes. So having identified the kind, we can then attempt to arrive at a theory of the underlying nature of things like this. (It is an important question to consider whether there are any "social kinds;" in general, I think not.)

These reflections raise many of the intellectual problems associated with defining a field of empirical research in the social sciences. Research always forces us to single out some specific body of phenomena for study. This means specifying and conceptualizing the phenomena. And eventually it means arriving at theories of how these sorts of things work. But there is a permanent gap between concept and the world that means that certain questions can't be answered: for example, what is fascism really? There are no social essences that definitions might be thought to identify. Instead, we can offer analysis and theory about specific fascist movements and regimes, based on this or that way of specifying the concept of fascism. But there is nothing in the world that dictates how we define fascism and classify, specify, and theorize historical examples of fascism. The semantic ideas of family resemblance, ideal type, and cluster concept work best for concepts in the social sciences.

Defining and specifying social phenomena

Insect (df): a class within the arthropods that have a chitinous exoskeleton, a three-part body (head, thorax, and abdomen), three pairs of jointed legs, compound eyes, and two antennae.

Defining and specifying social phenomena

Insect (df): a class within the arthropods that have a chitinous exoskeleton, a three-part body (head, thorax, and abdomen), three pairs of jointed legs, compound eyes, and two antennae.

Sunday, December 20, 2009

What makes a sociological theory compelling?

In the humanities it is a given that assertions and arguments have a certain degree of rational force, but that ultimately, reasonable people may differ about virtually every serious claim. An interpretation of Ulysses, an argument for a principle of distributive justice, or an attribution of certain of Shakespeare's works to Christopher Marlowe -- each may be supported by "evidence" from texts and history, and those who disagree need to produce evidence of their own to rebut the thesis; but no one imagines there is such a thing as an irrefutable argument to a conclusion in literature, art history, or philosophy. Conclusions are to some degree persuasive, well supported, or compelling; but they are never ineluctable or uniquely compatible with available evidence. There is no such thing as the final word, even on a well formulated question. (Naturally, a vague or ambiguous question can always be answered in multiple ways.)

In physics the situation appears to be different. Physical theories are about a class of things with what we may assume are uniform characteristics throughout the range of this kind of thing (electrons, electromagnetic waves, neutrinos); and these things come together in ensembles to produce other kinds of physical phenomena. So we often believe that physical theories are deductive systems that attribute a set of mathematical properties to more-or-less fundamental physical entities; and then we go on to derive descriptions of the behavior of things and ensembles made of these properties. And the approximate truth of the physical theory can be assessed by the degree of success its deductive implications have for the world of observable ensembles of physical things. So it seems that we can come to fairly definitive conclusions about various theories in the natural sciences: natural selection is the mechanism of species evolution; gravitational force is the cause of the elliptical shapes of planetary orbits; the velocity of light is a constant. And we are confident in the truth of these statements because they play essential roles within deductive systems that are highly confirmed by experiment and experience.

So what about the empirical domains of sociology or history? Is it possible for careful empirical and theoretical research to provide final answers to well formulated questions in these fields? Is it possible to put together an argument in sociology on a particular question that is so rationally and empirically compelling that no further disagreement is possible?

There is a domain of factual-empirical questions in sociology where the answer is probably affirmative. What is the population of Detroit in 2000? What percentage of likely voters supported McCain in October 2008? What is the full unemployment rate among African-American young people in Ohio? When was the International Brotherhood of Teamsters recognized in Cleveland? There are recognized sources of data and generally accepted methods that would allow us to judge that a particular study definitively answers one or another of these questions. This isn't to make a claim of unrevisability; but it is to assert that some social inquiries have as much empirical decidability as, say, questions in descriptive ecology ("what is the range of the Helicinidae land snail in North America?") or planetary astronomy ("what is the most likely origin of the planetary body Pluto?").

But most major works in sociology do not have this characteristic. They do not primarily aim at establishing a limited set of sociological or demographic facts. Instead, they take on larger issues of conceptualization, explanation, interpretation, and theory formation. They make use of empirical and historical facts to assess or support their arguments. But it is virtually never possible to conclude, "given the available body of empirical and historical evidence, this theory is almost certainly true." Rather, we are more commonly in a position to say something like this: "Given the range of empirical, historical, and theoretical considerations offered in its support, theory T is a credible explanation of P." And it remains for other scholars to either advance a more comprehensive or well-supported theory, or to undermine the evidence offered for T, or to provisionally accept T as being approximately correct. Durkheim put forward a theory of suicide based on the theoretical construct of anomie (Suicide). He argued that the rate of suicide demonstrated by a population is caused by the degree of anomie characteristic of that society; and he offered a few examples of how the theoretical construct of anomie might be operationalized in order to allow us to measure or compare different societies in these terms. But his theory can be challenged from numerous directions: that it is monocausal, that it assumes that suicide is a homogeneous phenomenon across social settings, and even that it overstates the degree of consistency that exists between "high anomie" and "high suicide" social settings.

Take Michael Mann's analysis of fascism (Fascists). He considers a vast range of historical and sociological evidence in arriving at his analysis. His theory is richly grounded in empirical evidence. But numerous elements of his account reflect the researcher's best judgment about a question, rather than a conclusive factual argument. Take Mann's view that "materialist" and class-based theories of fascist movements are incorrect. I doubt that his richly documented arguments to this conclusion make it rationally impossible to continue to find support for the materialist hypothesis. Or take his definition of fascism itself. It is a credible definition; but different researchers could certainly offer alternatives that would lead them to weigh the evidence differently and come to different conclusions. So even such simple questions as these lack determinate answers: What is fascism? Why did fascist movements arise? Who were the typical fascist followers? There are credible answers to each of these questions, and this is exactly what we want from a sociological theory of fascism; but the answers that a given researcher puts forward are always contestable.

Or take Howard Kimeldorf's analysis of the different political trajectories of dock-workers' unions on the East and West Coasts (Reds or Rackets?: The Making of Radical and Conservative Unions on the Waterfront). Kimeldorf provides an explanatory hypothesis of the differences between the nature of these unions in the east and west in the United States, and he offers a rich volume of historical and factual material to fill in the case and support the hypothesis. It is an admirable example of reasoning in comparative historical sociology. Nonetheless, there is ample room for controversy. Has he formulated the problem in the most perspicuous way? Are the empirical findings unambiguous? Are there perhaps other sources of data that would support a different conclusion? It is reasonable to say that Kimeldorf makes a credible case for his conclusions; but there are other possible interpretations of the facts, and even other possible interpretations of the phenomena themselves.

These are both outstanding examples of sociological theory and analysis; so the point here isn't that there is some important defect in either of them. Rather, the point is that there is a wide range of indeterminacy in each of these examples: in the way in which the problem is formulated, in the basic conceptual assumptions that the author makes, and in the types and interpretation of the data that the author provides. Each provides a strong basis, in fact and in theory, for accepting the conclusions offered. But each is contestable within the general framework of scientific rationality. And this seems to suggest that for the difficult, complex problems that arise in sociology and history, there is no basis for imagining that there could be a final and rationally compulsory answer to questions like "Why fascism?" and "Why Red labor unions on the Pacific Coast?" And perhaps it suggests something else as well: that the logic of scientific reasoning in the social sciences is as close to arguments in the humanities as it is to reasoning in the physical sciences. It is perhaps an instance of "inference to the best explanation" rather than an example of hypothetico-deductive testing and confirmation.