$\begingroup$

I know 2 explanations to such seemingly irrational behaviour in cognitive science. Both of them don't really justify the usage of the simple reward-maximizing model in economics.

Rule Rationality versus Act Rationality

Act Rationality is the notion that every decision an agent makes is made in order to maximize his utility. Rule Rationality is the notion that decision making follows rules. These rules are rational, in the sense that among all possible rules, the chosen rule is the one that on average (over choices made) maximizes the agent's utility. Importantly, not every decision is optimal.

In the example you stated, this can be seen as the following: Usually, my actions towards others affect their actions towards me. In this specific game it is not the case, but people cannot take that into account, because of their rule. Their rule says "Assume people are good, try to cooperate, and defect only as a last resort, or if the gain is very high". If they are told explicitly what the other player has chosen, than it is obvious that their choice cannot affect his (so they compute the utility of both choices and choose the higher one). If they're not told the other player's decision, then they assume their decision will affect his to some extent, because that is the way it usually works in the real world.

This explanation is closely related to the concept of 'Bounded Rationality' that assumes rational decisions with some bounds on the amount of information or processing power.

Relation to Homo economicus

This explanation contradicts the assumptions of H.economicus - Actions are not taken to optimize a utility function in every situation. Actions are taken according to rules (that are themselves optimized perhaps during evolution to optimize average reward).

Unknown optimization function

People choose optimally, just not trying to optimize what you think they are.

For example, if we add some 'reputation' that the agent has, and he is also worried about how his choice will affect (in addition to minimizing the time spent in prison, in the prisonner's dillema), perhaps we can explain the result you mentioned. When the 'time in prison' outcome is clear (other person's choice is known) then it gets a high weight in the combined optimization problem. When the 'time in prison' outcome is not clear (other person's choice is not known) then it gets a smaller weight (no point in working too hard to optimize goals you can't predict) and the reputation result (I want to be perceived as a 'good'/'cooperating' person) is given more weight.

This explanation is more 'rationalistic' by nature: It assumes behaviour is an optimization problem - we (the experimenters/scientists) just don't necessarily know the optimization function.

Relation to Homo economicus

This explanation is somewhat consistent with H.economicus. The problem is, we don't know the utility function in the general case. It can be very complex, and take into account different factors in different situations. This means that the H.economicus' basic assumtion (optimizing some utility function) is correct, but it makes the model less useful by making the specific function used not known in the general case.