07-Modeling.Rmd

```{r, echo=FALSE, eval=FALSE}
library(bookdown); library(rmarkdown); rmarkdown::render("07-Modeling.Rmd", "pdf_book")
```

# DEA Modeling Issues and Questions

## Introduction

This section gives some general guidelines in developing DEA models. There is no single recipe for developing a model or diagnostic tool for evaluating the quality of an analysis. Therefore, it is still about as much of an art as it is a science.

## Model Scope

One way to approach DEA modeling is to examine a series of questions. "What is the goal of the analysis?" A DEA evaluation can be used in a variety of ways but it is important to decide on a goal to lead the rest of the modeling. Goals include but are not limited to evaluating management, setting targets, and determining best practices.

*"How do you measure the goal of the analysis?"* There are a variety of possible measures in a DEA evaluation. One measurement may be to determine estimates of efficiency scores for each DMU. In other cases, ranking the DMUs to differentiate the most inefficient DMUs from the most efficient ones may suffice. An advantage to ranking is that it generally requires less DMUs than estimating efficiency scores for a given model size.

*"What are the decision making units?"* Deciding on which items to compare may be relatively straightforward after determining the goal.

*"What is the scope of the data set?"* The scope of the data set implies whether or not you are comparing DMUs within a single company, within an industry, or across industries. For example, in the area of warehouse evaluation, organizations such as General Motors or the Defense Logistics Agency may choose to evaluate their warehouses internally. These companies may have enough warehouses to perform a meaningful DEA evaluation but they would not be making use of comparisons against warehouses in other companies unless they include other companies. This means that they may not be able to identify units with better practices.

## Inputs and Outputs

*"What are the important factors related to the operation of the DMUs?"* Leaving out an important factor or including an irrelevant one can significantly change the results. For example, an analysis of warehouses that didn't include automation might not be meaningful.

*"Which relevant factors can be measured directly?"* For example, analyzing convenience store operations might include factors such as monthly sales and the number of gasoline pumps which can be easily measured.

*"Which relevant factors that can't be measured directly can be measured indirectly by a proxy?"* The aforementioned measure of automation in warehouses becomes quite tricky. For example, how do you count automated storage/retrieval systems and automatic guided vehicles? One plausible approach is to aggregate all of the automation equipment into a single dollar value. This raises obvious questions though of how to determine a dollar value such as whether to use a depreciated value, replacement cost, or rental price.

*"What are the inputs and outputs?"* Once the relevant factors are determined, the next step is to sort them into inputs and outputs. One initial way of sorting these is to determine which factors are goods and which are bads. A factor is a "good" if the DMU would consider higher values of that factor to be better. For example, monthly sales is a good for a store. In contrast labor as measured by full time equivalents would be considered a bad by a store. DEA requires that inputs be bads and outputs be goods. This would seem to resolve the issue of defining the model but remember that inputs should result in the production of outputs. In some cases, it may turn out that a factor should be classified as an input even though it is a good. In this case, a transformation may be applied to convert the good to a bad. One way to do this is to invert the factor. Another method is to subtract the factor from a constant.

*"Are the data for the inputs and outputs in the appropriate form?"* In other words, are the data values complete and nonnegative? If a DMU is missing a factor or its data is considered unreliable, it may be appropriate to exclude it from consideration by setting the virtual multiplier for that DMU to be zero for all other DMUs.

*"What are the important factors that are neither inputs nor outputs?"* For example, in the case of convenience store management, it may be decided to use years of experience as an explanatory variable for efficiency. Regression might then be used to relate efficiency scores and years of experience.

*"Does the DMU have full discretion over each of the inputs and outputs?"* In the case of convenience store operations, location might be an important input yet not something that the individual manager can change. In this case, perhaps a DEA model that allows for nondiscretionary inputs and outputs would help generate more realistic results.

*"Is there a reasonable number of DMUs for the model size that you specify in order to perform a DEA evaluation?"* In linear regression there is a strict rule that you need more data points than inputs. Unfortunately in DEA there doesn't exist a firm bound on the number of DMUs that are needed relative to model complexity. A commonly cited rule of thumb is that you need two or three times as many DMUs as inputs plus outputs. In general, you should also have more DMUs than the number of inputs times the number of outputs too.

*"Are there relationships between factors that can make the analysis more realistic?"* In the baseball batting example, there was an obvious dominance relationship between home runs and singles. In other applications, the relationships may be more ambiguous but still important. Bounding the relative factor weights has the effect of reducing the number of DMUs needed.

*"Can inputs and outputs be assumed to have continuous ranges?"* In the case of convenience store operations, if one input is the number of gas pumps, it is not a meaningful target to compare a store to a target store with 2.5 gas pumps. A gas pump is an inherently integer valued input. In this case, it may be necessary to constrain the targets to be integer valued.

## Model Structure

*"Is it appropriate to use multiple time periods for each DMU?"* A time series approach may be used in certain analyses to attempt to spot trends. One question that arises is whether or not to allow comparison between time periods and how many periods to allow comparison across. Allowing multiple periods can multiply the number of DMUs in your data set quickly and can help overcome problems with small data set size. It can also raise complications though. For example if, you are analyzing retail store sales, it may be very misleading to compare the Christmas season in a data with slower sales periods.

*"Which type of returns to scale should be used?"* The decision to assume CRS versus VRS is often arbitrary and not justified on the basis of the characteristics of the application. This issue deserves careful consideration in the modeling stage since it may have a significant effect on the results. Although CRS will help mitigate the problems of small data set size relative to model size, this is not an adequate justification for selecting CRS over other returns to scale assumptions.

*"Is noise a significant factor?"* The noise issue has already been discussed in detail but suffice it to say that it can be important. As an extreme point technique, DEA is susceptible to outliers. Noise can take the form of either natural random variation in the production function or measurement error. Complicated proxy measures may introduce more measurement error too.

## Validation of Results

Perhaps this is the most challenging and neglected topic in most DEA applications. Unlike regression or traditional statistical methods where we can rely on a variety of model diagnostics such as $R^2$ or *p-values*.

*"Are the results realistic?"* An important step of DEA modeling (and modeling in general) is to carefully examine the results to see if they are realistic. Just because DEA was run and numerical results were generated does not mean that they have value. Ideally there should be new insights and perhaps some surprising learnings from the analysis but they should also be realistic. For example, a variation of DEA that I analyzed years ago found that in baseball batting, Babe Ruth was a poor hitter in what was arguably the most dominant year in the history of the sport. Also, the analysis claimed that triples were consistently more valuable than home runs. Both of these criticisms invalidate the underlying model. For non-baseball fans, this would be equivalent to saying that Michael Jordan or LeBron James was not even as good as an average NBA starter and that a two point shot was more valuable than a three point shot.

*"Are the target DMUs feasible targets?"* Another question useful in testing a DEA evaluation is to try to consider how a low scoring DMU responds to their score. They may respond by saying that the model leaves out a critical aspect of their operations that differ from the other DMUs. For example, in the case of military depots, comparing them to commercial operations may find them to be overstaffed or overly automated for their regular operations. An important part of their mission could be to rapidly ramp up volume in time of war. The result is that it may not actually be meaningful to include them in the data set.

There remain fundamental modeling questions that have not been well addressed yet. One major area is statistical testing of DEA results.

This is a daunting list of questions that should be addressed in a DEA evaluation. In reality, these questions are largely interrelated so it is important to make multiple passes through these questions. It may be appropriate to perform multiple analyses to determine the robustness of the results with respect to the modeling assumptions used.

## Common DEA Problems

Over interpreting the precision of the results. Data that is accurate to one or two digits of accuracy can't have reliable DEA efficiency scores can't have three or more digits of precision.

The kitchen sink model. This refers to including everything possible, including a kitchen sink. This results in a large collection of inputs and outputs just because they are available without considering whether they should be used.

Relying on correlation matrices to select inputs and outputs. High correlations among variables cause problems in regression but do not cause the same kind of problems in DEA.

Not looking at the dual results for potential model debugging. In other words, while the analyst might be implementing the envelopment model, the weights from the dual (multiplier model) may be insightful. Similarly, a multiplier model still generates the lambda value by duality so why not look at that.

Being obsessed with zero-valued weights in the multiplier model.

Getting locked into just one model without consideration of whether another model is a better fit. For example, some people have only seen the input-oriented envelopment model so they don't consider whether an output-orientation or the multiplier model with weight restrictions might be a better fit.

Not including an application area expert on the project. DEA will create numerical results that look highly precise, but this does not replace the need for having a deep understanding of the application.

## Summary

Many of these same questions and issues apply to other DEA-based models that we will be discussing in later chapters.