1.     Introduction

Empirical evaluation can be defined as a way of gaining knowledge based on observations or experiments. Empirical evaluation has become very important in almost all fields. It has significant importance in technology as well. If we talk about empirical evaluation in terms of technology then it is also called technology evaluation. When we talk about the development, then the main thing to be considered is the best software to be used. The empirical evaluation helps in finding out which technology will be the best for use amongst other technologies.

2. Tasks

  1. Why should software practitioners conduct empirical evaluations?


The purpose to conduct empirical evaluation is that it concludes the best options being unbiased and based on research. It first presents a hypothesis and then based on the research presents the result based on facts and research. When we talk about empirical evaluation in software engineering then it mainly focuses on the achievement of two goals. The first goal is to expand our knowledge about what is more useful for us in software engineering. The second goal is to know which solution is the most cost effective amongst the available ones. Some areas of artificial intelligence in technology rely on the application of empirical evaluation. Benchmarking of search algorithms is done in specific domains. Moreover, planning for software is done by empirical evaluation. Moreover, machine learning algorithms are being tested with real data sets as well. All this is possible by the implementation of empirical evaluation in software engineering. To estimate the effectiveness, efficiency and usability of the systems is necessary to be tested by empirical evaluation.

This focuses on the study and collection of observations related to the real world. In any evaluation, the initial claim is the main point in the same way; empirical evaluation depends on the claim made at the beginning of the evaluation process (Elmas and Chen). It is good for the assessment of the success about the claim. The empirical evaluation will be successful if done in the generalized way that is based on the strategy of sampling. This can be helpful in the development of new techniques and methods rather than using the old ones. The empirical evaluation starts with the assumption that the key to the success of the evaluation process. In software engineering when we are concerned about finding the cost reduction solutions, mostly asymptotic bounds is used for the evaluation purposes, however it is not successful for this domain (Lanubile, 2007). But the empirical evaluation is very good at solving the complexity and convincing the people for the adoption of a cost saving method.

  1. What factors might make it difficult to conduct an empirical evaluation of the scenario?

Explain five factors and relate each of them to the scenario.

  1. Only by looking the two options and not knowing the requirements do not provide us the best options (Salo and Abrahamsson, 2004). In this scenario my friend did not mention his requirements, however in conduction the empirical evaluation to find either the NetBeans Integrated Development Environment will be good for him or BlueJ environment.
  2. If we talk about using the Desktop evaluation method for this scenario, even that scenario will not be enough for the evaluation process. For that purpose, a survey will be required. The survey needs some important points to be searched about.
  • These development environments are not suitable for the development of all sorts of the application. As he has told that he is planning to develop a very simple application so it is very hard to choose one for my friend either to use NetBeans Integrated Development Environment or BlueJ environment.
  1. Using the Fenton and Pfleeger model, why is it hard to show that an Integrated Development

Environment (such as those stated in the scenario) leads to improvements in the software project, or in the quality of the software produced.


The survey study will also helpful in the situation when we need to gather the reviews of the users and the experts. However the survey study does not provide information as it is limited to some question that is to be answered by the people. The survey can be best conducted when we are sure about our requirements.


Let us first take a look at what is given in the scenario. If we look at the scenario we find the following things,

  1. He has not studied technology as a subject
  2. He is aware of technologies available for development of Java application
  • No details about the application
  1. He wants to choose between NetBeans and BlueJ environment


Now we will look at the model presented by Fenton and Pfleeger, they have presented a model for software engineering that is shown below in the figure

This model presented contain basic six components of the model that are

  1. Activity
  2. Inputs
  • Outputs
  1. Controls
  2. Resources
  3. Time

While considering these components of the software engineering model we will see if the given scenario contains the following components or not.


As the first component of the model is Activity, my friend in the scenario has discussed about the application he is interested in developing. It means the first component is there that is activity. However, it is not informed what sort of activity it is going to be. This much information about the application is not enough.  On the unknowing the type of activity, we cannot prove our claim that NetBeans is the best option for him. What input the activity will take, what controls and resources will be employed to produce the output? All these components are not discussed in the scenario, so which development environment will be good for the improvement of the application cannot be predicted here. As the scenario is not according to the model given by Fenton and Pfleeger; so the environment that will help in improvement of the application cannot be chosen.


  1. Critically discuss the benefits and limitations of the case study and the survey study to answer the friend’s question.

My friend is looking for the best development environment that he will use for the development of the application. However, we have two basic different scenarios one is case study and the other one is survey study. There are benefits and limitation of both the studies. First we will discuss the case study.

  1. Case Study

There benefits and limitations of case study in empirical evaluation.

  • Benefits

The case study is a continuous process that keeps on moving along with the whole software engineering process. The benefit of case study method is that it is conducted in detail and helps in the evaluation of the process (Weibelzahl and Weber). The other benefit of the case study is that it provides the depth of the information.

  • Limitations

There are limitations in case study as well. The first limitation in case study is that it is not possible to control that situation that is being studied. The other limitation is that it is very hard to control the situation. The other limitation in case study is that, if the situation being studied is not the current one, and then the case study will get expansion.

  1. Survey Study

As we have just came to know about the benefits and limitations of case study, the survey study has also face the following benefits and limitations.

  • Benefits

The first benefit of survey study is that it is not limited to only one situation or case. It may cover more than one situation that can be studied under the one survey. The other benefit of survey study is that is not a repetitive process. As the case study gives the depth of the information, the survey study gives the breadth of the information.

  • Limitation

The first limitation of survey study is that it is not in detail. As the survey study will only highlight the important points to be discussed and considered important. Like case study, survey study cannot control the situation so there is no alternative but to do survey (Weibelzahl and Weber). The other limitation of survey study is that, if the situation occurs again so there is no way left but to survey again.


By adopting one of the following study, we can conclude which software environment can be used. The best way is the survey study that will involve the suggestions of the experts as well. The survey study has shown the BlueJ development environment is better for the person who is not aware of the computer, and is good for the starter as it provides easy development facilities to the developers. After knowing the benefits and limitations of Case Study and Survey Study, it seems that the survey study is suitable for this kind of situation. As I am not sure about the requirements, so it is better to go for survey study that covers more situation and will provide me better results for the selection procedure. Here I cannot rely only on survey study, because as I just said that case study gives the depth of information and on the other hand, survey study gives the breadth of information.


  1. Define the terms “claim”, “argument”, “evidence” and “empirical evidence”, and explain how they relate to each other.


Before conducting any evaluation, first a hypothesis is presented that needs to be proved through research and different methodologies. The hypothesis that is presented in the beginning is known as claim. In the beginning of the evaluation process, it is not possible to start with the conclusion statement.


When an empirical evaluation has been started with hypothesis, then comes the arguments which may go against the hypothesis or in its favor (Wood, Brooks, Miller and Roper, 1995). The arguments are never biased but based on the research findings. The purpose of the arguments is not to write in favor of the hypothesis.


Evidence is something that is used for the evaluation process as a proof of something is true and right. Evidence helps in the presentation of the arguments. The evidence helps in the proper evaluation and to prove our claim exact.



       Empirical evidence

The empirical evidence is defined as a source of knowledge that is obtained from observation and searching (Rainer and Beecham, 2008). It helps in the justification of the claim or it can also sometimes nullify the claim. It depends on the empirical evidence. It is because; the claim is just a way to start the empirical evaluation.


  1. Briefly explain the difference between Evidence Based Software Engineering (EBSE) and Systematic Literature Reviews (SLR).

The evidence based software engineering focuses on the collection and analysis of the all the empirical data that is obtained from the observations or the experimental cases (Kitchenham, Dyba and Jorgenson, 2004). In evidence bases software engineering, mostly biased results are produced, that make a ground basis for the proper evaluation process that could produce better results. On the other hand, Systematic Literature Reviews are the core tool that is used for evidence based software engineering. It helps in easy finding the answers to the empirical questions. In other words, it can be said that it is a secondary study; it focuses on all the information gathered in different researches. It also focuses on all the results obtained either from research or experiments. It provides unbiased results.

  1. Critically discuss the limitations of EBSE as a methodology for evaluating Integrated Development Environments.


There are certain limitations of evidence based software engineering. It has been very effective and carries a number of advantages that make it adaptable for a number of research works (Kitchenham, Dyba and Jorgenson, 2004). Besides the advantages, there are also limitations that make it not perfect for the entire situation. The first limitation of this evidence based software engineering is that it is not for the lone searcher. It means that more than one searcher is required for this task. This case study cannot be completed by the efforts of only one researcher. In this situation, I am the only one who has to look for either of the development environment and then decide which should be suggested to my friend. The other limitation of this EBSE is that there is very little space and chances left for biased opinion as there are more than one researcher involved. If I consider that I am having a friend who will be helping me in conducting, evaluation based software engineering, then it will not be successful. The reason is that, when there is more than one researcher, then the chances of biased opinion get higher. As this evidence based software engineering is not for one person, so it is not suitable for me to use this for selecting the best development environment for my friend. There are other limitations in Evidence Based Software Engineering as well, that it is not as simple as other software evaluations are. A lot more effort and hard work to achieve successful evaluation. As this EBSE already needs extra effort, and I am the lone searcher, so it will require my extra time to conduct this evaluation and help my friend in choosing development environment. The other limitation that makes ESBE inappropriate to apply for the selection of development environment for my friend is that it is not compatible with the requirements of the short papers. In this scenario, our requirement is only to choose the best development environment that is not compatible for ESBE. This makes it not suitable this scenario.


  1. Lanubile, F. 1997. Empirical Evaluation of Software Maintenance Technologies. Department of Computer Science, University of Maryland.
  2. Wood, M., Brooks, A., Miller, J. and Roper, M. Empirical Evaluation Of Software Quality Attributes. Department of Computer Science, University of Strathclyde.
  3. Salo, A. and Abrahamsson, P. 2004. Empirical Evaluation of Agile Software Development: The Controlled Study Approach. VTT Technical Research Centre of Finland.
  4. Weibelzahl, S. and Weber, G. Advantages, Opportunities and Limits of Empirical Evaluation: Evaluating Adaptive Systems. University of Education Freiburg, Germany.
  5. Whiteson, S. and Littman, M. 2010. Empirical Evaluations in Reinforcement Learning.
  6. Fernandez, N.C., Daneva, M., Sikkel, K. Wierings, R., Dieste, O. and Pastor, Q. 2009. A Systematic Mapping Study on Empirical Evaluation of Software Requirements Specifications Techniques. Third International Symposiumm on Empirical Software Engineering and Measurement.
  7. Wainer, J., Barsottini, C.G., Lacerda, D and Magalhaes, L.R. 2009. Empirical Evaluation in Computer Science. Research published by ACM.
  8. Elmas, T. and Chen, X. The Core Evaluation of a Software Engineering Research Result Must Be Empirical in Nature.
  9. Rainer, A. and Beecham, S. 2008. A follow-up empirical evaluation of evidence based software engineering by undergraduate students. School of Computer Science, University of Hartfordshire, U.K.
  10. Szajna, B. 2001. Empirical Evaluation of the Revised Technology Acceptance Model. Department of Management, Texas Christian University.
  11. Braind, L.C. 2011. Empirical Evaluation in Software Engineering: Role, Strategy, and Limitations.
  12. Wohlin, C., Runeson, P., Host, M., Ohlsson, C., Regnell, B. and Wesslén, A. 2000. Experimentation in Software Engineering: an Introduction. Retrieved 24th December 2012 from http://www.citeulike.org/group/1374/article/3945036.
13.  Devanbu, P., Karstu, S., Melo, W. and Thomas, W. Analytical and Empirical Evaluation of Software reuse Metrics. Retrieved 23rd December 2012 from http://dl.acm.org/citation.cfm?id=227760
  1. Basili, V.R., Selby, R.W. and  Hutchens, D.H. 1986. Experimentation in software engineering.

Retrieved 24th December 2012 from http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6312975&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D6312975.

  1. Sjoberg, D.I.K. , Dyba, T. and  Jorgensen, M. 2007. The Future of Empirical Methods in Software Engineering Research. Retrieved 24th December 2012 from http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=4221632&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D4221632.
  2. Kitchenham, B.A., Dyba, T and Jorgenson. 2004. Evidence-based Software Engineering. Retrieved 24th December 2012 from http://www.cs.colostate.edu/~bieman/CS614/Papers/Kitchenham-etal.icse04.pdf.