Artículos
Recibido: 01/10/2020
Aprobado: 01/01/2021
Juan David Pinto-Corredor
Vanessa Agredo-Delgado
Pablo H. Ruiz
Cesar A. Collazos
The usability evaluation includes a set method to analyze the system quality used in different development life cycle stages, for which there is a wide variety of evaluation methods (EM), each method uses certain ways and techniques to measure several aspects. Its choice does not only depend on which answer you are looking for but on multiple factors. The existing EMs are appropriate to assess the Virtual Learning Environments (VLE) usability due to the lack of combination methods or specific evaluation methods for this type of software. This paper focuses on showing the initial process before the creation of the guide from the EM combination that allows the usability evaluation in VLE. The process was carried out, initially, with a bibliographic analysis about the EM usability of existing interactive systems and a comparison among them, where the first combination version was obtained to address the object of study and to select the useful methods in order to assess usability in this context. The result of the application of these metrics will be the combination of EM usability to form the VLE evaluation guide in upcoming research.
La evaluación de usabilidad incluye un conjunto de métodos para analizar la calidad de uso del sistema en diferentes etapas del ciclo de vida del desarrollo, para lo cual existe una amplia variedad de métodos de evaluación (ME), cada método usa ciertas formas y técnicas para medir diferentes aspectos. Su elección no solo depende de qué respuesta se está buscando, sino de múltiples factores. El problema surge cuando se busca cuál de los ME existentes son apropiados para evaluar la usabilidad de los entornos virtuales de aprendizaje (EVA), considerando que no hay métodos combinados o métodos de evaluación específicos para este tipo de software para obtener una evaluación completa, consistente y considerar factores tales como efectividad, eficiencia, satisfacción, tiempos razonables, entre otros. Es por eso que surge la siguiente pregunta: ¿Qué combinación de ME de usabilidad son apropiadas para aplicar en EVAs? Este artículo se centra en mostrar el proceso inicial antes de la construcción de la guía obtenida de la combinación de ME que permitan la evaluación de usabilidad en EVAs. Proceso que se realizó, inicialmente, con un análisis bibliográfico sobre los ME de usabilidad de sistemas interactivos existentes y una comparación entre ellos, donde se obtuvo la primera versión de la combinación, con esta versión, se definió un conjunto de métricas que se aplicarán al EVA objeto de estudio y permitirá seleccionar los métodos útiles para evaluar la usabilidad en este contexto. El resultado de la aplicación de estas métricas será la combinación de ME de usabilidad para formar la guía de evaluación de EVA en próximas investigaciones.
Due to the great Internet growth in recent decades, online education has become a great alternative to traditional education. In the same way, educational institutions use available technologies and advances to provide more information to a growing audience. While the online education system proposals and modalities are growing, the number of people who use them is growing, too. So, it is necessary to consider the diversity in people's needs and characteristics to design Virtual Learning Environments (VLE) [1]. In this way, it contributes to design and builds online education systems; so, people can use them in a simple, effective, and efficient way that might provide a positive user experience. The increasing number of publics using them is growing; for this reason, the User Experience (UX) is a fundamental part of the success of the VLE [2]. The UX refers to "how people feel about a product and its satisfaction when they use it, look at it, sustain it, open it or close it" [2] The UX covers different aspects related to the software product quality such as accessibility, emotionality, usability, among others [3]. In this sense, the current research focuses exclusively on the UX "usability" feature concerning the "ease of learning", (which is defined as the time that a user - who has never seen an interface- can learn to use it well and perform basic operations, how much does it take a typical community user to learn how to use relevant commands from a set of tasks? [4]), specifically, in the VLE usability study.
Also, the usability evaluation has been determined as the activity comprising a method set that analyzes the interactive system quality use in different development life cycle stages [5]. It is necessary to perform the usability evaluation to validate if the final product meets the requirements and is easy to use. The evaluation’s main objectives are; to evaluate the system functionality scope and accessibility in order to evaluate the user’s experience in his/her interaction and to identify specific problems [6], These are the objectives that will be sought when evaluating a VLE with a combination method guide. To perform the usability evaluation, there are different Usability Assessment Methods (UAM), which depend on variables such as costs, time availability, and human resources among others [7]. In this way, choosing methods to evaluate VLE usability is not an easy task [8]. A series of UAM can be applied on a VLE, but the concern is related to how precise the information is given and at any combination of it. Similarly, there is no standardization regarding what, how, and when to perform the usability evaluation, but methods have been developed and used in an isolation way and with specific criteria to evaluate a particular product [9]. The usability assessment methods have strengths and weaknesses, they are focused on evaluating certain aspects or usability requirements, too. So, it is advisable to combine them in evaluation to complement each other in terms of their strengths and to cover a greater number of evaluation aspects [10]. The selection and evaluation methods combination will depend on financial and time constraints, the development cycle phases, and the development nature system [11].
Based on that, the problem arises when deciding which of the existing evaluation methods or combination is appropriate to evaluate the VLE usability. Therefore, the evaluation is completely and consistently carried out, getting concrete results on its usability, considering factors such as effectiveness, efficiency, satisfaction, reasonable timing, among other factors [11]. For this reason, the following research question emerges: Which of the existing UAMs are appropriate to apply in Virtual Learning Environments? That is why this paper focuses on the study about a set of methods for evaluating usability on VLE. These methods, after being selected, characterized and analyzed, will constitute a new combination method to evaluate the VLE usability, which can provide more complete and integral usability information, regarding the performance of the evaluation methods indiscriminately and independently. This paper is structured as follows: Section 2 shows a theoretical context to contextualize relevant research topics, section 3 contains some related works to usability evaluation in VLE. In section 4, the process to make the first UAM combination version and it is shown, and in section 5 the conclusions and future work are described
Important theoretical references about the guide development process for usability evaluation in Virtual Learning Environments are outlined below:
The VLEs are part of the computer set of applications designed for educational online purposes, which aim to achieve the educational objectives by providing tools that facilitate the user and course management, communication processes, evaluation, collaboration, and content distribution [11]. They present a functionality series to facilitate the teaching and learning processes that can unfold through a software tool according to each specific context need [12].
The term User Experience (UX) refers to "how people feel about a product and their satisfaction when they use it, look at it, sustain it, open it or close it" [2]. There are different UX definitions used by professionals in the HCI (Human-Computer Interaction) area, one of the most outstanding is the ISO 9241-210 standard definition [13], "Perceptions and person’s responses resulting from the usage of the product, system, or service usage". UX covers different aspects related to software product quality. The ISO/IEC 25010 standard [13] considers in a general way the following UX aspects: accessibility, dependability, emotivity, playability, usability, among others.
The term usability, in general, is defined as the ease of use, whether it is a web page, a computer application, or another system that interacts with a user [15]. Being one of the most important web applications quality features like reliability and security [14]. It determines the user’s satisfaction when interacting with the system. The usability system and its constant improvements lead to a significant increase in the user's experience quality evaluation.
The usability evaluation has been determined as the activity comprising a set of methods that analyzes the quality use of an interactive system, at different development life cycle stages [15]. It is necessary to perform the usability evaluation to validate that the final product meets the requirements and is usable [7]. Usability evaluation is a fundamental part of the software development iterative approach because evaluation activities can produce design solutions to be applied in the following development cycle or, at least, greater knowledge about the nature of the detected interaction problem [9].
The UAMs have become an interesting study source by the usability researchers, their application characteristics, the existing methods variety, and the generated results [15]. They allow usability characteristics evaluation such as the ease of learning, the ease and efficiency of usage, the ease to remember how it works, the frequency, and error severity [16].
Below, some related works are presented to justify the need to do the research presented in this paper (also analyzed in the paper [17]). Otaiza in [18], presents a study in which the UAMs have been studied in transactional web applications, contrasting their characteristics, and generating a methodological evaluation proposal to obtain the largest amount of relevant information regarding the usability of these kinds of applications.
In the same way [19], they examine e-learning usability evaluation methods, compare them, and propose criteria set that should be consulted when choosing the appropriate method to evaluate the e-learning systems usability. The research shows that none of the examined methods has allowed integral e-learning platforms usability evaluation and none of them addresses all relevant specific topics for the learning systems and modules.
In [15], methodological criteria are presented to evaluate the Course Management Systems’ (CMS) usability. The evaluation was carried out by combining different methods and instruments with the potential platform users: a group of teachers and language students. Traditional usability evaluation methods were used, they were mixed and some new ones were originated to evaluate not only the elements that make up the usability but also the functionality and the pedagogical aspect of the CMS.
In [20], a model to evaluate the VLE quality is proposed, considering usability as the central axis. The model is called MUSA, because it is a model based on usability, and is oriented to evaluate products in use. The general ideas are based on a four-level strategy or evaluation layers, which start from the general to reach the particular, where the usability definitions among attributes and heuristics form the core model.
Similarly, [21] is a research that is focused on analyzing the virtual learning environment usability for undergraduate university students, emphasizing psycho-pedagogical aspects that allow evaluating both, the quality contents and the system that contains them. The students' frustration implications in their cognitive process are analyzed, establishing this emotion as an immediate bad interface designing consequence.
This work is developed following a research methodology based on multi-cycle action with bifurcation [22]. The strategy starts with an initial research cycle where three problems are identified: conceptual, methodological, and evaluation. This allows us to divide the work into three research cycles: conceptual cycle, methodological cycle, and evaluation cycle.
In this cycle, a contextual analysis is carried out to find the problem to be studied, this cycle has three research phases: A literature study about the virtual learning environments, user experience, usability, usability evaluation, and characteristics, attributes, or elements that are related to the usability assessment methods are analyzed from an extensive literature review. The second phase is the appropriate UAMs for executing in VLE identification, where, from the literature study, a possible set of methods are established to make up the proposed combination in this research, and finally, the activities, resources, and the phase of assigning a person in charge for each of the selected methods.
Besides, the reviewed and analyzed methods in the literature and the possibly formed combination of UAM in VLE are made and analyzed regarding what is presented in the project “Usability integration in software development process framework [3]".
As evaluation methods and according to the research carried out by Muñoz et al. [23], the three main groups were analyzed: inspection, inquiry, and testing. The methods of each group were compared with each other, this in order to determine their advantages and disadvantages, analyzing the most relevant characteristics (development stage where should the method be used, place of performance, if the method generates quantitative data, if it can be done remotely, the time it takes to carry it out, the number of required evaluators, the number of users required to execute it, if it is possible to do it automatically, and if the method analyzes the usability characteristics such as intelligibility, learning, operability, errors, esthetic, accessibility), considering the following related works [24], [18], [19], [15], [20], [21].
From the comparison of the inspection methods shown in Table 1, it can be said, in certain aspects, that the heuristic evaluation takes some advantages over the other methods, mainly because of the ease of carrying it out, which is not accomplished in other methods. According to the methods: actions analysis and standard inspection, highest level experts are required, and for the routes (cognitive and pluralistic) there must be considered task definition methodologies and certain training, characteristics that increase the complexity of carrying them out. However, the inspection potential methods are not in doubt, because, if the time and necessary conditions to carry them out are available, good results can be obtained [23]. From the comparison, it can be seen that factors such as time, equipment, and experts’ level, heuristic evaluation is still the simplest method to perform. However, the cognitive journey characteristics are very similar to the heuristic evaluation ones, except for the experts’ experience level. Considering the aforementioned factor, the methods: inspection of standards and analysis of actions, become the most complex ones to perform, but other factors that benefit them from the other methods, such as the data type they obtain, and the evaluators need to carry it out.
Table 1 convention: Heuristic Evaluation: 1, Cognitive Walkthrough: 2, Action Analysis: 3, Pluralist path: 4, Formal Inspection: 5, Standard Inspection: 6.
Thus, it is possible to establish certain comparisons among the test methods, shown in Table 2. The interrogation methods (questionnaires and interviews) are the simplest test methods to perform. Its characteristics allow, with few economic resources and with a preparation that does not take too much time, to obtain satisfactory results regarding the system’s usability evaluation. The interrogation methods are aimed to obtain subjective information from the system under evaluation, obtaining, in many cases, information that cannot be collected through other evaluation methods.
Table 2 convention: Focus Group: 1, Thinking aloud: 2, Constructive Interaction: 3, Questionnaires: 4, Interviews: 5, Surveys: 6, Formative Experiments: 7, Recording of use: 8, Performances Measurement: 9, Driver's method: 10, Test retrospective: 11.
And finally, it is also possible to establish certain comparisons among the inquiry methods, shown in Table 3. We can say then that contextual inquiry offers a deep understanding of the user's work, but on the contrary, it can only be used in the early stages of development but it generates a lot of information that makes it difficult to assimilate and analyze it, too. The participatory inquiry has the advantages that it does not waste users' time, and that it can be carried out remotely and does not need experts to use it, although it can be time-consuming when dealing with complex system tasks.
UAMs evaluate specific usability aspects to obtain the best processes for evaluating usability [25]. Different factors influence this, such as time, simplicity, the type of results, phases within the development cycle, economic resources, users, and experts’ number, among others.
Due to the large number of UAMs, it is necessary to select a smaller set of them to be the study object in this research [3]. Hence, it has been taken as a reference due to the following criteria: training need-closeness to Software Engineering-user's presence-applicability-contribution vs effort and representativeness. Also, giving them the following values: a little useful, useful, and very useful. Thus, UAMs classification parameters were considered and determined as it appears in the information tables 1, 2, and 3.
The selected inspection methods are:
• Heuristic evaluation
• Cognitive walkthrough
The UAM: pluralistic path, standards inspection, and action analysis are not considered in this project.
As it is shown in Table 1, the pluralistic path is not considered by its impractical simulation, as well as the fact that it makes it difficult to analyze due to the number of participants per session. The standard inspection is not considered either due to the wide standard of knowledge in terms of the high level of training and the lack of considering actions to be evaluated. The action analysis was not selected, mainly, because it requires a higher-level expert, which is expensive for most organizations to achieve. The formal inspection was also discarded by the need for highly experienced evaluators and a numerous team, which makes the implementation cost higher.
Below, the instruments of the selected inspection methods are mentioned:
• Formal experiments
• Questionnaires and interviews.
• Constructive interaction
• Driver's method
The UAM: thinking out loud, use recording, performance measure, and retrospective tests, are not considered in this research. Based on the information in Table 3, thinking aloud is not considered because it interferes with the normal user’s behavior, which influences the interaction with the system. The use of the recording is not considered because its performance requires a high training level by the evaluators, the effort to establish the equipment is high, and, in addition, it is especially indicated only to analyze websites (level of low applicability). The performance measurement was not selected because it does not ensure the usability target of the obtained measure. Subjective information was neither used such as opinions, attitudes, satisfaction, and the environment used is not natural for users, so it can distort the users’ performance. The retrospective test was not selected because it takes at least twice as much time as any other method.
Inquiry methods selection
The inquiry methods will not be considered for the VLE usability evaluation methods combination because they were discarded in the related works [24], [xref>16], [19], [15], [20], [21].
It should be highlighted that the UAMs initial proposed selection is here linked to fundamental usability aspects and to different variables that must be considered in the development. The chosen methods could be applied correctly, effectively, and simply. However, these methods do not take into account the characteristics of a VLE, and for this reason, they do not evaluate the usability of these features and are mainly focused on a general application. Thus, it is expected that the usability evaluation in VLEs is carried out subsequently under the selected usability metrics that will yield the most appropriate UAMs to evaluate the usability in VLEs, due to the fact that those evaluations consider VLEs usability characteristic-aspects.
The objective of this cycle is to design an evaluation guide based on the formal combination methods for usability evaluation in virtual learning environments. To do that, the following activities were carried out: a literature review to choose a Metrics set that allows to perform the execution of the chosen UAMs in the previous cycle and to select those that can be applied in the VLE, in the same way, the VLE study object selection is made, followed by the UAM execution in order to choose the methods that make up the combination of the methods for the usability evaluation in VLE, and finally, as a subsequent activity, the guide that will be validated in the evaluation cycle will be selected.
For the result analysis, it is necessary to define a metrics set that allows measuring the obtained results from the UAM study target execution objectively. For this, after an observation process and a literature review [26], a metrics series was obtained from the different evaluation and execution methods, which were grouped into the following characteristics:
Feature N° 1: Usability problem detections
• Total number of identified problems
• Critical / severe number of problems
• Frequent number of problems
• NOT critical number of problems
• Problems per functionality number
Feature N° 2: Human resource
• Experts/evaluators number
• Users quantity
• Involved number
• Experts/ evaluators experience (in years)
Feature N° 3: Equipment
• Required amount of software tools / technologies
• Required amount of hardware devices
• Required amount of materials
Feature N° 4: Time
• Time used to complete a task
• Invested time to recover from errors
• Time used to complete the method
• Time used to complete the planning stage
• Time used to complete the execution stage
• Time used to complete the analysis of the results
Feature N° 5: Task
• Number of proposed tasks
• Number of completed tasks number
• Completed tasks per user’s profile number
• Completed tasks percentage
After defining the preliminary metrics set, a survey was drawn up in order to identify; according to the experience and the expert's group knowledge, the most relevant metrics that would allow choosing the UAM for VLE and which allows to carry out the result analysis. The survey was developed using the SUS (System Usability Scale) system [27] so, each question has 5 response options. A consensus was made by 10 experts in usability evaluations of interactive systems (who perform at least 3 evaluations per year).
Once the survey's results were collected and processed (including averages and standard deviation), the most relevant metrics were identified according to their high averages. These metrics are those that were rated as "important" and "very important" based on the participants’ experience who completed the surveys. The identified metrics that correspond to the characteristics are usability problem detection, human resources, and time. However, when making the analysis among the UAMs, the generated metrics by the methods should be considered, so, the human resource metrics will not be considered as criteria to discriminate between the UAMs study object. The reason is that these metrics are not related to the evaluation method itself, but to a test session where it is used, which is different. For example, the people who involved numbers in the method execution (Involved quantity metrics) should not be a criterion to compare among several UAMs because it would be attributed to a metrics value that is not generated by the method itself (the method requirements or work requirements thereof). The same happens with the experts’/evaluators’ experience metrics (in years) since the evaluators’ experience participating in a method does not reveal anything about it.
The selected metrics correspond to base (or direct) measures according to the measurement theory. This indicates that they do not depend on any other measure and which form measurement form is a measurement method [28]. On the other hand, the metrics that belongs to the usability problem detections feature is associated with an absolute scale type [29] because there is only one possible way to measure: counting; while the time feature metrics is associated with a ratio scale type [30], which has a fixed reference point: zero (no value can be less than zero).
Now, once the measurement process is done, the metrics values are not between 0 and 1 (exceed 1), so a standardization table must be used to take them to a value scale between 0 and 1. After normalizing the values, the metrics generate a real number that is in a range between 0 to 1. Thus, the metrics provide positive evidence if the values are close to 1. In the case of the metric related to time, in which "good" values are those that approach to zero, it would be necessary to perform a calculation like this: Vc = 1 - V. So, when the value of the metric (V) is closer to zero, the complementary value (Vc) will be closer to 1, so, the metrics can be taken to positive values (or increasing). Regarding the metrics corresponding to the time characteristic, a base time has not been established to carry out the planning, execution of the results analysis stages. The reason is the time may vary according to the evaluators' number and users participating in the evaluation process. On the other hand, the UAM stages execution speed planning, execution, and analysis of the results).
Here are the metrics that will be considered:
• Number of identified problems
• Number of critical problems
• Number of frequent problems
• Time used to complete the planning stage
• Time used to complete the execution stage
• Time used to complete the analysis of the results
As previously mentioned, the metrics that belongs to the human resource feature will not be considered to discriminate among the executed UAMs. On the other hand, regarding the Implicated Quantity metrics, it provides positive evidence when the method involves a user’s number greater than or equal to the established one, several (minimum 3) evaluators, and at least one organization representative. Finally, regarding the experts’/evaluators’ metrics experience (in years). This provides positive evidence the higher it is because it directly influences the quantity and quality of the obtained results in the evaluation of execution methods (inspection and test).
According to the above mentioned, the idea of having the final metrics set is to be able to apply these metrics, when executing the UAMs (selected in the conceptual cycle) in a VLE study object and from the values obtained to choose finally the suitable UAMs for the VLE to define the final guide.
This is the actual research location, in the project execution, planning some activities has been defined and will allow the final guide construction for the usability evaluation in virtual learning environments, the activities defined for this cycle and they have not been developed yet but they will be mentioned below:
• To identify the Virtual Learning Environments study object
• To execute the usability evaluation methods on the VLE study object with the selected metrics application
• To process the obtained results to identify the appropriate UAMs to make up the combination of the methods for the usability evaluation in Virtual Learning Environments
• To prepare a guide for usability evaluation in VLE based on the combination made in the previous step
In the evaluation cycle, the case study will be designed and executed to validate the proposed guide, followed by the obtained result analysis, and the redefinition of the guide considering the obtained results.
After the literature analysis, it was found that there is no standardization regarding settings of usability evaluation in VLEs. Nowadays, methods have been used in isolation, and with specific criteria to evaluate a product, methods that are not designed to evaluate VLEs usability. Therefore, what is intended to improve with the defined guide in this proposal will contain an evaluation combination method to be applied in virtual learning environments, due to the current boom of these software types of systems and the need to think of a satisfied end-user.
The selected methods to form the initial combination have been chosen because they are the fittest to evaluate the usability specifically in VLE, this selection has had several evaluation criteria as previously shown and from which it was possible to select of inspection methods: Heuristic Evaluation and Cognitive walkthrough, of the Test methods: Formal experiments, Questionnaires and interviews, Constructive interaction and Driver's method.
The types of inspection and test methods selected will be applied at later stages of the investigation in a VLE and will be executed together with the previously defined and selected metrics: number of identified problems, number of critical problems, number of frequent problems, time used to complete the planning stage, time used to complete the execution stage, time used to complete the analysis of the results. This in order to make another filter to the methods and leave those that really contribute in the evaluation of a VLE and as a result, it is expected to obtain a final combination that will be the basic guide to apply in a VLE, generating the information of usability required for these contexts.
Based on the obtained results in this research, we consider that the UAM selection strategy for VLE is a possible way to choose them, without leaving aside other strategies that can contribute with a selection that may fit in a better way to the VLE needs.