Determinación de la Concentración de Biomasa Requerida para Inocular

5 Inoculación a Escala de Laboratorio con el Aislado

5.2.3 Determinación de la Concentración de Biomasa Requerida para Inocular

Here we answer the research questions posed in Section 1.3:

Is there a need and/or wish by the intended users for a recommendation engine for high school profiles, sectors and courses? In every interview we asked if the interviewee thinks the system would be useful and if they would use it. All responses were positive (see also the interview transcripts in Ap- pendix B). Students indicated that they had already spent a long time thinking about this important decision, and they would definitely consider recommendations given by the system, especially since their cost is very low (it will not take too much of their time too receive a recommendation by the system), so there is a wish for such a recommendation engine. The students who have made the choice longer ago, have indicated that not enough information was available. They would make different choices if they could choose again.

The recommender system does provide additional information to the students: an objective recommendation based on their grades and a large historical data set. In the online experiment, we asked school counselors whether they would use the recommendation in a conversation with the student. Most of the times, school counselors replied that they would use it (see Figure 7.17). Some school counselors told us that they were looking for additional methods they could use to help the students choose, and that they think that an objective recommendation could be helpful.

There is indeed a need for more information regarding high school course choices. We did not explore whether the course recommendation engine is the best option, but we did ask students for alternative improvements regarding the information availabil- ity, and they did not propose other alternatives.

What are requirements for a recommendation engine for high school profiles, sectors and courses? The requirements for the high school course recommender system were described in Section 3. The system must meet all regulations posed by the government and the school attended by the student and the recommendations should be valuable, diverse and novel (if possible). Requirements were gathered through interviews with stakeholders, a brainstorm, discussion with the supervisors and a literature study.

Which methods exist for making recommendations? The state-of-the- art in recommender systems was presented in Section 2.2. General recommendation methods are collaborative filtering, content based filtering, and systems based on demographic data, utility functions or knowledge. Furthermore, Section 2.4 presented related work on course recommender systems and techniques that were used in such systems.

What are performance metrics for a recommender system? Section 2.3 presented several important performance metrics for recommender systems: accuracy, coverage, novelty, learning rate, confidence, user satisfaction and diversity. Online and offline measures for each of these metrics were described in this section as well. As was discussed in Section 8, novelty may be (one of) the most important performance metrics. In the domain of high school courses, it is (almost) impossible to recommend any truly novel courses, which may have decreased the usefulness of recommendations.

Which recommender methods are applicable in a “Big Data” con- text? All recommender algorithms that were used (item average, user item average, user-based collaborative filtering and item-based collaborative filtering) were able to produce a recommendation within a few seconds with a dataset of about 30.000 students. Furthermore, recommendations do not have to be generated real-time, but they can be prepared at any moment that suits the system best (for example, in the middle of the night, when the workload is low).

Are there relevant ethical issues that need to be taken care of ? If so, what are they and in what ways can we avoid them or deal with them? There are potential privacy issues that have to be dealt with carefully when designing a recommender system. We have discussed these issues in Section 2.2.4 and concluded that, because of the limited set of courses and combinations of courses, the privacy risks for this system are very low.

Another kind of ethical issue has been dealt with in the online experiments. Be- cause we wanted high school students to participate, we sent a research proposal to the ethical committee of the University of Twente, as can be seen in Appendix E. We made sure that all participating students had their parents permission.

If the system is going to be used in practice, then no special permission is needed to make recommendations for the students. The grades are already registered in SOM and can freely be used by the school counselor to guide the students in choosing courses. The recommender does not reveal any personal information about other students to a student using the system, so there is no privacy risk.

What information is available and/or useful to make good recommendations? Available and useful:

9.1. ANSWERS TO THE RESEARCH QUESTIONS 99 • grades for every course, every period

• progress information (implicit)

Not available, but could possibly enhance certain performance metrics such as accuracy:

• explicit feedback: did they regret their decision or not? • friends

• teacher ratings

For this recommender system, we have only used 5 kinds of grade data: average grade in the final year, last grade, average grade over the first 3 years, central exam grade and average exam grade. A student’s progress information can be used to determine whether his choice was a good one, but this is left as future work.

The three types of information that were not available but could improve certain performance metrics, were not further explored in this project. Since these were not available in SOM they could not be used at present to build a recommender system. Information about friends and favorite teachers could be useful to make recommendations that students like, but this is not the kind of information that school counselors and parents want the students to base their choice on, and it ruins the idea of creating an objective recommendation based on historical data. If the system were to use such information, it is less likely that school counselors and students would have confidence in it. On the other hand, explicit feedback on courses taken could be more useful to produce high quality recommendations. Students could be asked for explicit feedback on their courses after they finished high school, possibly even 10 years later. Their final opinion on each course may be more telling than the grade they have received, because even courses that students receive low grades for may be very interesting or useful for them.

Which recommender method has the best performance? First of all, ‘the best performance’ does not have one clear definition. In the offline and online experiments, we measured accuracy, coverage, novelty, serendipity, user satisfaction, confidence and diversity. There was not a single recommender that performed best at all of these performance metrics.

From the results of the offline tests, we have selected two recommenders that performed very well at accuracy related metrics:

• User-based collaborative filtering, ThresholdUserNeighborhood (0.6), Pearson Correlation Similarity, using the final grade from students in their second or third year and the average exam grade;

• User-Based Collaborative filtering, Euclidian Distance Similarity (n=25, minsim = 0.6) using the final grade from students in their second or third year and choice boolean (whether they choose a course or not).

These have been tested for serendipity, user satisfaction, confidence and novelty. The online test shows that the three recommenders that were tested (the two that were listed above, plus a random recommender) performed quite similarly on each of these aspects. Most recommendations were considered somewhat useful and received sufficient grades from students and school counselors in the online experiment, thus user satisfaction is sufficient. The novelty of the recommendations was quite high in the first iteration of the online experiment, but it dropped a bit when we removed

courses that were not taught at the student’s schools. Most students had at least some confidence in the advices and a majority indicated that they understood how the recommendations were made.

Many different answers could be given to the question ‘which recommender method has the best performance’, because 630 different recommenders were evaluated on many different types of performance metrics:

• For accuracy of the recommendations, the user-based collaborative filtering algorithms performed best, and results differed slightly according to which para- meters and similarity measures had been used.

• Many item-based collaborative filtering algorithms performed well on coverage, even better than the random recommenders. However, coverage is not relevant for single recommendations and is therefore not an impportant evaluation metric.

• The random, slope one and frequency recommenders required least time, but almost all recommenders were able to produce recommendations within a few seconds. Considering that these recommendations can be prepared when the system load is low, for example during the night, we conclude that time is not an issue.

• In the online experiments, the three recommender systems performed very similarly on each of the performance metrics.

What qualities should a recommender system have to encourage users to consider recommendations in a serious application domain like high school courses? First of all, it is very important that the recommendations pro- duced by the recommender system meet all requirements. In this case, the prototype that we have built met all legal requirements, but ignored school specific requirements. Recommendations that did not meet these requirements received bad reviews because of this. Using data available in SOM improves the situation but is still not ideal, because the database does not know if anything will change in the next year and certain rules (such as ’the student needs at least 16 points for the courses Economy, History and Geography in order to choose E and M) may not be possible to infer from the data. These are things that are only known by the school counselor.

Secondly, it must be clear to the student and/or school counselor how the recommendation has been made. One of the online experiments has taken place without a researcher on the site. They did receive a letter with explanation about the research and had the opportunity to call the researcher if there were problems or difficulties. The students and school counselor at this school generally gave lower ratings than the participants at other schools, and in the open questions they said that they had diffi- culty understanding the recommendations and the questions that they had to answer in the survey.

Also, it would be nice if the recommender system could take more personality traits and other contextual information into account. Many participants indicated that it would be better if the system knew for example what they wanted to do after high school or which course they did or did not like.

In document Caracterización de bacterias utilizables en procesos de biorremediación de aguas (página 109-114)