• No se han encontrado resultados

4.2 Implementación del balanced scorecard

4.2.2 Estudio de la organización

The model that was proposed in this report, did not perform well when it came to predicting quantiles of the citation distribution for a group of articles from the same country or university. The model did predict the quantiles correctly when predicting for a group of articles from different countries and univer- sities. This means that the Impact Factor and the number of recent citations are insufficient to predict the quantiles for the separate countries or universities correctly. Apparently, some factor is missing in the model that captures the differences between countries and universities. To be able to predict the performance of publications from a certain university or country, a different model would be needed.

In this report, all publications from the scientific field of Physics were taken into account. This is a very broad field of science, with different subfields. Within those subfields, the citation behavior might be different. This is not taken into account in this report, and might need further research.

In the model which we presented in this report, two factors were taken into account to predict the quantiles: the Impact Factor and the number of citations in the first year after publishing. Of course it is possible to add more factors to this model, such as the number of pages that the article consists of or the number of references in the article. In further research, the effect of adding these factors to the model could be investigated. In particular, it would be interesting to see if adding one or more covariates solves the inaccuracy in the predictions for a set of papers from the same country or university.

Another possibility for further research lies in the fitness factor we used in the model. We assumed that the fitness factor was a product of the two covariates to a certain power. Other variants of the fitness factor could also be investigated.

Appendix A

Quantile Regression

In Quantile Regression, the function:

min ξ X i ρj(yi−xiξ), (A.1) where ρj(z) =zj−z1z<0,

is minimized. To show that this quantile regression indeed fits the quantiles of the distribution, we look at the directional derivatives of equation (A.1) [14]. The values ofyi−xiξare just the residuals, which

we callri. The directional derivative of the function that is minimized along vectoruis:

d dt X i ρj(yi−xi(ξ+ut))|t=0 = −X i xiuρ0j(ri−xiut))|t=0 = −X i xiuφj(ri,−xiu), where φj(a, b) = ( j−1a<0 a6= 0 j−1b<0 a= 0.

The vectorξ minimizes equation (A.1), if the directional derivatives of the function are larger than or equal to zero in all directions, so if they satisfy:

−X

i

xiuφj(ri,−xiu)≥0, ∀u.

If we takeu| =

1 0 0

, then because of the definition ofX, xiu= 1for alli. Now the directional

derivative with respect to this vector should of course be larger than or equal to zero:

−X i φ(ri,−1) ≥ 0 − X i:ri=0 (j−1)− X i:ri>0 j− X i:ri<0 (j−1) ≥ 0 −Nz(j−1)−Npj−Nn(j−1) ≥ 0.

Here Nz, Np and Np are respectively the number of observations with residuals of value zero, the

foru=−1 0 0

,xiu=−1for alli, so:

−X i φ(ri,1) ≥ 0 X i:ri=0 j+ X i:ri>0 j+ X i:ri<0 (j−1) ≥ 0 Nzj+Npj+Nn(j−1) ≥ 0.

The total number of articles in the regression isn, wheren=Nz+Np+Nn. By rewriting the previous

two inequalities and combining them, the following inequalities can be obtained: Nn n ≤ j ≤ Nz+Nn n , Np n ≤ 1−j ≤ Nz+Np n .

This means that indeed by minimizing equation (A.1), the resulting model fits the quantiles exactly onto the empirical quantiles. A fraction ofjobservations has a value less than the predictedjthquantile.

Bibliography

[1] J. Beirlant, W. Gl ¨anzel, A. Carbonez, and H. Leemans. Scoring research output using statistical quantile plotting. Journal of Informetrics, 1(3):185–192, 2007.

[2] L. Bornmann. The problem of citation impact assessments for recent publication years in institu- tional evaluations. Journal of Informetrics, 7(3):722 – 729, 2013.

[3] L. Bornmann, L. Leydesdorff, and J. Wang. How to improve the prediction based on citation impact percentiles for years shortly after the publication date? Journal of Informetrics, 8(1):175 – 180, 2014.

[4] Q. L. Burrell. The nth-citation distribution and obsolescence.Scientometrics, 53(3):309–323, 2002. [5] Q. L. Burrell. Predicting future citation behavior. Journal of the American Society for Information

Science and Technology, 54(5):372–378, 2003.

[6] P. Cirillo. Are your data really pareto distributed? Physica A: Statistical Mechanics and its Applica- tions, 392(23):5947–5962, 2013.

[7] A. Clauset, C. Shalizi, and M. Newman. Power-law distributions in empirical data. SIAM Review, 51(4):661–703, 2009.

[8] D. J. de Solla Price. A general theory of bibliometric and other cumulative advantage processes.

Journal of the American Society for Information Science, 27(5):292–306, 1976.

[9] A. L. Dekkers, J. H. Einmahl, and L. De Haan. A moment estimator for the index of an extreme- value distribution. The Annals of Statistics, 17(4):1833–1855, 1989.

[10] E. Garfield. The history and meaning of the journal impact factor. The Journal of the American Medical Association, 295(1):90–93, 2006.

[11] M. Golosovsky and S. Solomon. Stochastic dynamical model of a growing citation network based on a self-exciting point process. Physical Review Letters, 109(9):098701, 2012.

[12] B. M. Hill. A simple general approach to inference about the tail of a distribution. The Annals of Statistics, 3(5):1163–1174, 1975.

[13] W. Ke. A fitness model for scholarly impact analysis. Scientometrics, 94(3):981–998, 2013. [14] R. Koenker and G. Bassett Jr. Regression quantiles. Econometrica: Journal of the Econometric

Society, 46(1):33–50, 1978.

[15] R. Koenker and J. A. Machado. Goodness of fit and related inference processes for quantile regression. Journal of the American Statistical Association, 94(448):1296–1310, 1999.

[16] J. M. Levitt and M. Thelwall. A combined bibliometric indicator to predict article impact.Information Processing & Management, 47(2):300 – 308, 2011.

[18] J. Mingers and Q. L. Burrell. Modeling citation behavior in management science journals.Informa- tion Processing and Management, 42(6):1451 – 1464, 2006. Special Issue on Informetrics. [19] H. P. F. Peters and A. F. J. van Raan. On determinants of citation scores: A case study in chemical

engineering. Journal of the American Society for Information Science, 45(1):39–49, 1994.

[20] F. Radicchi, S. Fortunato, and C. Castellano. Universality of citation distributions: Toward an objective measure of scientific impact. Proceedings of the National Academy of Sciences, 105(45):17268–17272, 2008.

[21] S. Redner. How popular is your paper? an empirical study of the citation distribution.The European Physical Journal B-Condensed Matter and Complex Systems, 4(2):131–134, 1998.

[22] P. O. Seglen. The skewness of science. Journal of the American Society for Information Science, 43(9):628–638, 1992.

[23] D. I. Stern. High-ranked social science journal articles can be identified from early citation informa- tion. Crawford School Research Paper No. 14-06, 2014.

[24] M. J. Stringer, M. Sales-Pardo, and L. A. Nunes Amaral. Effectiveness of journal ranking schemes as a tool for locating information. PLoS ONE, 3(2):e1683, 02 2008.

[25] M. L. Wallace, V. Larivi `ere, and Y. Gingras. Modeling a century of citation distributions. Journal of Informetrics, 3(4):296–303, 2009.

[26] D. Wang, C. Song, and A.-L. Barab ´asi. Quantifying long-term scientific impact. Science, 342(6154):127–132, 2013.

[27] M. Wang, G. Yu, J. Xu, H. He, D. Yu, and S. An. Development a case-based classifier for predicting highly cited papers. Journal of Informetrics, 6(4):586 – 599, 2012.

Documento similar