Adequades però millorables

Jaan Kalda

Academic Committee of IPhO 2012

Let us analyze the overall difficulty of the problem set using the score-rank rela- tionship presented in Fig 1. For an ideal well-balanced problem set, the net scores of the contestants should be equi-spaced: the point-rank graph should be a nearly straight line connecting the winner with a maximal score and ending at the last- place-owner with 0 points. Bearing in mind that the performance of the contestants is not fully predictable, such a problem set is hardly achievable. One can also argue that in order to determine the absolute winners and gold medalists more reliably, it is better to have larger point differences at the small ranks.

Fig 1. Point-rank graph for the total score; blue line - before moderation; black line - after moderation.

The graph in Fig 1 tells us that the simple questions might have been slightly too simple: the linear trend at the middle of the graph breaks at the right-hand-side of the graph, where the curves turn steeply down. Meanwhile, the difficult questions were difficult, indeed, and provided a good separation between the very best contestants; this is evidenced by another steeper segment at the left-hand-side.

141

It can also be of interest to compare the curves before and after the moderation. This year, the typical distance between the two curves is less than 1 pt; without going into details, one can say that this is quite a reasonable result. Ideally, a smaller distance between the two curves should imply a better initial grading. However, another factor here is the “flexibility” of the graders during the moderation. For a partially solved problem, the decision of how many points should be awarded is always slightly subjective. While the markers try to settle at the middle of the uncertainty interval (to provide as fair a grading as possible), the leaders tend to ask as many points as possible. (This is why the graders had been instructed to grade “generously”: if doubting between two options, opt for more points.) However, being too generous, i.e. giving too many marks for a partially solved problem, will be unfair in respect to those who have solved the problem flawlessly. Thus, regardless of how good the initial grading was, there will always be some room for negotiation during the moderations, and a score shift is some- times explained by the willingness of the graders to accept the arguments of the leaders. As an example, this year there were two “compromises” due to which the graders went through all the examination papers a second time. For all the experimental tasks, marks were added for a correct plotting of wrong data points, and for the Problem T3-iv and T3-v, partial credit for an incorrectly written first law of thermodynamics was increased.

Next, let us have a look at the distribution of the theoretical and experi- mental marks separately.

142

Fig 2. Point-rank graph for the scores of the experiment and for the theory For the theoretical examination, the linear trend extends down to the right- most corner of the graph. Therefore, none of the questions were too simple! As for the experiment, the graph is qualitatively very similar to the graph of the total scores.

Let us dwell even deeper into detail and have a look at the distribution of points for all the theoretical problems.

143 Fig 3. Point-rank graphs for the scores of the Problem T1, T2 and T3.

The simple questions of problem T1 were, indeed, simple: the total cost of these was 3x0.8=2.4 pts and ca 40% of the contestants got at least this much. However, the supposedly medium difficulty questions, each worth 1.2 and total- ling in 3.6 pts, were actually quite difficult: answering all the simple and medium questions would have resulted in 6 pts, and only 6% of contestants got 6 points or more.

The contestants with top scores are as follows (all gold medalists). 12.1: Attila Szabó (HUN);

10.9: Eric Schneider (USA); 10.2: Hengyun Zhou (CHN); 10.1: Yijun Jiang (CHN); 9.3: Ilya Vilkoviskiy (KAZ);

9.1: Paphop Sawasdee (THA), Wenzhuo Huang (CHN); 8.7: Chien-An Wang (TWN);

8.2: Wonseok Lee (KOR); 7.7: Jun-Ting Hsieh (TWN); 7.6: Ding Yue (SGP);

144

7.5: Kuan Jun Jie, Joseph (SGP);

7.2: Sooshin Kim (KOR), Siyuan Wei (CHN); 7.0: Phi Long Ngo (VNM).

Next, about Problem T2. As you can see, this is a problem with a perfect balance between simple and difficult questions: there is almost a linear line connecting the upper left corner with the lower right corner. The contestants with top scores are as follows (all gold medalists, unless otherwise noted).

8.0 pts: Chien-An Wang (TWN), Yijun Jiang (CHN);

7.9 pts: Jun-Ting Hsieh (TWN), Tudor Giurgică-Tiron (ROU);

7.8 pts: Hengyun Zhou (CHN), Chi Shu (CHN), Rahul Trivedi (IND), David Frenklakh (RUS, Silver), Kacper Oreszczuk (POL, Bronze),

7.7 pts: Wenzhuo Huang (CHN), Siyuan Wei (CHN), Jaemo Lim (KOR), Tanel Kiis (EST, Silver).

Finally, Problem T3. A slight score saturation can be observed for this problem: the curve “hits the roof” (i.e. the maximal value of 9.0 pts) at the upper left corner. There would have been probably a better balance between difficult and easy questions if the hint about Kepler’s laws were not given in the text of the problem. However, including the hint was the wish of the International Board, and the problem set as a whole was difficult enough even with the hint included ...

The contestants with a full score (9.0 pts; all gold medalists): Attila Szabó (HUN), Paphop Sawasdee (THA), Chien-An Wang (TWN), Siyuan Wei (CHN), Yuichi Enoki (JPN), Rahul Trivedi (IND), Puthipong Worasaran (THA), Tudor Giurgică-Tiron (ROU), Ngoc Hai Dinh (VNM), Alexandra Vasilyeva (RUS, Silver), Volodymyr Sivak (UKR, Silver), Bijoy Singh Kochar (IND, Silver), Nurzhas Aidynov (KAZ, Silver), Cristian Zanoci (MDA, Bronze).

145 Fig 4. Point-rank graphs for the scores of the Problems E1 and E2.

Problem E1 has a nice distribution at the upper left corner, but too steep a fall-off at the right edge – the simplest tasks of this problem were perhaps too simple. The contestants with top scores are as follows (all gold medalists, unless otherwise noted).

10: Jaan Toots (EST);

9.9: Kai-Chi Huang (TWN), Ivan Ivashkovskiy (RUS); 9.8: Wei-Jen Ko (TWN);

9.7: Attila Szabó (HUN), Siyuan Wei (CHN); 9.6: Allan Sadun (USA);

9.5: Hengyun Zhou (CHN), Chien-An Wang (TWN); 9.4: Wonseok Lee (KOR);

9.3: Jun-Ting Hsieh (TWN);

9.2: Eric Schneider (USA), Wenzhuo Huang (CHN);

9.1: Ngoc Hai Dinh (VNM), Alexandra Vasilyeva (RUS, Silver), Chi Shu (CHN). Meanwhile, Problem E2 was intended to be a difficult problem, aimed at finding the winner of the best experimentalist’s prize. And difficult it was: it had actually no easy tasks, as evidenced by a concave shape of the curve. The contestants with top scores are as follows (all gold medalists, unless otherwise noted).

8.8: Chi Shu (CHN); 8.5: Kai-Chi Huang (TWN);

146

8.3: Christoph Schildknecht (CHE, Silver); 8.1: Ivan Ivashkovskiy (RUS);

7.7: Attila Szabó (HUN);

7.5: Hengyun Zhou (CHN), Huan Yan Qi (SGP); 7.4: Lev Ginzburg (RUS);

7.2: Abdurrahman Akkas (TUR, Silver); 7: Kristjan Kongas (EST, Silver);

6.9: Yu-Ting Liu (TWN), Adam Brown (GBR, Silver); 6.7: Kevin Zhou (USA), Frank Bloomfield (GBR, Bronze).

And now, it is time to have a look at the most difficult questions (tasks). Let us start with the three parts of Problem 1.

Fig 5. Point-rank graphs for the scores of Tasks T1A , T1B, and T1C.

I was quite sure that question iii of Part A would be very difficult for the contestants, and question iii of Part C would be extremely difficult, and I was not mistaken. However, I did believe that question iii of Part B was not that difficult (just difficult, not “very” or “extremely”), and my colleagues from the Academic Committee agreed. However, here we were mistaken: Part B turned out to be the most difficult part!

Part 1A: only ca 20% of students were able to figure out the correct shape of the trajectory. Meanwhile, there was also a considerable number of those

147

who got everything correctly done, including q. iii! This is an interesting case because in order to be able to solve this problem only a moderate physical educa- tion is needed. This is evidenced by the fact that among the best solvers of Part 1A, there are several students whose overall results were not so good. One can only hypothesize that had they passed a full course of physics covering all the Syllabus of the IPhO, they would have been able to get gold medals. The contestants with top scores are as follows (all gold medalists, unless otherwise noted). 4.5 pts: Attila Szabó (HUN), Hengyun Zhou (CHN), Eric Schneider (USA), Wenzhuo Huang (CHN), Yijun Jiang (CHN), Rahul Trivedi (IND), Ilya Vilkoviskiy (KAZ), Kuan Jun Jie (SGP), Joseph Ramadhiansyah Ramadhiansyah (IDN, Honourable Mention);

4.4 pts: Paphop Sawasdee (THA), Jeffrey Cai (USA, Silver), Puthipong Worasaran (THA), Nathanan Tantivasadakarn (THA);

4.2 pts: Michele Fava (ITA, Bronze); 4.1 pts: Ding Yue (SGP);

4 pts: Hakon Tásken (NOR, Participation Certificate).

Part 1B: the list of top-solvers is shorter than before because all the others just did not get enough marks for q. iii to be qualified as someone who really solved this problem. As usual, everyone below got a gold medal, unless otherwise noted.

3.9 pts: Jun-Ting Hsieh (TWN); 3.8 pts: Attila Szabó (HUN); 3.6 pts: Sooshin Kim (KOR); 3.5 pts: Yijun Jiang (CHN);

3.4 pts: Siyuan Wei (CHN), Kai-Chi Huang (TWN), Wonseok Lee (KOR); 3.3 pts: Ihar Lobach (BLR);

3.1 pts: Wenzhuo Huang (CHN);

3 pts: Georgijs Trenins (LVA, Silver), Karlo Sepetanc (HRV, Honourable Mention). Finally, Part 1C: note that about half of those who performed very well here (listed below) lost 0.2 in q. i for drawing too curved field lines. There were only four contestants who got the idea of magnetic charges and realized it flawlessly (these are the first three in the list below, and Kunal Singhal). However, owing to the fact that there is also another way of calculating the force (via integrating over dipoles), several contestants got a correct estimate of the force, and, thereby,

148

collected enough marks to be listed below 4.5 pts: Ilya Vilkoviskiy (KAZ);

4.3 pts: Chien-An Wang (TWN); 4.2 pts: Paphop Sawasdee (THA);

4 pts: Wonseok Lee (KOR), Wei-Jen Ko (TWN); 3.9 pts: Eric Schneider (USA);

3.8 pts: Attila Szabó (HUN), Yuichi Enoki (JPN), Hengyun Zhou (CHN); 3.7 pts: Phi Long Ngo (VNM);

3.6 pts: Kazumi Kasaura (JPN); 3.5 pts: Kunal Singhal (IND, Silver); 3.4 pts: Yu-Ting Liu (TWN);

3 pts: Jun-Ting Hsieh (TWN), Siyuan Wei (CHN), Ivan Tadeu Ferreira Antunes Filho (BRA).

The rest of the theoretical test was not that difficult (q iii of Problem T3 would have been quite difficult, but with the hint inserted by the International Board it no longer was). So we won’t dwell more on the theoretical results, and we’ll switch to the really tricky experimental tasks: A-iv and B of Problem E2.

Fig 6. Point-rank graphs for the scores of the Tasks E2B and E2A-iv.

In the case of Task A-iv, the number of those who really got the correct idea how to measure C(V) was really small - essentially only those who are listed below.

149

2.6 pts: Kuan Jun Jie, Joseph (SGP), Kai-Chi Huang (TWN), Lev Ginzburg (RUS), Ivan Ivashkovskiy (RUS);

2.5 pts: Chi Shu (CHN), Qiao Gu (DEU), Kevin Zhou (USA), Allan Sadun (USA) 2.4 pts: Yu-Ting Liu (TWN), Kristjan Kongas (EST, Silver), Adrian Nugraha Utama (IDN), Tudor Giurgică-Tiron (ROU), Sebastian Linß (DEU, Silver).

Finally, Task B. There was a surprisingly small number of those contestants who noticed that the difference in the graphs of Part A and Part B is localized to the region of negative differential resistance. As for the explanation of the phenome- non (which consists of three key elements), none of the contestants managed to list all the key elements flawlessly, and the only one to get it almost done (with some omissions in the average current part) was Christoph Schildknecht; the next two in the list below mentioned one key element. And so, the best results for Task 2B:

2.9 pts: Christoph Schildknecht (CHE, Silver); 2.3 pts: Attila Szabó (HUN);

2.1 pts: Chi Shu (CHN);

2 pts: Kai-Chi Huang (TWN), Luka Ivanovskis (LVA, Honourable Mention). For those who want to go beyond this statistical analysis, there is also an Excel file http://www.ioc.ee/~kalda/ipho/Results_for_web.xlsx (the names of those who got less than 12.4 pts have been stripped).

The problems of the 43rd IPhO have been thought to be difficult, and it has even been stated that the problem set was the most difficult one during the last 20 years. In order to make a comparative study about how difficult the problems actually were several types of data are needed, which are not freely available for all the Olympiads. Still, I managed to get more or less what is needed (overall number of participants, number of medals, medal boundaries in points, the scores of the absolute winners) for the period covering 1994–2012. The graph is shown below. (The last two digits of the year are shown alongside the curve – except for some curves in the central densely populated region; note that the curves which are based only on the number of medals and on the medal boundaries are interpolated and smooth.)

150

Fig 7. Point-rank graphs for the overall scores of the last nineteen IPhO-s (for those years for which the complete data were unavailable, the curves are interpolated between the datapoints corresponding to the medal boundaries). Even with these data, the IPhOs apart as much as 19 years are not fully compa- rable. I have a feeling that, in average, the preparation level of the leading group of contestants has risen significantly. So, the graph here does not allow comparison of the absolute difficulties of the problems, but only relative ones – relative to the preparation level of the students. One should also bear in mind that the scores of absolute winners have a high intrinsic variability (there is essentially no statistical averaging); c.f. this year: the first and second places were separated by a huge margin of 3 pts.

Therefore, the conclusion is that the claim, about this being the most difficult problem set in the last 20 years, was slightly exaggerated. The problems in Beijing in 1994 were even more difficult, at least in relative terms, and at least when leaving out the contestants with ranks from 2 to 6.

International

board

Minutes

mInuTEs OF ThE mEETIngs OF ThE InTErnaTIOnaL bOard