Capítulo 4 Evaluación de procesos
4.4. Operación:
distributed frameworks on thousands of nodes to answer basic questions.
Now, I can do almost the same stuff, and I can tweak a learning algorithm that teaches students in the best way they can learn. There’s a whole feedback system that says, “you should answer these questions, and then five minutes from now, we’ll come back and repeat it, and then we’ll come back a week later and repeat it again.”
The wonderful thing is that those algorithms, that whole pattern, is being replicated from galaxies to psychology and cognition. All of these high topics of knowledge are beginning to trickle down, and they’re actually making a real impact on day- to-day interactions. There is not a single company on NASDAQ that doesn’t use some aspect of this. Your Facebook Newsfeed is highly tweaked to give you everything that you think is relevant, and new content to test your preferences.
LinkedIn is using all kinds of graph networks. Square is using all these fraud detection techniques. HealthTap is fielding all of these questions, and training a computer to understand what these questions are. And there really are doctors who will be answering a lot of those medical questions.
The cool thing here is that they can take a doctor and clone him virtually. He can answer a question, and that might reduce patient time in a hospital somewhere. And when you take that power, and you multiply it by the number of patients in the whole world — it’s a huge number. These are real things. We’re not limited to theoretical worlds. You really can go out and have awesome effects immediately, and they’re tangible. We’re collecting more and more data, to the point that there are not that many aspects of life that aren’t becoming data driven. So it’s super exciting.
Imagine if you were able to go back to the beginning of your graduate school career, and you meet yourself coming in the corridors and you have a five minute window to speak to yourself. Would you tell yourself to do anything differently? A lot of it would have centered on working more with people. I joined an open source project, and that was the single best decision in all of graduate school. I learned how to code in a collaborative way.
The second most important thing probably would have been communication. Every week, I would deliver a presentation on my results during the past week, and usually, it
I joined an open source project, and that was the single best decision in all of graduate school. I learned how to code in a collaborative way.
CHRIS MOODY
93
would boil down to giving a two or three minute feedback session at the end of that. So I was already doing a lot of communication and I wouldn’t have changed that.
My programming context was great; maybe I should have started that earlier and taken more formal programming classes. If you were to design the curriculum, I’d say you have to have a lot of programming. A lot of classes are like, ‘go and do this assignment.’ The real world is, ‘go do this assignment but you only have to do this module, and someone else will do the next module. You guys need to be working collaboratively.’
They should also be doing lots of statistics, and they should be able to do it as quickly as possible. People love to talk about this Pareto Principle, where 80% of the outcomes result from 20% of the effort. The hard part is trying to figure out where that 80% line actually is, and once you realize you’re at it, stop.
How can people find open source projects to participate in?
A lot of the time, they already exist. You probably already know what they are because you hear about them. The biggest thing is not to be shy about it, and not to be scared off. It took me a long time to work up the courage to actually push code back out and be able to take the criticism. No matter where you’re working, there are other people working with similar problems. Just go out and search for them. If they haven’t solved your specific niche problem, join the effort. It’s a worthwhile process. It’s really hard to convince graduate students about this, who are already overwhelmed with a lot of other things, but it is definitely the best part of those five years.
Your advisor is going to be pushing you for results, and my advisor said it had been years since he’d written any code. So you might not realize how important this is. But in a world that is becoming way more team-based, both in industry and science, it’s super important to push everything into a team-based context. Also, if you’re in science, you’re all about trying to communicate your results. One of the best ways to do that is through your open source network. They have an audience there, waiting for you, and they might be really interested. A lot of it is, ‘I built this feature onto this project.’ They’ll go try it out and maybe they’ll write a paper about it, and then you get an extra citation.
It’s a little unfortunate that the primary currency of science is citations and not source code, even though that’s a big infrastructure push. I think that will have to change going forward because everything is being done in a team-based context.
CHRIS MOODY
94
There are a lot of extra indirect effects. The direct effect is that you’ll be better. The indirect effects are that there are a lot of other people who will benefit, and that will reflect very well on you.
It’s a little unfortunate that the primary currency of science is citations and not source code, even though that’s a big infrastructure push. I think that will have to change going forward because everything is being done in a team-based context. To do science more efficiently, it has to be that way. There’s no other alternative.