Winner Interview
Using Data Science to Predict Video Popularity | Winner’s Interview with José Joaquim
Insights from our Data Science Competition Winner
Our Video Popularity Prediction Challenge recently came to a close, so we reached out to the 1st place winner José Joaquim. Here is what we talked about.
Meet José Joaquim!
Please introduce yourself, including your name and academic/professional background.
I’m José, 22 years old, doing my bachelor’s in Medicine. My interest in Data Science started a year ago when I realized that Machine Learning will be widely present in health fields in the near future. I’m proud to say that my main background is hard work and consistent study. I used to study 5 hours daily on average. I think there’s nothing that one cannot learn by working hard, despite their main field. This was my second data science competition and I’m very glad to be the winner. This result motivates me to keep studying even more, keeping in mind that in the next year I’ll be better on DS.
Are you currently working as a data scientist?
I’m currently a member of a University research group. That’s not exactly work 🙂 But I’m always open to changes/opportunities.
Why did you decide to join this Video Popularity Prediction challenge?
I decided to join the competition because business is a very important thing for me. Although I’m graduating in medicine, I plan to have my own business in the near future and the problem that Video Popularity Prediction proposed is a real-world problem. I enjoy solving real world problems despite my academic profile.
Have you ever participated in a competition like this before?
Yes.
Let’s get Technical
What was your impression of the dataset and problem statement for this competition?
My impression was that it proposed a worth solving problem. It was mainly responsible to engage me in participating, DataGateway platform can strongly differentiate its service with a good views predictor platform.
Please explain your winning solution and the process you used to build it.
I’m inclined to prefer simple solutions (not simplistic) rather than complex ones. My main idea was to work with reduced dimensionality, as I had many variables to build a model. With fewer variables, a model might probably generalize better!
Why do you think you were able to create a winning solution?
Mainly because I strive for a simple solution.
Do you have a standard step-by-step approach that you use in data science competitions like this one? If so, could you share it?
No, even with some NumPy and Pandas functions that I use every day, I refuse to memorize. I’m much more inclined to stop and think about each problem that I face than to have a standard approach. But I always start by importing libraries.
Did you face any problems or difficulties in this challenge? Please explain.
I found the Bitgrit website and team very professional and clear. I found the competition timeline, rules and schedule was easy to understand. The datasets used were also easily available. However, I found a little hard when I started, because I never worked with image datasets. But I did some research on google and stopped a little bit to think about how to deal with that image dataset. That was my main challenge.
Is there anything you would do differently if you could start over on this challenge?
No. In my life I try to always do the best that I can. So, by that moment, I tried to give my best.
What did you think about this competition overall?
I think that it’s a good competition, a good problem to think about and a good goal.
Words of wisdom
Did you learn anything new by participating in this challenge? If so, what?
Of course, I learned many things. The first was that I’m not less capable than people with academic backgrounds in Data Science/math/statistics or related fields. Nothing can beat dedication and sweat. For me, the main skill necessary to be good at building Data Science solutions is the ability to think deeper and work hard trying to grasp the problem at its best.
Do you have any advice for newbies looking to get started on a machine learning challenge like this?
Just start. Don’t go with the standard theory-based approach. Start empirically. As Feynman said, if you can’t explain an idea to a kid, you did not grasp it well. Don’t be shy or fearful if you do not have a PhD on math or statistics! Learn the main concepts empirically.
That’s all!
Follow our socials to stay updated!
Want to discuss the latest developments in Data Science and AI with other data scientists? Join our discord server!
Follow Bitgrit’s socials 📱 to stay updated on workshops and upcoming competitions!