Chevron Left
Voltar para How to Win a Data Science Competition: Learn from Top Kagglers

Comentários e feedback de alunos de How to Win a Data Science Competition: Learn from Top Kagglers da instituição National Research University Higher School of Economics

708 classificações
151 avaliações

Sobre o curso

If you want to break into competitive data science, then this course is for you! Participating in predictive modelling competitions can help you gain practical experience, improve and harness your data modelling skills in various domains such as credit, insurance, marketing, natural language processing, sales’ forecasting and computer vision to name a few. At the same time you get to do it in a competitive context against thousands of participants where each one tries to build the most predictive algorithm. Pushing each other to the limit can result in better performance and smaller prediction errors. Being able to achieve high ranks consistently can help you accelerate your career in data science. In this course, you will learn to analyse and solve competitively such predictive modelling tasks. When you finish this class, you will: - Understand how to solve predictive modelling competitions efficiently and learn which of the skills obtained can be applicable to real-world tasks. - Learn how to preprocess the data and generate new features from various sources such as text and images. - Be taught advanced feature engineering techniques like generating mean-encodings, using aggregated statistical measures or finding nearest neighbors as a means to improve your predictions. - Be able to form reliable cross validation methodologies that help you benchmark your solutions and avoid overfitting or underfitting when tested with unobserved (test) data. - Gain experience of analysing and interpreting the data. You will become aware of inconsistencies, high noise levels, errors and other data-related issues such as leakages and you will learn how to overcome them. - Acquire knowledge of different algorithms and learn how to efficiently tune their hyperparameters and achieve top performance. - Master the art of combining different machine learning models and learn how to ensemble. - Get exposed to past (winning) solutions and codes and learn how to read them. Disclaimer : This is not a machine learning course in the general sense. This course will teach you how to get high-rank solutions against thousands of competitors with focus on practical usage of machine learning methods rather than the theoretical underpinnings behind them. Prerequisites: - Python: work with DataFrames in pandas, plot figures in matplotlib, import and train models from scikit-learn, XGBoost, LightGBM. - Machine Learning: basic understanding of linear models, K-NN, random forest, gradient boosting and neural networks. Do you have technical problems? Write to us:

Melhores avaliações


Mar 29, 2018

Top Kagglers gently introduce one to Data Science Competitions. One will have a great chance to learn various tips and tricks and apply them in practice throughout the course. Highly recommended!


Nov 10, 2017

This course is fantastic. It's chock full of practical information that is presented clearly and concisely. I would like to thank the team for sharing their knowledge so generously.

Filtrar por:

1 — 25 de {totalReviews} Avaliações para o How to Win a Data Science Competition: Learn from Top Kagglers

por Kostyantyn B

Aug 10, 2018

I am very conflicted about this series, as well as this particular course (How to win a Data Science Competition). Let me try to summarize it.


- This is not an introductory level course. You have a chance to learn some advanced techniques of the state-of-the-art Data Science. There are not that many advanced DS/ML courses available so I was very excited when I found this one.

- You get your foot in the door of the Competitive Data Science, something you may not have courage to do on your own (it was certainly the case with me).

Cons... Where do I begin?...

- The courses of this series have been available for quite some time now. Yet the learning materials still feel very raw: I can live with occasional typos but I have seen some mistakes that I found unacceptable. Including things like wrong math formulas, improperly set up Docker environment, and incorrect "correct" answers (one can actually get the credit for the last question in the Programming Assignment 1 only if a wrong answer is submitted! This has been pointed out on the Forum months before I took it, yet here we are).

- The course content is somewhat strange. It is a mix of an introductory-level material and some pretty advanced tricks. Of course, it is the latter that is most appealing to many students like myself. But the problem is, too many of these topics are covered in a very superficial way providing very little substance. I remember getting all excited when the instructors would start talking about the Kaggle competitions they personally participated in... only to be left disappointed with how little I learned from their experience. I am not finished with the course but this has already happened more than once...

- Finally, with all due respect, the instructors are not what you might call outstanding educators. I realize that not everybody can be like say, Andrew Ng and you certainly get to see a broad spectrum of the teaching skills among the instructors at Coursera. Still, in my opinion some (in fact, most) instructors that participated in creating this series have a long way to go...

I don't mean to be harsh. I certainly appreciate what they are trying to achieve here and it is a noble goal. But the execution is flawed and I felt that I had to say something. I know that they can do better (they are without a doubt, a talented bunch of people) and I really hope they can learn from their mistakes.

por Fabrice L

Apr 11, 2019

A looooot of content!!!

I like the fact that it talk about broad data science topics, and doesn't specialize into one specific domain. You gain some good tricks about pandas, EDA, modeling, feature engineering... etc The skill coverage is very wide.

This is definitely advance, and challenging soemtimes, but you'll learn a lot.

por Yu Q

Dec 05, 2018

I competed this course within almost 3 months, far more time than I planed. The most time I spent on was to create new features via feature engineering and verify the cross-validation method. This course was difficult, but very helpful and inspirational. Thanks to each teacher and tutor!

por Caio A A O

Nov 20, 2017

Some cool tips on the first week, but then on the second one we have a whole section about how to exploit data leaks on competitions, and that's worth 12% of the final grade. This sucks... If it wasn't part of the advanced machine learning specialization, I wouldn't care, but it is. This plus a peer-graded assignment with really broad criteria really got me thinking about whether it's worth doing it for a verified certificate.

por andy

Aug 05, 2018

Assignments are terribly written. Quizzes don't make sense at times.

por Nicholas C

Mar 10, 2018

Should have been labeled "How to Cheat a Data Science competition". An entire week is dedicated to Data Leakage and how to exploit it rather than in the spirit of the competition how to create a model that actually solves the problem.

por Steffen R

Oct 16, 2018

extremely bad supported.

por Nick

Mar 04, 2018

Questions are unclear, authors clearly do not understand what they're asking

por 李继杨霖

Oct 17, 2018

This is the first course I've finished on coursera. At the beginning, the motivation of taking this course is only to get an better score in Kaggle competition because I major in statistics and am interested in data science. But during the processing of learning, I found many important ideas and experience to deal with the real problem and enjoyed the communication with other people from forum and Kaggle, I also aquired some special experience such as peer review, which is not only very fun but also can provide me different aspects to see the problem I'm dealing with again. Thanks Dmitry Ulyanov, Alexander Guschin, Mikhail Trofimov, Dmitry Altukhov, and Marios Michailidis for sharing your important knowledge and experience with us.

por Carlos V

Sep 30, 2018

This course is unique, highly recommended to anyone that wants to push their skill with machine learning, the assignments are excellent and super challenging, after completing the final assignment my understanding how to improve an ml model was better, pushing you to understand how to build a machine learning model to be competitive in Kaggle.

All the techniques explained also can help you to create better ml models in general.

Thanks very much to all professors for putting together this fantastic course.

Looking forward to a more advanced version in the future.

por yanqiang

Oct 29, 2018


por Greg W

Feb 19, 2019

Really excellent. Very practical advice from top competitors. This specialization is much more information-dense than most machine learning MOOCs. You really get your money's worth.

por Stephane H

Apr 10, 2019

Great course, truly invaluable information in there, also the hardest i've ever done, took me months and a couple hundred hours. The knowledge and experience you gain is incredible, not for the faint of heart though.

por Xiukun H

Feb 25, 2019

Great course to learn practical skills. I love the painful final project.

por Голубев К О

Sep 27, 2018

Great course with excellent tutors. 1C Predict Price Competiton - the best InClass Kaggle competition in which I took part.

But, IMHO, there's not enough practice. For example: it's very useful to exercise on different validation strategies or on stacking of something more, than 2 simple models. Also there's a mistakes in KNN notebook and unclear instrustions.

por Aman S

Jun 03, 2019

Teaching style is not engaging at all. I am very confused

por Mithun G

Jan 14, 2018

Content is really good. But delivery is at times incomprehensible. Assignments questions are also not very clear

por Milos V

Mar 08, 2019

Very interesting course, and the most practical and useful one. However, lecture are usually too theoretical and super-simple, while assignments are tough and very code oriented. So often there is no real connection between the two (except for Dmitry Altukhov). And final project is too difficult in sense that my Alienware 16 RAM was not enough, so I had to go to Google Cloud Platform. Also, I am not sure is anybody who is learning Machine Learning possible to do the final task in "6 hours" as solely runs could last for a day...

por Lun Y

May 07, 2019

There are too many things need the learner to investigate by themselves. We are here to learn but not guess. And the condition to close the course is very hard to achieve. I'd say it is not a well designed course including contents and how they are organized.


Dec 23, 2018

This course is just what I was looking for as I am really interested in competitive Machine Learning and data science. Hopefully , I will be able to perform better in competitions from now on.

But the only down side I can think of is that the programming assignments are pretty difficult at times, but none the less it was a great experience.

por Oleg O

Dec 09, 2018

Very handy course, except I wasn't motivated enough to do home assignments. However, I gained a lot of new concepts

por robert

Jan 02, 2019

Challenging in a fun way, puts things I've learnt before in a different perspective. Overall very practical knowledge with lots of use-cases and not much theory. it's like an awesome lab in grad school.

por Igor B

Jan 27, 2019

This course requires much time, but gives hardcore experience in practical data science and machine learning. The final project, which is a proving ground for the acquired skills, is both an interesting competition to participate in and a real-world-task.

por Diego A G S

Feb 04, 2019

Very good course