Python and SQL remain the two most common programming skills for data scientists

VSCode is now used by over 50% of working data scientists

Notebooks are a popular environment as well.

Colab notebooks are the most popular cloud-based Jupyter notebook environment

Makes sense especially since Kaggle is owned by Google.

Machine Learning

Kaggle DS & ML Survey 2022 Scikit-learn is the most popular ML framework while PyTorch has been growing steadily year-over-year

LightGBM, XGBoost are also among the top frameworks.

Transformer architectures are becoming more popular for deep learning models (both image and text data)

Cloud computing

All major cloud computing providers saw strong year over year growth in 2022

Specialized hardware like Tensor Processing Units (TPUs) is gaining initial traction with Kaggle data scientists


