Back to Modin

Examples and Resources

docs/getting_started/examples.rst

0.37.16.4 KB
Original Source

Examples and Resources

Here you can find additional resources to learn about Modin. To learn more about advanced usage for Modin, please refer to :doc:Usage Guide </usage_guide/index> section..

Usage Examples ''''''''''''''

The following notebooks demonstrate how Modin can be used for scalable data science:

  • Quickstart Guide to Modin [Source <https://github.com/modin-project/modin/tree/main/examples/quickstart.ipynb>__]
  • Using Modin with the NYC Taxi Dataset [Source <https://github.com/modin-project/modin/blob/main/examples/jupyter/Modin_Taxi.ipynb>__]
  • Modin for Machine Learning with scikit-learn [Source <https://github.com/modin-project/modin/blob/main/examples/modin-scikit-learn-example.ipynb>__]

Tutorials '''''''''

The following tutorials cover the basic usage of Modin. Here <https://www.youtube.com/watch?v=NglkafEmbhE>__ is a one hour video tutorial that walks through these basic exercises.

  • Exercise 1: Introduction to Modin [Source PandasOnRay <https://github.com/modin-project/modin/blob/main/examples/tutorial/jupyter/execution/pandas_on_ray/local/exercise_1.ipynb>, Source PandasOnDask <https://github.com/modin-project/modin/blob/main/examples/tutorial/jupyter/execution/pandas_on_dask/local/exercise_1.ipynb>]
  • Exercise 2: Speed Improvements with Modin [Source PandasOnRay <https://github.com/modin-project/modin/blob/main/examples/tutorial/jupyter/execution/pandas_on_ray/local/exercise_2.ipynb>, Source PandasOnDask <https://github.com/modin-project/modin/blob/main/examples/tutorial/jupyter/execution/pandas_on_dask/local/exercise_2.ipynb>]
  • Exercise 3: Defaulting to pandas with Modin [Source PandasOnRay <https://github.com/modin-project/modin/blob/main/examples/tutorial/jupyter/execution/pandas_on_ray/local/exercise_3.ipynb>, Source PandasOnDask <https://github.com/modin-project/modin/blob/main/examples/tutorial/jupyter/execution/pandas_on_dask/local/exercise_3.ipynb>]

The following tutorials covers more advanced features in Modin:

  • Exercise 4: Experimental Features in Modin (Spreadsheet, Progress Bar) [Source PandasOnRay <https://github.com/modin-project/modin/blob/main/examples/tutorial/jupyter/execution/pandas_on_ray/local/exercise_4.ipynb>, Source PandasOnDask <https://github.com/modin-project/modin/blob/main/examples/tutorial/jupyter/execution/pandas_on_dask/local/exercise_4.ipynb>]
  • Exercise 5: Setting up Modin in a Cluster Environment [Source PandasOnRay <https://github.com/modin-project/modin/blob/main/examples/tutorial/jupyter/execution/pandas_on_ray/cluster/exercise_5.ipynb>__]
  • Exercise 6: Running Modin in a Cluster Environment [Source PandasOnRay <https://github.com/modin-project/modin/blob/main/examples/tutorial/jupyter/execution/pandas_on_ray/cluster/exercise_6.ipynb>__]

How to get required dependencies for the tutorial notebooks and to run them please refer to the respective README.md <https://github.com/modin-project/modin/tree/main/examples/tutorial/jupyter/README.md>__ file.

Talks & Podcasts ''''''''''''''''

  • Scaling Interactive Data Science with Modin and Ray <https://www.youtube.com/watch?v=ycSf1IbBGWk>_ (20 minute, Ray Summit 2021)
  • Unleash The Power Of Dataframes At Any Scale With Modin <https://www.pythonpodcast.com/modin-parallel-dataframe-episode-324/>_ (40 minute, Python Podcast 2021)
  • [Russian] Distributed Data Processing and XGBoost Training and Prediction with Modin <https://www.youtube.com/watch?v=oo_lxUjsFTM&t=1s>_ (30 minute, PyCon Russia 2021)
  • [Russian] Efficient Data Science with Modin <https://www.youtube.com/watch?v=cOM82kHRwkM&t=6568s>_ (30 minute, ISP RAS Open 2021)
  • Modin: Scaling the Capabilities of the Data Scientist, not the Machine <https://www.youtube.com/watch?v=NglkafEmbhE>_ (1 hour, RISE Camp 2020)
  • Modin: Pandas Scalability with Devin Petersohn <https://softwareengineeringdaily.com/2020/07/23/modin-pandas-scalability-with-devin-petersohn/>_ (1 hour, Software Engineering Daily Podcast 2020)
  • Introduction to the DataFrame and Modin <https://www.youtube.com/watch?v=_0eVVLXrtfY>_ (20 minute, RISECamp 2019)
  • Scaling Interactive Pandas Workflows with Modin <https://www.youtube.com/watch?v=-HjLd_3ahCw>_ (40 minute, PyData NYC 2018)

Community contributions '''''''''''''''''''''''

Here are some blogposts and articles about Modin:

  • Anaconda Blog: Scale your pandas workflow with Modin by Vasilij Litvinov <https://www.anaconda.com/blog/scale-your-pandas-workflow-with-modin>_
  • The Modin view of Scaling Pandas by Devin Petersohn <https://towardsdatascience.com/the-modin-view-of-scaling-pandas-825215533122>_
  • Data Science at Scale with Modin by Areg Melik-Adamyan <https://medium.com/intel-analytics-software/data-science-at-scale-with-modin-5319175e6b9a>_
  • Speed up Pandas using Modin by Eric D. Brown, D.Sc. <https://pythondata.com/quick-tip-speed-up-pandas-using-modin/>_
  • Explore Python Libraries: Make Your DataFrames Parallel With Modin by Zachary Bennett <https://www.pluralsight.com/guides/explore-python-libraries:-make-your-dataframes-parallel-with-modin>_
  • Get faster pandas with Modin, even on your laptops by Parul Pandey <https://towardsdatascience.com/get-faster-pandas-with-modin-even-on-your-laptops-b527a2eeda74>_
  • How to speedup pandas by changing one line of code by Shrivarsheni <https://www.machinelearningplus.com/python/modin-speedup-pandas/>_
  • How To Accelerate Pandas With Just One Line Of Code by Analytics India <https://analyticsindiamag.com/how-to-accelerate-pandas-with-just-one-line-of-code-modin/>_
  • An Easy Introduction to Modin: A Step-by-Step Guide to Accelerating Pandas by Intel <https://www.intel.com/content/www/us/en/developer/articles/technical/modin-step-by-step-guide-to-accelerating-pandas.html#gs.c69er5>_

Here are some articles contributed by the international community:

  • [Chinese] 用 Modin 来提速 pandas 工作流程 by Python Chinese Community <https://blog.csdn.net/BF02jgtRS00XKtCx/article/details/90709222>_
  • [German] Was ist Modin? by Dipl.-Ing. (FH) Stefan Luber <https://www.bigdata-insider.de/was-ist-modin-a-982826/>_
  • [Russian] Ускоряем Pandas при помощи модуля modin by Разработка <https://vc.ru/dev/187095-uskoryaem-pandas-pri-pomoshchi-modulya-modin>_
  • [Korean] modin 으로 pandas 더 빠르게 사용하기 by 분석뉴비 <https://data-newbie.tistory.com/279>_

If you would like your articles to be featured here, please submit a pull request <https://github.com/modin-project/modin/pulls>_ to let us know!