The KCAU Library

Image from Google Jackets

Mastering large datasets with Python : parallelize and distribute your Python code / J.T. Wolohan.

By: Material type: TextTextPublication details: Shelter Island, New York : Manning, 2019.Description: xx, 289 pages : illustrations ; 24 cmISBN:
  • 9781617296239
Other title:
  • Large datasets with Python
Subject(s): LOC classification:
  • QA76.73.P98 W65 2019
Summary: Programming techniques that work well on laptop-sized data can slow to a crawl-- or fail altogether-- when applied to massive files or distributed datasets. By mastering the powerful map and reduce paradigm, along with the Python-based tools that support it, you can write data-centric applications that scale efficiently without requiring codebase rewrites as your requirements change. "Mastering large datasets with Python" teaches you to write code that can handle datasets of any size. You'll start with laptop-sized datasets that teach you to parallelize data analysis by breaking large tasks into smaller ones that can run simultaneously. You'll then scale those same programs to industrial-sized datasets on a cluster of cloud servers. With the map and reduce paradigm firly in place, you'll explore tools like Hadoop and PySpark to efficiently process massive distributed datasets, speed up decision-making with machine learning, and simplify your data storage with AWS S3.
Reviews from LibraryThing.com:
Tags from this library: No tags from this library for this title. Log in to add tags.
Holdings
Item type Current library Collection Call number Vol info Status Date due Barcode
Main Short Main Short Martin Oduor-Otieno Library This item is located on the library ground floor Non-fiction QA76.73.P98 W65 2019 (Browse shelf(Opens below)) 31692/24 Available MOOL24030011
Main Short Main Short Martin Oduor-Otieno Library This item is located on the library ground floor Non-fiction QA76.73.P98 W65 2019 (Browse shelf(Opens below)) 31693/24 Available MOOL24030012

Includes index.

Programming techniques that work well on laptop-sized data can slow to a crawl-- or fail altogether-- when applied to massive files or distributed datasets. By mastering the powerful map and reduce paradigm, along with the Python-based tools that support it, you can write data-centric applications that scale efficiently without requiring codebase rewrites as your requirements change. "Mastering large datasets with Python" teaches you to write code that can handle datasets of any size. You'll start with laptop-sized datasets that teach you to parallelize data analysis by breaking large tasks into smaller ones that can run simultaneously. You'll then scale those same programs to industrial-sized datasets on a cluster of cloud servers. With the map and reduce paradigm firly in place, you'll explore tools like Hadoop and PySpark to efficiently process massive distributed datasets, speed up decision-making with machine learning, and simplify your data storage with AWS S3.

There are no comments on this title.

to post a comment.
KCAU Library,
KCA University ,
Thika Road Ruaraka
P. O. Box 56808 – 00200 Nairobi, Kenya

More Links

Powered by Koha