Modin dataframes and IBM Cloud Object Storage
Modin is a Python framework capable to efficiently scale Pandas dataframe. To achieve this Modin uses a high performance distributed Ray framework. This short post explains how to use Modin and read data objects from IBM Cloud Object Storage. Requirements IBM Cloud Object Storage account If you doesn't have one already, navigate to IBM Cloud and choose IBM Cloud Object Storage . Using dashboard, create a new bucket and upload some CSV objects there. You will need to obtain HMAC credentials for the bucket, just follow simple steps as described here Python and dependencies I used Python 3.6 but i assume other versions will work as well. Install the following packages: IBM COS SDK for Python , smart_open (we will use smart_open to access IBM Cloud Object Storage) and modin Example import modin.pandas as pd import ibm_boto3 import smart_open if __name__ == '__main__' : acces...