Loader

The Loader class is responsible for downloading the MNIST dataset and preparing a DataLoader for it.

The class initializes with a batch size and provides methods to download the MNIST dataset and create a DataLoader object. The DataLoader can then be used to iterate over the dataset in batches.

Attributes:

  • batch_size (int): The size of the batches to divide the dataset into.

Methods:

  • __init__(self, batch_size=32): Constructor for the Loader class.
  • download_mnist(self): Downloads the MNIST dataset and applies transformations.
  • create_loader(self, mnist_data=None): Creates a DataLoader from the MNIST dataset.
Source code in data_loader.py
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
class Loader:
    """
    The `Loader` class is responsible for downloading the MNIST dataset and preparing a DataLoader for it.

    The class initializes with a batch size and provides methods to download the MNIST dataset and create a DataLoader object. The DataLoader can then be used to iterate over the dataset in batches.

    ## Attributes:
    - `batch_size` (int): The size of the batches to divide the dataset into.

    ## Methods:
    - `__init__(self, batch_size=32)`: Constructor for the Loader class.
    - `download_mnist(self)`: Downloads the MNIST dataset and applies transformations.
    - `create_loader(self, mnist_data=None)`: Creates a DataLoader from the MNIST dataset.
    """

    def __init__(self, batch_size=32):
        """
        Initializes the Loader class with a specified batch size.

        ### Parameters:
        - `batch_size` (int): The size of the batches in which the dataset will be split. Default is 32.
        """
        self.batch_size = batch_size

    def download_mnist(self):
        """
        Downloads the MNIST dataset and applies necessary transformations.

        The method applies a composition of transformations to the dataset: converting images to tensors and normalizing them.

        ### Returns:
        - `mnist_data` (Dataset): The MNIST dataset with applied transformations.
        """
        transform = transforms.Compose(
            [transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))]
        )
        mnist_data = datasets.MNIST(
            root="./data/raw/",
            train=True,
            transform=transforms.ToTensor(),
            download=True,
        )
        return mnist_data

    def create_loader(self, mnist_data=None):
        """
        Creates a DataLoader from the provided MNIST dataset.

        The method initializes a DataLoader with the given dataset and batch size. It also handles the saving of the DataLoader object using joblib. Any exceptions during DataLoader creation or saving are logged.

        ### Parameters:
        - `mnist_data` (Dataset, optional): The MNIST dataset to be loaded into the DataLoader. If not provided, an exception is logged.

        ### Side Effects:
        - Creates a DataLoader object and saves it to a file.
        - Logs exceptions if the DataLoader creation or saving fails.
        """
        if mnist_data is not None:
            dataloader = DataLoader(
                mnist_data, batch_size=self.batch_size, shuffle=True
            )
            try:
                logging.info("Saving dataloader".capitalize())

                joblib.dump(
                    value=dataloader, filename="./data/processed/dataloader.pkl"
                )
            except Exception as e:
                logging.exception(
                    "Error occurred while creating dataloader".capitalize()
                )
        else:
            logging.exception("No data provided".capitalize())

__init__(batch_size=32)

Initializes the Loader class with a specified batch size.

Parameters:
  • batch_size (int): The size of the batches in which the dataset will be split. Default is 32.
Source code in data_loader.py
33
34
35
36
37
38
39
40
def __init__(self, batch_size=32):
    """
    Initializes the Loader class with a specified batch size.

    ### Parameters:
    - `batch_size` (int): The size of the batches in which the dataset will be split. Default is 32.
    """
    self.batch_size = batch_size

create_loader(mnist_data=None)

Creates a DataLoader from the provided MNIST dataset.

The method initializes a DataLoader with the given dataset and batch size. It also handles the saving of the DataLoader object using joblib. Any exceptions during DataLoader creation or saving are logged.

Parameters:
  • mnist_data (Dataset, optional): The MNIST dataset to be loaded into the DataLoader. If not provided, an exception is logged.
Side Effects:
  • Creates a DataLoader object and saves it to a file.
  • Logs exceptions if the DataLoader creation or saving fails.
Source code in data_loader.py
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
def create_loader(self, mnist_data=None):
    """
    Creates a DataLoader from the provided MNIST dataset.

    The method initializes a DataLoader with the given dataset and batch size. It also handles the saving of the DataLoader object using joblib. Any exceptions during DataLoader creation or saving are logged.

    ### Parameters:
    - `mnist_data` (Dataset, optional): The MNIST dataset to be loaded into the DataLoader. If not provided, an exception is logged.

    ### Side Effects:
    - Creates a DataLoader object and saves it to a file.
    - Logs exceptions if the DataLoader creation or saving fails.
    """
    if mnist_data is not None:
        dataloader = DataLoader(
            mnist_data, batch_size=self.batch_size, shuffle=True
        )
        try:
            logging.info("Saving dataloader".capitalize())

            joblib.dump(
                value=dataloader, filename="./data/processed/dataloader.pkl"
            )
        except Exception as e:
            logging.exception(
                "Error occurred while creating dataloader".capitalize()
            )
    else:
        logging.exception("No data provided".capitalize())

download_mnist()

Downloads the MNIST dataset and applies necessary transformations.

The method applies a composition of transformations to the dataset: converting images to tensors and normalizing them.

Returns:
  • mnist_data (Dataset): The MNIST dataset with applied transformations.
Source code in data_loader.py
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
def download_mnist(self):
    """
    Downloads the MNIST dataset and applies necessary transformations.

    The method applies a composition of transformations to the dataset: converting images to tensors and normalizing them.

    ### Returns:
    - `mnist_data` (Dataset): The MNIST dataset with applied transformations.
    """
    transform = transforms.Compose(
        [transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))]
    )
    mnist_data = datasets.MNIST(
        root="./data/raw/",
        train=True,
        transform=transforms.ToTensor(),
        download=True,
    )
    return mnist_data