The Loader class is responsible for downloading the MNIST dataset and preparing a DataLoader for it.
The class initializes with a batch size and provides methods to download the MNIST dataset and create a DataLoader object. The DataLoader can then be used to iterate over the dataset in batches.
Attributes:
batch_size (int): The size of the batches to divide the dataset into.
Methods:
__init__(self, batch_size=32): Constructor for the Loader class.
download_mnist(self): Downloads the MNIST dataset and applies transformations.
create_loader(self, mnist_data=None): Creates a DataLoader from the MNIST dataset.
Source code in data_loader.py
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90 | class Loader:
"""
The `Loader` class is responsible for downloading the MNIST dataset and preparing a DataLoader for it.
The class initializes with a batch size and provides methods to download the MNIST dataset and create a DataLoader object. The DataLoader can then be used to iterate over the dataset in batches.
## Attributes:
- `batch_size` (int): The size of the batches to divide the dataset into.
## Methods:
- `__init__(self, batch_size=32)`: Constructor for the Loader class.
- `download_mnist(self)`: Downloads the MNIST dataset and applies transformations.
- `create_loader(self, mnist_data=None)`: Creates a DataLoader from the MNIST dataset.
"""
def __init__(self, batch_size=32):
"""
Initializes the Loader class with a specified batch size.
### Parameters:
- `batch_size` (int): The size of the batches in which the dataset will be split. Default is 32.
"""
self.batch_size = batch_size
def download_mnist(self):
"""
Downloads the MNIST dataset and applies necessary transformations.
The method applies a composition of transformations to the dataset: converting images to tensors and normalizing them.
### Returns:
- `mnist_data` (Dataset): The MNIST dataset with applied transformations.
"""
transform = transforms.Compose(
[transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))]
)
mnist_data = datasets.MNIST(
root="./data/raw/",
train=True,
transform=transforms.ToTensor(),
download=True,
)
return mnist_data
def create_loader(self, mnist_data=None):
"""
Creates a DataLoader from the provided MNIST dataset.
The method initializes a DataLoader with the given dataset and batch size. It also handles the saving of the DataLoader object using joblib. Any exceptions during DataLoader creation or saving are logged.
### Parameters:
- `mnist_data` (Dataset, optional): The MNIST dataset to be loaded into the DataLoader. If not provided, an exception is logged.
### Side Effects:
- Creates a DataLoader object and saves it to a file.
- Logs exceptions if the DataLoader creation or saving fails.
"""
if mnist_data is not None:
dataloader = DataLoader(
mnist_data, batch_size=self.batch_size, shuffle=True
)
try:
logging.info("Saving dataloader".capitalize())
joblib.dump(
value=dataloader, filename="./data/processed/dataloader.pkl"
)
except Exception as e:
logging.exception(
"Error occurred while creating dataloader".capitalize()
)
else:
logging.exception("No data provided".capitalize())
|
__init__(batch_size=32)
Initializes the Loader class with a specified batch size.
Parameters:
batch_size (int): The size of the batches in which the dataset will be split. Default is 32.
Source code in data_loader.py
| def __init__(self, batch_size=32):
"""
Initializes the Loader class with a specified batch size.
### Parameters:
- `batch_size` (int): The size of the batches in which the dataset will be split. Default is 32.
"""
self.batch_size = batch_size
|
create_loader(mnist_data=None)
Creates a DataLoader from the provided MNIST dataset.
The method initializes a DataLoader with the given dataset and batch size. It also handles the saving of the DataLoader object using joblib. Any exceptions during DataLoader creation or saving are logged.
Parameters:
mnist_data (Dataset, optional): The MNIST dataset to be loaded into the DataLoader. If not provided, an exception is logged.
Side Effects:
- Creates a DataLoader object and saves it to a file.
- Logs exceptions if the DataLoader creation or saving fails.
Source code in data_loader.py
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90 | def create_loader(self, mnist_data=None):
"""
Creates a DataLoader from the provided MNIST dataset.
The method initializes a DataLoader with the given dataset and batch size. It also handles the saving of the DataLoader object using joblib. Any exceptions during DataLoader creation or saving are logged.
### Parameters:
- `mnist_data` (Dataset, optional): The MNIST dataset to be loaded into the DataLoader. If not provided, an exception is logged.
### Side Effects:
- Creates a DataLoader object and saves it to a file.
- Logs exceptions if the DataLoader creation or saving fails.
"""
if mnist_data is not None:
dataloader = DataLoader(
mnist_data, batch_size=self.batch_size, shuffle=True
)
try:
logging.info("Saving dataloader".capitalize())
joblib.dump(
value=dataloader, filename="./data/processed/dataloader.pkl"
)
except Exception as e:
logging.exception(
"Error occurred while creating dataloader".capitalize()
)
else:
logging.exception("No data provided".capitalize())
|
download_mnist()
Downloads the MNIST dataset and applies necessary transformations.
The method applies a composition of transformations to the dataset: converting images to tensors and normalizing them.
Returns:
mnist_data (Dataset): The MNIST dataset with applied transformations.
Source code in data_loader.py
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 | def download_mnist(self):
"""
Downloads the MNIST dataset and applies necessary transformations.
The method applies a composition of transformations to the dataset: converting images to tensors and normalizing them.
### Returns:
- `mnist_data` (Dataset): The MNIST dataset with applied transformations.
"""
transform = transforms.Compose(
[transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))]
)
mnist_data = datasets.MNIST(
root="./data/raw/",
train=True,
transform=transforms.ToTensor(),
download=True,
)
return mnist_data
|