Collaboratory is a Google collaboration initiative intended to help promote machine learning in education and science. It’s a Jupyter notebook system that doesn’t need much configurations and runs completely in the cloud.
Step 1: Install the dependencies
Lets start with install the dependencies first. It can be done by
!pip install Kaggle
and run this. Here you can see the output of installing Kaggle.
Step 2: Get Kaggle Api token
Now, you have to go to your Kaggle account. And in that Kaggle account, next, go to your ” my account” settings. From that “my account” settings, And go-to “create new API token”. By clicking on the link you have to download the Kaggle.json file on to your machine.
Step 3:Import Kaggle api token in google colab notebook
Once you do that lets comeback to your google colab notebook. On that just type
from google.colab import files
and execute this command. It will come up with some dialog browser and select “choose files” option and select your previously downloaded “Kaggle.json” file from your machine. There you can see the uploading process at the output log cell.
After 100% upload completion our next command will be
!mkdir -p ~/.kaggle
!cp kaggle.json ~/.kaggle
!chmod 600 ~/.kaggle/kaggle.json
Just execute the command.
Step 4: Get your dataset api command
If you want to import some Kaggle dataset, then you just go to the Kaggle websites and search for your dataset. Here I am working with a fruit image dataset so I select the fruit360 dataset, you choose as per your requirements. A huge number of datasets present in Kaggle. Just choose the dataset whose size is less than your google colab account storage. For me my available storage space is 70GB so I choose the dataset size of less than 70 GB. From your Kaggle dataset download page just click on “Copy API command”. Once that is copied go back to your Google colab notebook and paste the copied command and add an exclamation mark before the command. For me it is
!kaggle datasets download -d moltean/fruits
Next just execute this command.
Step 5: Extract the dataset
Once you do that Kaggle dataset will be downloaded to your google colab. Refresh the table of contents to see your dataset. But dataset will be in zip format. To work with the dataset we need to extract the data from that zip file.
To extract just paste the command given below and change the dataset name as per your dataset
“from zipfile import ZipFile
file_name=”fruits.zip” #Your Dataset name
with ZipFile(file_name,’r’) as zip:
The use of this code is to extract the dataset from the zip file. After executing this command you have to wait sometime depending on your dataset size and after completing the log window shows “Done”. Just refresh the table of contents to view your extracted dataset in google colab repository.
So, you have successfully learnt how to import Kaggle dataset in google colab notebook.
Step 6: Work with the dataset (Optional)
Now you are ready to work with your Kaggle dataset in your google colab. I am doing some pre-processing operation on my fruit image dataset. So for reference purposes just see how can we work with that dataset. Mainly I am calculating the average color value of any image and store in a variable. So I will do that for my whole dataset’s image.
First thing first,
“import cv2, os, glob“
Write it on your cell and execute it. After that just get your image dataset folder’s path. For that just goto the dataset folder and “right-click” on that, there you can see an option “ copy the path”, by clicking on that you will get your dataset path.
After getting the dataset’s path just run the below command to get your result.
img_dir = “/content/fruits-360/Test/Apple Braeburn” # Enter Directory of all images
data_path = os.path.join(img_dir,’*g’)
files = glob.glob(data_path)
data = 
for f1 in files:
img = cv2.imread(f1)[:, :, :-1]
average = img.mean(axis=0).mean(axis=0)
Thanks for reading. Feel free to like and share..
Please visit our facebook page to get more updates : pre-processing
Join our telegram group : pre-processing
Follow us on instagram : pre-processing
You may also like