Also known as Google Colaboratory, Google Colab is a free Jupyter Notebook environment that runs entirely in Google Cloud. Moreover, it has many libraries already installed to manipulate data and train Machine Learning models, even using the cloud machine’s GPU.
That is to say, it’s a powerful tool created by Google to help spread the Machine Learning practice and collaboration.
To illustrate how this amazing tool works, it’s important to first talk about IPython and Jupyter.
IPython is a tool to add interactive features to Python programming language. For example, it allows highlighting with colors, add syntax for the shell and autocompleting.
In addition, with IPython, it’s possible to write scripts in its interactive shell and develop applications using parallel computing.
IPython was born back in 2001 to help scientist to do calculations and computing interactively, as a result of its academic beginning, it’s very common to find it in scientific libraries like SciPy.
2. Jupyter Notebooks
Jupyter Notebooks is a web based environment to run code interactively.
It’s a spin-off of IPython which implemented similar environment called IPython Notebooks.
Jupyter was built to be used with several programming languages like R, C, C++ and, of course, Python. In order to do that, it uses kernels for each language, for example the Python kernel is IPython.
Most importantly it was created to help Data Scientists to do their work in a simple web environment, allowing them to create reports at the same time they are coding.
3. Google Colaboratory
Colaboratory is a product in charge of Google Research, it was implemented based in Jupyter Notebooks for Python, and it’s free to use for almost all the learning tasks that a newbie Data Scientist executes.
Certainly, nothing is free at all, so the product limits are not warranted and can change without notification, however if you need more processing power you may pay Colab Pro.
3.1 Code cells
The basic unit of Colab is a cell, which it’s just a space to write and execute code.
The first cell imports Pandas’ library, the second one reads a csv file and shows the first five rows of the dataframe using the method head().
Each cell could be executed independently, however the code flow it’s up-down, that means you have to run it in that order to avoid errors.
The execution shortcut is command-contorl-enter, however if you prefer to use the mouse, you can click in the left play button.
3.2 Rich text
One of the most important characteristics of Jupyter Notebooks/Colab it’s that allows you to combine executable code and rich text in the same document. Consequently you can use images, HTML, LaTeX or Markdown.
3.3 Data Science and Machine Learning
Well, Colab was built for Data Science and Machine Learning practitioners, consequently you can use the power of Python and its libraries for extract, transform, load or analyze data.
Here a small list of the main Python libraries for DS and ML:
- Scikit Learn
I will write an article about this libraries later.
In conclusion, Google Colab is a powerful tool where everyone can execute Python code interactively in a web browser and create beautiful reports with living code. And the best, it’s not necessary to install anything, just open a new notebook and start to code.
You can find the official documentation and tutorials in https://colab.research.google.com/