avatarFrank Neugebauer

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

4594

Abstract

e Visual Studio Code Command Palette):</li></ul><figure id="7895"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*MooSk0kVOdX1dQC0iJ8tAw.jpeg"><figcaption>Command Palette Snippet — Image by Author</figcaption></figure><p id="42e3">The image only shows a very small snippet of the commands available.</p><ul><li>Finally, there’s a terminal. If you’re using the free Colab, you get a message that says you need to have Colab Pro to access the terminal. I don’t often need the terminal, given I can use <code>pip</code> within the notebook (more on this in a moment).</li></ul><p id="4b21">If you look again at that Initial Notebook image, you’ll also see some expected features such as the name of the notebook (Untitled1.ipynb), menu options to do things like manipulate cells, create a scratch code cell (nice feature there), and get help. (I renamed my notebook to <code>first_notebook.ipynb</code>.)</p><p id="f77e">If you look at the far right (below my account), you’ll also see an indicator for the RAM and Disk being used. Selecting the arrow next to those indicators allows me to connect to a different runtime (like Google Cloud Platform), manage the session, and view resources.</p><h2 id="3ea4">Entering a Section and Text</h2><p id="6322">To create a section that’s visible in the TOC, I selected the Table of contents icon on the left and then “+ Section.” This adds a section under the default-provided cell. I moved the section up with the up arrow.</p><figure id="2e97"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*omzO6C-0I-Qwir12XOje8A.jpeg"><figcaption>Adding and Moving a Section — Image by Author</figcaption></figure><p id="931c">Editing the name of the section was not intuitive. I thought I could just select it and type, but alas, that’s not the case. But it’s not counter-intuitive either — I had to <b>double click</b> it. Geeze Google — can’t you just anticipate my intuition?</p><p id="910c">At this point I have a new section (that’s been renamed) and a cell under it. I renamed the section to <i>About My Project</i> and here I’ll put some HTML to do just that.</p><p id="98f6">By default, the cell that’s created when I created the notebook is a code cell. I put some HTML in it and (as expected) it didn’t work (as HTML) when I selected SHIFT+ENTER to run the cell. (But something pretty cool surfaced.)</p><figure id="899c"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*TUThW2ezFHUVmQIo84Oi2g.jpeg"><figcaption>Search Stack Overflow on Code Error — Image by Author</figcaption></figure><p id="3fa4">I got the syntax error as expected but check out that button at the bottom — “SEARCH STACK OVERFLOW”. That’s a nice touch.</p><p id="7c4a">Changing the cell from Code to Text is simple enough, but not intuitive. There doesn’t seem to be a way to do this graphically — you have to enter CTRL+mm (that’s m twice) to go from Code to Text and CTRL+my (m then y) to go from Text to Code. I think this feature could stand for a graphical method.</p><p id="a690">Now, when I changed the cell to Text, I got (another) nice surprise — test editing features. Check it out.</p><figure id="e369"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*biVAVc06UpdvckHQB9Ns3A.jpeg"><figcaption>The Text Editor — Image by Author</figcaption></figure><p id="a210">This is nice. While I have HTML in my Text cell, I really don’t need it. By using the icons, the corresponding Markdown is inserted (e.g., selecting text and selecting B for bold adds the “**” around the selected text). This is actually <i>really</i> nice.</p><h2 id="6cdf">Using PIP</h2><p id="116d">Before I get to the code, I want to circle back to something I mentioned earlier about using <code>pip</code> in a cell. Normally, when I have a new programming environment, I like to see what Python packages I have available. Doing this is simple enough in both Studio Code and Colab Notebooks — it’s just <code>pip list</code> within a cell.</p><figure id="1147"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*1A8F_42W-dk8gEh-aUks4w.jpeg"><figcaption>pip list Snippet — Image by Author</figcaption></figure><p id="dd0e">This is super handy (for both Studio Code and Colab). Honestly, I didn’t know I could do this in a code cell until I tried it in Colab. This method also lets you install and uninstall packages within a cell.</p><h2 id="4460">Enter Code</h2><p id="8021">The default Colab Notebook provides a bunch of stuff — all the basics you’d expect (<code>pandas</code>, <code>sklearn</code>), but also some package

Options

s I normally have to add (<code>pandas-profiling</code>, <code>spacy,</code> <code>Keras</code>, <code>tensorflow</code>).</p><p id="758e">Let me take a moment to describe files in the (free) Colab environment. While you can’t get to a terminal for the VM, you can get to the files, which is helpful because I can manipulate them as you’d expect (add/delete files and folders). But I can also mount my Google drive, which you’ll soon see is a superior option.</p><p id="48b3">First, I’ll add a folder and .csv to my Colab file system. By default, the current working directory in a Colab Notebook is <code>/content</code>. If you select the File icon in Google Colab, that’s the folder that shows up containing a <code>sample_data</code> folder. I explored the file system a bit using the navigator and found things I’d have expected — <code>/home</code>, <code>/usr</code>, <code>/var</code>, etc. I uploaded a CSV file named insurance.csv, which I got <a href="https://github.com/stedy/Machine-Learning-with-R-datasets/blob/master/insurance.csv">here</a>. I put it right in the <code>/content</code> folder for simplicity. I did so by first selecting the folder, then the vertical ellipses, then Upload.</p><figure id="a2b7"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*Nkx7sFpmd53K-kMCoIAp3A.jpeg"><figcaption>Uploading to Colab — Image by Author</figcaption></figure><p id="41db">That was easy enough, but when I uploaded the file, the main reason for <b>not doing this</b> appeared:</p><figure id="c22f"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*ZQffB2Ddp8oeHuch9zcwew.jpeg"><figcaption>Don’t Use Colab File System — Image by Author</figcaption></figure><p id="6fe9">That’s right — as soon as I kill the runtime, the files are gone, which is entirely intuitive. The entire runtime is spindled when I start Colab, including the file system. That’s why we can use Google Drive for permanent storage. For brevity, I’m not going to show that, but it works the same basic way with the additional step of first mounting your Google Drive.</p><p id="d8e7"><i>Now</i> I can enter code.</p><p id="8cfd">First things first — load the data I just uploaded.</p><figure id="e97a"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*VINq6iOEyVS1lTYnclJFVA.jpeg"><figcaption>Loading Data — Image by Author</figcaption></figure><p id="b92b">I didn’t put this into gist because this piece isn’t about the code, it’s about code in Colab. Either way, it works as you’d expect it to. One nice feature compared to Studio Code — the code complete worked out of the box, no need for an extension like <code>pylance</code>.</p><p id="24ce">I wanted to see a bit more code and a couple of visualizations, so I wrote some code and corresponding description.</p><figure id="9899"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*HqCjnUm2YbmNEAURAyOYcw.jpeg"><figcaption>QQ Plot and Histogram in Colab — Image by Author</figcaption></figure><p id="9b0b">Again, this works as expected because it’s just Python running in a notebook. At this point, I am pretty well fully functional with Google Colab Notebooks.</p><h2 id="3b26">One Last Detail — The Jupyter Notebook</h2><p id="5043">If you look around the Colab file system, one thing you <i>won’t</i> find is the Jupyter Notebook itself. That’s because Goodle Drive is automatically used to store those. When I created the Colab Notebook, Google Colab created a directory in the root of my Google Drive called Colab Notebooks, which is exactly where my first_notebook.ipynb file was saved.</p><figure id="9602"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*LNBQKkZ4Wdd2cLK_bS9fIg.jpeg"><figcaption>My Google Drive — Image by Author</figcaption></figure><h2 id="605c">Conclusions</h2><p id="1e8b">Some takeaways for me in doing this:</p><ol><li>Setup is superior in Colab and in just about every way</li><li>Text cell editing is superior in Colab</li><li>There are more ease-of-use features in Colab, such as going to Stack Overflow automatically when there’s an error</li><li>If you can’t use your Google Drive for some reason, the Colab file system really won’t work because either you need to copy the data to/from Colab every time you use it or you need to re-create it every time</li><li>I don’t dislike Azure ML Studio / Studio Code any less, I just have another exceptional option if the situation merits it (e.g., at work in my day job, Colab is a non-starter but for my work in teaching, it’s exceptional)</li></ol><p id="bd2e">Happy coding!</p></article></body>

Google Colab — First Impressions

From a Microsoft ML Studio and Visual Studio Code Fan

Photo by Mitchell Luo on Unsplash

The Quick

I’m going to cut right to the quick of it — while I really like Azure ML Studio combined with (or without) Visual Studio Code for use with Python Jupyter Notebooks, if you just look at the steps to get up and running, Google Colab (Colaboratory) is hands-down the easier to use.

Assuming you have a Gmail (Colab) or Outlook (ML Studio) account, the steps to create a Jupyter Notebook in ML Studio are:

  1. Create an Azure account
  2. Create an Azure ML Studio resource in Azure
  3. Create a Compute Instance in Azure
  4. Start the Compute Instance
  5. Code away

Comparing that to the steps required for Google Colab (and you’ll see why it’s so much easier):

  1. Navigate to https://https://colab.research.google.com/
  2. Code away
Image by Author

Yeah, that’s all there is to it.

First Take and Tour

Super simple start-up aside, here’s my impressions of Google Colab from start to finish.

When you launch Colab for the first time, a default “What is Colabratory” notebook is opened. In itself, it’s very informative and gives some clues about how a Colab Notebook can be structured.

Image by Author

A few things really caught my attention with this notebook:

  • Notice I’m logged in with my Gmail account (top right)
  • As such, I can copy the notebook to Google Drive (“Copy to Drive”)
  • There’s the option to create a table of contents
  • The right pane looks like what I’m familiar with in a notebook

I naturally double-clicked the first cell to see if it works like my Visual Studio Code Notebooks and it does — and it doesn’t. I can edit the markdown cell as I expected but I get not only the code but a preview (very useful).

Image by Author

You’ll notice that the markdown for this cell is all HTML. I’ll do something with pure markdown (or Text as its called in Colab) when I build my own later on, but the interesting thing is that preview on the right. Well played Google.

My First Notebook

Creating a notebook is simple enough, I just selected File → New notebook. The result is a Colab Notebook, which is very much like a Jupyter Notebook with some extra features. Here’s the blank notebook that appears after creating it.

Initial Notebook — Image by Author

If you look at the icons on the far-left side, they give a glimpse into the Colab Notebook functionality.

  • There’s a table of contents icon (I selected that before taking that last screenshot, which is why the TOC is visible)
  • There’s a search (which has find and replace)
  • The “<>” icon is pretty sweet; it’s the “Code snippets” option and selecting it reveals a very nice feature of Colab (which reminds me of R Studio)
Image by Author

That’s pretty amazing .

  • The next icon is a file icon, which allows you to navigate the Colab file structure, mount a Google Drive, or upload files to the session storage for the Colab
  • The next icon is the Command palette, another really well-done feature of Collab Notebooks. Selecting this icon gives me a bunch of things I can do (it’s a bit like the Visual Studio Code Command Palette):
Command Palette Snippet — Image by Author

The image only shows a very small snippet of the commands available.

  • Finally, there’s a terminal. If you’re using the free Colab, you get a message that says you need to have Colab Pro to access the terminal. I don’t often need the terminal, given I can use pip within the notebook (more on this in a moment).

If you look again at that Initial Notebook image, you’ll also see some expected features such as the name of the notebook (Untitled1.ipynb), menu options to do things like manipulate cells, create a scratch code cell (nice feature there), and get help. (I renamed my notebook to first_notebook.ipynb.)

If you look at the far right (below my account), you’ll also see an indicator for the RAM and Disk being used. Selecting the arrow next to those indicators allows me to connect to a different runtime (like Google Cloud Platform), manage the session, and view resources.

Entering a Section and Text

To create a section that’s visible in the TOC, I selected the Table of contents icon on the left and then “+ Section.” This adds a section under the default-provided cell. I moved the section up with the up arrow.

Adding and Moving a Section — Image by Author

Editing the name of the section was not intuitive. I thought I could just select it and type, but alas, that’s not the case. But it’s not counter-intuitive either — I had to double click it. Geeze Google — can’t you just anticipate my intuition?

At this point I have a new section (that’s been renamed) and a cell under it. I renamed the section to About My Project and here I’ll put some HTML to do just that.

By default, the cell that’s created when I created the notebook is a code cell. I put some HTML in it and (as expected) it didn’t work (as HTML) when I selected SHIFT+ENTER to run the cell. (But something pretty cool surfaced.)

Search Stack Overflow on Code Error — Image by Author

I got the syntax error as expected but check out that button at the bottom — “SEARCH STACK OVERFLOW”. That’s a nice touch.

Changing the cell from Code to Text is simple enough, but not intuitive. There doesn’t seem to be a way to do this graphically — you have to enter CTRL+mm (that’s m twice) to go from Code to Text and CTRL+my (m then y) to go from Text to Code. I think this feature could stand for a graphical method.

Now, when I changed the cell to Text, I got (another) nice surprise — test editing features. Check it out.

The Text Editor — Image by Author

This is nice. While I have HTML in my Text cell, I really don’t need it. By using the icons, the corresponding Markdown is inserted (e.g., selecting text and selecting B for bold adds the “**” around the selected text). This is actually really nice.

Using PIP

Before I get to the code, I want to circle back to something I mentioned earlier about using pip in a cell. Normally, when I have a new programming environment, I like to see what Python packages I have available. Doing this is simple enough in both Studio Code and Colab Notebooks — it’s just pip list within a cell.

pip list Snippet — Image by Author

This is super handy (for both Studio Code and Colab). Honestly, I didn’t know I could do this in a code cell until I tried it in Colab. This method also lets you install and uninstall packages within a cell.

Enter Code

The default Colab Notebook provides a bunch of stuff — all the basics you’d expect (pandas, sklearn), but also some packages I normally have to add (pandas-profiling, spacy, Keras, tensorflow).

Let me take a moment to describe files in the (free) Colab environment. While you can’t get to a terminal for the VM, you can get to the files, which is helpful because I can manipulate them as you’d expect (add/delete files and folders). But I can also mount my Google drive, which you’ll soon see is a superior option.

First, I’ll add a folder and .csv to my Colab file system. By default, the current working directory in a Colab Notebook is /content. If you select the File icon in Google Colab, that’s the folder that shows up containing a sample_data folder. I explored the file system a bit using the navigator and found things I’d have expected — /home, /usr, /var, etc. I uploaded a CSV file named insurance.csv, which I got here. I put it right in the /content folder for simplicity. I did so by first selecting the folder, then the vertical ellipses, then Upload.

Uploading to Colab — Image by Author

That was easy enough, but when I uploaded the file, the main reason for not doing this appeared:

Don’t Use Colab File System — Image by Author

That’s right — as soon as I kill the runtime, the files are gone, which is entirely intuitive. The entire runtime is spindled when I start Colab, including the file system. That’s why we can use Google Drive for permanent storage. For brevity, I’m not going to show that, but it works the same basic way with the additional step of first mounting your Google Drive.

Now I can enter code.

First things first — load the data I just uploaded.

Loading Data — Image by Author

I didn’t put this into gist because this piece isn’t about the code, it’s about code in Colab. Either way, it works as you’d expect it to. One nice feature compared to Studio Code — the code complete worked out of the box, no need for an extension like pylance.

I wanted to see a bit more code and a couple of visualizations, so I wrote some code and corresponding description.

QQ Plot and Histogram in Colab — Image by Author

Again, this works as expected because it’s just Python running in a notebook. At this point, I am pretty well fully functional with Google Colab Notebooks.

One Last Detail — The Jupyter Notebook

If you look around the Colab file system, one thing you won’t find is the Jupyter Notebook itself. That’s because Goodle Drive is automatically used to store those. When I created the Colab Notebook, Google Colab created a directory in the root of my Google Drive called Colab Notebooks, which is exactly where my first_notebook.ipynb file was saved.

My Google Drive — Image by Author

Conclusions

Some takeaways for me in doing this:

  1. Setup is superior in Colab and in just about every way
  2. Text cell editing is superior in Colab
  3. There are more ease-of-use features in Colab, such as going to Stack Overflow automatically when there’s an error
  4. If you can’t use your Google Drive for some reason, the Colab file system really won’t work because either you need to copy the data to/from Colab every time you use it or you need to re-create it every time
  5. I don’t dislike Azure ML Studio / Studio Code any less, I just have another exceptional option if the situation merits it (e.g., at work in my day job, Colab is a non-starter but for my work in teaching, it’s exceptional)

Happy coding!

Google Colab
Python
Data Science
Jupyter Notebook
Recommended from ReadMedium