A .csv file, or a comma-separated values file, is a form of excel/sheets data that is very important/easy to use in Machine Learning and Data Science.
Now this will vary based on whether you are running Jupyter Notebooks locally or on the cloud. For this course, we'll be using Google Colab, so we'll be on the cloud. If you're going to use it locally eventually, don't worry, that will be a lot easier.
This is the syntax to take data from a .csv file and create a corresponding dataframe:
df = pd.read_csv("file_path")
read_csv()function takes the file path of the comma-separated values and converts it into a dataframe, which is stored in
In case you're not sure what your file path is, let's see how to get that. Let's assume you're data is on the computer:
In this case, you can use the following commands to upload a .csv file that you can then access.
from google.colab import files uploaded = files.upload
Now a button to upload a file should show up. Click it, and select the data file. Now your file path is just the following: "file_name.csv"
However, if your data is on the Drive, and you don't want to download and then upload it — although that can be a little easier — the process will be a little different.
.excel files are the familiar excel/sheets file, so it can be useful to know how to import them. The function required is similar:
df = pd.read_excel("file_path")
We referred to this before, but just as a refresher, here's how to convert a dictionary to a dataframe:
df = pd.DataFrame(dictionary)
That's all for this lesson. You can practice this in an exercise, but really, these are often things that you will just become familiar with as time passes. And you can always refer back to this when you need.3.2 Pandas 2: Dataframe operations