Everything you need to know to add, rename, and delete columns using Pandas!

Add, Rename, and Delete Columns in Pandas

Dylan | Jul 27, 2020

Post Thumbnail

When working with real-world data in Pandas DataFrames, nearly every project will require you to add, delete, or rename columns. Whether you’re working with Pandas for the first time, or just looking for a quick refresher, in this post, we’ll break down in simple terms how to apply these operations to DataFrames in your projects.

Add a New Column

Let’s create a DataFrame object to begin.



import pandas as pd
df = pd.DataFrame({'price': [3, 89, 45, 6], 'amount': [57, 42, 70, 43]}}

Method One

We can simply declare a new DataFrame column the same way would we insert a new key into a dictionary in Python.



df['total'] = [171, 3738, 3150, 258]

We just have to be sure to assign a list of values with the exact length of the current number of rows in our DataFrame object, otherwise, Pandas will raise a ValueError.

For this example, we could save ourselves some manual calculations by multiplying the 'price' and 'amount' columns together to generate the data for our new 'total' column.



df['total'] = df['price'] * df['amount']

Both of the previous lines of code would successfully generate our new column, 'total' as you can see in the image below.

Method Two

If you want to specify where your new column should be inserted in the DataFrame, you can use the DataFrame.insert() method. The insert method has four parameters:

  • loc: the column insertion index
  • column: new column label
  • value: desired row data
  • allow_duplications: (optional) will not create a new column if a column with the same label already exists

We can insert our new 'total' column at index 0 in our DataFrame object using the following code.



df.insert(0, 'total', df['price']*df['amount'], False)



Delete a Column

The best way to delete DataFrame columns in Pandas is with the DataFrame.drop() method. The drop method is very flexible and can be used to drop specific rows or columns. It can also drop multiple columns at a time by either the column’s index or the column’s name. There are seven possible parameters you can pass to the drop method, but only two are required. Let’s look at the three most common:

  • labels: index or column labels to drop
  • axis: whether to drop labels from the index (0 or 'index') or columns (1 or 'columns')
  • inplace: if True, complete the operation inplace and return None

Check out the documentation if you want to learn more about the drop method.

Let’s pretend we wanted to delete the 'total' column from the following DataFrame.

We could do it with the following code.



df.drop('total', 1, inplace=True)

If instead, we wanted to drop both 'amount' and 'total' by their indices instead of their labels, we could use the following code to do that.



df.drop(df.columns[[1, 2]], 1, inplace=True)

Which would result in only the column 'price' being left.

Rename a Column

The simplest way to achieve this in Pandas is with the DataFrame.rename() method. The method has eight possible parameters you can pass; however, for basic situations, you can get by with only the following two:

  • columns: dictionary-like transformations to apply to the column labels
  • inplace: if True, complete the operation inplace and return None

To further explore the capabilities of DataFrame.rename, please refer to the Pandas documentation linked above.

Imagine we have the following DataFrame object, with the columns 'price', 'amount', and 'total'.

Now pretend we want to relabel the 'amount' column to 'quantity'. We could achieve that with the following code.



df.rename(columns={'amount': 'quantity'}, inplace=True)

If I overlooked anything in this guide, let me know in the comments below. I’m always happy to answer any questions and do whatever I can to help. As always, it’s best to roll up your sleeves and try out the examples above in the wild. Until next week, happy coding from Nimble Coding!

Comments