How to Combine Two String Columns in Pandas
Understanding how to concatenate two string columns into a new column more efficiently in pandas DataFrames

Introduction
Creating new columns by concatenating other columns is a fairly common task. In today’s short guide we will showcase how to concatenate the content of string DataFrame columns into a new column. We will explore a few different options that you should always consider based on the dataset size you are working on.
Some of these approaches tend to be more efficient when applied to small datasets whilst other approaches may be faster to execute when applied over larger datasets.
Additionally, we will also explore how to concatenate string with non-string (e.g. integer) columns.
First, let’s create an example DataFrame that we’ll reference throughout this article in order to demonstrate a few concepts.
import pandas as pddf = pd.DataFrame(
[
(1, '2017', 10, 'Q1'),
(2, '2017', 20, 'Q2'),
(3, '2016', 35, 'Q4'),
(4, '2019', 25, 'Q2'),
(5, '2020', 44, 'Q3'),
(6, '2021', 51, 'Q3'),
],
columns=['colA', 'colB', 'colC', 'colD']
)print(df)
colA colB colC colD
0 1 2017 10 Q1
1 2 2017 20 Q2
2 3 2016 35 Q4
3 4 2019 25 Q2
4 5 2020 44 Q3
5 6 2021 51 Q3Concatenating string columns in small datasets
For relatively small datasets (up to 100–150 rows) you can use pandas.Series.str.cat() method that is used to concatenate strings in the Series using the specified separator (by default the separator is set to '').
For example, if we wanted to concatenate columns colB and colD and then store the output into a new column called colE, the following statement would do the trick:
df['colE'] = df.colB.str.cat(df.colD) print(df)
colA colB colC colD colE
0 1 2017 10 Q1 2017Q1
1 2 2017 20 Q2 2017Q2
2 3 2016 35 Q4 2016Q4
3 4 2019 25 Q2 2019Q2
4 5 2020 44 Q3 2020Q3
5 6 2021 51 Q3 2021Q3Now if we wanted to specify a separator that will be placed between the concatenated columns, then we simply need to pass sep argument:
df['colE'] = df.colB.str.cat(df.colD, sep='-')print(df)
colA colB colC colD colE
0 1 2017 10 Q1 2017-Q1
1 2 2017 20 Q2 2017-Q2
2 3 2016 35 Q4 2016-Q4
3 4 2019 25 Q2 2019-Q2
4 5 2020 44 Q3 2020-Q3
5 6 2021 51 Q3 2021-Q3Alternatively, you can also use a list comprehension which is a bit more verbose but slightly faster:
df['colE'] = [''.join(i) for i in zip(df['colB'], df['colD'])]print(df)
colA colB colC colD colE
0 1 2017 10 Q1 2017Q1
1 2 2017 20 Q2 2017Q2
2 3 2016 35 Q4 2016Q4
3 4 2019 25 Q2 2019Q2
4 5 2020 44 Q3 2020Q3
5 6 2021 51 Q3 2021Q3Concatenating string columns in larger datasets
Now if you are working with large datasets, the more efficient way to concatenate two columns is using the + operator.
df['colE'] = df['colB'] + df['colD']print(df)
colA colB colC colD colE
0 1 2017 10 Q1 2017Q1
1 2 2017 20 Q2 2017Q2
2 3 2016 35 Q4 2016Q4
3 4 2019 25 Q2 2019Q2
4 5 2020 44 Q3 2020Q3
5 6 2021 51 Q3 2021Q3If you want to include a separator then simply place it as a string in-between the two columns:
df['colE'] = df['colB'] + '-' + df['colD']Concatenating string with non-string columns
Now let’s assume that one of the columns you are trying to concatenate is not in string format:
import pandas as pddf = pd.DataFrame(
[
(1, 2017, 10, 'Q1'),
(2, 2017, 20, 'Q2'),
(3, 2016, 35, 'Q4'),
(4, 2019, 25, 'Q2'),
(5, 2020, 44, 'Q3'),
(6, 2021, 51, 'Q3'),
],
columns=['colA', 'colB', 'colC', 'colD']
)print(df.dtypes)colA int64
colB int64
colC int64
colD object
dtype: objectIn this case, you can simply cast the column using pandas.DataFrame.astype() or map() methods.
# Option 1
df['colE'] = df.colB.astype(str).str.cat(df.colD)# Option 2
df['colE'] = df['colB'].astype(str) + '-' + df['colD']# Option 3
df['colE'] = [
''.join(i) for i in zip(df['colB'].map(str), df['colD'])
]Final Thoughts
In today’s short guide we discussed about concatenating string columns in pandas DataFrames. Depending on the size of the dataset you are working with, you may have to select the most appropriate method that will be executed more efficiently.
Furthermore, we also showcased how to concatenate string with non-string columns by making use of the astype() method in pandas.
Become a member and read every story on Medium. Your membership fee directly supports me and other writers you read. You’ll also get full access to every story on Medium.
You may also like






