Split values into columns of a pandas DataFrame
Setup
# Import libraries
import pandas as pd
import numpy as np
Split string and expand result in columns
Apply function str.split()
to the column, with parameter expand=True
# Create sample DataFrame
df = pd.DataFrame({'name':['John A. Doe', 'Aby K. Parker', 'Bob T. Morris', 'Alice D. Allen']})
df
|
name |
0 |
John A. Doe |
1 |
Aby K. Parker |
2 |
Bob T. Morris |
3 |
Alice D. Allen |
# Split string and create multiple columns
df['name'].str.split(expand=True)
|
0 |
1 |
2 |
0 |
John |
A. |
Doe |
1 |
Aby |
K. |
Parker |
2 |
Bob |
T. |
Morris |
3 |
Alice |
D. |
Allen |
If you want to extract only the nth element in the resulting list, you can simply use .str[n]
:
# Just take the last value of the list of splitted string
df['name'].str.split().str[-1]
0 Doe
1 Parker
2 Morris
3 Allen
Name: name, dtype: object
Split lists into columns
If you have list-like values in a column, and want to select some values and expand them into columns, specify result_type='expand'
to apply()
# Create sample DataFrame
df2 = pd.Series([np.random.randint(0, 9, size=5) for i in range(5)]).to_frame()
df2
|
0 |
0 |
[1, 2, 3, 2, 7] |
1 |
[5, 6, 5, 8, 3] |
2 |
[4, 0, 2, 5, 3] |
3 |
[0, 4, 3, 0, 0] |
4 |
[3, 0, 1, 4, 8] |
# Extract some values without expansion
df2.apply(lambda x: (x[0][0], x[0][-1]), axis=1)
0 (1, 7)
1 (5, 3)
2 (4, 3)
3 (0, 0)
4 (3, 8)
dtype: object
# Expand result to columns
df2.apply(lambda x: [x[0][0], x[0][-1]], result_type='expand', axis=1)
|
0 |
1 |
0 |
1 |
7 |
1 |
5 |
3 |
2 |
4 |
3 |
3 |
0 |
0 |
4 |
3 |
8 |