Setup
# Import libraries
import pandas as pd
import numpy as np
Split string and expand result in columns
Apply function str.split()
to the column, with parameter expand=True
# Create sample DataFrame
df = pd.DataFrame({'name':['John A. Doe', 'Aby K. Parker', 'Bob T. Morris', 'Alice D. Allen']})
df
name | |
0 | John A. Doe |
1 | Aby K. Parker |
2 | Bob T. Morris |
3 | Alice D. Allen |
# Split string and create multiple columns
df['name'].str.split(expand=True)
0 | 1 | 2 | |
0 | John | A. | Doe |
1 | Aby | K. | Parker |
2 | Bob | T. | Morris |
3 | Alice | D. | Allen |
If you want to extract only the nth element in the resulting list, you can simply use .str[n]
:
# Just take the last value of the list of splitted string
df['name'].str.split().str[-1]
0 Doe
1 Parker
2 Morris
3 Allen
Name: name, dtype: object
Split lists into columns
If you have list-like values in a column, and want to select some values and expand them into columns, specify result_type='expand'
to apply()
# Create sample DataFrame
df2 = pd.Series([np.random.randint(0, 9, size=5) for i in range(5)]).to_frame()
df2
0 | |
0 | [1, 2, 3, 2, 7] |
1 | [5, 6, 5, 8, 3] |
2 | [4, 0, 2, 5, 3] |
3 | [0, 4, 3, 0, 0] |
4 | [3, 0, 1, 4, 8] |
Extract some values without expansion:
# Extract some values without expansion
df2.apply(lambda x: (x[0][0], x[0][-1]), axis=1)
0 (1, 7)
1 (5, 3)
2 (4, 3)
3 (0, 0)
4 (3, 8)
dtype: object
Expand result into columns:
# Expand result to columns
df2.apply(lambda x: [x[0][0], x[0][-1]], result_type='expand', axis=1)
0 | 1 | |
0 | 1 | 7 |
1 | 5 | 3 |
2 | 4 | 3 |
3 | 0 | 0 |
4 | 3 | 8 |