Hash values in Python

Hashing values, for example to anonymise personal information, can easily be done using the Python hashlib module:

# Import library
import hashlib

# Apply SHA1 hash
email = "lara@croft.org"
hashlib.sha1(email.encode()).hexdigest()
8104be5f19b5be4cec9185173a26aa121935d656

If you work with a pandas dataframe, you can apply the function to a full column:

# Create dataframe with plain text emails
df = pd.DataFrame({
    'email': [
        'harry@potter.com',
        'sherlock@holmes.co.uk',
        'lara@croft.org',
        'frodo@baggins.com',
        'arsene@lupin.fr'
    ]
})

# Hash email column
df['hash'] = df['email'].apply(lambda x: hashlib.sha1(x.encode()).hexdigest())
email
hash
0
harry@potter.com
964721fca55c89f80dc59a86f077bffb6e14ffc4
1
sherlock@holmes.co.uk
76cb17778847d74fb4558d453e67aedbac573c88
2
lara@croft.org
8104be5f19b5be4cec9185173a26aa121935d656
3
frodo@baggins.net
6589497c3ad91249aa51ddf9f5165fb1becf95d0
4
arsene@lupin.fr
84edb980256dad67a693ce96ba8caef8eadb9a13