Linguistics PhD
import pandas as pd
df = pd.DataFrame({'Discount':[10, 8, 20, 15, 10],
'Product':[' UMbreLla', ' maTress', 'BeDmintoN ', 'Shuttle', 'jaCket '],
'Updated_Price':[880, 1250, 1450, 1550, 400],
'Date':['10/2/2011', '10/2/2011', '11/2/2011', '12/2/2011', '13/2/2011']})
df
Discount | Product | Updated_Price | Date | |
---|---|---|---|---|
0 | 10 | UMbreLla | 880 | 10/2/2011 |
1 | 8 | maTress | 1250 | 10/2/2011 |
2 | 20 | BeDmintoN | 1450 | 11/2/2011 |
3 | 15 | Shuttle | 1550 | 12/2/2011 |
4 | 10 | jaCket | 400 | 13/2/2011 |
here are some ways to (i) strip product names of whitespace and (ii) capitalize their first letter.
#Example 1
ls = df['Product'].tolist()
new_ls = [item.strip().capitalize() for item in ls]
df = df.drop(['Product'], axis=1)
df.insert(1,'Product_new', new_ls)
df
Discount | Product_new | Updated_Price | Date | |
---|---|---|---|---|
0 | 10 | Umbrella | 880 | 10/2/2011 |
1 | 8 | Matress | 1250 | 10/2/2011 |
2 | 20 | Bedminton | 1450 | 11/2/2011 |
3 | 15 | Shuttle | 1550 | 12/2/2011 |
4 | 10 | Jacket | 400 | 13/2/2011 |
#Example 2
df['Product'] = df['Product'].apply(lambda x: x.strip().capitalize())
df
Discount | Product | Updated_Price | Date | |
---|---|---|---|---|
0 | 10 | Umbrella | 880 | 10/2/2011 |
1 | 8 | Matress | 1250 | 10/2/2011 |
2 | 20 | Bedminton | 1450 | 11/2/2011 |
3 | 15 | Shuttle | 1550 | 12/2/2011 |
4 | 10 | Jacket | 400 | 13/2/2011 |
and the code below renames certain values using regex.
df.replace(to_replace=r'^Be', value='Ba', regex=True, inplace=True)
df
Discount | Product_new | Updated_Price | Date | |
---|---|---|---|---|
0 | 10 | Umbrella | 880 | 10/2/2011 |
1 | 8 | Matress | 1250 | 10/2/2011 |
2 | 20 | Badminton | 1450 | 11/2/2011 |
3 | 15 | Shuttle | 1550 | 12/2/2011 |
4 | 10 | Jacket | 400 | 13/2/2011 |