How to assign subject IDs

import pandas as pd

sample = pd.read_csv('./sampledata1.csv')
sample.head(3)

	Unnamed: 0	Movement	Island_Type	Island	Distance	Item	Sentence	Subj_id	List	Score
0	0	WH	whe	non	sh	1	Who thinks that Paul stole the necklace?	31WPPC	1	6
1	1	WH	whe	non	sh	2	Who thinks that Matt chased the bus?	31WPPC	1	2
2	2	WH	whe	non	sh	3	Who thinks that Tom sold the television?	31WPPC	1	3

sample['Subj_id'].unique()

array(['31WPPC', 'MLOT0C', 'QUCYBY', '3HM9R4', 'TNZ93A', 'RE7119',
       'IKH3NF', '0R04SW', 'S7VOS9', 'JO1B7Q', '0HY4IC', 'MNSV2I',
       'IOEK50', 'LXP23M', '7NXUBG', '4EQFWR'], dtype=object)

Let’s simplify the subject ids in this dataset by converting them to numbers only. Here is one way to accomplish this.

sample['Subj_id'] = sample.groupby('Subj_id', sort=False).ngroup()
sample['Subj_id'] = sample['Subj_id'] + 1 #add 1 to each id if you do not want the first id to be 0

sample['Subj_id'].unique()

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16])