If you need to subtract a portion (substring) from a string variable, you can use substr.
The authors of the guide can happily reveal that they have applied this a lot when working with ICD codes (classification system for diagnoses).
Function
Basic command
egen newvarname=std(oldvarname)
Explanations
newvarname
Insert the name of the new variable (containing the substring).
oldvarname
Insert the name of the old variable (the original string variable).
substr
Extract a portion of the string variable.
start
Specify which position that the starting character in the substring has.
length
Specify the length of the substring.
More information help substr
Practical example
Dataset
StataData1.dta
Variable name
cvd_date_str
Variable label
Date of out-of-patient care due to CVD (Ages 41-50, Years 2011-2020)
Value labels
N/A
We have a string variable called cvd_date_strthat contains the date of out-patient care due to cardiovascular disease (CVD), coded like YYYYMMDD. Suppose that we want to extract the year (YYYY), month (MM), and day (DD) into separate variables.
gen cvd_year_str= substr(cvd_date_str,1,4)
gen cvd_month_str= substr(cvd_date_str,5,2)
gen cvd_day_str= substr(cvd_date_str,7,2)
As can be noted in the command above, for year, we specify 1 as the position which the starting character in the substring has, and 4 as the length. For month, we specify 5 and 2. And, finally, for day, we specify 7 and 2.