################################################################################### # U.S. Department of Health and Human Services # Office of Inspector General (OIG) # Office of Evaluation and Inspections # Toolkit for Calculating Opioid Levels and Identifying Patients At Risk of Misuse # or Overdose: R and SQL Programming Code # OEI-02-17-00561, May 2020 # This code uses prescription drug data to analyze patients' opioid levels and # identify patients who are at risk of opioid misuse or overdose. It is based on # the methodology OIG developed for its extensive work on opioid use in Medicare # Part D, including its recent data brief entitled, Opioid Use Decreased in # Medicare Part D, While Medication-Assisted Treatment Increased (OEI-02-19-00390). # This code is being shared as part of the OIG's Toolkit for Calculating Opioid # Levels and Identifying Patients At Risk of Misuse or Overdose: R or SQL # Programming Code (OEI-02-17-00561). Please read the full toolkit. It provides # important information about preparing the data for the analysis, including how to # merge data from the Centers for Disease Control and Prevention. The toolkit may # be accessed at www.oig.hhs.gov/oei/reports/oei-02-17-00560.asp. # OIG has also released versions of this code using Structured Query Language (SQL) # and SAS. For the code, see www.oig.hhs.gov/oei/reports/oei-02-17-00560.asp. # The code calculates patients' average daily morphine equivalent dose (MED), which # converts various prescription opioids and strengths into one standard value: # equivalent milligrams of morphine. This measure is also called morphine # milligram equivalent (MME). Users can use these data to identify patients at # varying levels of risk. # The code creates three patient-level datasets: # 1) OVERVIEW_MED dataset provides an overview of each patient's opioid # utilization. # 2) DAILY_MED dataset provides detailed daily MED amounts for selected patients. # 3) AT_RISK_MED dataset identifies at-risk patients who had certain MED levels. # Please have this information ready to enter into the code below: # . The file directory of the data and the name of the dataset. # . The number of days in the timeframe of the analysis, the first date of the # timeframe, the date format, and the number of days in the timeframe. For # example, a timeframe of calendar year 2019 begins on January 1, 2019, uses # %d%m%Y date format, and has 365 days in the timeframe. # . The field names in the data for: prescription fill date, patient ID, # quantity dispensed, days supply, strength per unit, morphine milligram # equivalent conversion factor, pharmacy ID (optional), and prescriber ID # (optional). # . The number of days in the period of MED analysis. The default is set at # 3 months, or 90 days. ################################################################################### ### BEFORE RUNNING CODE INSTALL REQUIRED R PACKAGES libs <- c("readr","dplyr","tictoc","purrr","RcppRoll","data.table") new.libs <- libs[!(libs %in% installed.packages()[,"Package"])] if(length(new.libs)) install.packages(new.libs) lapply(libs, require, character.only=TRUE) ## INSERT THE FILE DIRECTORY OF THE DATA HERE ## setwd("LINK TO DATA") ################################################################################### ### 1) OVERVIEW_MED: AN OVERVIEW OF EACH PATIENT'S OPIOID UTILIZATION # This section of the code provides an overview of each patient's opioid # utilization. The output data contains the following fields for each patient: # PATIENT_ID Unique patient identifier. # OPIOID_DAYS Days of opioid use, or the number of days during the # timeframe when the patient has a daily MED that is greater # than zero. # ALL_AVG_MED Average daily MED for the entire study timeframe. For # example, for a timeframe of 1 year, this is the patient's # average daily MED for that year. # MAX_AVG_MED_90 Highest average daily MED for any 90 days. This is the # maximum average daily MED over any 90-day period in the # timeframe. # Brief overview of the code: # 1a. Establish function parameters. # 1b. Establish global variables for use through the code. # 1c. Read and prepare the prescription data. # 1c1. Map parameter values to internal functional variables. Filter out any # missing values. # 1c2. Calculate values: # . RX_FILL_DATE: prescription fill date # . RX_END_DATE: prescription end date is the last date of prescription. # This is the prescription fill date + days supply - 1. # . AVG_DAILY_MED: average daily MED # . PT0: the relative number of days before or after the timeframe start # date that a prescription was filled plus one. Negative values # indicate that the prescriptions were filled before the first day of # the timeframe. # . PT1: the relative number of days before or after the timeframe start # date that a prescription ended. Negative values indicate that the # prescriptions ended before the first day of the timeframe. # 1c3. Keep prescriptions within the timeframe. # 1c4. Set day indexes for the day of the timeframe that a prescription starts # and ends. # . T0_IDX: the day index within the timeframe a prescription starts. # Prescriptions starting before the first day of the timeframe are # assigned a day index of 1. # . T1_IDX: the day index within the timeframe a prescription ends. # Prescriptions ending after the last day of the timeframe are # assigned a day index equivalent to the total days in timeframe. # 1d. Create a list of dataframes. Each dataframe is specific to a single # patient ID. Each dataframe is ordered by the prescription date. Each # prescription contains the AVG_DAILY_MED, T0_, and T1_ day indexes. # 1e. Initialize a vector (INIT_VEC) with an element length equal to the number # of days in timeframe with each element set to 0. # 1f. Define an empty list (RESULTS_LIST) with a vector length equivalent to the # total number of patient IDs. This list will be populated with summarized # opioid metrics for each patient ID. # 1g. Loop through each prescription. For each prescription in each dataframe in # the ANALYSIS_LIST: # . Define a parameter vector with the AVG_DAILY_MED and T1_/T0_ day indexes. # . Create a copy of INIT_VEC named MED_VEC. Insert the AVG_DAILY_MED into # the range bookended by T1_ and T0 day indexes. # . Sum the resulting list of vectors to get the AVG_DAILY_MED for each day # of the timeframe and store it in the vector (MED_VEC). # . Define variables for each patient ID and compute opioid variables. # Combine them into a vector inserted iteratively into RESULTS_LIST. # 1h. Convert RESULTS_LIST to a rectangular dataframe (OVERVIEW_MED). # 1i. Write the overview dataset to CSV file. # 1a overview_med <- function(data_cleaned_csv, days_in_timeframe, window, first_date, date_format, patient_id, prescriber_id, pharmacy_id, prescription_fill_date, quantity_dispensed, days_supply, strength_per_unit, mme_conversion_factor) { # 1b timeframe_start_date <<- as.Date(first_date, format = date_format) time_frame_end_date <<- timeframe_start_date + days_in_timeframe - 1 first_date <<- first_date days_in_timeframe <<- days_in_timeframe date_format <<- date_format # 1c prescriptions_df <<- fread(paste0("./",data_cleaned_csv)) %>% # 1c1 rename('pat_id' = !! patient_id, 'rx_fill_date' = !! prescription_fill_date, 'quantity' = !! quantity_dispensed, 'days' = !! days_supply, 'strength' = !! strength_per_unit, 'mme_con_factor' = !! mme_conversion_factor, 'presc_id' = !! prescriber_id, 'pharm_id' = !! pharmacy_id) %>% filter(!is.na(pat_id)) %>% filter(!is.na(rx_fill_date)) %>% filter(!is.na(quantity)) %>% filter(!is.na(days)) %>% filter(!is.na(strength)) %>% filter(!is.na(mme_con_factor)) %>% # 1c2 mutate( rx_fill_date = as.Date(rx_fill_date, format = date_format), rx_end_date = as.Date(rx_fill_date) + days - 1, avg_daily_med = ((strength * mme_con_factor * quantity) / days), pt0_idx = as.numeric(rx_fill_date - timeframe_start_date) + 1, pt1_idx = as.numeric(pt0_idx) + days - 1) %>% # 1c3 filter(rx_end_date >= timeframe_start_date) %>% filter(rx_fill_date <= time_frame_end_date) %>% # 1c4 mutate( t0_idx = ifelse(rx_fill_date <= timeframe_start_date, 1, rx_fill_date - timeframe_start_date + 1), t1_idx = ifelse(rx_end_date > time_frame_end_date, days_in_timeframe, rx_end_date - timeframe_start_date + 1)) %>% select(presc_id, pharm_id, pat_id, rx_fill_date, rx_end_date, days, avg_daily_med, pt0_idx, pt1_idx, t0_idx, t1_idx) %>% arrange(t0_idx) # 1d analysis_list <<- prescriptions_df %>% mutate(rx_rank = as.numeric(as.Date(rx_fill_date, format = date_format))) %>% group_by(pat_id) %>% arrange(rx_rank) %>% mutate(rx_idx = row_number()) %>% ungroup() %>% arrange(pat_id, rx_rank) %>% select(rx_idx, pat_id, avg_daily_med, t0_idx, t1_idx) %>% filter_all(all_vars(!is.infinite(.))) %>% split(.$pat_id) # 1e init_vec <- numeric(days_in_timeframe) # 1f results_list <- vector("list", length = length(analysis_list)) # 1g for (i in seq_along(analysis_list)) { param_vecs <- pmap(analysis_list[[i]][c("avg_daily_med","t0_idx","t1_idx")], c, use.names = FALSE) med_vec <- param_vecs %>% map(function(x){ init_vec[x[2]:x[3]] <- x[1] return(init_vec) }) %>% reduce(`+`) pat_id <- analysis_list[[i]][1,2] opioid_days <- length(med_vec[med_vec > 0]) all_avg_med <- sum(med_vec) / days_in_timeframe max_avg_med <- max(roll_mean(med_vec, window)) results_list[[i]] <- c(pat_id, opioid_days, all_avg_med, max_avg_med) } # 1h overview_med_df <<- do.call(rbind.data.frame, results_list) %>% set_names( "patient_id","opioid_days","all_avg_med","max_avg_med") # 1i overview_med_df %>% set_names("patient_id","opioid_days",paste0("all_avg_med_", days_in_timeframe), paste0("max_avg_med_",window)) %>% fwrite("./overview_med.csv",row.names = FALSE, col.names = TRUE) } ## ENTER TIMEFRAME, DATASET NAMES, AND FIELD NAMES ## # Enter the timeframe, the name of the dataset that contains the prescription # drug data, and the field names, as described below. tic("Overview MED: An Overview of Each Patient's Opioid Utilization") overview_med( ## ENTER FIRST DATE OF THE TIMEFRAME ## # For example, for a timeframe of calendar year 2019 (January 1, # 2019 to December 31, 2019), the first date is January 1, 2019. first_date = "01JAN2019", ## SPECIFY THE DATE FORMAT ## # The date format needs to use a format where each calendar date # is equal to 1. This is required for the calculations to work # properly. If the field uses another format (i.e., date time), # some manipulation of the data may be needed to make it a date # format. Search 'R date formats' online for more information. date_format = "%d%b%Y", ## ENTER THE NUMBER OF DAYS IN THE TIMEFRAME ## # For example, a timeframe of calendar year 2019 (January 1, 2019 # to December 31, 2019) has 365 days. days_in_timeframe = 365, ## ENTER NAME OF CLEANED DATASET OF OPIOID PRESCRIPTION DATA ## # Please note that the code does not exclude cancer and hospice # patients. Please see the toolkit text for more information # about excluding these patients from the analysis. data_cleaned_csv = "data_cleaned.csv", ## ENTER PRESCRIPTION FILL DATE FIELD NAME ## # Again, the date format needs to use a format where each # calendar date is equal to 1. This is required for the # calculations to work properly. If the field uses another # format (i.e., date time), some manipulation of the data may # be needed to make it a date format. Search 'R date formats' # online for more information. prescription_fill_date = 'prescription_fill_date', ## ENTER UNIQUE PATIENT IDENTIFIER FIELD NAME ## # Note: the analysis relies on each patient having a unique # identifier. If the prescription drug data do not contain a # unique identifier for each patient, additional steps should be # taken to create one. patient_id = 'patient_id', ## ENTER QUANTITY DISPENSED FIELD NAME ## quantity_dispensed = 'quantity_dispensed', ## ENTER DAYS SUPPLY FIELD NAME ## days_supply = 'days_supply', ## ENTER STRENGTH PER UNIT FIELD NAME ## strength_per_unit = 'strength_per_unit', ## ENTER MORPHINE MILLIGRAM EQUIVALENT (MME) CONVERSION FACTOR # FIELD NAME ## mme_conversion_factor = 'MME_Conversion_Factor', ## If available: ENTER PHARMACY UNIQUE IDENTIFIER FIELD NAME ## # Note: This field is optional. pharmacy_id = 'pharmacy_id', ## If available: ENTER PRESCRIBER UNIQUE IDENTIFIER FIELD NAME ## # Note: This field is optional. prescriber_id = 'prescriber_id', ## ENTER NUMBER OF DAYS IN PERIOD OF MED ANALYSIS ## # The default is set at 3 months, or 90 days. window = 90) toc() ################################################################################### ### 2) DAILY_MED: DETAILED DAILY MED AMOUNTS FOR SELECTED PATIENTS # This code provides the daily MED for each day in the study timeframe for selected # patients. This code can be used to look more closely at the daily MED for # selected patients with concerning patterns. For example, if a patient has an # extremely high average daily MED, these data will show which days during the # timeframe the patient received the highest amounts. # The code works similar to OVERVIEW_MED. Brief overview of the code: # 2a. Initialize a vector (MED_VEC) with an element length equal to the number of # days in timeframe with each element set to 0. # 2b. Subset ANALYSIS_LIST by select PATIENT_ID dataframe and loop through # prescriptions: # . Define a parameter vector with the AVG_DAILY_MED and T1_/T0_ day indexes. # . Insert the AVG_DAILY_MED into the range bookended by T1_ and T0_ day # indexes. # . Sum the resulting list of vectors to get the AVG_DAILY_MED for each day # of the timeframe and store it in the vector (MED_VEC). # . Define variables for each patient ID and compute opioid variables. # Combine them into a vector inserted iteratively into MED_VEC. # 2c. Generate a dataframe with sequential rows for the date in timeframe and the # corresponding total average daily MED. # 2d. Write the patient MED dataset to CSV file. daily_med <- function(patient_id) { # 2a med_vec <- numeric(days_in_timeframe) # 2b param_vecs <- pmap(analysis_list[[patient_id]][c("avg_daily_med", "t0_idx","t1_idx")], c, use.names = FALSE) med_vec <- param_vecs %>% map(function(x){ med_vec[x[2]:x[3]] <- x[1] return(med_vec) }) %>% reduce(`+`) # 2c patient_med_df <<- data.frame(patient_id = patient_id, date = seq(as.Date(first_date, format = date_format), as.Date(first_date, format = date_format) + (days_in_timeframe - 1), "days"), daily_med = med_vec) # 2d patient_med_df %>% fwrite(paste0("./patient_daily_med_",patient_id,".csv"), row.names = FALSE, col.names = TRUE) } tic("Daily MED: Detailed Daily MED Amounts for Selected Patients") daily_med(patient_id = c( ## ENTER PATIENT ID HERE ## 'PATIENT_ID')) toc() ################################################################################# ### 3) AT_RISK_MED: AT-RISK PATIENTS WHO HAD CERTAIN MED LEVELS # This code identifies patients who, based on their average daily MED, are at # risk of opioid misuse or abuse. The code identifies patients who met the # criteria for the measures used in a series of OIG data briefs, including Opioid # Use Decreased in Medicare Part D, While Medication-Assisted Treatment Increased # (OEI-02-19-00390). # This section of the code can be adjusted depending on each user's needs. For # example, if the purpose of the analysis is to identify patients who may # benefit from additional case management, the MED thresholds can be lowered. # If the purpose is to refer incidences of possible drug diversion to law # enforcement, the MED thresholds can be increased. Similarly, if the user's # analysis is based on a shorter or longer timeframe than 1 year, the days of # opioid use can be changed. # The code identifies patients who appear to be doctor shopping. This analysis # requires unique pharmacy identifiers and unique prescriber identifiers in the # opioid prescription data. For each patient, it counts the number of distinct # pharmacies and prescribers that had at least 1 opioid prescription with a fill # date in the timeframe. As noted below, this section of the code is optional # and should be skipped if these unique identifiers are not available. # The output dataset contains the following fields in addition to the fields # from the OVERVIEW_MED dataset: # HIGH_MED Patients who received high amounts of opioids. Patients # with an average daily MED greater than 120 mg for any # 90-day period and had at least 90 days of opioid use. # 1=Yes, 0=No. # EXTREME_MED Patients who received extreme amounts of opioids. # Patients with an average daily MED that was greater than # 240 mg for the entire year and had at least 360 days of # opioid use. 1=Yes, 0=No. # DOCTORSHOP_MED Patients who appear to be doctor shopping. Patients with # a high amount of opioids (i.e., average daily MED # greater than 120 mg for any 90-day period) and who # received opioid prescriptions from four or more # prescribers and four or more pharmacies during the year. # 1=Yes, 0=No. # Brief overview of the code: # 3a. Count the number of distinct pharmacies and prescribers that had at least # 1 opioid prescription with a fill date in the timeframe. Do not run this # if unique pharmacy identifier or unique patient identifier data are not # available. # 3b. Join prescriber and pharmacy counts and apply conditions to flag patients. # If unique pharmacy identifier or unique patient identifier data are not # available, then remove join and the DR_SHOPPER parts. # 3c. Write at-risk overview data to CSV file. at_risk_med <- function(high_opioid_days, extreme_opioid_days, high_med_threshold, extreme_med_threshold, prescriber_count, pharmacy_count) { # 3a count_pharms_presc <- prescriptions_df %>% filter(as.Date(rx_fill_date, format = date_format) >= as.Date(first_date, format = date_format)) %>% filter(as.numeric(as.Date(rx_fill_date, format = date_format)) <= as.numeric(as.Date(rx_fill_date, format = date_format)) + days_in_timeframe - 1) %>% rename('patient_id' = pat_id) %>% group_by(patient_id) %>% mutate(prescriber_cnt = n_distinct(presc_id, na.rm = TRUE), pharmacy_cnt = n_distinct(pharm_id, na.rm = TRUE)) %>% select(patient_id, prescriber_cnt, pharmacy_cnt) %>% distinct() # 3b at_risk_overview <<- overview_med_df %>% left_join(count_pharms_presc) %>% mutate(high_med = case_when((opioid_days >= high_opioid_days & max_avg_med > high_med_threshold) ~ 1, TRUE ~ 0), extreme_med = case_when((opioid_days >= extreme_opioid_days & all_avg_med > extreme_med_threshold) ~ 1, TRUE ~ 0), dr_shopper = case_when((high_med == 1 & pharmacy_cnt >= pharmacy_count & prescriber_cnt >= pharmacy_count) ~ 1, TRUE ~ 0)) %>% select(patient_id, opioid_days, all_avg_med, max_avg_med, pharmacy_cnt, prescriber_cnt, high_med, extreme_med, dr_shopper) %>% replace(., is.na(.), 0) # 3c at_risk_overview %>% set_names("patient_id","opioid_days",paste0("all_avg_med_", days_in_timeframe),paste0("max_avg_med_",high_opioid_days), "pharmacy_count","prescriber_count","high_med","extreme_med", "doctorshop_med") %>% fwrite("./at-risk-med.csv", row.names = FALSE, col.names = TRUE) } tic("At-Risk Patients: At-Risk Patients Who had Certain MED Levels ") at_risk_med( ## The MED values (MED_THRESHOLDs), days of opioid use # (OPIOID_DAYS), prescriber count, and pharmacy count may be # adjusted as needed. high_med_threshold = 120, extreme_med_threshold = 240, high_opioid_days = 90, extreme_opioid_days = 360, prescriber_count = 4, pharmacy_count = 4) toc() ### END OF CODE ################################################################################# # Note that the Office of Inspector General (OIG) is providing this toolkit, # including the associated programming code, to assist users in analyzing large # datasets of prescription drug claims to identify individuals at risk of # potential opioid abuse or misuse. This toolkit was prepared as a technical # resource and is not intended to, and does not, create any rights, privileges, # or benefits, substantive or procedural, enforceable by a party against the # United States; its agencies or instrumentalities; its officers or employees; or # any other person. The toolkit is provided in "as-is" condition, and OIG and # its employees, agents, and staff disclaim any express or implied # representation, warranty, or guarantee, including, but not limited to, the # implied warranties of merchantability and fitness for a particular purpose. # In particular, no representation is made that the information included in the # toolkit, or any data the toolkit produces, is error free. The toolkit should # not be used as the sole basis to determine whether an individual is abusing or # overdosing on opioids or other prescription drugs, or in any determinations of # coverage or dispensing by an insurer, pharmacy, provider, or other individual # or organization. The toolkit is not intended to be used to determine # compliance with any laws, regulations or other guidance. In no event shall # OIG or its employees, agents, or staff be liable for any claim, damages, or # liability, whether in an action of contract, tort or otherwise, and including # direct, indirect, incidental, special, exemplary, or consequential damages, # however caused, and on any theory of liability, arising in any way out of the # use of this toolkit or its associated code, even if advised of the possibility # of such damage. Compatibility of the toolkit with any user systems is not # guaranteed, and any manipulation or alteration of the code is the sole # responsibility of the user. #################################################################################