How to Create Flag Fields with Hive Arrays
Purpose
Creating a field to flag specific values for a field represented by many different fields (i.e., ICD Diagnosis Codes, Condition Codes, etc.), can be done by putting the fields into an array and searching the entire array for a particular value.
Best Practice
The syntax below shows an example of putting the potential 30 condition code fields into an array and create a flag that identifies if any of the field values is equal to a particular value of interest.
- It creates and calls a macro that selects and renames condition code field 1 through 30 into a macro variable that is called to extract from Hive claims table.
- Then that table is used to create a new Hive Table that uses the Hive array function and searches for a particular value ('44' in this example) throughout the array.
- The output value is set to 1 if found, or 0 if not found in the newly created field (cond44).
/*MACRO TO CREATE CALLS FOR FIELDS WITH UP TO 30 VALUES*/ %MACRO loop30; %GLOBAL cd cn; %DO i=1 %TO 30; %IF &i=1 %THEN %DO; %LET cd = %STR(h.clm_rlt_cond_cd_&i as cond&i,); %LET cn = %STR(cond&i,); %END; %ELSE %if &i=30 %THEN %DO; %LET cd = &cd %STR(h.clm_rlt_cond_cd_&i as cond&i); %LET cn = &cn %STR(cond&i); %END; %ELSE %DO; %LET cd = &cd %STR(h.clm_rlt_cond_cd_&i as cond&i,); %LET cn = &cn %STR(cond&i,); %END; %END; %MEND; %loop30; /*DROP AND PURGE HIVE TABLE*/ %hive_drop_table([schema], [table1]); /*EXTRACT 30 CONDITION CODE FIELDS FROM HIVE CLAIMS TABLE*/ %hive_exec_sql( %STR(CREATE TABLE [schema].[table1] STORED AS parquet TBLPROPERTIES ('parquet.compress'='SNAPPY') AS SELECT &cd FROM [hive_db].[hive_claims_table] h)); /*DROP AND PURGE HIVE TABLE*/ %hive_drop_table([schema], [table2]); /*PUT 30 CONDITION CODE FIELDS INTO ARRAY AND CREATE VALUE FLAG INDICATOR*/ %hive_exec_sql( %STR(CREATE TABLE [schema].[table2] STORED AS parquet TBLPROPERTIES ('parquet.compress'='SNAPPY') AS SELECT array_contains(array(&cn),cast("44" as varchar(2))) as cond44 FROM [schema].[table1]));
Related articles