Loading Data into CASLIB is Slow
Purpose
This guide will detail how to load data into CASLIB when it takes a significant amount of time.
Solution
Use VARLIST to limit the variables that are loaded into memory and use a WHERE clause to filter and limit data being loaded into memory. This should improve overall system performance for all users. PROC CASUTIL INCASLIB="claims_sample" OUTCASLIB="&mylib."; /* Load some data from HHA_HEADER; */ LOAD PROMOTE CASDATA="hha_header" CASOUT="hha_header" /* Only select a small subset of fields for our analysis; */ VARLIST = {"clm_fac_type_cd","clm_from_dt","clm_dgns_cd_1","clm_pmt_amt","nch_prvdr_state_cd"} /* Limit the data using Hive date syntax to subset; */ OPTIONS = {USERNAME='sasuser',WHERE = "clm_from_dt > DATE('2019-10-01') LIMIT 10000" }; /* */ QUIT;
When the data is loaded into CASLIB, users must compress their data to speed up the loading process. See the related article: How to Compress Data.
Related articles