Page tree



Back to Knowledge Base

Loading Data into CASLIB is Slow

Purpose

This guide will detail how to load data into CASLIB when it takes a significant amount of time. 

Solution

Use VARLIST to limit the variables that are loaded into memory and use a WHERE clause to filter and limit data being loaded into memory. 
This should improve overall system performance for all users.

PROC CASUTIL INCASLIB="claims_sample" OUTCASLIB="&mylib.";
/* 


Load some data from HHA_HEADER; */
 LOAD PROMOTE CASDATA="hha_header" CASOUT="hha_header"
/* 


Only select a small subset of fields for our analysis; */
 VARLIST = {"clm_fac_type_cd","clm_from_dt","clm_dgns_cd_1","clm_pmt_amt","nch_prvdr_state_cd"}
 
 /* Limit the data using Hive date syntax to subset; */
 OPTIONS = {USERNAME='sasuser',WHERE = "clm_from_dt > DATE('2019-10-01') LIMIT 10000" };
 /*
 
*/
QUIT;

When the data is loaded into CASLIB, users must compress their data to speed up the loading process. See the related article: How to Compress Data.